Operations & quality

Testing in the Playground

The Playground is a built-in test chat window in which you can try out your agent live — exactly as a customer would experience it on your website, but without using your real channels (WhatsApp, SMS, phone, email). Use it after every change to the persona, knowledge base, model or booking settings to check whether the agent responds the way you expect before it meets real customers.

You will find the Playground in the agent menu under the "Playground" tab (Agents → your agent → "Playground" tab).

How the test window works

The Playground consists of three parts: the header with the reset button, the chat history in the middle, and the input field at the bottom. There is no configuration in the classic sense (form fields to fill in) here — the Playground automatically uses your agent's currently saved configuration. So what you test is always the present state.

Step 1: Enter a message - Type your test question into the input field at the bottom. The placeholder text reads "Enter a message… (Enter to send, Shift+Enter for a new line)". - Enter sends the message immediately. - Shift+Enter inserts a line break without sending — handy for reconstructing longer, multi-line customer enquiries. - The field is two lines high but grows with longer text. As long as the field is empty, the send button (the paper-plane icon on the right) stays disabled.

Step 2: Wait for the reply - After sending, your message appears on the right (as a speech bubble with a person icon). While the agent is thinking, you see an animated "typing indicator" on the left (three pulsing dots). - The agent's reply appears on the left with a robot icon. It is generated with the real configuration: the chosen model, temperature, persona, base prompt, knowledge base (including recall-fallback, see Tips) and all enabled tools such as booking or lead capture.

Step 3: Continue the conversation - Every further message remains part of the same conversation — so the agent remembers the history so far, exactly as in live operation. This lets you also test multi-step dialogues (e.g. an appointment booking spread across several messages). - As soon as the first reply has arrived, a small grey label such as "conv 1a2b3c4d" appears in the description at the top right. That is the shortened conversation ID of this test conversation — purely for orientation; you do not need to do anything with it.

Step 4: Reset for a fresh test - The "Reset" button at the top right (with a circular-arrow icon) clears the chat history and starts a completely new conversation. A brief notice "Conversation reset" confirms this. - Use it when you want to start a new scenario from scratch — for example, to check how the agent handles a first enquiry with no prior history. The button is only active when there are already messages in the history. - Important: resetting only clears your view. The test conversation already held (and any leads/costs that arose from it) remains in the system.

What happens behind the scenes (and why it matters)

The Playground is not mere "dry-run training". It uses the same agent engine as live operation, via an internal channel called "portal". In concrete terms, this means:

  • Real conversations are created. Every test conversation is saved and later shows up in your inbox / your conversations.
  • Real leads can be created. If you have enabled CRM mode (the "Create automatically on contact" setting / auto_create_on_contact), every contact becomes a lead — and this applies to all channels, including the Playground. Without CRM mode, a lead is only created when the agent itself decides to capture the contact details (the capture_lead tool). So expect that your test chats may generate lead entries.
  • Real costs are incurred. Every reply calls the AI provider (e.g. OpenAI), and these costs are fully recorded and counted against your daily budget (daily_budget_usd). Intensive testing can therefore eat into the limit.

Tips & pitfalls

  • The daily budget also applies in the Playground. If you have set a daily budget and it is already used up, the agent will no longer reply here either, but will show something to the effect of "This assistant is temporarily unavailable. Please try again later." This is not a fault of the Playground but your budget stop. In that case, check the agent's budget.
  • Test chats end up in your real conversations and leads. If you want clean "real" statistics/lists, delete the test conversations or test leads afterwards, or only test if you are prepared to accept this mixing. Especially with CRM mode enabled, dead entries otherwise pile up quickly.
  • The agent must be active. The Playground only talks to an active, non-deleted agent. If the agent is disabled, an error message appears in the chat.
  • You always test the saved state. Changes you make to the persona, prompt or knowledge base only take effect after saving. If you send a test message without saving first, the agent will still reply with the old settings.
  • Mind the "recall-fallback" for knowledge. The knowledge search works "recall-first": if nothing is found that clearly falls below the similarity threshold (kb_max_distance, default 0.5), the best near-match is still used. So an answer in the Playground that is "somehow fitting but not quite right" can come from exactly this. Polluted, automatically crawled pages (navigation and footer text) worsen the match quality — in that case, upload clean text again.
  • Why did it answer that way? The Playground only shows the finished reply, not the reasoning. If you want to see which knowledge snippets were drawn on, which tools the agent called and which decisions it made, use the separate "AI debug" area (overview/knowledge health, the live "retrieval test" with a recommended threshold and one-click save, the "turn traces" per message, as well as the decision overview with thumbs up/down).
  • Realistic tests instead of keywords. Write whole sentences, the way a customer would type them — not just keywords. This shows you how the agent really reacts in practice, including booking and follow-up logic.
  • On an error, [error] … appears. If a request fails, the reply bubble shows an error message prefixed with [error], plus a brief pop-up notice. Simply send the message again; if the error recurs, check the model settings and budget.