The Chat Playground
What the Playground is for
Three patterns:
- Answer-quality testing. Ask sample questions before launching a new bot or after changing knowledge content.
- Skill testing. Verify a skill (collect_lead, escalate_to_human) triggers when expected and not when not.
- Prompt tuning. Adjust the system prompt and see immediate effect on tone and behavior.
Playground queries don't count against your monthly query quota. Use freely.
How to use it
- Open Chat Playground from the dashboard sidebar.
- Type a question in the chat input.
- Bot responds as it would on the widget.
- Click the response to see:
- Retrieved chunks (which knowledge sources the bot used).
- Skills triggered (if any).
- Latency breakdown (retrieval time, generation time).
- Token usage.
Use the same workspace where your production bot lives, or create a separate "staging" workspace for iteration.
Compare models side-by-side
For Growth and above:
- Click "Compare Models".
- Pick 2 to 3 models from your available roster.
- Send the same question to all of them.
- See responses side-by-side with latency and citation comparison.
Useful when:
- Choosing a default model for your workspace.
- Validating an upgrade before flipping all conversations.
- Debugging quality regressions between model versions.
Retrieved chunks panel
For every answer, see which knowledge sources fed the response:
- Source document name and URL.
- Retrieved chunk text (the actual snippet that was passed to the LLM).
- Relevance score (0-1).
- Audience tag of the source.
Highest-relevance chunks at the top. If you see irrelevant chunks at the top, the retrieval is the issue, not the LLM. Tune chunking, add snippets, or refine the source content.
Skill trace panel
When a skill fires, the trace shows:
- Which skill (e.g.,
collect_lead). - Trigger that matched (phrase or intent classifier).
- Inputs to the skill.
- Outputs returned.
- Policy bound checks (pass/fail).
- Final response back to the user.
Useful for debugging "why did the bot ask for my email here?" or "why didn't escalate fire when I said 'I need a human'?".
Identity simulation
Test how the bot behaves for different visitor segments:
- Click "Simulate Visitor".
- Set: user_id, name, email, plan, custom attributes.
- Bot treats the playground session as if this visitor is identity-verified.
- Test audience-tag-gated content and identity-required skills.
Useful for verifying that Enterprise-only docs surface for Enterprise visitors and stay hidden for free-tier visitors. Without simulation, the playground defaults to anonymous.
Channel simulation
Test how the bot would respond on different channels:
- Widget. Default. Full rich-card responses.
- WhatsApp. Plain text with light formatting.
- SMS. Plain text, no Markdown.
- Voice. Text spoken via TTS preview.
Pick under "Simulate Channel". The bot adapts its response format to the channel's constraints.
Prompt-tuning view
For advanced users:
- Click "Show System Prompt".
- See the assembled prompt sent to the LLM (workspace prompt + system instructions + retrieved context).
- Edit live under Settings > AI Config > System Prompt in another tab.
- Re-send the question to see the new prompt's effect.
Common tweaks: tone adjustments, response-length caps, language-specific instructions, brand voice rules.
Sample test scenarios
Patterns worth running before launch:
Happy path. Common customer questions ("how much does it cost?", "what's your refund policy?"). Confirm 95%+ accuracy.
Edge cases. Questions outside the knowledge base. Confirm the bot doesn't hallucinate, says "I don't know" cleanly.
Skill triggers. Phrases that should and shouldn't fire each skill. Tune trigger lists.
Multi-turn. Conversations spanning 5 to 10 turns. Verify context carries.
Hostile inputs. Prompt injection, off-topic, adversarial questions. Confirm the bot stays grounded and on-brand.
About 80% of pre-launch issues surface through 30 minutes of playground testing.
Saving playground sessions
For team review:
- Run a session.
- Click "Save Session".
- Name it.
- Share the link with teammates.
Saved sessions become testcases. Re-run them after every prompt change to verify no regressions.
Audit and rollback
Every playground session logs:
- Author and timestamp.
- Prompt used.
- Model used.
- Skills triggered.
If a prompt change broke something in production, the playground audit log helps identify the prompt diff and roll back.
Limits
- Daily playground queries. Unlimited (don't count against quota).
- Side-by-side comparison. Up to 3 models at once.
- Saved sessions per user. 100.
- Session retention. 90 days standard, 1 year Enterprise.
Common pitfalls
Playground bot behaves differently from production. Different workspace or different prompt version. Confirm you're testing in the same workspace.
Skills don't fire in playground. Identity required (e.g., subscription_manager). Use "Simulate Visitor" to verify identity.
Slow first response. Cold workspace; vector index not warm. Second query is faster.
Citations don't show. Source content not indexed yet. Check Knowledge Hub for indexing status.
Planned features (on the roadmap)
Documented for accuracy:
- Automated regression testing. Today, manual playground runs. Planned: a "test suite" feature that runs saved sessions and flags answer drift.
- Replay production conversations. Today, ad-hoc questions. Planned: replay a real customer conversation through different model or prompt settings.
- Multi-turn batch testing. Today, one question at a time in comparison mode. Multi-turn batch comparison planned.
FAQ
Does the playground use real customer data?
No. Playground sessions are synthetic. Real customer conversations live in Live Chat and are isolated.
Can multiple team members share a playground session?
Yes via saved sessions. Share the link.
Will my playground prompts leak to production?
Only if you save them as the workspace's system prompt. Playground tweaks are local until you explicitly save.
Can I A/B test prompts in the playground?
Yes via side-by-side comparison. For production A/B testing, use two workspaces with different prompts and split traffic.
Does the playground charge against my LLM token budget?
No. Playground tokens are free (included in your plan). Production tokens count.
Related guides
- AI Agents page
- Knowledge Hub
- Skills overview
- Prompt engineering for support
- What is Retrieval-Augmented Generation (RAG)?