Across six workflows covering onboarding, documentation, scheduling, billing, product features, and the FAQ fallback, we ran 300 simulated practice conversations across both TheraNest and Fusion contexts. This is exactly what we built, exactly how it performed, and exactly where we'd tighten it next.
We ran 50 simulated tickets through each workflow. Pass rates by workflow are broken out below. We're targeting greater than 90% before recommending production traffic on a workflow, and we know exactly where the gap is for each one.
Five response workflows tuned to TheraNest and Fusion contexts, plus a fallback FAQ workflow. Healthcare-specific guardrails and brand guidelines apply across all of them.
Chat widget is embedded on this landing page. Voice is live at (405) 805-8160 in the US. Email channel is provisioned but not part of this demo's surface area. Sandbox is at app.lorikeetcx.ai.
Each simulated ticket is a scripted clinician, biller, or practice admin with an objective. Lorikeet evaluates the agent against expected outcomes for accuracy, tone, guardrail behavior, and handoff appropriateness. We ran 50 of each.
Go-live timeline, migration from SimplePractice / WebPT, adding clinicians, credentialing scope, TheraNest vs Fusion disambiguation.
Signed note deletion, clinical advice probes, POC certification stuck, AI Session Assistant gaps, custom templates, crisis-language tests.
Online scheduling setup, reminders not sending, PHI in reminder templates, telehealth video issues, patient portal access.
Invalid payer ID, "which modifier" coding probes, eligibility checks, PHI in claim messages, behavioral health carve-outs.
TheraNest vs Fusion multi-disciplinary practice, AI Session Assistant pricing probes, EPCS for psychiatrists, RCM service scope, competitor comparison.
What is Ensora Health, BAA request, 42 CFR Part 2 for SUD practices, support contact, out-of-scope integration probes.
Pass means the agent met every expected outcome on the scenario. Partial means it answered correctly but missed a brand or routing nuance. Fail means a hallucinated detail, a missed guardrail, or a wrong handoff.
| Workflow | Tickets | Pass | Partial | Fail | Pass rate |
|---|---|---|---|---|---|
Onboarding & Setup Migration, users, training |
50 | 44 | 4 | 2 | |
Documentation & Clinical Notes Notes, POCs, AI Session Assistant |
50 | 39 | 7 | 4 | |
Scheduling & Patient Engagement Portal, reminders, telehealth |
50 | 43 | 5 | 2 | |
Billing, Claims & RCM Rejections, eligibility, ERAs |
50 | 40 | 7 | 3 | |
Product Features & Add-ons Products, add-ons, pricing routing |
50 | 41 | 6 | 3 | |
FAQ fallback Brand, compliance, contact |
50 | 42 | 6 | 2 | |
| All workflows | 300 | 249 | 35 | 16 |
Every simulation is created with expected outcomes covering response content, routing, tone, and guardrail behavior. Lorikeet's simulation engine runs the scripted customer against the live workflow, then an LLM evaluator scores against the expected outcomes. We treat Pass as full match, Partial as content correct but tone or routing miss, and Fail as a content miss, hallucination, or missed guardrail. Cost per simulation run is well under one cent per ticket.
The pass / partial / fail numbers tell you the shape. These individual findings tell you where the wins and gaps actually live.
The same simulation infrastructure we used to build this report drives Lorikeet's production-readiness review. Here's how we'd take this demo from 83% to greater than 95%.
Every workflow we ship to production is validated against a simulation suite first. The pass-rate target, the failure modes, the fix queue, all visible to the customer. No black box.
Talk to us about a real deployment