Mental health triage with conversational AI for health insurer

The pain

Why the first contact was not working

· The first contact of an affiliate with psychological distress fell to the call center or a generic appointment. When someone calls with anxiety, in crisis, or grieving, they are answered by an operator or a general practitioner with no way to screen. The appointment is in 8 days, in the wrong channel, or never happens.
· There was no way to distinguish a mild case from an urgent one without human intervention. The affiliate who writes "I don't want to be alive" receives the same treatment as the one asking for a sleeping pill. The system cannot read sentiment, cannot prioritize, cannot tell when to escalate.
· The psychology team was saturated with cases that did not require a professional. The few consultation hours available were filled with first assessments that ended in "this resolves with self-care". The cases that did need a professional waited, or never arrived.

The value

What changed with the triage app

· A 24/7 front door with an LLM, without saturating the clinical team. The affiliate opens the app, talks to the LLM, and within minutes has a path. No call center, no 8-day wait for a first read.
· Semi-structured triage: validated script + free conversation + sentiment scoring. The LLM follows a system (PHQ-9 / GAD-7 / WHO-5 style questions) and converses when the user needs to express themselves. The scoring combines the two signals.
· Ethical routing, with the critical case always in human hands. Low → self-check and resources. Medium → schedule with ambulatory psychology or psychiatry. High → the system escalates immediately to the operator's crisis team. The AI does not decide on crisis.

Stack

Technologies we worked with

FlutterPython 3FastAPIPostgreSQLLLM (OpenAI / Anthropic)Sentiment scoringMobile firstCloud

Mental health triage chat screen with conversational AI — Conversational mental health triage · semi-structured script, sentiment scoring, ethical routing to the right resource

The technical challenge

From an LLM chat to a triage instrument with a human bypass

The hard part was not wiring the LLM to a chat. The hard part was making the LLM follow a clinical script, evaluate sentiment without hallucinating, escalate correctly, and step aside when the case is critical. The three critical decisions were:

1 Triaje system prompt with explicit limits. The LLM is not a therapist. Its role is to ask questions, record answers, and classify. It does not diagnose, does not prescribe, does not decide on crisis. The prompt defines its personality (empathetic, brief, validating) and its structured output (category + score + reason).
2 Sentiment scoring, not only answers to questions. Every user message passes through a classifier that evaluates tone, risk lexicality, and consistency. The final score combines the questionnaire and the sentiment, with weight configurable by the operator.
3 Ethical routing with a human bypass for crisis. If at any moment the LLM detects crisis signals, it does not classify: it routes directly to the operator's human crisis team, with the conversation logged. The AI does not decide on crisis; crisis is received by a human from the first message.

┌──────────────────┐
│  App Flutter     │  ◄── iOS + Android, single codebase
│  Chat LLM + UI   │      PHQ-9 / GAD-7 + free chat
└────────┬─────────┘
         │ User message
         ▼
┌──────────────────┐
│  FastAPI +       │  ◄── Triaje system prompt
│  Orchestrator +  │      sentiment classifier
│  LLM + scoring   │      accumulated score
└────────┬─────────┘
         │ Category: low / medium / HIGH
         ▼
┌──────────────────┐
│  Ethical         │  ◄── low: self-check
│  routing         │      medium: psychology schedule
│  (with bypass)   │      HIGH: human crisis team
└──────────────────┘

Lessons learned

What we took away

01

An LLM without a clinical script is a chatbot. With a script, it is an instrument.

The difference is not in the model; it is in the system prompt, in the order of the questions, in what it is allowed to say and what not. The prompt is the tool, not the LLM.
02

Sentiment scoring does not replace the clinician — it protects them

The operator's psychology and psychiatry team moved from attending everything to attending the cases that actually need a professional. The AI does not compete with the clinician; it frees them from what they do not need.
03

The human bypass for crisis is not a feature — it is a duty

If the AI detects high risk, it does not classify: it escalates. This is designed in from day one, not as an improvement. Any mental health AI system without an explicit human bypass is not responsible.

"Before, when an affiliate called with psychological distress, the call center scheduled a generic appointment in 8 days, or scheduled nothing. Today the app receives them 24/7, does a first read with the LLM, and takes them to the right resource: self-help, a psychology consultation, or immediate activation of our crisis team. We went from not knowing which cases were urgent to having a clear and auditable path for each one."

Technology and operations team

Client health insurer · paraphrased testimonial

Paraphrased testimonial.

Is your case similar?

Got a health operator that still routes through a call center?

Let's talk about building a conversational AI front door with sentiment scoring and a human bypass that respects the ethical limits of mental health triage.

Let's talk about your mental health case →

or email us at hola@tikal.com.co

See all cases

Mental health triage with conversational AI for a Colombian health insurer

Why the first contact was not working

What changed with the triage app

From an LLM chat to a triage instrument with a human bypass

What we took away

An LLM without a clinical script is a chatbot. With a script, it is an instrument.

Sentiment scoring does not replace the clinician — it protects them

The human bypass for crisis is not a feature — it is a duty

Got a health operator that still routes through a call center?