A reference design for any service business that turns missed calls into booked appointments — and the six points along a single call where most builds leak the booking before it's made.
Every service business with a phone has the same leak: calls that come in when no one can pick up are bookings walking out the door. An AI voice agent catches them — if the path is built right. Here is that path.
The call goes unanswered and the carrier forwards it to a number the agent owns.
A cheapest-first spam gate drops junk before it ever reaches the model.
An exact phone lookup says new or returning — then one question confirms who's actually on the line.
A grounded dialogue fills service, time and contact — every price and slot pulled from a real source, never invented.
The appointment is written only after a confirmation clears — so the calendar holds real bookings, nothing else.
On a voice line, every call that reaches the agent costs real money — speech-to-text, the model, and text-to-speech run on each one, on the order of ~$0.09 a call and up. The architecture's job is to spend that compute only on callers who are plausibly real, and to protect the calendar at the very end. Get the sequence wrong and you pay for noise while losing real bookings.
In call order. Each one is invisible in a design document and obvious only after you've shipped a few of these. The principle is here; the tuned values and the implementation are the build.
A public phone number attracts robodialers and junk. Route every call straight into the model and they drain your budget and bury the real callers — you're paying per call to talk to no one.
PrincipleBlock by geography and number reputation at the carrier before you answer; drop autodialers with a one-step challenge before the model runs; cap call length; and gate the booking behind a code so junk can never reach the calendar.
"Is this a new or returning customer?" looks like a question for the AI. It isn't. Retrieval is semantic and fuzzy — point it at identity and it will eventually match the wrong record, adding latency and cost on every single call.
PrincipleIdentity is an exact phone-number lookup in your customer table — deterministic, instant, effectively free. Keep retrieval for free-form business knowledge only. It grounds what the agent says; it never decides who is calling.
The whole lookup assumes the forwarded call still carries the original caller's number. Depending on the carrier, a no-answer forward can instead present the business's own number — and then every caller looks identical and identity fails silently for everyone.
PrincipleVerify the forwarding behavior on the real line before you build anything on top of it — and design the unknown-number fallback regardless, because withheld and unrecognized numbers happen daily.
A matched number is not a verified person. Phones get shared in a household and numbers get recycled to strangers. Greet whoever answers as the account holder and you can hand one customer's appointment history to someone else — an awkward call and a real privacy gap.
PrincipleTreat a matched number as "likely this person — confirm," never "definitely." One light confirmation before revealing anything, and a small step-up only for sensitive actions. It costs a single line of dialogue and closes the gap without annoying regulars.
Most voice failures aren't model quality. They're a missed interruption, an unconfirmed detail, or a half-second of dead air that reads as "the line dropped." The caller hangs up — and a hang-up is a lost booking, full stop.
PrincipleLet callers talk over the agent and stop playback instantly. When the model is unsure, re-ask or offer options instead of guessing. Hold a latency budget so the line never goes quiet — and if it's blown, the agent acknowledges and keeps moving rather than going silent.
An agent that invents a price or an open slot books appointments that don't exist and quotes numbers you never set. And the final step quietly fails too: asking a caller to read a texted code back over the phone runs straight into mis-heard digits and a text that lands mid-call.
PrinciplePrices and open times come only from your tools — if a tool returns nothing, the agent says so rather than filling the gap. And confirm the booking in a way that survives a live phone call; reading a code back by voice is exactly where confirmations break, and there's a cleaner pattern.
This reference comes out of voice agents running in production, not a whiteboard. The numbers are from live systems.
No AI theater. Measurable production outcomes, or it doesn't ship.
What's above is the map. The spam thresholds, the retrieval layer that grounds every answer, the parts that have to survive a real phone line — that's where these systems are won or lost, and it's what we do. Bring your design or a blank page; we'll harden it into something that ships.
Book a 30-min architecture call →