AI Agents · Retail60 locations · 450K customers10% held-out control from day one

An AI agent that reactivates lapsing customers across 60 locations and 23 service categories.

A LangGraph agent picks the channel, message, offer, and timing for every customer about to lapse. Brand voice is schema-validated. A 10% held-out control ran from day one, so the 3.5x ROI is measured against what would have happened without the agent — not against an optimistic baseline.

3.5×ROI vs. held-out control
18.7%Reactivation rate
192KCustomer × category intervals
23Service categories

Client

Regional dry-cleaning chain. 60+ locations. ~450K active customers. 23 service categories — from everyday laundry through wedding-dress preservation and leather restoration. Each category has its own buying cadence and its own definition of "lapsing."

Engagement

16-week build. Ongoing retainer for brand-voice tuning, new-category onboarding, and agent policy updates.

Blast-email retention didn't know who was actually lapsing.

The chain had been running quarterly blast emails with a flat 15% coupon. Open rates were 8%, redemption was 1.4%, and the offer was cannibalizing margin on customers who would have come back anyway. Nobody knew which cohort drove the redemption — or which cohort ignored it and churned the next quarter.

The real problem: "lapsing" meant something different for each of the 23 service categories. A laundry customer who hadn't visited in three weeks was lapsing. A wedding-gown customer who hadn't visited in three weeks was right on schedule. Reactivation needed to be scored per customer per category, and the intervention had to match the category — an SMS nudge about everyday laundry, an email about seasonal down-jacket cleaning, a WhatsApp follow-up after wedding-season wraps.

Off-the-shelf marketing automation couldn't do this. It could segment, but it couldn't reason — and it couldn't keep the brand voice consistent across tens of thousands of hand-tailored messages a month.

A LangGraph agent, schema-validated brand voice, and a held-out control from day one.

Four systems that stop a generic LLM from writing on-brand gibberish at scale.

Per-category lapse model in dbt. 192K customer × category intervals fitted into dbt — the empirical distribution of time between purchases for every customer in every category they buy. "Lapsing" is a quantile on that distribution, not a calendar rule. The data model runs daily; every customer × category pair has a current lapse probability.

LangGraph agent chooses channel, message, offer, timing. For each lapsing customer, a LangGraph state-machine runs: pick channel (SMS / WhatsApp / email) from the customer's engagement history, pick offer depth from unit-economics (not everyone gets 15% off — some get 5%, some get a free add-on, some get nothing because they're about to come back anyway), draft the message, validate, schedule. Tool-using agent, not a one-shot prompt.

Brand voice as Pydantic schema. Every message passes through a validator that checks tone, forbidden phrases, required category-specific disclaimers, offer math, and length per channel. If the validator rejects, the agent retries with the validator's feedback. Over 40K messages a month go out; none have a typo or an off-brand phrase because none can.

10% held-out control from day one. 10% of the eligible population gets no message — same scoring, same segmentation, just no intervention. That's the control. ROI is measured against what those customers did. The 3.5x is a real lift, not a before-after number that hides seasonality and mean reversion.

Architecture (daily run)

1.POS extractTransactions from 60 locations → PostgreSQL staging
2.Lapse modeldbt → customer × category interval quantile → lapse probability
3.Control split10% held out by stable hash on customer_id — same cohort every day
4.AgentLangGraph: channel · offer · message · timing (GPT-4o planner, Claude Sonnet drafter)
5.ValidatePydantic brand-voice schema (tone, disclaimers, offer math, length)
6.Dispatchn8n orchestrator → Twilio SMS · WhatsApp Business API · SendGrid email
7.AttributeRedemption / return-visit joined back to message → Metabase dashboard
8.ObserveLangSmith traces · weekly lift report against held-out control
LangGraph LangChain GPT-4o Claude Sonnet 3.5 Pydantic PostgreSQL dbt n8n Twilio SMS WhatsApp Business API SendGrid Python 3.12 FastAPI LangSmith Metabase

3.5x ROI measured the hard way — against a real held-out control.

3.5×

ROI vs. held-out control

Incremental gross margin over the 10% control cohort, net of agent run cost + offer cost. The boring number, honestly measured.

18.7%

Reactivation rate

Across 23 service categories. Messaged customers who returned within their category's lapse window. Control: 9.2%.

40K+

Messages per month

Schema-validated. Every message passed brand-voice checks before dispatch. Zero typos, zero off-brand offers in production.

192K

Customer × category intervals

Fitted empirically. Every customer's "lapsing" is defined against their own history in that category — not a chain-wide calendar rule.

-62%

Margin leak on coupons

Vs. the old flat-15%-to-everyone blast. The agent only offers what's needed — many lapsing customers get nothing because they were going to come back anyway.

3

Channels, one agent

SMS / WhatsApp / email. Channel choice is per-customer, from their engagement history. No A/B test rotation — the agent picks.

The honest comparison — 10% held out from day one — is the thing that convinced us. We stopped running blast campaigns and we don't miss them. The agent knows who's actually lapsing and who's just quiet.

— Head of Marketing, Regional Dry-Cleaning Chain

Four decisions that separate an agent from an automation.

1. Lapse is a per-category quantile, not a calendar rule. Every generic retention tool has a "hasn't visited in 90 days" rule. For 23 categories with buying cadences from 7 days to 2 years, that's nonsense. Fit the interval distribution empirically; take a quantile; ship.

2. The agent decides the offer, not the marketing team. A customer who was going to come back anyway costs margin if you give them 15% off. The agent has unit-economics in context and picks the offer depth that maximizes incremental gross margin, not redemptions.

3. Brand voice is a validator, not a prompt. Telling an LLM "be on-brand" gets you on-brand 80% of the time. Rejecting every message that fails a Pydantic validator gets you on-brand 100% of the time, at the cost of some retries. For 40K messages a month, the second option is the only option.

4. Held-out control from day one. If you don't have a control from day one, you'll spend the next two years arguing with the CMO about whether the number is real. Hold out 10%, report lift weekly, sleep at night.

Sixteen weeks to production.

We can build your reactivation agent in 12–16 weeks.

Bring us your POS data, your channel accounts, and your brand guide. We'll ship with a held-out control from day one — you'll know the real number.