Function 03 / 07 · Perform

How your operation actually shows up. Across humans and AI.

Perform is the function nobody names but everyone feels. It's the difference between an operation that closes tickets and an operation that builds trust. The bar that defines what "good" looks like, every day, on every conversation. Human or AI.

SAMPLE READING CALIBRATION-04 · APR 26
Function 03 / 07 · Perform
You're emerging on this one.
CALIBRATION-04
v 02 · live
63/100
Composite
Emerging
01
Reactive
Now
Emerging
03
Defined
04
Optimised
Next move Build a shared standard. You have a bar. Your team doesn't know what it is. Your AI was never told.
02

What Perform means.

Perform is the work of showing up consistently across an operation. Not the work of writing scripts. Not the work of monitoring dashboards. The work of agreeing what a good conversation sounds like, calibrating to that bar every week, and scoring every interaction against it. The team's interactions and the AI's, against one standard.

Most CX teams have a vague sense of "good." Senior agents carry it in their heads. The AI runs on a system prompt nobody calibrated. New hires absorb a third version through osmosis. Quality scoring catches the worst cases on the human side; the AI's vendor dashboard reports against its own metrics. Nobody can answer the question "what does great look like, today, on this customer's situation?" consistently across both populations.

That's the gap Perform names. The bar isn't written down. The standard isn't shared. The AI's behavior is calibrated by a vendor CSM the team has never met. The calibration cadence is monthly at best, and one-sided when it happens. Agents fly on instinct. Managers grade on instinct. The AI drifts on whatever the prompt last said. Quality drifts everywhere.

Haven's Perform module builds the shared standard first. Three to five named dimensions. Four named levels per dimension. Every interaction scored against it, human and AI alike, in real time. The bar becomes legible. New hires onboard against it. The AI is built on it. Senior agents teach against it. Quality stops drifting on either side.

The work isn't fancy. It's a craft skill that's been buried under "QA software" for ten years and split across two stacks for the last two. Haven names it, structures it, and holds it across both populations. That's the function.

03

The progression. Four levels.

Level 01 You've passed
Reactive

"Good" lives in senior agents' heads, and in whatever the AI's system prompt last said. Quality is monitored after the fact on the human side. The AI's behavior is read by a vendor dashboard nobody verifies. New hires learn through ride-alongs. Calibration is monthly or quarterly, one-sided, often skipped.

  • No written standard
  • No shared definition of good
  • QA scoring exists but isn't trusted
  • AI runs on whatever the system prompt last said
Level 02 · Now You are here
Emerging

A bar exists for humans, but it isn't shared. The AI is on its own. The lead has the bar. Some senior agents have it. New hires don't. The AI is calibrated by a vendor CSM the team has never met. Quality scoring catches obvious misses on the human side but doesn't read the AI at all.

  • Lead carries the bar
  • Coaching happens 1:1 on humans only
  • AI calibration owned by the vendor
  • Humans and AI drift in different directions; nobody reads them together
Level 03 2-3 months out
Defined

The bar is named, owned, and calibrated weekly across humans and AI. A shared standard. Three to five dimensions. Four levels each. Every interaction scored against it, whether the team handled it or the AI did. Every operator knows what good looks like, today.

  • Written standard, version-controlled
  • Weekly calibration across humans and AI
  • Shared with the team and the AI's prompt owner
  • Owner named
Level 04 12+ months out
Optimised

The standard evolves with the work and updates both populations. Calibration findings update the standard. The standard trains new hires and updates the AI's prompt structure. Quality drift on either side is detected before it shows up in CSAT.

  • Self-improving standard
  • Auto-onboarding from standard (human and AI)
  • Drift detection on both populations
  • Drift caught at the standard line, not at CSAT
04

What Perform builds.

Artifact 01

The shared standard

Three to five dimensions. Four levels per dimension. Calibrated weekly across humans and AI both. The single most leveraged artifact in the function.

  • 3–5 dimensions, 4 levels each
  • Calibrated examples per level
  • Scored against every interaction, human and AI
  • Linked to onboarding & Enable
~3 hours to first draft
Artifact 02

The calibration ritual

A weekly 30-minute session where the team scores the same five conversations. Mix of human-handled and AI-handled. Findings update the standard and route to prompt changes where the AI is the source.

  • 5 conversations, mix of human and AI
  • Findings update the standard live
  • Disagreement log → coaching or prompt updates
  • Whole team participates; prompt owner included
30 min · weekly
Artifact 03

The onboarding ladder

Six-week ramp where new hires move from Level 01 to Level 02 against the standard, with named milestones. Reduces ramp time from 12 weeks to 6.

  • Six-week ramp, Level 01 → 02
  • Named milestones every two weeks
  • Live standard scoring against cohort
  • Cuts ramp from 12 weeks to 6
6 weeks · per hire
05

See it cascade.

A Perform signal rarely stops at Perform. A QA flag on the team often traces to a knowledge gap the AI is hitting on the same intent, and the fix routes back through Enable. One root cause, not two performance conversations. See how Perform cascades across the operation →