Expert in the Loop · Part 2 of 6 Explainer Operations leaders introducing AI into existing workflows

AI doesn't fix bad thinking. It scales it.

5 min read Published 2026-03-20 Updated Jul 12, 2026

Teams introducing AI before they have clarified how they think are not failing at adoption. They are succeeding at amplification — of whatever was already there. Clear thinking becomes a multiplier. Vague thinking becomes noise at scale. This piece gives you four questions to run before any model touches a workflow, two failure stories you will recognize, and a ninety-minute exercise that surfaces where your operation is not ready for speed.

#Four questions to answer before AI touches the workflow

Before you wire a model into a process, your team should be able to answer these without hedging:

What problem are we actually solving? Not "we want AI for support" — the specific failure mode. Tickets sit unassigned for six hours. Disputes reopen because resolution notes are incomplete. Onboarding steps get skipped when volume spikes.

What does a good outcome look like? Observable, not aspirational. "Customer receives accurate resolution within SLA" beats "better customer experience." Name the fields, timestamps, or statuses that prove good.

Where do we need expert judgment vs automation? Every workflow has a boundary. Inside it, rules are clear enough to execute. Outside it, a person with context must decide. If you cannot draw that line, AI will draw it for you — incorrectly.

What tradeoffs are we willing to accept? Speed vs thoroughness. Automation rate vs escalation rate. Cost per case vs risk of a wrong automatic action. Unspoken tradeoffs become incidents.

If those answers are not clear, AI will not help. It will give you more to sort through. The teams getting real value are not moving faster because of the tool. They are moving faster because their thinking was already clear.

#Two ops stories where speed made things worse

The pilot with pretty output and the wrong decision. A fifty-person services firm piloted AI-generated project status summaries for client QBRs. The output read well. Stakeholders in the review meeting nodded. Two weeks later, a renewal slipped because the summary smoothed over a staffing gap the account lead would have flagged — but the lead was no longer in the loop on every draft. The model did not hallucinate. It optimized for readable prose on inputs that never included capacity constraints. The failure was not the model. It was the absence of a named expert evaluating whether the summary was safe to send.

Support triage with confident wrong routing. A team automated first-pass ticket classification. Accuracy looked acceptable on aggregate — eighty-seven percent. The thirteen percent wrong were not random. They clustered on escalations: billing disputes tagged as "how-to," security reports tagged as "billing," VIP clients routed to the junior queue because the model weighted keywords over account tier. Confidence was high on the wrong routes. Volume increased. Decision quality did not.

Both stories share a root cause: AI was added to workflows where problem definition, success criteria, and judgment boundaries were implicit. Speed exposed the gap faster than spreadsheets ever did.

#Noise at scale

Vague thinking does not stay vague when you add AI. It multiplies.

Vague input	What AI produces	What the operation experiences
Unclear problem definition	Plausible outputs on the wrong task	Activity without progress
No shared success criteria	Polished text that fails review unpredictably	Rework loops, trust erosion
Implicit judgment boundaries	Automation crossing lines it should not	Incidents, escalations, rollbacks
Unspoken tradeoffs	Optimized for the wrong variable	Fast wrong answers

This is why tool-first strategies stall. A copilot license does not create a decision framework. An agent framework does not name who owns the outcome. You get more artifacts — summaries, drafts, classifications — without better decisions.

If your process layer is broken, this is the mechanism. In the Readiness Stack, Layer 2 — Process — asks whether a smart outsider could follow written rules and get the right answer. When that layer fails, AI does not fix it. It scales the inconsistency with confidence.

#Ninety-minute exercise: the four-question test

Pick one workflow that hurts — escalations, disputes, onboarding handoffs, whatever leadership complained about last week. Gather the person who runs it and one person who receives its output. Ninety minutes, whiteboard or shared doc.

Work through the four questions in order. Write answers in plain language. When someone says "it depends," capture the dependency as a rule or flag it as a judgment point. When two people define "done" differently, stop — that is the finding.

Note where the room goes silent. Silence on "what does good look like" means you are not ready for automation. Silence on "where judgment matters" means your next incident is already scheduled.

At minute ninety, you should have one of two outcomes: clear enough answers to sketch control points, or a short list of alignment work that must happen first. Both are wins. Only one should proceed to a model API.

#Make thinking visible next

Clarity is not a one-time workshop. It has to live in artifacts the team actually uses — not documentation theater. The next piece shifts the question from "How do we use AI?" to "Is our thinking clear enough for AI?" See Make thinking visible before you automate.

Edit this article on GitHub