Expert in the Loop · Part 5 of 6 Explainer Operations and engineering leads designing agent workflows

Where AI contributes and where judgment takes over

4 min read Published 2026-04-09 Updated Jul 12, 2026

Philosophy is cheap. Workflow design is where expert-in-the-loop discipline survives contact with Monday morning. Every operational process can be decomposed into four kinds of steps — generate, recommend, decide, act — and AI is strong in the first two only when the boundaries around the last two are explicit. This piece gives you that split, a worked invoice-dispute example, a swimlane you can sketch in one meeting, and a build-this-week exercise before anyone touches a model API.

Generate. Produce a candidate artifact — draft reply, summary, classification label, reconciliation diff. AI belongs here when inputs are frozen and output format is specified.

Recommend. Rank options, suggest next step, surface confidence and rationale. AI belongs here when recommendations are advisory — nothing ships on recommendation alone.

Decide. Commit to a course of action — approve adjustment, choose escalation tier, accept liability. Expert ownership unless the boundary is airtight and the cost of error is low.

Act. Execute in a downstream system — send email, post credit, close ticket, update ERP. Often automated after a logged decision; never silent on high-cost paths.

Step type	AI default	Expert default	When AI can cross into decide/act
Generate	Yes	Review sample	Rarely — only with human spot-check
Recommend	Yes	Owns override	When wrong recommendation cannot ship
Decide	No	Yes	Penny-level thresholds with audit
Act	No	Approves first	Idempotent, reversible, low-cost actions

The mistake is treating a generate step as a decide step because the output "looks done." A polished draft is not a decision.

#Worked example: invoice dispute escalation

A B2B services company disputes a $24,000 invoice line — delivery dates do not match the statement of work. The workflow below is design-stage, not build-stage. Same pattern as a reviewer console later; the control points are labeled first.

Generate. AI pulls contract clauses, delivery logs, and prior dispute notes; drafts a summary of mismatch and a proposed credit percentage.

Recommend. AI suggests route: standard adjustment (under $5k pattern), manager review ($5k–$25k), executive escalation (above $25k or strategic account). Flags confidence and missing fields.

Decide. Domain expert — the billing operations lead — confirms route, adjusts credit recommendation, or escalates to client success for relationship context. Strategic accounts always hit a control point regardless of dollar amount.

Act. After logged approval, system posts credit memo, notifies client success, updates CRM dispute status. Idempotent send — retries do not double-post.

The wrong automatic action here is not a bad paragraph. It is a credit memo on a strategic account or a wrong percentage applied without a human who knows the renewal is in sixty days.

flowchart TB subgraph ai_lane [AI lane] g1[Generate dispute summary] r1[Recommend route and credit range] end subgraph expert_lane [Expert lane] d1[Decide: approve, edit, or escalate] a1[Act: post memo after log] end intake[Dispute opened] --> g1 --> r1 --> d1 --> a1

#Checklist: name the hurt step

Before you automate, answer:

Can you name the step where a wrong automatic action hurts a customer, revenue, or compliance?
Does that step have a named expert and a log requirement?
If the model is wrong on generate, is anything prevented from shipping until decide?
If confidence drops, is there a routing rule — not hope — that sends the case to a queue?

If the first answer is "I am not sure," that uncertainty is the step to design first. It becomes a control point.

#Build-this-week exercise

Pick one workflow — dispute, escalation, onboarding exception, reconciliation mismatch. One hour, one page.

List every step from trigger to done.
Label each step G, R, D, or A.
Circle every D and A. Those are control points — name the expert role from The expert in the loop is the control point.
Draw the boundary: what must never act without a logged decision?
Stop. Do not call a model API until the diagram exists.

Share the one-pager with the person who would debug a bad outcome on a Friday afternoon. If they cannot follow it, the design is not ready.

#When control points need software

Sketches in meetings do not survive scale. Queues fill up. Routing rules drift. Audit questions arrive. The final article in this series covers the minimum stack — exception routing, named queue owner, input snapshot, action log — and hands off to production patterns. See Where control points become software.

Edit this article on GitHub

#Generate, recommend, decide, act

#Worked example: invoice dispute escalation

#Checklist: name the hurt step

#Build-this-week exercise

#When control points need software