Back to blog

The 5 AI Automation Patterns Actually Working in Production

Prompt engineering is dead. Pattern engineering is the new skill. The 5 reusable workflows driving successful AI automation in March 2026, with code examples you can copy.

#AI#Automation#Workflows#Production#Patterns
3/5/202626 min readMrSven
The 5 AI Automation Patterns Actually Working in Production

Six months ago a startup founder asked me to review his AI agent. It was a single prompt that tried to do everything. Analyze customer emails, categorize requests, check billing data, generate responses, and handle escalations.

The prompt was 2,400 tokens long. It worked perfectly in his test cases. In production it failed 40% of the time.

I asked him why he built it that way. He said he heard that great prompts create great agents.

That advice was wrong in 2024. It is dangerous in 2026.

The companies shipping production AI automation are not writing better prompts. They are engineering better patterns.

March 2026 marks the shift from prompt engineering to pattern engineering. The teams winning are not the ones with the best prompts. They are the ones with the best reusable workflows.

Here are the 5 patterns actually working in production, with code you can copy.

Pattern 1: ReAct - Reason and Act in Small Steps

ReAct alternates between reasoning and action in small steps. Instead of one giant prompt that does everything, you chain together small focused steps. Each step observes the result before deciding what to do next.

This pattern is ideal for triaging requests, routing workflows, and handling multi-step investigations.

When to Use It

You need ReAct when:

  • The workflow has conditional branches
  • Decisions depend on intermediate results
  • You need to observe before acting
  • Error recovery matters

Real World Example: Customer Support Triage

Warmly uses ReAct for AI-powered sales outreach. When a prospect views a calendar but does not book, their agent analyzes inaction patterns, generates personalized follow-ups, and sequences them based on prospect behavior.

Here is how to build a ReAct workflow for customer support triage:

from openai import OpenAI
import json
from datetime import datetime

client = OpenAI()

class ReActAgent:
    def __init__(self, max_steps=5):
        self.max_steps = max_steps

    def run(self, initial_request):
        state = {
            "step": 0,
            "request": initial_request,
            "observations": [],
            "actions_taken": [],
            "classification": None,
            "resolution": None
        }

        while state["step"] < self.max_steps:
            state["step"] += 1

            # Step 1: Reason about what to do
            thought = self._think(state)
            print(f"[Step {state['step']}] Thought: {thought}")

            # Step 2: Act based on reasoning
            action_result = self._act(thought, state)
            state["actions_taken"].append(action_result)
            print(f"[Step {state['step']}] Action: {action_result}")

            # Step 3: Observe the result
            observation = self._observe(action_result)
            state["observations"].append(observation)
            print(f"[Step {state['step']}] Observation: {observation}")

            # Step 4: Check if we are done
            if self._is_done(state):
                state["resolution"] = self._resolve(state)
                break

        return state

    def _think(self, state):
        prompt = f"""You are a customer support triage agent. Think about the next step.

Current Request: {state['request']}

Previous Steps: {len(state['observations'])} taken
Previous Actions: {state['actions_taken'][-3:] if state['actions_taken'] else 'None'}

Possible Actions:
1. CLASSIFY - Determine request type (billing, technical, compliance, other)
2. INVESTIGATE_BILLING - Check customer billing data
3. INVESTIGATE_TECHNICAL - Check technical logs and errors
4. CHECK_POLICY - Review relevant policies
5. RESOLVE - Generate final resolution

Output your thought as a single sentence explaining which action to take and why.
If we already have enough information to resolve, choose RESOLVE."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.3
        )

        return response.choices[0].message.content.strip()

    def _act(self, thought, state):
        thought_lower = thought.lower()

        if "classify" in thought_lower and not state["classification"]:
            action = {
                "type": "CLASSIFY",
                "result": self._classify_request(state["request"])
            }
            state["classification"] = action["result"]
            return action

        elif "billing" in thought_lower and state["classification"] == "billing":
            return {
                "type": "INVESTIGATE_BILLING",
                "result": self._get_billing_data(state["request"])
            }

        elif "technical" in thought_lower and state["classification"] == "technical":
            return {
                "type": "INVESTIGATE_TECHNICAL",
                "result": self._get_technical_data(state["request"])
            }

        elif "resolve" in thought_lower:
            return {
                "type": "RESOLVE",
                "result": "Ready to resolve"
            }

        else:
            return {
                "type": "UNKNOWN",
                "result": "No action taken"
            }

    def _observe(self, action):
        if action["type"] == "CLASSIFY":
            return f"Classified as: {action['result']}"

        elif action["type"] == "INVESTIGATE_BILLING":
            return f"Billing data retrieved: {json.dumps(action['result'][:100])}..."

        elif action["type"] == "INVESTIGATE_TECHNICAL":
            return f"Technical data retrieved: {len(action['result'])} errors found"

        else:
            return "No new information"

    def _is_done(self, state):
        # Done if we have classification and investigation data
        has_classification = state["classification"] is not None
        has_investigation = any(
            "INVESTIGATE" in a["type"]
            for a in state["actions_taken"]
        )

        return has_classification and has_investigation

    def _resolve(self, state):
        prompt = f"""Based on the investigation, generate a resolution.

Request: {state['request']}
Classification: {state['classification']}
Actions Taken: {state['actions_taken']}
Observations: {state['observations']}

Generate a helpful, clear resolution for the customer."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.7
        )

        return response.choices[0].message.content

    def _classify_request(self, request):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{
                "role": "system",
                "content": "Classify customer requests as: billing, technical, compliance, or other. Return only the classification word."
            }, {
                "role": "user",
                "content": request
            }],
            temperature=0.1
        )
        return response.choices[0].message.content.strip()

    def _get_billing_data(self, request):
        # In production, call your billing API here
        return {
            "status": "active",
            "plan": "pro",
            "amount_due": 0,
            "last_payment": "2026-03-01"
        }

    def _get_technical_data(self, request):
        # In production, call your error tracking system here
        return [
            {"error": "API timeout", "count": 3, "severity": "warning"},
            {"error": "Login failed", "count": 1, "severity": "error"}
        ]


# Usage
agent = ReActAgent(max_steps=5)

result = agent.run(
    initial_request="I was charged $99 but my account says I'm on the free plan"
)

print(f"\nFinal Resolution: {result['resolution']}")
print(f"Steps taken: {result['step']}")

Why ReAct Works

The magic of ReAct is in the observation step. Each action gives you new information. The next decision is based on what you actually observed, not what you guessed.

If the classification is wrong, you can course-correct in the next step. If billing data looks odd, you can investigate further. The workflow adapts to reality instead of assuming.

This is why companies using ReAct report 30-50% lower error rates compared to single-prompt approaches.

Production Considerations

  • Add timeout limits on each step
  • Implement circuit breakers for API failures
  • Log every thought, action, and observation
  • Track token usage per step
  • Cache investigation results when possible

Pattern 2: Plan-and-Execute - Separate Planning from Doing

Plan-and-Execute splits workflow into two phases. First, generate a detailed plan. Then, execute each step sequentially. If the plan fails, you can retry individual steps instead of the entire workflow.

This pattern is ideal for report generation, research tasks, and any work where the steps are predictable but the details vary.

When to Use It

You need Plan-and-Execute when:

  • The workflow has multiple steps that must execute in order
  • Each step can be executed independently
  • You need to track progress through a known sequence
  • Failures in individual steps should not restart everything

Real World Example: Compliance Audit Preparation

Legal and compliance teams use Plan-and-Execute for audit preparation. The system first outlines what evidence is needed for each control, then systematically retrieves it from systems, logs, and documentation.

Here is how to build a Plan-and-Execute workflow:

from openai import OpenAI
from typing import List, Dict
import json
from datetime import datetime

client = OpenAI()

class PlanAndExecuteAgent:
    def __init__(self):
        self.plan = []
        self.execution_log = []

    def run(self, objective):
        # Phase 1: Plan
        print("Planning phase...")
        self.plan = self._generate_plan(objective)
        print(f"Generated plan with {len(self.plan)} steps:")
        for i, step in enumerate(self.plan, 1):
            print(f"  {i}. {step['description']}")

        # Phase 2: Execute
        print("\nExecution phase...")
        results = []
        for i, step in enumerate(self.plan, 1):
            print(f"\nExecuting step {i}/{len(self.plan)}: {step['description']}")
            result = self._execute_step(step, objective)
            results.append({
                "step": i,
                "description": step["description"],
                "result": result,
                "status": "completed"
            })
            self.execution_log.append({
                "timestamp": datetime.utcnow().isoformat(),
                "step": i,
                "description": step["description"],
                "result_summary": str(result)[:200]
            })

        # Phase 3: Synthesize
        print("\nSynthesizing results...")
        final_output = self._synthesize(objective, self.plan, results)

        return {
            "objective": objective,
            "plan": self.plan,
            "execution": results,
            "final_output": final_output
        }

    def _generate_plan(self, objective):
        prompt = f"""Generate a detailed plan to accomplish this objective:

Objective: {objective}

Output the plan as a JSON array of steps. Each step must have:
- description: What this step does
- input_required: What data this step needs
- output_expected: What this step produces
- dependencies: Which previous steps this depends on (array of step indices, or empty)

Keep steps small and focused. 5-10 steps is typical.

Example format:
[
  {{
    "description": "Retrieve customer data",
    "input_required": "customer_id",
    "output_expected": "customer_profile",
    "dependencies": []
  }},
  {{
    "description": "Analyze usage patterns",
    "input_required": "customer_profile",
    "output_expected": "usage_summary",
    "dependencies": [0]
  }}
]

Return ONLY the JSON array. No markdown formatting."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.3,
            response_format={"type": "json_object"}
        )

        result = json.loads(response.choices[0].message.content)
        return result.get("steps", result)

    def _execute_step(self, step, objective):
        step_type = self._classify_step_type(step["description"])

        if step_type == "retrieve":
            return self._retrieve_data(step, objective)
        elif step_type == "analyze":
            return self._analyze_data(step, objective)
        elif step_type == "transform":
            return self._transform_data(step, objective)
        elif step_type == "verify":
            return self._verify_data(step, objective)
        else:
            return self._generic_execute(step, objective)

    def _classify_step_type(self, description):
        description_lower = description.lower()

        if any(word in description_lower for word in ["retrieve", "fetch", "get", "load", "query"]):
            return "retrieve"
        elif any(word in description_lower for word in ["analyze", "calculate", "compute", "measure"]):
            return "analyze"
        elif any(word in description_lower for word in ["transform", "convert", "format", "process"]):
            return "transform"
        elif any(word in description_lower for word in ["verify", "validate", "check", "confirm"]):
            return "verify"
        else:
            return "generic"

    def _retrieve_data(self, step, objective):
        # In production, this would call actual APIs
        print(f"  Retrieving data for: {step['input_required']}")

        if "customer" in objective.lower():
            return {
                "customer_id": "cust_12345",
                "name": "John Doe",
                "plan": "Pro",
                "signup_date": "2025-06-15",
                "status": "active"
            }
        elif "billing" in objective.lower():
            return {
                "total_charged": 299.00,
                "payment_history": ["2025-12-01", "2026-01-01", "2026-02-01"],
                "invoices": [
                    {"id": "INV-001", "amount": 99.00, "status": "paid"},
                    {"id": "INV-002", "amount": 99.00, "status": "paid"},
                    {"id": "INV-003", "amount": 101.00, "status": "pending"}
                ]
            }
        else:
            return {"data": "Mock data retrieved"}

    def _analyze_data(self, step, objective):
        print(f"  Analyzing: {step['output_expected']}")

        prompt = f"""Analyze this data and provide the requested output.

Step Description: {step['description']}
Objective: {objective}

Available data would be here in production.

Provide a concise analysis output."""

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.5,
            max_tokens=300
        )

        return {
            "analysis": response.choices[0].message.content,
            "metrics": {"items_processed": 5, "anomalies_found": 0}
        }

    def _transform_data(self, step, objective):
        print(f"  Transforming data to: {step['output_expected']}")
        return {"transformed": True, "format": step["output_expected"]}

    def _verify_data(self, step, objective):
        print(f"  Verifying: {step['description']}")
        return {"verified": True, "issues_found": 0}

    def _generic_execute(self, step, objective):
        print(f"  Executing: {step['description']}")
        return {"status": "completed", "notes": "Generic execution"}

    def _synthesize(self, objective, plan, results):
        prompt = f"""Synthesize the execution results into a final output.

Objective: {objective}

Plan:
{json.dumps(plan, indent=2)}

Execution Results:
{json.dumps(results, indent=2)}

Create a comprehensive summary that:
1. States what was accomplished
2. Lists key findings
3. Notes any issues or gaps
4. Provides actionable next steps

Format as a clear, professional report."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.5
        )

        return response.choices[0].message.content


# Usage
agent = PlanAndExecuteAgent()

result = agent.run(
    objective="Prepare a customer billing analysis for customer ID CUST-12345, including total charges, payment history, and any outstanding issues"
)

print(f"\n{'='*60}")
print("FINAL OUTPUT")
print('='*60)
print(result["final_output"])

Why Plan-and-Execute Works

The separation of planning and execution gives you three advantages:

  1. Visibility: You see the entire plan before executing anything. You can review and adjust before spending money on API calls.

  2. Resilience: If step 4 fails, you can retry just step 4. The other steps are complete and can be reused.

  3. Optimization: You can cache plans for similar objectives. The same audit type can reuse the plan structure.

Companies using Plan-and-Execute report 40-60% faster retry times when workflows fail. The plan tells you exactly where to pick up.

Production Considerations

  • Cache plans for common objectives
  • Implement step-level timeout and retry logic
  • Store execution results for audit trails
  • Add verification steps after critical operations
  • Use smaller models for planning to save costs

Pattern 3: Router-Specialist - Route to Domain Experts

Router-Specialist uses a router agent to classify incoming requests and route them to specialist agents. Each specialist is optimized for a specific domain. The router coordinates handoffs between specialists when needed.

This pattern is ideal for customer support, sales triage, and any workflow requiring domain-specific expertise.

When to Use It

You need Router-Specialist when:

  • Requests fall into distinct categories
  • Each category requires specialized knowledge
  • Quality matters more than speed
  • You can afford multiple model calls per request

Real World Example: Enterprise Customer Support

Typewise's AI Supervisor Engine uses this pattern. A supervisor agent classifies requests and routes to billing, technical, or compliance specialists. Each specialist has deep knowledge of their domain and access to relevant tools.

Here is how to build a Router-Specialist workflow:

from openai import OpenAI
from typing import Dict, Any
import json

client = OpenAI()

class SpecialistAgent:
    def __init__(self, domain, system_prompt, tools=None):
        self.domain = domain
        self.system_prompt = system_prompt
        self.tools = tools or []

    def handle(self, request, context=None):
        messages = [
            {"role": "system", "content": self.system_prompt},
            {"role": "user", "content": f"Request: {request}"}
        ]

        if context:
            messages.append({"role": "user", "content": f"Context: {json.dumps(context, indent=2)}"})

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            temperature=0.5
        )

        return {
            "specialist": self.domain,
            "response": response.choices[0].message.content,
            "confidence": self._estimate_confidence(response)
        }

    def _estimate_confidence(self, response):
        # In production, use actual confidence metrics
        return 0.85


class RouterAgent:
    def __init__(self, specialists: Dict[str, SpecialistAgent]):
        self.specialists = specialists

    def route(self, request):
        # Step 1: Classify the request
        classification = self._classify(request)

        # Step 2: Get the right specialist
        specialist = self.specialists.get(classification)

        if not specialist:
            return {
                "classification": classification,
                "response": "I am not equipped to handle this type of request. Please escalate to human support.",
                "specialist": "none"
            }

        # Step 3: Handle the request
        result = specialist.handle(request)

        # Step 4: Check if we need another specialist
        needs_handoff = self._check_handoff_needed(classification, result)

        if needs_handoff:
            handoff_result = self._handle_handoff(request, classification, result)
            return handoff_result

        return {
            "classification": classification,
            "response": result["response"],
            "specialist": specialist.domain,
            "confidence": result["confidence"]
        }

    def _classify(self, request):
        prompt = f"""Classify this customer request into exactly one category:

{', '.join(self.specialists.keys())}

Request: {request}

Return ONLY the category name. No explanation."""

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.1
        )

        classification = response.choices[0].message.content.strip()

        # Fallback if classification is invalid
        if classification not in self.specialists:
            return "general"

        return classification

    def _check_handoff_needed(self, classification, result):
        # Check if the specialist flagged that another domain is needed
        response_lower = result["response"].lower()

        handoff_keywords = {
            "billing": ["payment", "subscription", "invoice", "refund"],
            "technical": ["bug", "error", "integration", "api"],
            "compliance": ["policy", "regulation", "legal", "privacy"]
        }

        current_keywords = handoff_keywords.get(classification, [])

        # Check if the response mentions keywords from other domains
        for domain, keywords in handoff_keywords.items():
            if domain == classification:
                continue
            if any(kw in response_lower for kw in keywords):
                return True

        return False

    def _handle_handoff(self, request, original_classification, original_result):
        # Find which other specialist is needed
        original_response = original_result["response"].lower()

        handoff_map = {
            "billing": "technical",
            "technical": "billing",
            "compliance": "billing"
        }

        # Simple heuristic - in production, use LLM to determine
        for domain, keywords in {
            "billing": ["payment", "subscription"],
            "technical": ["bug", "error"],
            "compliance": ["policy", "legal"]
        }.items():
            if domain == original_classification:
                continue
            if any(kw in original_response for kw in keywords):
                target_specialist = self.specialists.get(domain)
                if target_specialist:
                    # Handoff with context from first specialist
                    handoff_result = target_specialist.handle(
                        request,
                        context={
                            "previous_specialist": original_classification,
                            "previous_analysis": original_result["response"]
                        }
                    )

                    return {
                        "classification": f"{original_classification} -> {domain}",
                        "response": f"Original request was about {original_classification}. {handoff_result['response']}",
                        "specialist": f"{original_classification} and {domain}",
                        "handoff": True
                    }

        return {
            "classification": original_classification,
            "response": original_result["response"],
            "specialist": original_classification,
            "handoff": False
        }


# Define specialists
billing_specialist = SpecialistAgent(
    domain="billing",
    system_prompt="""You are a billing specialist with expertise in:
- Subscription plans and pricing
- Payment processing and invoicing
- Refunds and credits
- Account upgrades and downgrades

Always check actual billing data before making commitments. Be clear about what can and cannot be done."""
)

technical_specialist = SpecialistAgent(
    domain="technical",
    system_prompt="""You are a technical support specialist with expertise in:
- Debugging and troubleshooting
- API integrations and errors
- System status and outages
- Feature functionality

Ask clarifying questions when needed. Provide specific, actionable solutions."""
)

compliance_specialist = SpecialistAgent(
    domain="compliance",
    system_prompt="""You are a compliance specialist with expertise in:
- Company policies and procedures
- Data privacy and security
- Regulatory requirements
- Terms of service

Be precise. Cite specific policies when relevant. Flag potential issues clearly."""
)

# Create router
router = RouterAgent({
    "billing": billing_specialist,
    "technical": technical_specialist,
    "compliance": compliance_specialist
})

# Test with different requests
test_requests = [
    "I was charged twice this month but only saw one service period",
    "The API is returning 500 errors when I try to create invoices",
    "I need to understand how my data is stored and who has access to it"
]

print("=" * 80)
print("ROUTER-SPECIALIST DEMO")
print("=" * 80)

for i, request in enumerate(test_requests, 1):
    print(f"\n--- Request {i} ---")
    print(f"Input: {request}")
    print()

    result = router.route(request)

    print(f"Classification: {result['classification']}")
    print(f"Specialist: {result['specialist']}")
    print(f"Handoff: {result.get('handoff', False)}")
    print(f"\nResponse:\n{result['response']}")
    print()

Why Router-Specialist Works

Each specialist has a focused system prompt. They do not need to be good at everything. They need to be great at their domain.

The router handles the complexity of coordination. Specialists stay simple. This separation makes the system easier to maintain and improve.

Companies using Router-Specialist report 25-40% higher quality responses compared to generalist agents. The domain expertise shows through.

Production Considerations

  • Track classification accuracy and refine router
  • Implement fallback to human for low confidence
  • Log all handoffs for analysis
  • A/B test different specialist prompts
  • Use smaller models for routing to save costs

Pattern 4: Planner-Critic-Executor - Add a Review Layer

Planner-Critic-Executor adds a critic between planning and execution. The planner generates a plan. The critic reviews and validates it. The executor only runs approved plans. This pattern catches errors before they happen.

This pattern is ideal for contract drafting, financial reporting, code review, and any high-stakes work where errors are expensive.

When to Use It

You need Planner-Critic-Executor when:

  • Errors in execution are expensive or dangerous
  • Plans need validation before action
  • Quality gates are critical
  • You can afford extra tokens for the critic step

Real World Example: Financial Report Generation

Financial services firms use this pattern for quarterly reporting. The planner drafts the report structure. The critic validates calculations, checks for regulatory compliance, and flags anomalies. The executor only generates the final report after approval.

Here is how to build a Planner-Critic-Executor workflow:

from openai import OpenAI
from typing import List, Dict, Any
import json

client = OpenAI()

class PlannerCriticExecutor:
    def __init__(self):
        self.planner_prompt = """You are a planning specialist. Generate detailed plans to accomplish objectives.
Break down work into clear, executable steps. Consider edge cases and dependencies.
Output plans as structured JSON."""

        self.critic_prompt = """You are a quality control specialist. Review plans for:
1. Logical consistency and feasibility
2. Completeness and edge case coverage
3. Risk identification and mitigation
4. Compliance with requirements

Be critical. Flag issues. Suggest improvements. Reject risky plans."""

        self.executor_prompt = """You are an execution specialist. Execute approved plans precisely.
Follow each step in order. Use the tools available to you.
Report results clearly and accurately."""

    def run(self, objective, requirements=None):
        print(f"\n{'='*80}")
        print(f"Objective: {objective}")
        print(f"{'='*80}\n")

        # Step 1: Plan
        print("Step 1: Generating plan...")
        plan = self._plan(objective, requirements)
        print(f"Plan generated with {len(plan['steps'])} steps\n")

        # Step 2: Critique
        print("Step 2: Critiquing plan...")
        critique = self._critique(objective, plan)
        print(f"Critique: {critique['status']}\n")

        # Step 3: Handle critique results
        if critique["status"] == "approved":
            print("Step 3: Executing approved plan...")
            results = self._execute(plan)
        elif critique["status"] == "needs_revision":
            print("Step 3: Revising plan based on critique...")
            revised_plan = self._revise_plan(objective, plan, critique)
            print("Step 4: Critiquing revised plan...")
            revised_critique = self._critique(objective, revised_plan)

            if revised_critique["status"] == "approved":
                print("Step 5: Executing revised plan...")
                results = self._execute(revised_plan)
            else:
                print("Plan still needs revision. Escalating to human review.")
                return {
                    "status": "escalated",
                    "plan": plan,
                    "critique": critique,
                    "revised_plan": revised_plan,
                    "revised_critique": revised_critique
                }
        else:
            print("Plan rejected. Escalating to human review.")
            return {
                "status": "rejected",
                "plan": plan,
                "critique": critique
            }

        # Step 4: Verify execution
        print("\nStep 6: Verifying execution results...")
        verification = self._verify(objective, plan, results)

        print(f"\n{'='*80}")
        print("COMPLETE")
        print(f"{'='*80}\n")

        return {
            "status": "complete",
            "plan": plan,
            "critique": critique,
            "results": results,
            "verification": verification
        }

    def _plan(self, objective, requirements):
        prompt = f"""Generate a detailed plan to accomplish this objective:

Objective: {objective}

{f'Requirements: {requirements}' if requirements else ''}

Output the plan as JSON with this structure:
{{
  "steps": [
    {{
      "step_number": 1,
      "description": "What this step does",
      "action": "specific action to take",
      "expected_output": "what this produces",
      "dependencies": [],
      "risks": ["potential risks to consider"]
    }}
  ]
}}

Be specific. Include 5-10 steps. Consider what could go wrong."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "system", "content": self.planner_prompt},
                     {"role": "user", "content": prompt}],
            temperature=0.3,
            response_format={"type": "json_object"}
        )

        return json.loads(response.choices[0].message.content)

    def _critique(self, objective, plan):
        prompt = f"""Review this plan:

Objective: {objective}

Plan:
{json.dumps(plan, indent=2)}

Critique the plan for:
1. Logical flow and feasibility
2. Completeness (are steps missing?)
3. Edge cases and error handling
4. Compliance with the objective
5. Risk level

Output JSON:
{{
  "status": "approved" | "needs_revision" | "rejected",
  "issues": ["list specific issues"],
  "suggestions": ["concrete improvements"],
  "confidence_score": 0.0-1.0,
  "risk_level": "low" | "medium" | "high"
}}

Be thorough. If unsure, recommend revision."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "system", "content": self.critic_prompt},
                     {"role": "user", "content": prompt}],
            temperature=0.3,
            response_format={"type": "json_object"}
        )

        return json.loads(response.choices[0].message.content)

    def _revise_plan(self, objective, plan, critique):
        prompt = f"""Revise this plan based on critique:

Objective: {objective}

Original Plan:
{json.dumps(plan, indent=2)}

Critique:
{json.dumps(critique, indent=2)}

Revise the plan to address all issues. Output in the same JSON format."""

        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "system", "content": self.planner_prompt},
                     {"role": "user", "content": prompt}],
            temperature=0.3,
            response_format={"type": "json_object"}
        )

        return json.loads(response.choices[0].message.content)

    def _execute(self, plan):
        results = []

        for step in plan["steps"]:
            print(f"  Executing step {step['step_number']}: {step['description']}")

            # In production, this would execute actual actions
            # For demo, we simulate execution
            result = self._execute_step(step)
            results.append({
                "step": step["step_number"],
                "description": step["description"],
                "result": result,
                "status": "completed"
            })

        return results

    def _execute_step(self, step):
        # Simulated execution - in production call actual APIs
        return {
            "output": f"Executed: {step['action']}",
            "success": True,
            "duration_ms": 150
        }

    def _verify(self, objective, plan, results):
        prompt = f"""Verify these execution results against the plan:

Objective: {objective}

Plan:
{json.dumps(plan, indent=2)}

Results:
{json.dumps(results, indent=2)}

Check:
1. Did all steps complete successfully?
2. Do results match expected outputs?
3. Was the objective accomplished?

Output JSON:
{{
  "verification_status": "passed" | "failed" | "partial",
  "issues_found": ["any problems"],
  "objective_achieved": true | false
}}"""

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.2,
            response_format={"type": "json_object"}
        )

        return json.loads(response.choices[0].message.content)


# Usage
pce = PlannerCriticExecutor()

# Example: Financial report generation
result = pce.run(
    objective="Generate Q1 2026 revenue report including MRR, churn, and expansion revenue",
    requirements="Must include exact calculations, source data references, and compliance notes"
)

print("\n" + "="*80)
print("SUMMARY")
print("="*80)
print(f"Status: {result['status']}")
print(f"Plan Steps: {len(result['plan']['steps'])}")
print(f"Verification: {result['verification']['verification_status']}")
print(f"Objective Achieved: {result['verification']['objective_achieved']}")

Why Planner-Critic-Executor Works

The critic catches the planner's mistakes before execution. For high-stakes work, this is invaluable.

Consider a financial report. The planner might miss a regulatory requirement. The critic catches it. The plan gets revised. Only then does execution proceed. The cost of an extra API call is nothing compared to the cost of a regulatory violation.

Companies using this pattern report 60-80% fewer critical errors in high-stakes workflows.

Production Considerations

  • Use larger models for the critic to ensure thoroughness
  • Implement escalation if critic rejects multiple times
  • Track what types of issues the critic catches
  • Consider domain-specific critics (legal critic, technical critic)
  • Cache plans that pass critic review for reuse

Pattern 5: Reflection-Loop - Think, Act, Reflect, Improve

Reflection-Loop adds a reflection step after execution. The system reviews what happened, learns from the outcome, and updates its approach for next time. This is continuous improvement built into the workflow.

This pattern is ideal for learning systems, adaptive workflows, and any application where performance improves over time.

When to Use It

You need Reflection-Loop when:

  • Workflows repeat with similar inputs
  • You want continuous improvement
  • Feedback is available after execution
  • You can store and retrieve past reflections

Real World Example: Code Review Automation

GitHub Copilot and similar tools use reflection loops. The system reviews code, generates suggestions, learns from which suggestions are accepted or rejected, and improves over time.

Here is how to build a Reflection-Loop workflow:

from openai import OpenAI
from typing import Dict, List, Any
import json
from datetime import datetime
import sqlite3

client = OpenAI()

class ReflectionLoop:
    def __init__(self, db_path="reflections.db"):
        self.db_path = db_path
        self._init_db()

    def _init_db(self):
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        cursor.execute("""
            CREATE TABLE IF NOT EXISTS reflections (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                workflow_type TEXT,
                input_hash TEXT,
                action_taken TEXT,
                outcome TEXT,
                reflection TEXT,
                improvements TEXT,
                timestamp TEXT
            )
        """)

        conn.commit()
        conn.close()

    def run(self, workflow_type, input_data):
        print(f"\n{'='*80}")
        print(f"Workflow: {workflow_type}")
        print(f"{'='*80}\n")

        # Step 1: Check for past reflections
        print("Step 1: Checking past reflections...")
        past_reflections = self._get_reflections(workflow_type, input_data)
        context = self._build_context(past_reflections)
        print(f"Found {len(past_reflections)} relevant reflections\n")

        # Step 2: Execute action
        print("Step 2: Executing action...")
        action_result = self._execute(workflow_type, input_data, context)
        print(f"Action completed\n")

        # Step 3: Reflect on outcome
        print("Step 3: Reflecting on outcome...")
        reflection = self._reflect(workflow_type, input_data, action_result)
        print(f"Reflection: {reflection['quality']}\n")

        # Step 4: Extract improvements
        if reflection["quality"] == "good":
            print("Step 4: Extracting improvements...")
            improvements = self._extract_improvements(workflow_type, input_data, action_result)
            print(f"Found {len(improvements)} improvements\n")
        else:
            improvements = []

        # Step 5: Store for next time
        print("Step 5: Storing reflection...")
        self._store_reflection(
            workflow_type=workflow_type,
            input_data=input_data,
            action_taken=action_result,
            reflection=reflection,
            improvements=improvements
        )
        print("Reflection stored\n")

        return {
            "action_result": action_result,
            "reflection": reflection,
            "improvements": improvements
        }

    def _get_reflections(self, workflow_type, input_data):
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        # Simple hash-based lookup - in production, use embeddings
        input_hash = hash(str(input_data))

        cursor.execute("""
            SELECT reflection, improvements
            FROM reflections
            WHERE workflow_type = ? AND timestamp > datetime('now', '-30 days')
            ORDER BY timestamp DESC
            LIMIT 5
        """, (workflow_type,))

        results = cursor.fetchall()
        conn.close()

        return [
            {
                "reflection": json.loads(row[0]),
                "improvements": json.loads(row[1])
            }
            for row in results
        ]

    def _build_context(self, reflections):
        if not reflections:
            return ""

        context_parts = []
        for i, ref in enumerate(reflections, 1):
            if ref["improvements"]:
                context_parts.append(f"Insight {i}: {', '.join(ref['improvements'])}")

        return "\n".join(context_parts)

    def _execute(self, workflow_type, input_data, context):
        if workflow_type == "code_review":
            return self._execute_code_review(input_data, context)
        elif workflow_type == "email_response":
            return self._execute_email_response(input_data, context)
        else:
            return {"status": "unknown workflow"}

    def _execute_code_review(self, input_data, context):
        code = input_data.get("code", "")
        context_note = f"\n\nPast insights to apply:\n{context}" if context else ""

        prompt = f"""Review this code for:
1. Security vulnerabilities
2. Performance issues
3. Code style and best practices
4. Edge cases and error handling

Code:
```python
{code}

{context_note}

Provide specific, actionable feedback."""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3
    )

    return {
        "type": "code_review",
        "feedback": response.choices[0].message.content,
        "issues_found": 3,  # Would parse from actual response
        "severity": "medium"
    }

def _execute_email_response(self, input_data, context):
    email = input_data.get("email", "")
    context_note = f"\n\nPast insights to apply:\n{context}" if context else ""

    prompt = f"""Draft a response to this customer email.

Email: {email} {context_note}

Requirements:

  • Be helpful and empathetic

  • Address all concerns

  • Provide clear next steps"""

      response = client.chat.completions.create(
          model="gpt-4o",
          messages=[{"role": "user", "content": prompt}],
          temperature=0.7
      )
    
      return {
          "type": "email_response",
          "draft": response.choices[0].message.content,
          "tone": "professional"
      }
    

    def _reflect(self, workflow_type, input_data, action_result): prompt = f"""Reflect on this workflow execution:

Workflow: {workflow_type}

Input: {json.dumps(input_data, indent=2)[:500]}

Action Result: {json.dumps(action_result, indent=2)[:500]}

Reflect on:

  1. Was the action appropriate for the input?
  2. Was the quality good or could it be improved?
  3. What should be done differently next time?

Output JSON: {{ "quality": "good" | "acceptable" | "poor", "assessment": "brief assessment of what worked and what didn't", "what_to_keep": ["what to do again"], "what_to_change": ["what to improve"] }}"""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3,
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

def _extract_improvements(self, workflow_type, input_data, action_result):
    prompt = f"""Extract specific improvements from this execution:

Workflow: {workflow_type}

Input: {str(input_data)[:300]}

Result: {str(action_result)[:300]}

Extract 3-5 specific improvements as concise bullet points. Each should be actionable and reusable.

Return as a JSON array of strings."""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3,
        response_format={"type": "json_object"}
    )

    result = json.loads(response.choices[0].message.content)
    return result.get("improvements", [])

def _store_reflection(self, workflow_type, input_data, action_taken, reflection, improvements):
    conn = sqlite3.connect(self.db_path)
    cursor = conn.cursor()

    cursor.execute("""
        INSERT INTO reflections (
            workflow_type, input_hash, action_taken, outcome,
            reflection, improvements, timestamp
        ) VALUES (?, ?, ?, ?, ?, ?, ?)
    """, (
        workflow_type,
        hash(str(input_data)),
        json.dumps(action_taken),
        json.dumps(reflection),
        json.dumps(reflection),
        json.dumps(improvements),
        datetime.utcnow().isoformat()
    ))

    conn.commit()
    conn.close()

Usage

reflection_loop = ReflectionLoop()

First run - no past reflections

print("\n" + "="*80) print("FIRST RUN - NO PAST REFLECTIONS") print("="*80)

result1 = reflection_loop.run( workflow_type="code_review", input_data={ "code": """ def process_user_data(user_id): query = f"SELECT * FROM users WHERE id = {user_id}" result = db.execute(query) return result """ } )

Second run - with reflections

print("\n" + "="*80) print("SECOND RUN - WITH REFLECTIONS") print("="*80)

result2 = reflection_loop.run( workflow_type="code_review", input_data={ "code": """ def process_user_data(user_id): query = f"SELECT * FROM users WHERE id = {user_id}" result = db.execute(query) return result """ } )

print("\n" + "="*80) print("COMPARISON") print("="*80) print(f"First run quality: {result1['reflection']['quality']}") print(f"Second run quality: {result2['reflection']['quality']}") print(f"Improvements extracted: {len(result2['improvements'])}")


### Why Reflection-Loop Works

The system learns from every execution. Mistakes become insights. Successful patterns become rules.

Companies using reflection loops report 20-40% improvement in quality over time. The first thousand executions build the foundation. The next ten thousand benefit from it.

### Production Considerations

- Store reflections in a database, not in-memory
- Use embeddings for semantic search of past reflections
- Periodically review and curate high-quality reflections
- Set up alerts when quality degrades
- Consider human feedback loops for critical workflows

## How to Choose the Right Pattern

Not every workflow needs every pattern. Here is a decision framework:

| Pattern | Best For | Complexity | Token Cost | Resilience |
|---------|----------|------------|------------|------------|
| ReAct | Conditional workflows, triage | Medium | Low-Medium | High |
| Plan-and-Execute | Multi-step tasks, research | Medium | Low | Very High |
| Router-Specialist | Domain-specific work, support | High | Medium-High | Medium |
| Planner-Critic-Executor | High-stakes, compliance | Very High | High | Very High |
| Reflection-Loop | Repeating workflows, learning | High | Medium-High | Medium |

### Start Simple

If you are just starting, begin with ReAct. It gives you the most value for the least complexity. Move up the ladder as you gain experience.

### Combine Patterns

The most sophisticated systems combine patterns. You might use Router-Specialist to route to a Planner-Critic-Executor, with Reflection-Loop learning from the results.

The key is to start simple and add complexity only when you need it.

## The Production Checklist

Before you ship any of these patterns to production:

1. **Add Observability**
   - Log every step, decision, and action
   - Track token usage per step
   - Monitor latency and error rates
   - Set up alerts for failures

2. **Implement Guardrails**
   - Timeout limits on each step
   - Retry logic with exponential backoff
   - Circuit breakers for API failures
   - Fallback to human for low confidence

3. **Measure Performance**
   - Success rate per workflow
   - Average latency
   - Cost per execution
   - User satisfaction scores

4. **Version Control Prompts**
   - Track prompt changes
   - A/B test new versions
   - Roll back if quality degrades
   - Document why changes were made

5. **Plan for Scale**
   - Cache results when possible
   - Use smaller models for routing
   - Batch independent requests
   - Optimize for your most common workflows

## The Bottom Line

The era of prompt engineering is ending. The era of pattern engineering is here.

The companies shipping production AI automation are not writing perfect prompts. They are building reusable workflows that work consistently, handle errors gracefully, and improve over time.

The patterns I have shared here are not theoretical. They are extracted from production systems handling millions of requests every day.

Pick one pattern. Implement it. Measure the results.

Then do it again.

The future of AI automation belongs to teams who can engineer systems, not teams who can write prompts.

Build workflows that work. That is how you win.

---

**Want production templates for these patterns?** I have ready-to-use implementations for ReAct, Plan-and-Execute, and Router-Specialist that you can drop into your codebase. Reply "templates" and I will send them over.

Get new articles by email

Short practical updates. No spam.

Most AI automations die in the prototype phase. Here is a complete framework to take your automation from idea to profitable production system.

Production AI agents are burning budget unnecessarily. Here is the cost optimization framework companies are using to cut LLM spend by 85% while improving output quality.

The multi-agent hype is real, but production reality is different. Here is when single agents outperform multi-agent systems, the coordination costs nobody talks about, and how to decide which architecture fits your use case.