Common Agentic Workflow Patterns

Large Language Models (LLMs) have transformed how we tackle complex tasks, enabling the creation of agents that operate autonomously. Unlike AI agents that function without a predefined sequence of steps, agentic workflows leverage LLMs within structured, task-oriented processes. To design effective LLM workflows, it's crucial to understand key patterns. In this tutorial, we'll explore four essential workflow patterns: Prompt Chaining, Orchestrator-Workers, Evaluator-Optimizer, and Parallelization.

Each of these patterns addresses unique challenges in building AI-powered applications. Let's explore their implementation and best practices. The following diagram illustrates the common agentic workflow patterns.

Common Agentic Workflow Pattern: Prompt Chaining, Orchestrator-Workers, Evaluator-Optimizer, and Parallelization.

Prompt Chaining Pattern

In Prompt chaining, complex tasks are broken into smaller steps. Similar to a sequential workflow each step's output serve as the input for the next prompt. This approach simplifies complex tasks by breaking them into smaller subtasks. It is commonly used when LLM calls depend on each other and when tasks can be easily decomposed into smaller steps. It offers many benefits, such as:

Reduces hallucinations by focusing on smaller tasks.
Enables reasoning across multiple steps.

How to Implement Prompt Chaining?

To implement Prompt Chaining, you should follow these steps:

Decompose the Task: Identify the steps required to complete the task.
Design Prompts: Create specific prompts for each step.
Chain Outputs: Use the output of one prompt as the input for the next.

The following code snippet illustrate how to implement the prompt chaining pattern using two calls to OpenAI LLM.

1from openai import OpenAI
2
3client = OpenAI()
4
5def get_key_topics(text):
6    response = client.chat.completions.create(model="gpt-4o",
7    messages=[{"role": "user", "content": f"Extract key topics from this text: {text}"}])
8    return response.choices[0].message.content
9
10def summarize_topics(topics):
11    response = client.chat.completions.create(model="gpt-4o",
12    messages=[{"role": "user", "content": f"Summarize these topics: {topics}"}])
13    return response.choices[0].message.content
14
15text = "Your document text here."
16topics = get_key_topics(text)
17summary = summarize_topics(topics)
18print(summary)

Evaluator-Optimizer Pattern

The Evaluator-Optimizer Pattern introduces a feedback loop where an evaluator assesses the quality of an LLM's response, and an optimizer refines it iteratively. LLM outputs are not always optimal on the first attempt. By introducing an evaluator, we ensure that outputs meet predefined standards.

How to Implement Evaluator-Optimizer?

Define Evaluation Criteria: Establish metrics or rules for assessing output quality.
Build the Evaluator: Create an agent that evaluates outputs against the criteria.
Design the Optimizer: Develop an agent that refines outputs based on evaluator feedback.
Iterate: Repeat the evaluation and optimization process until the desired quality is achieved.

The following code snippet illustrate how to implement the Evaluator-Optimizer pattern using an iteration between code generation and security checks.

1from openai import OpenAI
2
3client = OpenAI()
4
5def evaluate_code(code_snippet):
6    response = client.chat.completions.create(
7        model="gpt-4o",
8        messages=[
9          {"role": "user", "content": f"Check code security and CVEs in this code: {code_snippet}"}
10        ]
11    )
12    return response.choices[0].message.content
13
14def generate_code(feature, feedback | None = None):
15    content = f"Generate code for : {feature}"
16    if feedback:
17        content = f"Generate code for : {feature} make sure that the code is secure and does not contains these issues {feedback}"
18
19    response = client.chat.completions.create(
20        model="gpt-4o",
21        messages=[
22        {
23            "role": "user",
24            "content": content
25        }
26    ])
27    return response.choices[0].message.content
28
29feature = "My new feature"
30
31generated_code = generate_code(feature)
32feedback = evaluate_code(generated_code)
33# if no feedback, end workflow, else we move to the next iteration
34generated_code = generate_code(feature, feedback)

Parallelization Pattern

The Parallelization Pattern divides large tasks into smaller, independent units that can be processed simultaneously. This approach is used to reduce latency by processing tasks concurrently and handles large workloads by distributing tasks across multiple agents. It can also be used to compare and get the best results from different prompts or models.

How to Implement Parallelization?

Identify Parallelizable Tasks: Determine which tasks can be executed independently.
Distribute Workload: Assign tasks to multiple agents.
Synchronize Results: Collect and combine outputs from parallel tasks.

The following code snippet illustrate how to implement the Parallelization pattern using Python ThreadPoolExecutor.

1from concurrent.futures import ThreadPoolExecutor
2
3def translate_text(segment):
4    response = openai.ChatCompletion.create(
5        model="gpt-4o",
6        messages=[{"role": "user", "content": f"Translate this to French: {segment}"}]
7    )
8    return response["choices"][0]["message"]["content"]
9
10document_segments = ["Hello, how are you?", "This is an AI tutorial.", "LLMs are powerful tools."]
11with ThreadPoolExecutor() as executor:
12    translations = list(executor.map(translate_text, document_segments))
13
14print(translations)

Orchestrator-Workers Pattern

As shown in the diagram above, the Orchestrator-Workers Pattern has three components: the orchestrator, which is responsible for dividing tasks into multiple subtasks and assigning them to workers; the workers, which are essentially LLM calls; and the synthesizer, which collects the workers' outputs and synthesizes the final result. This pattern is useful when tasks are not easily decomposable.

How to Implement Orchestrator-Workers?

Build the Orchestrator: Create a central agent that coordinates task assignment.
Build the Workers: Create independent task handlers
Build the Synthesizer

Below is a code snippet that demonstrates a minimal implementation of this workflow pattern using Celery.

1
2from fastapi import FastAPI
3from celery import Celery
4
5app = FastAPI()
6celery = Celery('tasks', broker='redis://localhost:6379/0')
7
8@celery.task
9def orchestrator(message):
10    # Simulated LLM call
11    return {
12      "search": "internet"
13    }
14
15@celery.task
16def synthesizer(message):
17    return "synthesized output content"
18
19@celery.task
20def internet_search(query):
21    return {
22      "iPhone16": "1000$"
23    }
24
25@celery.task
26def document_search(query):
27    return {
28      "iPhone16": "1100$"
29    }
30
31@app.post("/chat")
32def chat(message: str):
33    task_assignments = orchestrator.delay(message).get()
34    # based on task_assignments call the appropriate search
35    output = synthesizer.delay(outputs).get()
36    return output
37

Wrapping Up

In this tutorial, we explored four essential workflow patterns: Prompt Chaining, Orchestrator-Workers, Evaluator-Optimizer, and Parallelization. This tutorial demonstrates that you don't always need complex workflows or frameworks to build agentic systems. Instead, you can rely on basic LLM features to implement your desired functionality. If you need to create an agentic system, start with small iterations, refine frequently, and don't hesitate to combine patterns for even greater results.