One Small Change That Makes OpenAI Work Perfectly in ADK Parallel Pipelines

ADK `ParallelAgent` behaves differently across models: Google models handle sub-agents cleanly, while non-Google models, when invoked through LiteLLM, risk interleaved messages and broken outputs. Base image from ADK Parallel Agents documentation.

If you’re following Google ADK’s ParallelAgent documentation to build a multi-agent system with non-Google models like OpenAI (via LiteLLM or custom adapters), be warned: the standard approach can lead to serious headaches.

Conversation histories often get interleaved, message ordering can break, random API errors can occur, and subtle race conditions can surface during parallel execution.
🔺 Even if the workflow runs perfectly with Google-native models, non-Google LLMs need careful handling to avoid unreliable results.

In this article, we’ll explain why this happens, how ADK manages context internally, and why Google models handle parallelism seamlessly while OpenAI-based agents often fail.

Most importantly, we’ll show a clean, practical workaround using a custom GPTParallelAgent that isolates context for each sub-agent.

You can copy and paste the implementation directly into your multi-agent pipeline. 😉

⚠️ The Root Problem: Shared Context in Parallel Execution

In Google ADK, agents run using a shared ctx (context) object that stores the session state, conversation history, memory, tool bindings, and intermediate execution state. When multiple sub-agents run in parallel, ADK passes this same context instance to each of them.

With Gemini models, this works because Google’s infrastructure safely handles concurrent mutations. However, with OpenAI or LiteLLM models, it doesn’t.

If multiple agents append to the same conversation history at the same time, they concurrently mutate shared state, leading to mixed turns, out-of-order messages, and, sometimes, malformed payload errors from OpenAI. These failures are often subtle and non-deterministic, making them difficult to debug.

The problem isn’t parallelism itself — it’s shared mutable state.

🛠️ The Fix: Context Isolation Per Agent

The solution is simple:

Give every parallel agent its own copy of the context.

Instead of:

agent.run_async(ctx)

We do:

local_ctx = copy.copy(ctx)
agent.run_async(local_ctx)

Each agent runs in an isolated “branch” of history. No interleaving. No corruption.

🙌 The Workaround: GPTParallelAgent

Here is a production-ready drop-in replacement for ADK’s parallel behavior when using non-Google models.

You can copy this directly.

import asyncio
import copy
from google.adk.agents import BaseAgent
from google.adk.events import Event


class GPTParallelAgent(BaseAgent):
"""
Runs sub-agents in parallel but isolates session history to prevent
OpenAI/LiteLLM interleaving errors.
"""

def __init__(self, name: str, sub_agents: list, description: str = ""):
super().__init__(name=name, description=description)
self.sub_agents = sub_agents

async def _run_async_impl(self, ctx):
# Helper to run an agent and collect its events independently
async def run_isolated(agent):
# Clone the context so each agent has its own private history branch
local_ctx = copy.copy(ctx)

events = []
async for event in agent.run_async(local_ctx):
events.append(event)

return events

# Run all agents concurrently
all_results = await asyncio.gather(
*[run_isolated(agent) for agent in self.sub_agents]
)

# Flatten results and yield events upstream
for agent_events in all_results:
for event in agent_events:
yield event

How to use this in your Project?

This is super easy with the implementation above!

✍️ Step 1: Define your sub-agents

accommodation_agent = SomeLLMAgent(name="accommodation_selector")
activity_agent = SomeLLMAgent(name="activity_recommender")
transport_agent = SomeLLMAgent(name="transport_planner")

🤖 Step 2: Wrap Them in GPTParallelAgent

parallel_agent = GPTParallelAgent(
name="parallel_agent",
sub_agents=[
accommodation_agent,
activity_agent,
transport_agent,
],
description="Runs agents in isolation"
)

You can now run them in parallel, even with non-Google models — saving valuable execution time and significantly improving overall efficiency. 🚀

If you like the article, please subscribe to my latest ones.
To get in touch, contact me on
LinkedIn or via ashmibanerjee.com.

GenAI usage disclosure: GenAI models were used to check for grammar inconsistencies in the blog and to refine the text for greater clarity. The authors accept full responsibility for the content presented in this blog.


One Small Change That Makes OpenAI Work Perfectly in ADK Parallel Pipelines was originally published in Google Developer Experts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

VIDEO PODCAST | What A Quality Signature Really Means

Next Post

Cursor has reportedly surpassed $2B in annualized revenue

Related Posts