Building “Yuh Hear Dem”: A Parliamentary AI with Google’s ADK and a Lesson in Agentic Design

building-“yuh-hear-dem”:-a-parliamentary-ai-with-google’s-adk-and-a-lesson-in-agentic-design

Democracy thrives on transparency, but the raw data of governance—hours of parliamentary video, dense transcripts, and complex legislation—is often inaccessible to the very citizens it’s meant to serve. This was the challenge that sparked “Yuh Hear Dem,” our submission for the Google Agent Development Kit (ADK) Hackathon. The project began as a father-daughter mentoring journey into AI and evolved into a powerful tool for civic engagement in Barbados. It combines deep experience in backend AI architecture with a fresh perspective on user experience, guided by principles from the world of education. This blend allowed us to build a system that is not only technically sophisticated but also genuinely accessible, transforming the way citizens can interact with their government.

Asking a question

Visualising the knowledge graph

Youtube video provenance

This post details our technical journey, from the initial data pipeline to a crucial architectural pivot, all powered by Google’s Agent Development Kit (ADK), Gemini, and a Knowledge Graph backend.

The Problem: From Hours of Video to Actionable Insight

Parliamentary sessions in Barbados, like in many places, are published as long-form YouTube videos. Finding what a specific minister said about a particular topic, like the “sugar tax,” requires manually scrubbing through hours of footage. This creates a significant barrier to civic engagement.

Our goal was to transform this unstructured data into a structured, queryable format, allowing any citizen to ask a natural language question and get a direct, source-verified answer.

The Solution: An AI-Powered Parliamentary Research Assistant

“Yuh Hear Dem” (Bajan dialect for “Did you hear them?”) is a conversational agent that allows users to query parliamentary data. A user can ask, “What has been discussed about the sugar tax?” and receive a concise summary, direct quotes from MPs, and links to the exact moments in the source videos.

The system is built on a sophisticated Retrieval-Augmented Generation (RAG) pipeline that combines the semantic power of vector search with the structured precision of a knowledge graph.

The Technical Architecture: A Three-Stage Pipeline

Our system is built on a robust data processing and retrieval pipeline.

1. Ingest, Clean, Extract

The foundation of our system is a structured knowledge base built from raw, messy transcripts.

  • Ingest: We start by ingesting the full YouTube transcripts from hundreds of parliamentary session videos—over 1,200 hours of content.
  • Clean: The raw transcripts are often riddled with grammatical errors and misattributions. We use Gemini to clean and structure this text, correcting grammar, identifying speakers, and aligning the text with accurate video timestamps.
  • Extract: With clean, timestamped text, we use Gemini again to perform entity and relationship extraction. It identifies people, topics, bills, and the connections between them (e.g., “Minister X spoke about Bill Y”). This structured data, including over 33,000 nodes and 86,000 statements, is stored in MongoDB Atlas.

This process creates a rich, interconnected Knowledge Graph that forms the backbone of our agent’s “brain.”

2. Hybrid Retrieval with GraphRAG

When a user asks a question, the agent doesn’t just rely on a simple semantic search. It uses a hybrid retrieval strategy:

  • Vector Search: We run a vector search over MongoDB Atlas embeddings to find semantically similar transcript segments. This is great for broad, topic-based queries.
  • Knowledge Graph Search: We traverse the entities and relationships in our knowledge graph to find precise connections (e.g., Minister -> Topic -> Session). This excels at specific, factual queries.

The results are combined and ranked using a hybrid scoring model (GraphRAG), giving us the best of both worlds. Critically, every piece of information is grounded in video timestamps, allowing us to generate direct links to the source.

3. The Agent Architecture Evolution: A Lesson in Pragmatism

Our journey with ADK taught us a valuable lesson about the current state of multi-agent frameworks.

The Original Vision: A Multi-Agent Pipeline

# Root Conversational Agent
ConversationalAgent(
    model="gemini-2.0-flash",
    sub_agents=[ResearchPipeline]
)

# Sequential Research Pipeline
ResearchPipeline(SequentialAgent):
├── ResearcherAgent(LlmAgent)
   ├── Tools: [hybrid_search_turtle, authority_search_turtle, topic_search_turtle]
   └── Role: Parliamentary data collection via MCP tools

├── ProvenanceAgent(BaseAgent) 
   ├── Custom Implementation: Video source enrichment
   └── Role: Enrich entities with YouTube timestamps & transcripts

└── WriterAgent(LlmAgent)
    ├── Dynamic Instruction: Receives enriched turtle data
    └── Role: Synthesize findings into cited responses

Initially, we designed a classic multi-agent system using a SequentialAgent. The idea was to have a clear separation of concerns:

  • ConversationalAgent: The main entry point.
  • ResearchPipeline (SequentialAgent):
    • ResearcherAgent: Collects data from our knowledge graph.
    • ProvenanceAgent: Enriches the data with video sources and timestamps.
    • WriterAgent: Synthesizes the final response.

The Roadblock: Session State Management

We quickly hit a wall. We found that ctx.session.state was not being reliably shared between the agents in our SequentialAgent pipeline. The ResearcherAgent would fetch data, but by the time the flow reached the ProvenanceAgent or WriterAgent, the state was often empty or corrupted.

# What SHOULD have worked:
ctx.session.state["turtle_results"] = raw_turtle_data  # ResearcherAgent
enriched_turtle = ctx.session.state.get("turtle_results", [])  # ProvenanceAgent ❌

# What we encountered:
# - Session state not reliably shared between agents in SequentialAgent
# - Context loss during agent handoffs  
# - Empty/corrupted state in downstream agents
# - Related to: https://github.com/google/adk-python/issues/1119

This appears to be a known challenge, which we tracked in GitHub Issue #1119. This roadblock became a critical learning moment: while the theory of multi-agent systems is powerful, the practical implementation can be fraught with state management complexities.

The Pivot: A Robust Single-Agent Solution

To solve this, we refactored our architecture into a single intelligent agent with a set of specialized function tools. This approach proved to be far more reliable and easier to debug.

The agent maintains context reliably, and the tools are called synchronously, ensuring data is passed correctly.

# Refactored Single-Agent Solution
self.agent = LlmAgent(
    name="YuhHearDem",
    model="gemini-2.5-flash-preview-05-20",
    planner=BuiltInPlanner(),
    tools=[
        FunctionTool(search_parliament_hybrid), # Was: ResearcherAgent
        FunctionTool(clear_session_graph),      # Memory management
        FunctionTool(get_session_graph_stats),  # Session insights
        FunctionTool(visualize_knowledge_graph) # Was: Custom visualization
    ]
)

This pragmatic pivot allowed us to achieve our desired modularity—with each tool handling a specific task—without the overhead and unreliability of inter-agent state management.

The User Experience: Making AI Accessible

Technology is only as good as its interface. Our focus on educational design was instrumental here. The frontend was built to make the agent’s powerful capabilities accessible to everyone.

Key design principles included:

  • Progressive Disclosure: Information is presented in expandable cards, preventing cognitive overload. Users see a high-level summary first and can expand for details.
  • Visual Learning: We used D3.js to create interactive knowledge graphs, helping users visually understand the relationships between speakers, topics, and sessions.
  • Contextual Guidance: The agent uses the knowledge graph to generate relevant follow-up questions, guiding users on natural exploration paths.

Conclusion and What’s Next

“Yuh Hear Dem” is more than just a technical demo; it’s a functioning tool for enhancing democratic transparency. Our journey taught us several key lessons:

  1. The Power of Hybrid RAG: Combining knowledge graphs and vector search provides superior retrieval accuracy.
  2. ADK’s Strengths: While multi-agent state sharing needs maturing, ADK’s single-agent with function tools model is incredibly robust for building complex, reliable AI systems.
  3. Pragmatism Over Purity: A simpler, more reliable architecture is often better than a theoretically “purer” but fragile one.
  4. Human-Centered Design is Key: An intuitive UI, grounded in learning principles, is essential for making powerful AI accessible and useful.

We invite you to explore the project yourself.

Our next steps involve expanding the data sources to include official legislative documents and exploring a return to a multi-agent architecture as the ADK framework evolves. For now, we’re proud to have built a tool that helps citizens hear what really matters.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
the-energy-is-worth-the-effort-–-the-catalyst-to-automate

The Energy is Worth the Effort – the Catalyst to Automate

Next Post
generative-ai-vs-predictive-ai:-what-are-the-differences?

Generative AI vs Predictive AI: What are the differences?

Related Posts