Browsing Tag
LLM
135 posts
Two Pre-Registered Benchmarks for Audit-Native RAG: RAB (EU AI Act 10/12/19) + LRB (Time-Travel Retrieval)
Most RAG demos answer “what’s the right chunk?” Very few can answer the two questions a regulator or…
Install last30days-skill Research the Last 30 Days of the Internet: Installing last30days on Hermes Agent
In this article, I’ll show you how to install last30days-skill on Hermes Agent. My Hermes Agent is on…
How to access AI from a blocked region? From 2022 to 2026, a Chinese developer’s perspective
Not long ago, I saw articles analyzing how Chinese people obtain US model API at low prices through…
Three checks that separate an agent demo from a production agent
Shipping an agent demo takes an afternoon. Shipping one that survives a quarter in production is a different…
Gemma 4 12B Is Google’s Biggest Bet on Local Multimodal AI Yet
Google Just Made Your Laptop a Multimodal AI Workstation Yesterday, Google dropped Gemma 4 12B — and if…
Building a Multimodal Indonesian Fake-News Detector with JAX, Flax, and Keras Kinetic on Cloud TPU
How I trained a Stance-Aware Cross-Encoder that classifies Indonesian news headlines against claims — starting on a free Colab TPU…
Extract Plain Text from Medium Posts for RAG and Search Indexes
Chunk clean article content for embeddings, summarization, and full-text search—skip nav, clap bars, and scripts. Extract Plain Text…
Why output-stage PII masking is the wrong protective surface for data exfiltration in RAG
“The output filter runs after the LLM has already seen the confidential data. By then, three classes of…
How to Monitor AI Agents in Production
TLDR Monitoring AI agents in production requires distributed tracing: a single user request fans out into 10 or…
Exploring AI workflow Orchestration: Comparing Weft, Python & Alternative Pipeline Approaches
A few weeks ago I started exploring something that made me rethink how we build AI workflows. Most…