Browsing Tag
llm-evaluation
2 posts
Gemma 4 as an LLM-as-a-Judge: Batch Responsible AI Evaluation on Cloud TPU v5e
TPU Batch Eval Pipeline for RAI-Checklist-CLI Calibrated Trust (the governance framework I’ve been building for agentic AI systems)…
LLMs as Judges: Measuring Bias, Hinting Effects, and Tier Preferences
By Aashi Dutt and Sayak Paul(AI GDEs) If you’re using LLMs to evaluate other LLMs for tasks like ranking…