Beyond Black Box Scores: How Musubi Trains Custom AI for Trust and Safety Teams

Beyond Black Box Scores: How Musubi Trains Custom AI for Trust and Safety Teams

Listen to this episode on: Spotify | Apple Podcasts

What do you do when off-the-shelf moderation scores aren’t good enough—and the alternative is paying human contractors to spend their days reviewing traumatizing content at scale?

In this episode of Just Now Possible, Teresa Torres talks with Nikki Marinsek (Data Scientist), Brian McCaffrey (Software Engineer), and Dan Means (Machine Learning Engineer) from Musubi, an AI-native trust and safety toolkit for content platforms. Musubi builds custom-trained ML models and LLM-powered moderation tools that adapt to each platform’s unique policies—from dating apps to social networks to AI inference endpoints.

They walk through the full journey: training the first prototype on tabular data, discovering their AI was sometimes catching things human moderators missed, and building a policy optimizer that uses agentic flows to help teams iterate on their moderation policies without needing a data scientist in the room.

You’ll hear how they balance latency, accuracy, and cost for clients handling hundreds of millions of actions per month, why pushing eval tools directly to customers is their core product strategy, and what’s next as they build flexible agentic orchestration for non-technical trust and safety teams.

Show Notes

Guests:

  • Nikki Marinsek, Data Scientist, Musubi
  • Brian McCaffrey, Software Engineer, Musubi
  • Dan Means, Machine Learning Engineer, Musubi

In this episode:

  • Why off-the-shelf moderation scores fail and how custom-trained models fix that
  • How Musubi combines traditional ML with LLMs for different moderation tasks
  • The discovery that AI can outperform human moderators—and how to communicate that to clients
  • Using AI as a judge to referee disagreements between AI and human decisions
  • How Musubi onboards new customers with “reverse demos”
  • What custom model training actually means: fine-tuning, feature engineering, and reusable deployment pipelines
  • The policy optimizer: an agentic flow that helps customers iterate on their LLM moderation policies
  • Why pushing eval tools directly to customers is a core product strategy
  • How Musubi is building flexible orchestration workflows for non-technical trust and safety teams

Resources & Links:

  • Musubi — AI-powered trust and safety toolkit for content platforms
  • Maven AI Evals Course — The course Teresa took to learn about evals (get 35% off with Teresa’s affiliate link)

Chapters

00:00 Meet the Team
01:18 Why Everyone Wears Product
02:32 What Musubi Builds
04:51 AI for Human Moderation
09:59 Adversaries and Asymmetry
11:48 Early Days and Low Latency
13:35 First Prototype Slice
15:33 Traditional ML Meets LLMs
19:52 Benchmarking Against Humans
23:09 LLM as Judge and Policy Gaps
29:53 From Prototype to Platform
31:15 Customer Onboarding Reverse Demos
36:08 Custom Models Per Customer
38:05 Fine Tuning vs Training
39:14 Embedding Driven Classification
40:04 Cost and Latency Tradeoffs
43:21 Productizing Customization
49:16 Scaling Prototypes to Production
51:58 Golden Sets and Policy Loops
56:17 Coaching Customers Safely
01:02:06 Gamified Feedback Signals
01:06:19 Agentic Toolkit Roadmap
01:09:05 Workflow Orchestration Future
01:12:06 Wrap Up and Thanks

Full Transcript

Podcast transcripts are only available to paid subscribers.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Automation Fuels Strong Start to 2026 Manufacturing Technology Orders

Related Posts