🔥 NVIDIA quietly drops a bomb: an 8B router model that beats GPT-5
NVIDIA released Orchestrator-8B, a tiny routing model that decides when to answer itself and when to call tools like search, code, APIs, or bigger LLMs. And it’s shockingly good: 37.1% on Humanity’s Last Exam vs GPT-5’s 35.1%, while being ~2.5× more efficient.
How it works:
• Trained on ToolScale, a huge synthetic dataset of multi-step tasks.
• Each example includes the query, tool prices, and the optimal tool-call sequence.
• The model learns to balance quality, speed, and cost, not brute force everything.
Benchmarks:
Acr
NVIDIA released Orchestrator-8B, a tiny routing model that decides when to answer itself and when to call tools like search, code, APIs, or bigger LLMs. And it’s shockingly good: 37.1% on Humanity’s Last Exam vs GPT-5’s 35.1%, while being ~2.5× more efficient.
How it works:
• Trained on ToolScale, a huge synthetic dataset of multi-step tasks.
• Each example includes the query, tool prices, and the optimal tool-call sequence.
• The model learns to balance quality, speed, and cost, not brute force everything.
Benchmarks:
Acr