⚡️ DeepSeek V3.2 (38.2%) is currently the best Open source Model on Cortex-AGI benchmark.
Gemini 3.0 Pro is the top ranker with 45.6%. Cortex-AGI measures how well AI models can perform abstract, out-of-distribution reasoning on procedurally generated logic puzzles across 10 increasingly complex levels, without relying on memorization.
It also measures and compares the performance of proprietary models against open-source models under this rigorous setting.
AI Post ⚪️ | Our X 🏴
Gemini 3.0 Pro is the top ranker with 45.6%. Cortex-AGI measures how well AI models can perform abstract, out-of-distribution reasoning on procedurally generated logic puzzles across 10 increasingly complex levels, without relying on memorization.
It also measures and compares the performance of proprietary models against open-source models under this rigorous setting.
AI Post ⚪️ | Our X 🏴