📈 Key Takeaways on GPT-5.2 Performance
• Top-tier performance in professional tasks: GPT-5.2 delivered 70.9% on GDPVal well ahead of Opus 4.5’s 60% making it the strongest model for office-style, knowledge-work evaluations.
• Coding improvements, but not the leader: The model shows major gains in software development benchmarks, though it still trails Opus 4.5 slightly in pure coding depth and execution.
• Best-to-date reduction in hallucinations: GPT-5.2 shows the strongest reliability improvements so far. While cross-model comparisons are tricky, its hallucination rate appears significantly
• Top-tier performance in professional tasks: GPT-5.2 delivered 70.9% on GDPVal well ahead of Opus 4.5’s 60% making it the strongest model for office-style, knowledge-work evaluations.
• Coding improvements, but not the leader: The model shows major gains in software development benchmarks, though it still trails Opus 4.5 slightly in pure coding depth and execution.
• Best-to-date reduction in hallucinations: GPT-5.2 shows the strongest reliability improvements so far. While cross-model comparisons are tricky, its hallucination rate appears significantly
