Models/OpenAI o4-mini
OpenAI/o-series/Apr 16, 2025

OpenAI o4-mini

OpenAI's small reasoning model optimized for fast, cost-efficient reasoning. Achieves 99.5% on AIME 2025 (with Python) and excels at math, coding, and visual tasks. The best-performing benchmarked model on AIME 2024 and 2025. Retired from ChatGPT but API remains available.

textcodereasoningvisiontool-use

Arena ELO

1,391

Input Price

$1.1/M

Output Price

$4.4/M

Speed

114 t/s

Context

200K

Latency

64850ms

Capability Assessment

SWE-Bench Pro68.1%
GPQA Diamond81.4%
MMMU Pro81.6%

Comparative Analysis

MetricOpenAI o4-miniClaude Opus 4.6Gemini 3 ProOpenAI o3
SWE-bench68.1%80.8%76.2%71.7%
AIME 202592.7%100.0%95.0%96.7%
GPQA Diamond81.4%91.3%91.9%87.7%
MMLU90.0%91.0%91.8%92.9%
Input $/M$1.1$5$2$2
Output $/M$4.4$25$12$8

ARC Prize

ARC Prize Snapshot

Cost per task vs score across current reasoning levels.

Updated 3/10/2026, 2:38:07 AM

20406080100o4-mini Lowo4-mini Mediumo4-mini Higho4-mini Lowo4-mini Mediumo4-mini Higho4-mini Lowo4-mini Mediumo4-mini Higho4-mini Lowo4-mini Mediumo4-mini HighCost per task (log scale)Score (%)

ARC-AGI-1 Public

o4-mini (Low)27.6% / $0.04
o4-mini (Medium)50.2% / $0.13
o4-mini (High)68.0% / $0.32

ARC-AGI-1 Semi-Private

o4-mini (Low)21.3% / $0.04
o4-mini (Medium)41.8% / $0.15
o4-mini (High)58.7% / $0.41

ARC-AGI-2 Public

o4-mini (Low)0.3% / $0.05
o4-mini (Medium)2.2% / $0.24
o4-mini (High)7.5% / $0.88

ARC-AGI-2 Semi-Private

o4-mini (Low)1.7% / $0.05
o4-mini (Medium)2.4% / $0.23
o4-mini (High)6.1% / $0.86

Source Material