Models/GPT-5.2
OpenAI/GPT-5/Dec 11, 2025

GPT-5.2

OpenAI's flagship reasoning model. GPT-5.2 Thinking scores 80% on SWE-bench Verified and 55.6% on SWE-Bench Pro (state of the art). GPT-5.2 Pro crosses the 90% threshold on ARC-AGI-1 while reducing cost by 390x compared to o3. Features a 400K context window with up to 128K output tokens.

textcodereasoningvisiontool-useaudio

Arena ELO

Input Price

$1.75/M

Output Price

$14/M

Speed

71 t/s

Context

400K

Latency

99790ms

Capability Assessment

SWE-Bench Pro80.0%
Terminal-Bench 2.047.6%
GPQA Diamond92.4%
MMMU Pro86.7%

Comparative Analysis

MetricGPT-5.2Claude Opus 4.6Gemini 3 ProOpenAI o3
SWE-bench80.0%80.8%76.2%71.7%
AIME 2025100.0%100.0%95.0%96.7%
GPQA Diamond92.4%91.3%91.9%87.7%
MMLU91.0%91.0%91.8%92.9%
Input $/M$1.75$5$2$2
Output $/M$14$25$12$8

ARC Prize

ARC Prize Snapshot

Cost per task vs score across current reasoning levels.

Updated 3/10/2026, 2:38:07 AM

20406080100GPT-5.2GPT-5.2GPT-5.2GPT-5.2Cost per task (log scale)Score (%)

ARC-AGI-1 Public

GPT-5.216.5% / $0.04

ARC-AGI-1 Semi-Private

GPT-5.212.3% / $0.05

ARC-AGI-2 Public

GPT-5.20.0% / $0.08

ARC-AGI-2 Semi-Private

GPT-5.20.8% / $0.08

Source Material