OpenAI/GPT-5/Dec 11, 2025

GPT-5.2

OpenAI's flagship reasoning model. GPT-5.2 Thinking scores 80% on SWE-bench Verified and 55.6% on SWE-Bench Pro (state of the art). GPT-5.2 Pro crosses the 90% threshold on ARC-AGI-1 while reducing cost by 390x compared to o3. Features a 400K context window with up to 128K output tokens.

textcodereasoningvisiontool-useaudio

Arena ELO

—

Input Price

$1.75/M

Output Price

$14/M

Speed

71 t/s

Context

400K

Latency

99790ms

Capability Assessment

SWE-Bench Pro80.0%

Terminal-Bench 2.047.6%

GPQA Diamond92.4%

MMMU Pro86.7%

Comparative Analysis

Metric	GPT-5.2	Claude Opus 4.6	Gemini 3 Pro	OpenAI o3
SWE-bench	80.0%	80.8%	76.2%	71.7%
AIME 2025	100.0%	100.0%	95.0%	96.7%
GPQA Diamond	92.4%	91.3%	91.9%	87.7%
MMLU	91.0%	91.0%	91.8%	92.9%
Input $/M	$1.75	$5	$2	$2
Output $/M	$14	$25	$12	$8

ARC Prize

ARC Prize Snapshot

Cost per task vs score across current reasoning levels.

Updated 3/10/2026, 2:38:07 AM

ARC-AGI-1 Public

GPT-5.216.5% / $0.04

ARC-AGI-1 Semi-Private

GPT-5.212.3% / $0.05

ARC-AGI-2 Public

GPT-5.20.0% / $0.08

ARC-AGI-2 Semi-Private

GPT-5.20.8% / $0.08

Source Material

Launch Post

Back to Model Comparison Matrix