xAI/Grok 3/Feb 17, 2025

Grok 3

xAI's heavyweight contender, trained on the Colossus 100K-GPU cluster. A 3T MoE model that emerged as a serious frontier player with top-tier MMLU and AIME scores, proving that brute-force compute still has surprises left.

textcodereasoningvision

Arena ELO

1,411

Input Price

$3/M

Output Price

$15/M

Speed

69 t/s

Context

Latency

750ms

Capability Assessment

GPQA Diamond84.6%

Comparative Analysis

Metric	Grok 3	Claude Opus 4.6	Gemini 3 Pro	OpenAI o3
SWE-bench	—	80.8%	76.2%	71.7%
AIME 2025	93.3%	100.0%	95.0%	96.7%
GPQA Diamond	84.6%	91.3%	91.9%	87.7%
MMLU	92.7%	91.0%	91.8%	92.9%
Input $/M	$3	$5	$2	$2
Output $/M	$15	$25	$12	$8

ARC Prize

ARC Prize Snapshot

Cost per task vs score across current reasoning levels.

Updated 3/10/2026, 2:38:07 AM

ARC-AGI-1 Public

Grok 38.4% / $0.07

ARC-AGI-1 Semi-Private

Grok 35.5% / $0.09

ARC-AGI-2 Public

Grok 30.0% / $0.14

ARC-AGI-2 Semi-Private

Grok 30.0% / $0.14

Source Material

Launch Post

Back to Model Comparison Matrix