Models/Grok 4
xAI/Grok 4/Jul 10, 2025

Grok 4

xAI's frontier reasoning model with always-on "Think" reasoning and an optional Heavy tier that runs five Grok 4 agents in parallel. Achieves 88% on GPQA Diamond and approximately 70.8-75% on SWE-bench Verified. Features a 260K-token context window.

textcodereasoningvisiontool-use

Arena ELO

1,492

Input Price

$3/M

Output Price

$15/M

Speed

45 t/s

Context

260K

Latency

15570ms

Capability Assessment

SWE-Bench Pro72.0%
GPQA Diamond88.0%
MMMU Pro76.5%

Comparative Analysis

MetricGrok 4Claude Opus 4.6Gemini 3.1 ProGemini 3 Pro
SWE-bench72.0%80.8%80.6%76.2%
AIME 202594.0%100.0%91.2%95.0%
GPQA Diamond88.0%91.3%94.3%91.9%
MMLU91.0%92.6%91.8%
Input $/M$3$5$2$2
Output $/M$15$25$12$12

Source Material