CBRSPrivate
Evidence
96%Authoritative
FactConfirmedProduct·June 2, 2025

Qwen3-32B Reasoning Model Live on Cerebras at 2,400 Tokens/Sec

Cerebras deployed Alibaba's Qwen3-32B reasoning model at 2,400 tokens/sec output speed (40x faster than best GPU result) with 1.2-second time to first token, marking the first reasoning model running in real time on any hardware.

Evidence Strength

Evidence
96%Authoritative
Backed by official company doc
Single publisher source
Includes official or primary source
Key Development
High-significance development (rated 8/10)
Confirmed — verified event

Insights

First tracked

May 15, 2025

Last updated

June 2, 2025

Sources

2 sources

Sources (2)

Source Timeline