MOUNTAIN VIEW, CA — Google has officially released its most powerful AI model to date: Gemini 2.5 Pro. The standout feature is the new Deep Think reasoning mode, which allows the neural network to spend significantly more compute time on complex problems before generating a response.
New Performance Records
According to published benchmarks, Gemini 2.5 Pro has set new industry standards, surpassing competitors from OpenAI and Anthropic in key technical disciplines:
- MMLU-Pro (General Knowledge): 89.8% — the highest score of any publicly available model.
- GPQA Diamond (Graduate-level Science): 82.4% — beating Fable 5 (79.1%) and GPT-5.5 (76.3%).
- HumanEval+ (Coding): 94.1% — the highest result ever recorded in testing history.
- MATH-500: 97.2% — near-perfect accuracy on advanced mathematical problems.
2 Million Token Context Window
Another headline-grabbing feature is the doubling of the context window to 2 million tokens. This means the model can ingest and reason over entire codebases, full-length books, hours of high-definition video, or months of conversation history in a single session. For enterprise customers, this enables the analysis of massive document corpora without the need for manual chunking.
How Deep Think Works
Deep Think is Google’s answer to the “thinking” models shipped by rivals. Instead of responding instantly, the model decomposes problems, explores multiple reasoning paths, and verifies its own logic before producing an answer. Google claims Deep Think improves accuracy on complex multi-step problems by 15–25%, although response times can be 3–5x longer.
The model is available immediately via the Gemini API, Google AI Studio, and Vertex AI. Consumer-facing features are being rolled out as part of the AI Plus subscription tier.

Comments on this article