Google launches Gemini 3.1 Professional, retaking AI crown with 2X+ reasoning efficiency enhance


Late final 12 months, Google briefly took the crown for many highly effective AI mannequin in the world with the launch of Gemini 3 Pro — solely to be surpassed inside weeks by OpenAI and Anthropic releasing new fashions, s is frequent in the fiercely aggressive AI race.

Now Google is again to retake the throne with an updated version of that flagship model: Gemini 3.1 Pro, positioned as a better baseline for duties the place a easy response is inadequate—focusing on science, analysis, and engineering workflows that demand deep planning and synthesis.

Already, evaluations by third-party firm Artificial Analysis present that Google’s Gemini 3.1 Professional has leapt to the entrance of the pack and is as soon as extra the strongest and performant AI mannequin in the world.

A giant leap in core reasoning

Essentially the most important development in Gemini 3.1 Professional lies in its efficiency on rigorous logic benchmarks. Most notably, the mannequin achieved a verified rating of 77.1% on ARC-AGI-2.

This particular benchmark is designed to consider a mannequin’s skill to resolve completely new logic patterns it has not encountered throughout coaching.

This outcome represents greater than double the reasoning efficiency of the earlier Gemini 3 Professional mannequin.

Google Gemini 3.1 Pro benchmark chart

Google Gemini 3.1 Professional benchmark chart. Credit score: Google

Past summary logic, inside benchmarks point out that 3.1 Professional is extremely aggressive throughout specialised domains:

  • Scientific Data: It scored 94.3% on GPQA Diamond.

  • Coding: It reached an Elo of 2887 on LiveCodeBench Professional and scored 80.6% on SWE-Bench Verified.

  • Multimodal Understanding: It achieved 92.6% on MMMLU.

These technical good points are not simply incremental; they characterize a refinement in how the mannequin handles “considering” tokens and long-horizon duties, offering a extra dependable basis for builders constructing autonomous brokers.

Improved vibe coding and 3D synthesis

Google is demonstrating the mannequin’s utility via “intelligence utilized”—shifting the focus from chat interfaces to purposeful outputs.

One in every of the most distinguished options is the mannequin’s skill to generate “vibe-coded” animated SVGs straight from textual content prompts. As a result of these are code-based quite than pixel-based, they continue to be scalable and keep tiny file sizes in contrast to conventional video, boasting way more detailed, presentable {and professional} visuals for web sites and displays and different enterprise purposes.

Different showcased purposes embody:

  • Advanced System Synthesis: The mannequin efficiently configured a public telemetry stream to construct a dwell aerospace dashboard visualizing the Worldwide Area Station’s orbit.

  • Interactive Design: In a single demo, 3.1 Professional coded a posh 3D starling murmuration that customers can manipulate through hand-tracking, accompanied by a generative audio rating.

  • Artistic Coding: The mannequin translated the atmospheric themes of Emily Brontë’s Wuthering Heights right into a purposeful, fashionable net design, demonstrating a capability to motive via tone and magnificence quite than simply literal textual content.

Enterprise affect and group reactions

Enterprise companions have already begun integrating the preview model of three.1 Professional, reporting noticeable enhancements in reliability and effectivity.

Vladislav Tankov, Director of AI at JetBrains, famous a 15% high quality enchancment over earlier variations, stating the mannequin is “stronger, quicker… and extra environment friendly, requiring fewer output tokens”. Different trade reactions embody:

  • Databricks: CTO Hanlin Tang reported that the mannequin achieved “best-in-class outcomes” on OfficeQA, a benchmark for grounded reasoning throughout tabular and unstructured knowledge.

  • Cartwheel: Co-founder Andrew Carr highlighted the mannequin’s “considerably improved understanding of 3D transformations,” noting it resolved long-standing rotation order bugs in 3D animation pipelines.

  • Hostinger Horizons: Head of Product Dainius Kavoliunas noticed that the mannequin understands the “vibe” behind a immediate, translating intent into style-accurate code for non-developers.

Pricing, licensing, and availability

For builders, the most putting side of the 3.1 Professional launch is the “reasoning-to-dollar” ratio. When Gemini 3 Professional launched, it was positioned in the mid-high value vary at $2.00 per million enter tokens for normal prompts. Gemini 3.1 Professional maintains this actual pricing construction, successfully providing a large efficiency improve at no extra value to API customers.

  • Enter Worth: $2.00 per 1M tokens for prompts up to 200k; $4.00 per 1M tokens for prompts over 200k.

  • Output Worth: $12.00 per 1M tokens for prompts up to 200k; $18.00 per 1M tokens for prompts over 200k.

  • Context Caching: Billed at $0.20 to $0.40 per 1M tokens relying on immediate dimension, plus a storage payment of $4.50 per 1M tokens per hour.

  • Search Grounding: 5,000 prompts per thirty days are free, adopted by a cost of $14 per 1,000 search queries.

For customers, the mannequin is rolling out in the Gemini app and NotebookLM with increased limits for Google AI Professional and Extremely subscribers.

Licensing implications

As a proprietary mannequin provided via Vertex Studio in Google Cloud and the Gemini API, 3.1 Professional follows a normal industrial SaaS (Software program as a Service) mannequin quite than an open-source license.

For enterprise customers, this offers “grounded reasoning” inside the safety perimeter of Vertex AI, permitting companies to function on their very own knowledge with confidence.

The “Preview” standing permits Google to refine the mannequin’s security and efficiency before normal availability, a typical apply in high-stakes AI deployment.

By doubling down on core reasoning and specialised benchmarks like ARC-AGI-2, Google is signaling that the subsequent part of the AI race can be received by fashions that may suppose via an issue, not simply predict the subsequent phrase.




Disclaimer: This article is sourced from external platforms. OverBeta has not independently verified the information. Readers are advised to verify details before relying on them.

0
Show Comments (0) Hide Comments (0)
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Stay Updated!

Subscribe to get the latest blog posts, news, and updates delivered straight to your inbox.