Google has launched Gemini 3 Flash, a new “fast and cheap” model designed to deliver stronger performance at lower latency, and has made it the default model inside the Gemini app worldwide. The company is also setting Gemini 3 Flash as the default model powering AI Mode in Google Search, a move that signals how aggressively Google is standardizing its latest AI capabilities across consumer surfaces.
The release, reported by TechCrunch, positions Gemini 3 Flash as a successor to Gemini 2.5 Flash and a companion to higher-tier options like Gemini 3 Pro. Google’s strategy is clear: push a capable “everyday” model to the mass market while keeping an upgrade path for users who need deeper math and coding performance.
What Gemini 3 Flash is—and why “default” matters
Gemini 3 Flash is based on the Gemini 3 family released last month and is optimized for speed and cost efficiency. In practice, “Flash” models are intended to handle the bulk of daily AI interactions—summarization, quick reasoning tasks, multimodal understanding, and conversational help—without the heavier compute of premium models.
Making Flash the default inside the Gemini app is significant because defaults shape behavior. Most users never change model settings, meaning Google’s new baseline experience for millions of people will now be Gemini 3 Flash. The same logic applies to AI Mode in Search: by setting a faster model as default, Google can scale AI responses more broadly while controlling infrastructure costs and response times.
Benchmarks: Google claims big gains over Gemini 2.5 Flash
Google is emphasizing measurable improvements, pointing to benchmark performance that it says represents a substantial leap from Gemini 2.5 Flash and, in some tests, lands in the neighborhood of other frontier systems.
Humanity’s Last Exam: a jump in “no tools” performance
On the benchmark known as Humanity’s Last Exam—designed to test expertise across domains—Gemini 3 Flash scored 33.7% without tool use, according to the figures cited by TechCrunch. For context, the report compares that to 37.5% for Gemini 3 Pro, 11% for Gemini 2.5 Flash, and 34.5% for GPT-5.2. While benchmark results don’t always translate directly to real-world quality, the jump from 11% to 33.7% suggests a meaningful improvement in baseline reasoning and knowledge recall for the “Flash” tier.
MMMU-Pro: multimodal reasoning focus
Google also highlighted the MMMU-Pro benchmark, which tests multimodality and reasoning, where Gemini 3 Flash reportedly scored 81.2%—outscoring competitors in that comparison set. That’s a noteworthy claim because multimodal capability (understanding text alongside images, audio, or video) is increasingly central to consumer AI products, from “explain what’s in this photo” to “analyze this recording.”
Consumer rollout: Gemini app users get Flash by default
Google says Gemini 3 Flash is rolling out globally as the default model in the Gemini app, replacing Gemini 2.5 Flash. Users will still be able to manually select Gemini 3 Pro from the model picker, particularly for “math and coding questions,” per the report.
This split reflects a broader pattern across AI platforms: a fast default model for general usage, with a higher-powered model available when users want more deliberate reasoning or specialized output. For Google, it’s also a product clarity move—most people want quick help, while power users can opt into heavier compute when it matters.
Multimodal features: video tips, sketches, and audio analysis
Google is pitching Gemini 3 Flash as better at identifying and responding to multimodal content. Examples cited include uploading a short video—such as a pickleball clip—and asking for tips, sketching a drawing and having the model guess what it is, or uploading an audio recording for analysis or quiz generation.
Those scenarios underscore where consumer AI is headed: not only answering questions, but interpreting the user’s world through media inputs. If Gemini 3 Flash can deliver those experiences quickly and reliably, it becomes a practical assistant rather than a novelty—especially when embedded into an app people already use.
Search implications: faster AI answers, broader adoption
By making Gemini 3 Flash the default in AI Mode in Search, Google is effectively betting that speed and cost efficiency are essential to scaling AI-generated results. Search is a high-volume product; even small increases in latency can degrade user experience, and compute-heavy models can become expensive at scale.
Google also said the model better understands the intent behind queries and can generate more visual answers using elements like images and tables. If that capability becomes common in AI Mode, it could shift how users interact with results—moving from link-first browsing toward more structured, synthesized responses, especially for how-to queries, comparisons, and information lookups.
Competitive context: pressure from frontier models
The timing and positioning suggest a competitive response to rapid iteration from rivals. Google’s benchmark comparisons explicitly reference GPT-5.2 and its own Gemini 3 Pro, framing Gemini 3 Flash as capable enough to be the default while still approaching frontier performance in certain measures.
For consumers, the practical question is whether “fast and cheap” comes with noticeable trade-offs. Google’s decision to keep Pro available for math and coding implies that, for complex tasks, the company expects some users will still prefer a more powerful model. But for everyday use—summaries, explanations, media understanding, and quick reasoning—Google is signaling that Flash is now good enough to represent the Gemini brand by default.
With Gemini 3 Flash now set as the standard experience in the Gemini app and AI Mode in Search, Google is pushing its latest AI model into the places where users ask questions most—turning model upgrades into an immediate, mainstream product change rather than a niche feature for enthusiasts.

