Gemini 3.5 Flash launched today - quick breakdown for anyone running agent workloads

Google shipped 3.5 Flash at I/O 2026. The “budget” Flash model now beats 3.1 Pro on coding and tool-calling benchmarks.

Key numbers (from Google):

  • MCP Atlas (tool calling): 83.6% vs 3.1 Pro’s 78.2%
  • Terminal-Bench (coding): 76.2% vs 70.3%
  • Finance Agent v2: 57.9% vs 43.0%
  • 4x faster, ~40% cheaper than Pro
  • $1.50/M input, $9/M output, $0.15/M cached

Where it does NOT win:

  • Computer Use: not supported (GPT-5.5 only)
  • SWE-Bench Pro: Opus 4.7 still leads
  • Abstract reasoning: 3.1 Pro still edges it

My quick take on model routing:

  • Multi-tool agent loops → Flash
  • Heavy code refactoring → Opus 4.7
  • GUI automation → GPT-5.5

Anyone tested it on real agent workflows yet? Curious how the 4x speed claim holds up in practice.