Google released Gemini Embedding 2 - their first natively multimodal embedding model. It's 3,072-dimensional, supports up to 8,192 tokens, and natively embeds text, images, audio, and video. Currently free in public preview via the Gemini API.
We added it to our Embedding Leaderboard and ran it against all 17 existing models across 7 retrieval datasets.
TL;DR
- Gemini Embedding 2 takes #1 with 1605 Elo and a 59.5% win rate
- Only 18 Elo separate the top three: Gemini Embedding 2, zembed-1, and Voyage 4
- Strongest on scientific retrieval (70.6% on SciFact) and Arabic QA (59.6% on ARCD)
- Weakest on financial QA (50.6% on FiQA) - barely above a coin flip
- Beats its predecessor Gemini text-embedding-004 in 80% of matchups
What We Found
Leads the leaderboard, narrowly
Gemini Embedding 2 reaches 1605 Elo. zembed-1 sits at 1590, Voyage 4 at 1586. The gap is real but narrow, with a different query set the ordering could shift.

Below #3, there's a visible drop. Jina v5 Small and OpenAI text-3-large form the next tier around 1563-1566.
Domain performance varies
The overall win rate hides variance across datasets.

SciFact is the strongest result: 71% win rate. ARCD (Arabic QA) is also strong at 60%.
FiQA (financial QA) is the weakest: 51%. Barely above a coin flip. Financial retrieval rewards exact terminology and numeric patterns, the model's generalist training doesn't fully capture that. MSMARCO is also below 50%, meaning on general short-query retrieval it doesn't consistently beat the top tier.
Beats mid-tier, even against the top
Against zembed-1 and Voyage 4, Gemini Embedding 2 wins 54% of matchups. Against mid-tier models, the gap widens.

The clearest signal is the predecessor comparison. Against text-embedding-004, it wins 48-6, 80% of the time. That's not a refinement. It's a different model class.
How We Tested
- Each model embedded the same corpus and queries
- Retrieved top-5 results per query, shown to an LLM judge
- Judge picks which model's results are more relevant
- 7 datasets: MSMARCO, FiQA, SciFact, DBPedia, ARCD + two internal sets
- Elo computed from 1,065 pairwise judgments
Recommendation
Gemini Embedding 2 is a reasonable default for new pipelines. It leads the leaderboard, beats most models clearly, and costs nothing during preview.
If you're already on zembed-1 or Voyage 4, there's no strong reason to switch. The top three are within noise of each other. At that tier, your chunking strategy or reranker matters more than which embedding model you pick.
We'll keep it on the leaderboard. As the preview ends and pricing is announced, we'll see whether the performance holds at scale.
See the full rankings on the Embedding Leaderboard.
