Gemini Embedding 2

Google's first natively multimodal embedding model built on Gemini architecture, supporting text (8,192 tokens), images, video, audio, and documents in a single unified embedding space across 100+ languages. Features Matryoshka Representation Learning with flexible output dimensions (3072/1536/768). Available in Public Preview via Gemini API. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank

of 18

ELO Rating

1605

Win Rate

59.5%

Accuracy (nDCG@10)

0.628

#11

Latency

435ms

#18

Model Information

Provider: Google
License: Proprietary
Price per 1M tokens: $0.000
Dimensions: 3072
Release Date: 2026-03-10
Model Name: gemini-embedding-2-preview
Total Evaluations: 1065

Performance Record

Wins634 (59.5%)

Losses316 (29.7%)

Ties115 (10.8%)

Wins

Losses

Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Gemini Embedding 2's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Gemini Embedding 2 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

FiQa

ELO 150050.6% WR86W-73L-11T

Accuracy Metrics

nDCG@5: 0.843
nDCG@10: 0.835
Recall@5: 0.763
Recall@10: 0.816

Latency Distribution

Mean: 466ms
P50 (Median): 454ms
P90: 605ms

MSMARCO

ELO 150047.3% WR80W-55L-34T

Accuracy Metrics

nDCG@5: 0.956
nDCG@10: 0.939
Recall@5: 0.122
Recall@10: 0.221

Latency Distribution

Mean: 441ms
P50 (Median): 446ms
P90: 584ms

SciFact

ELO 150070.6% WR120W-37L-13T

Accuracy Metrics

nDCG@5: 0.871
nDCG@10: 0.871
Recall@5: 0.959
Recall@10: 0.959

Latency Distribution

Mean: 404ms
P50 (Median): 360ms
P90: 537ms

DBPedia

ELO 150059.4% WR101W-50L-19T

Accuracy Metrics

nDCG@5: 0.788
nDCG@10: 0.792
Recall@5: 0.061
Recall@10: 0.120

Latency Distribution

Mean: 436ms
P50 (Median): 432ms
P90: 592ms

business reports

ELO 150064.1% WR109W-55L-6T

Accuracy Metrics

nDCG@5: 0.091
nDCG@10: 0.084
Recall@5: 0.012
Recall@10: 0.020

Latency Distribution

Mean: 439ms
P50 (Median): 431ms
P90: 603ms

ARCD

ELO 150059.6% WR99W-35L-32T

Accuracy Metrics

nDCG@5: 0.868
nDCG@10: 0.875
Recall@5: 0.940
Recall@10: 0.960

Latency Distribution

Mean: 410ms
P50 (Median): 359ms
P90: 586ms

PG

ELO 150078.0% WR39W-11L-0T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 448ms
P50 (Median): 431ms
P90: 595ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}