Back to all embeddings

Gemini Embedding 2

Google's first natively multimodal embedding model built on Gemini architecture, supporting text (8,192 tokens), images, video, audio, and documents in a single unified embedding space across 100+ languages. Features Matryoshka Representation Learning with flexible output dimensions (3072/1536/768). Available in Public Preview via Gemini API. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank
#1
of 18
ELO Rating
1605
#1
Win Rate
59.5%
#1
Accuracy (nDCG@10)
0.628
#11
Latency
435ms
#18

Model Information

Provider
Google
License
Proprietary
Price per 1M tokens
$0.000
Dimensions
3072
Release Date
2026-03-10
Model Name
gemini-embedding-2-preview
Total Evaluations
1065

Performance Record

Wins634 (59.5%)
Losses316 (29.7%)
Ties115 (10.8%)
Wins
Losses
Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Trusted by teams building production RAG applications

5M+
Documents
1,500+
Teams
99.9%
Uptime

Performance Overview

ELO ratings by dataset

Gemini Embedding 2's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Gemini Embedding 2 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

FiQa

ELO 150050.6% WR86W-73L-11T

Accuracy Metrics

nDCG@5
0.843
nDCG@10
0.835
Recall@5
0.763
Recall@10
0.816

Latency Distribution

Mean
466ms
P50 (Median)
454ms
P90
605ms

MSMARCO

ELO 150047.3% WR80W-55L-34T

Accuracy Metrics

nDCG@5
0.956
nDCG@10
0.939
Recall@5
0.122
Recall@10
0.221

Latency Distribution

Mean
441ms
P50 (Median)
446ms
P90
584ms

SciFact

ELO 150070.6% WR120W-37L-13T

Accuracy Metrics

nDCG@5
0.871
nDCG@10
0.871
Recall@5
0.959
Recall@10
0.959

Latency Distribution

Mean
404ms
P50 (Median)
360ms
P90
537ms

DBPedia

ELO 150059.4% WR101W-50L-19T

Accuracy Metrics

nDCG@5
0.788
nDCG@10
0.792
Recall@5
0.061
Recall@10
0.120

Latency Distribution

Mean
436ms
P50 (Median)
432ms
P90
592ms

business reports

ELO 150064.1% WR109W-55L-6T

Accuracy Metrics

nDCG@5
0.091
nDCG@10
0.084
Recall@5
0.012
Recall@10
0.020

Latency Distribution

Mean
439ms
P50 (Median)
431ms
P90
603ms

ARCD

ELO 150059.6% WR99W-35L-32T

Accuracy Metrics

nDCG@5
0.868
nDCG@10
0.875
Recall@5
0.940
Recall@10
0.960

Latency Distribution

Mean
410ms
P50 (Median)
359ms
P90
586ms

PG

ELO 150078.0% WR39W-11L-0T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
448ms
P50 (Median)
431ms
P90
595ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}

Compare Models

See how it stacks up

Compare Gemini Embedding 2 with other top embeddings to understand the differences in performance, accuracy, and latency.