Gemini text-embedding-004

Supports 3,000 token context length with task-type specification for retrieval and classification. Legacy model scheduled for deprecation on January 14, 2026, replaced by gemini-embedding-001. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank

#18

of 18

ELO Rating

1366

#18

Win Rate

28.4%

#18

Accuracy (nDCG@10)

0.538

#16

Latency

16ms

Model Information

Provider: Google
License: Proprietary
Price per 1M tokens: $0.020
Dimensions: 768
Release Date: 2024-05-14
Model Name: text-embedding-004
Total Evaluations: 830

Performance Record

Wins236 (28.4%)

Losses541 (65.2%)

Ties53 (6.4%)

Wins

Losses

Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Gemini text-embedding-004's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Gemini text-embedding-004 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

business reports

ELO 150032.5% WR52W-107L-1T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 15ms
P50 (Median): 15ms
P90: 15ms

DBPedia

ELO 150015.0% WR24W-126L-10T

Accuracy Metrics

nDCG@5: 0.747
nDCG@10: 0.737
Recall@5: 0.057
Recall@10: 0.108

Latency Distribution

Mean: 14ms
P50 (Median): 14ms
P90: 14ms

FiQa

ELO 150026.7% WR40W-105L-5T

Accuracy Metrics

nDCG@5: 0.744
nDCG@10: 0.730
Recall@5: 0.647
Recall@10: 0.752

Latency Distribution

Mean: 16ms
P50 (Median): 16ms
P90: 16ms

SciFact

ELO 150043.8% WR70W-83L-7T

Accuracy Metrics

nDCG@5: 0.728
nDCG@10: 0.729
Recall@5: 0.813
Recall@10: 0.857

Latency Distribution

Mean: 15ms
P50 (Median): 15ms
P90: 15ms

MSMARCO

ELO 150030.0% WR48W-90L-22T

Accuracy Metrics

nDCG@5: 0.932
nDCG@10: 0.918
Recall@5: 0.117
Recall@10: 0.208

Latency Distribution

Mean: 18ms
P50 (Median): 18ms
P90: 18ms

ARCD

ELO 15005.0% WR2W-30L-8T

Accuracy Metrics

nDCG@5: 0.021
nDCG@10: 0.027
Recall@5: 0.040
Recall@10: 0.060

Latency Distribution

Mean: 15ms
P50 (Median): 15ms
P90: 15ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}