Back to all embeddings

zembed-1

4B parameter multilingual embedding model distilled from zerank-2 reranker. Supports dimension reduction (2048 to 40) and quantization (32-bit to binary). Strong performance on finance, healthcare, and legal domains. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank
#1
of 17
ELO Rating
1595
#1
Win Rate
59.2%
#1
Accuracy (nDCG@10)
0.619
#13
Latency
250ms
#13

Model Information

Provider
ZeroEntropy
License
CC BY-NC 4.0
Price per 1M tokens
$0.050
Dimensions
2048
Release Date
2026-03-02
Model Name
zembed-1
Total Evaluations
1000

Performance Record

Wins592 (59.2%)
Losses319 (31.9%)
Ties89 (8.9%)
Wins
Losses
Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Trusted by teams building production RAG applications

5M+
Documents
1,500+
Teams
99.9%
Uptime

Performance Overview

ELO ratings by dataset

zembed-1's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

zembed-1 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

PG

ELO 150055.0% WR22W-17L-1T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

business reports

ELO 150066.3% WR106W-49L-5T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

DBPedia

ELO 150051.9% WR83W-59L-18T

Accuracy Metrics

nDCG@5
0.832
nDCG@10
0.811
Recall@5
0.062
Recall@10
0.121

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

FiQa

ELO 150069.4% WR111W-43L-6T

Accuracy Metrics

nDCG@5
0.862
nDCG@10
0.855
Recall@5
0.668
Recall@10
0.712

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

SciFact

ELO 150065.0% WR104W-48L-8T

Accuracy Metrics

nDCG@5
0.767
nDCG@10
0.777
Recall@5
0.888
Recall@10
0.929

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

MSMARCO

ELO 150041.9% WR67W-57L-36T

Accuracy Metrics

nDCG@5
0.955
nDCG@10
0.946
Recall@5
0.123
Recall@10
0.223

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

ARCD

ELO 150061.9% WR99W-46L-15T

Accuracy Metrics

nDCG@5
0.851
nDCG@10
0.858
Recall@5
0.920
Recall@10
0.940

Latency Distribution

Mean
250ms
P50 (Median)
250ms
P90
250ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}