zembed-1

4B parameter multilingual embedding model distilled from zerank-2 reranker. Supports dimension reduction (2048 to 40) and quantization (32-bit to binary). Strong performance on finance, healthcare, and legal domains. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank

of 18

ELO Rating

1590

Win Rate

59.2%

Accuracy (nDCG@10)

0.619

#14

Latency

250ms

#13

Model Information

Provider: ZeroEntropy
License: CC BY-NC 4.0
Price per 1M tokens: $0.050
Dimensions: 2048
Release Date: 2026-03-02
Model Name: zembed-1
Total Evaluations: 1000

Performance Record

Wins592 (59.2%)

Losses319 (31.9%)

Ties89 (8.9%)

Wins

Losses

Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

zembed-1's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

zembed-1 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

PG

ELO 150055.0% WR22W-17L-1T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

business reports

ELO 150066.3% WR106W-49L-5T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

DBPedia

ELO 150051.9% WR83W-59L-18T

Accuracy Metrics

nDCG@5: 0.832
nDCG@10: 0.811
Recall@5: 0.062
Recall@10: 0.121

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

FiQa

ELO 150069.4% WR111W-43L-6T

Accuracy Metrics

nDCG@5: 0.862
nDCG@10: 0.855
Recall@5: 0.668
Recall@10: 0.712

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

SciFact

ELO 150065.0% WR104W-48L-8T

Accuracy Metrics

nDCG@5: 0.767
nDCG@10: 0.777
Recall@5: 0.888
Recall@10: 0.929

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

MSMARCO

ELO 150041.9% WR67W-57L-36T

Accuracy Metrics

nDCG@5: 0.955
nDCG@10: 0.946
Recall@5: 0.123
Recall@10: 0.223

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

ARCD

ELO 150061.9% WR99W-46L-15T

Accuracy Metrics

nDCG@5: 0.851
nDCG@10: 0.858
Recall@5: 0.920
Recall@10: 0.940

Latency Distribution

Mean: 250ms
P50 (Median): 250ms
P90: 250ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}