Back to all embeddings

Qwen3 Embedding 4B

Mid-size 4 billion parameter model with strong multilingual capabilities across 100+ languages. Supports user-defined instructions for task-specific optimization in text retrieval, classification, and clustering applications. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank
#11
of 18
ELO Rating
1482
#11
Win Rate
44.6%
#10
Accuracy (nDCG@10)
0.705
#3
Latency
29ms
#9

Model Information

Provider
Qwen
License
Open Source
Price per 1M tokens
$0.020
Dimensions
2560
Release Date
2025-06-06
Model Name
qwen3-embedding-4b
Total Evaluations
830

Performance Record

Wins370 (44.6%)
Losses398 (48.0%)
Ties62 (7.5%)
Wins
Losses
Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Trusted by teams building production RAG applications

5M+
Documents
1,500+
Teams
99.9%
Uptime

Performance Overview

ELO ratings by dataset

Qwen3 Embedding 4B's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Qwen3 Embedding 4B - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

business reports

ELO 150045.6% WR73W-87L-0T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
29ms
P50 (Median)
29ms
P90
29ms

DBPedia

ELO 150045.0% WR72W-80L-8T

Accuracy Metrics

nDCG@5
0.799
nDCG@10
0.787
Recall@5
0.061
Recall@10
0.119

Latency Distribution

Mean
26ms
P50 (Median)
26ms
P90
26ms

FiQa

ELO 150045.3% WR68W-77L-5T

Accuracy Metrics

nDCG@5
0.838
nDCG@10
0.836
Recall@5
0.719
Recall@10
0.839

Latency Distribution

Mean
23ms
P50 (Median)
23ms
P90
23ms

SciFact

ELO 150032.5% WR52W-100L-8T

Accuracy Metrics

nDCG@5
0.666
nDCG@10
0.697
Recall@5
0.782
Recall@10
0.891

Latency Distribution

Mean
38ms
P50 (Median)
38ms
P90
38ms

MSMARCO

ELO 150052.5% WR84W-43L-33T

Accuracy Metrics

nDCG@5
0.974
nDCG@10
0.954
Recall@5
0.124
Recall@10
0.224

Latency Distribution

Mean
31ms
P50 (Median)
31ms
P90
31ms

ARCD

ELO 150052.5% WR21W-11L-8T

Accuracy Metrics

nDCG@5
0.857
nDCG@10
0.864
Recall@5
0.940
Recall@10
0.960

Latency Distribution

Mean
25ms
P50 (Median)
25ms
P90
25ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}

Compare Models

See how it stacks up

Compare Qwen3 Embedding 4B with other top embeddings to understand the differences in performance, accuracy, and latency.