Jina Embeddings v5 Text Small

677M parameter multilingual embedding model built on Qwen3-0.6B-Base with 32K token context length supporting 119+ languages. Features four task-specific LoRA adapters for retrieval, text-matching, clustering, and classification, with Matryoshka learning enabling dimension reduction down to 32. Achieves 67.0 average on MMTEB, best among sub-1B multilingual models. If you want to compare the best embedding models for your data, try Agentset.

Leaderboard Rank

of 18

ELO Rating

1566

Win Rate

54.7%

Accuracy (nDCG@10)

0.608

#15

Latency

289ms

#16

Model Information

Provider: Jina AI
License: CC BY-NC 4.0
Price per 1M tokens: $0.050
Dimensions: 1024
Release Date: 2026-02-18
Model Name: jina-embeddings-v5-text-small
Total Evaluations: 1120

Performance Record

Wins613 (54.7%)

Losses408 (36.4%)

Ties99 (8.8%)

Wins

Losses

Ties

Embedding Models Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no embeddings to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Jina Embeddings v5 Text Small's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Jina Embeddings v5 Text Small - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

PG

ELO 150063.3% WR38W-21L-1T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 291ms
P50 (Median): 241ms
P90: 290ms

business reports

ELO 150075.0% WR135W-42L-3T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 283ms
P50 (Median): 247ms
P90: 322ms

DBPedia

ELO 150043.3% WR78W-83L-19T

Accuracy Metrics

nDCG@5: 0.823
nDCG@10: 0.805
Recall@5: 0.062
Recall@10: 0.123

Latency Distribution

Mean: 270ms
P50 (Median): 239ms
P90: 264ms

FiQa

ELO 150061.2% WR104W-62L-4T

Accuracy Metrics

nDCG@5: 0.838
nDCG@10: 0.831
Recall@5: 0.677
Recall@10: 0.771

Latency Distribution

Mean: 300ms
P50 (Median): 241ms
P90: 419ms

SciFact

ELO 150059.4% WR107W-66L-7T

Accuracy Metrics

nDCG@5: 0.703
nDCG@10: 0.734
Recall@5: 0.789
Recall@10: 0.898

Latency Distribution

Mean: 267ms
P50 (Median): 240ms
P90: 265ms

MSMARCO

ELO 150046.7% WR84W-67L-29T

Accuracy Metrics

nDCG@5: 0.960
nDCG@10: 0.954
Recall@5: 0.122
Recall@10: 0.219

Latency Distribution

Mean: 273ms
P50 (Median): 239ms
P90: 313ms

ARCD

ELO 150039.4% WR67W-67L-36T

Accuracy Metrics

nDCG@5: 0.842
nDCG@10: 0.842
Recall@5: 0.940
Recall@10: 0.940

Latency Distribution

Mean: 336ms
P50 (Median): 248ms
P90: 305ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked embedding models and smart retrieval built in. Upload your data, call the API, and get accurate results from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}