Back to all rerankers

Cohere Rerank 4 Fast

Fast cross-encoder reranker for enterprise search and RAG, built for low-latency production workloads. Supports up to 32K context and strong multilingual retrieval across 100+ languages, with optional self-learning to adapt to your domain over time. If you want to compare the best rerankers for your data, try Agentset.

Leaderboard Rank
#7
of 12
ELO Rating
1510
#7
Win Rate
49.8%
#8
Accuracy (nDCG@10)
0.094
#6
Latency
447ms
#5

Model Information

Provider
Cohere
License
Proprietary
Price per 1M tokens
$0.050
Release Date
2025-12-11
Model Name
rerank-v4.0-fast
Total Evaluations
3300

Performance Record

Wins1643 (49.8%)
Losses1540 (46.7%)
Ties117 (3.5%)
Wins
Losses
Ties

Rerankers Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no reranking to configure.

Trusted by teams building production RAG applications

5M+
Documents
1,500+
Teams
99.9%
Uptime

Performance Overview

ELO ratings by dataset

Cohere Rerank 4 Fast's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Cohere Rerank 4 Fast - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

business reports

ELO 160356.2% WR309W-231L-10T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
428ms
P50 (Median)
408ms
P90
550ms

DBPedia

ELO 158041.4% WR228W-282L-40T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
297ms
P50 (Median)
297ms
P90
309ms

MSMARCO

ELO 150145.1% WR248W-251L-51T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
403ms
P50 (Median)
382ms
P90
486ms

PG

ELO 147441.6% WR229W-321L-0T

Accuracy Metrics

nDCG@5
0.000
nDCG@10
0.000
Recall@5
0.000
Recall@10
0.000

Latency Distribution

Mean
492ms
P50 (Median)
439ms
P90
650ms

arguana

ELO 147262.2% WR342W-203L-5T

Accuracy Metrics

nDCG@5
0.351
nDCG@10
0.425
Recall@5
0.660
Recall@10
0.880

Latency Distribution

Mean
574ms
P50 (Median)
562ms
P90
728ms

FiQa

ELO 142952.2% WR287W-252L-11T

Accuracy Metrics

nDCG@5
0.135
nDCG@10
0.138
Recall@5
0.125
Recall@10
0.130

Latency Distribution

Mean
485ms
P50 (Median)
459ms
P90
624ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked rerankers and embedding models built in. Upload your data, call the API, and get accurate results from day one.

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}

Compare Models

See how it stacks up

Compare Cohere Rerank 4 Fast with other top rerankers to understand the differences in performance, accuracy, and latency.