Cohere Rerank 4 Pro

Advanced cross-encoder reranking models (Fast & Pro) optimized for enterprise search and RAG, featuring a 32K context window, strong performance on long and complex documents, and multilingual retrieval across 100+ languages. Built for high-stakes domains like finance, healthcare, manufacturing, and e-commerce, with self-learning to adapt to domain-specific data over time. If you want to compare the best rerankers for your data, try Agentset.

Leaderboard Rank

of 12

ELO Rating

1629

Win Rate

57.7%

Accuracy (nDCG@10)

0.095

Latency

614ms

Model Information

Provider: Cohere
License: Proprietary
Price per 1M tokens: $0.050
Release Date: 2025-12-11
Model Name: rerank-v4.0-pro
Total Evaluations: 3300

Performance Record

Wins1903 (57.7%)

Losses1270 (38.5%)

Ties127 (3.8%)

Wins

Losses

Ties

Rerankers Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no reranking to configure.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Cohere Rerank 4 Pro's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Cohere Rerank 4 Pro - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

arguana

ELO 178366.2% WR364W-184L-2T

Accuracy Metrics

nDCG@5: 0.353
nDCG@10: 0.439
Recall@5: 0.660
Recall@10: 0.920

Latency Distribution

Mean: 785ms
P50 (Median): 768ms
P90: 933ms

FiQa

ELO 170259.3% WR326W-207L-17T

Accuracy Metrics

nDCG@5: 0.126
nDCG@10: 0.129
Recall@5: 0.130
Recall@10: 0.135

Latency Distribution

Mean: 610ms
P50 (Median): 585ms
P90: 817ms

business reports

ELO 165862.0% WR341W-197L-12T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 529ms
P50 (Median): 498ms
P90: 675ms

MSMARCO

ELO 158652.0% WR286W-209L-55T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 458ms
P50 (Median): 408ms
P90: 615ms

PG

ELO 155356.9% WR313W-237L-0T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 760ms
P50 (Median): 720ms
P90: 896ms

DBPedia

ELO 149149.6% WR273W-236L-41T

Accuracy Metrics

nDCG@5: 0.000
nDCG@10: 0.000
Recall@5: 0.000
Recall@10: 0.000

Latency Distribution

Mean: 541ms
P50 (Median): 489ms
P90: 729ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked rerankers and embedding models built in. Upload your data, call the API, and get accurate results from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}