DeepSeek R1

163,840 token context with transparent <think> delimiters showing reasoning over retrieved documents. MIT license enables fine-tuning on domain-specific retrieval tasks and full model customization. If you want to compare the best LLMs for your data, try Agentset.

Leaderboard Rank

#14

of 16

ELO Rating

1325

#14

Win Rate

22.5%

#15

Latency

18272ms

#14

Model Information

Provider: DeepSeek
License: Open Source
Input Price per 1M: $0.30
Output Price per 1M: $1.20
Context Window: 164K
Release Date: 2025-01-20
Model Name: deepseek-r1
Total Evaluations: 1350

Performance Record

Wins304 (22.5%)

Losses885 (65.6%)

Ties161 (11.9%)

Wins

Losses

Ties

LLMs Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no LLM orchestration to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

DeepSeek R1's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

DeepSeek R1 - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

SciFact

ELO 149326.0% WR117W-241L-92T

Quality Metrics

Correctness: 4.97
Faithfulness: 4.97
Grounding: 4.97
Relevance: 5.00
Completeness: 4.80
Overall: 4.94

Latency Distribution

Mean: 14826ms
Min: 7765ms
Max: 33129ms

PG

ELO 141524.2% WR109W-320L-21T

Quality Metrics

Correctness: 4.87
Faithfulness: 4.87
Grounding: 4.87
Relevance: 4.93
Completeness: 4.60
Overall: 4.83

Latency Distribution

Mean: 23334ms
Min: 12280ms
Max: 85633ms

MSMARCO

ELO 106817.3% WR78W-324L-48T

Quality Metrics

Correctness: 4.67
Faithfulness: 4.70
Grounding: 4.67
Relevance: 4.83
Completeness: 4.57
Overall: 4.69

Latency Distribution

Mean: 16654ms
Min: 9675ms
Max: 31255ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked LLMs and smart retrieval built in. Upload your data, call the API, and get grounded answers from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}