Claude Opus 4.5
200K context window handles substantial retrieved documents with 4.97 grounding and faithfulness scores ensuring high fidelity to source material. Prompt caching feature optimizes performance for repeated retrieval patterns in RAG pipelines.
Model Information
- Provider
- Anthropic
- License
- Proprietary
- Input Price per 1M
- $5.00
- Output Price per 1M
- $25.00
- Context Window
- 200K
- Release Date
- 2025-11-24
- Model Name
- claude-opus-4-5-20251101
- Total Evaluations
- 810
Performance Record
Wins454 (56.0%)
Losses243 (30.0%)
Ties113 (14.0%)
Wins
Losses
Ties
Performance Overview
ELO ratings by dataset
Claude Opus 4.5's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.
Claude Opus 4.5 - ELO by Dataset
Detailed Metrics
Dataset breakdown
Performance metrics across different benchmark datasets, including accuracy and latency percentiles.
MSMARCO
ELO 169673.7% WR199W-40L-31T
Quality Metrics
- Correctness
- 4.97
- Faithfulness
- 4.97
- Grounding
- 4.97
- Relevance
- 4.97
- Completeness
- 4.97
- Overall
- 4.97
Latency Distribution
- Mean
- 5992ms
- Min
- 2590ms
- Max
- 8072ms
SciFact
ELO 160659.3% WR160W-38L-72T
Quality Metrics
- Correctness
- 4.73
- Faithfulness
- 4.80
- Grounding
- 4.80
- Relevance
- 4.97
- Completeness
- 4.70
- Overall
- 4.80
Latency Distribution
- Mean
- 7276ms
- Min
- 4210ms
- Max
- 10496ms
PG
ELO 147735.2% WR95W-165L-10T
Quality Metrics
- Correctness
- 4.93
- Faithfulness
- 4.93
- Grounding
- 4.93
- Relevance
- 4.93
- Completeness
- 4.80
- Overall
- 4.91
Latency Distribution
- Mean
- 11489ms
- Min
- 7945ms
- Max
- 15934ms
Compare Models
See how it stacks up
Compare Claude Opus 4.5 with other top llms to understand the differences in performance, accuracy, and latency.