Claude Opus 4.5
200K context window handles substantial retrieved documents with 4.97 grounding and faithfulness scores ensuring high fidelity to source material. Prompt caching feature optimizes performance for repeated retrieval patterns in RAG pipelines.
Model Information
- Provider
- Anthropic
- License
- Proprietary
- Input Price per 1M
- $5.00
- Output Price per 1M
- $25.00
- Context Window
- 200K
- Release Date
- 2025-11-24
- Model Name
- claude-opus-4-5-20251101
- Total Evaluations
- 900
Performance Record
Wins475 (52.8%)
Losses290 (32.2%)
Ties135 (15.0%)
Wins
Losses
Ties
Performance Overview
ELO ratings by dataset
Claude Opus 4.5's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.
Claude Opus 4.5 - ELO by Dataset
Detailed Metrics
Dataset breakdown
Performance metrics across different benchmark datasets, including accuracy and latency percentiles.
MSMARCO
ELO 170670.7% WR212W-51L-37T
Quality Metrics
- Correctness
- 4.97
- Faithfulness
- 4.97
- Grounding
- 4.97
- Relevance
- 4.97
- Completeness
- 4.97
- Overall
- 4.97
Latency Distribution
- Mean
- 5992ms
- Min
- 2590ms
- Max
- 8072ms
SciFact
ELO 158055.3% WR166W-48L-86T
Quality Metrics
- Correctness
- 4.70
- Faithfulness
- 4.80
- Grounding
- 4.77
- Relevance
- 4.97
- Completeness
- 4.70
- Overall
- 4.79
Latency Distribution
- Mean
- 7276ms
- Min
- 4210ms
- Max
- 10496ms
PG
ELO 144232.3% WR97W-191L-12T
Quality Metrics
- Correctness
- 5.00
- Faithfulness
- 5.00
- Grounding
- 5.00
- Relevance
- 5.00
- Completeness
- 4.93
- Overall
- 4.99
Latency Distribution
- Mean
- 11489ms
- Min
- 7945ms
- Max
- 15934ms
Compare Models
See how it stacks up
Compare Claude Opus 4.5 with other top llms to understand the differences in performance, accuracy, and latency.