Gemini 3 Pro Preview

1M context with mandatory reasoning mode for careful analysis of retrieved content before responding. Multimodal support enables RAG across text, images, audio, and video with robust tool-calling for dynamic retrieval. If you want to compare the best LLMs for your data, try Agentset.

Leaderboard Rank

of 16

ELO Rating

1509

Win Rate

44.1%

Latency

17904ms

#13

Model Information

Provider: Google
License: Proprietary
Input Price per 1M: $2.00
Output Price per 1M: $12.00
Context Window: 1049K
Release Date: 2025-11-18
Model Name: gemini-3-pro-preview
Total Evaluations: 1350

Performance Record

Wins596 (44.1%)

Losses581 (43.0%)

Ties173 (12.8%)

Wins

Losses

Ties

LLMs Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no LLM orchestration to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Gemini 3 Pro Preview's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Gemini 3 Pro Preview - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

PG

ELO 155148.4% WR218W-204L-28T

Quality Metrics

Correctness: 4.90
Faithfulness: 4.93
Grounding: 4.93
Relevance: 5.00
Completeness: 4.73
Overall: 4.90

Latency Distribution

Mean: 25137ms
Min: 13317ms
Max: 62299ms

SciFact

ELO 153039.3% WR177W-184L-89T

Quality Metrics

Correctness: 4.93
Faithfulness: 4.97
Grounding: 4.97
Relevance: 4.97
Completeness: 4.83
Overall: 4.93

Latency Distribution

Mean: 14583ms
Min: 10135ms
Max: 21489ms

MSMARCO

ELO 144644.7% WR201W-193L-56T

Quality Metrics

Correctness: 4.83
Faithfulness: 4.83
Grounding: 4.83
Relevance: 5.00
Completeness: 4.90
Overall: 4.88

Latency Distribution

Mean: 13990ms
Min: 7461ms
Max: 26343ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked LLMs and smart retrieval built in. Upload your data, call the API, and get grounded answers from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}