Qwen3 30B A3B Thinking

Support for 119 languages enabling multilingual RAG without translation overhead. Thinking mode with <think> blocks shows reasoning process over retrieved documents with Apache 2.0 license. If you want to compare the best LLMs for your data, try Agentset.

Leaderboard Rank

#15

of 16

ELO Rating

1250

#15

Win Rate

23.8%

#14

Latency

12313ms

#10

Model Information

Provider: Alibaba/Qwen
License: Open Source
Input Price per 1M: $0.05
Output Price per 1M: $0.34
Context Window: 33K
Release Date: 2025-08-28
Model Name: qwen3-30b-a3b-thinking-2507
Total Evaluations: 1350

Performance Record

Wins321 (23.8%)

Losses870 (64.4%)

Ties159 (11.8%)

Wins

Losses

Ties

LLMs Are Just One Piece of RAG

Agentset gives you a managed RAG pipeline with the top-ranked models and best practices baked in. No infrastructure to maintain, no LLM orchestration to manage.

Schedule Demo Login

Trusted by teams building production RAG applications

5M+

Documents

1,500+

Teams

99.9%

Uptime

Performance Overview

ELO ratings by dataset

Qwen3 30B A3B Thinking's ELO performance varies across different benchmark datasets, showing its strengths in specific domains.

Qwen3 30B A3B Thinking - ELO by Dataset

Detailed Metrics

Dataset breakdown

Performance metrics across different benchmark datasets, including accuracy and latency percentiles.

SciFact

ELO 134329.3% WR132W-215L-103T

Quality Metrics

Correctness: 4.97
Faithfulness: 4.97
Grounding: 4.93
Relevance: 5.00
Completeness: 4.87
Overall: 4.95

Latency Distribution

Mean: 8384ms
Min: 2185ms
Max: 19414ms

MSMARCO

ELO 130527.6% WR124W-276L-50T

Quality Metrics

Correctness: 4.90
Faithfulness: 4.90
Grounding: 4.87
Relevance: 5.00
Completeness: 4.80
Overall: 4.89

Latency Distribution

Mean: 12522ms
Min: 1541ms
Max: 49799ms

PG

ELO 110414.4% WR65W-379L-6T

Quality Metrics

Correctness: 4.90
Faithfulness: 4.87
Grounding: 4.83
Relevance: 4.90
Completeness: 4.67
Overall: 4.83

Latency Distribution

Mean: 16030ms
Min: 3483ms
Max: 44237ms

Build RAG in Minutes, Not Months

Agentset gives you a complete RAG API with top-ranked LLMs and smart retrieval built in. Upload your data, call the API, and get grounded answers from day one.

Schedule Demo Read the docs

import { Agentset } from "agentset";

const agentset = new Agentset();
const ns = agentset.namespace("ns_1234");

const results = await ns.search(
  "What is multi-head attention?"
);

for (const result of results) {
  console.log(result.text);
}