</> CodeAgents

Models & benchmarks

Model matrix

Every model each coding agent supports, with context windows, pricing, and SWE-bench / HumanEval scores side by side.

Last verified: April 2026

Benchmark scores

Higher is better. SWE-bench Verified measures end-to-end task completion on real GitHub issues; HumanEval measures Python function synthesis.

Sources: official SWE-bench Verified leaderboard and provider-published HumanEval scores.

Supported models per agent

Claude Code

Learn more →
ModelProviderContextInput / 1MOutput / 1M
Claude Sonnet 4.6DefaultAnthropic1M$3$15
Claude Opus 4.7Anthropic1M$5$25
Claude Opus 4.6Anthropic1M$5$25
Claude Opus 4.5Anthropic200K$5$25
Claude Opus 4.1Anthropic200K$15$75
Claude Sonnet 4.5Anthropic200K$3$15
Claude Haiku 4.5Anthropic200K$1$5
SWE-bench: 82%HumanEval: 94%

Claude Code defaults to Sonnet 4.6 (1M context). Opus 4.7 is the strongest Anthropic model for complex agentic coding. Haiku 4.5 is the fast, cheap option. Older Opus 4 / Sonnet 4 / Haiku 3.5 are deprecated and being removed.

ModelProviderContextInput / 1MOutput / 1M
Composer-1 (in-house)DefaultCursor200KBundledBundled
Claude Sonnet 4.6Anthropic1MBundledBundled
Claude Opus 4.7Anthropic1MBundledBundled
Claude Haiku 4.5Anthropic200KBundledBundled
GPT-5OpenAI400KBundledBundled
GPT-5 CodexOpenAI400KBundledBundled
Gemini 3.1 ProGoogle1MBundledBundled
Gemini 3 FlashGoogle1MBundledBundled
SWE-bench: 73.5%HumanEval: 91%

Models bundled in the Cursor Pro subscription ($20/mo). Power users add their own API keys to bypass quota.

GitHub Copilot

Learn more →
ModelProviderContextInput / 1MOutput / 1M
GPT-5DefaultOpenAI400KBundledBundled
GPT-5 miniOpenAI400KBundledBundled
Claude Sonnet 4.6Anthropic1MBundledBundled
Claude Opus 4.7Anthropic1MBundledBundled
Claude Haiku 4.5Anthropic200KBundledBundled
Gemini 3.1 ProGoogle1MBundledBundled
Gemini 3 FlashGoogle1MBundledBundled
GPT-4.1OpenAI128KBundledBundled
SWE-bench: 68.3%HumanEval: 89%

Model picker available on Pro, Business, and Enterprise plans. Free tier limited to default model.

Gemini CLI

Learn more →
ModelProviderContextInput / 1MOutput / 1M
Gemini 3.1 Pro (Preview)DefaultGoogle1M$2$12
Gemini 3 Flash (Preview)Google1M$0.5$3
Gemini 3.1 Flash-Lite (Preview)Google1M$0.25$1.5
Gemini 2.5 ProGoogle1M$1.25$10
Gemini 2.5 FlashGoogle1M$0.3$2.5
SWE-bench: 76.2%HumanEval: 93%

Free tier via personal Google account: 60 requests/min, 1,000 requests/day on Gemini 3.1 Pro Preview. Gemini 3.1 Pro pricing tiers: $2/$12 for prompts ≤200K tokens, $4/$18 above.

ModelProviderContextInput / 1MOutput / 1M
Any provider via API keyDefaultBYOKProvider-definedBundledBundled
Claude Sonnet 4.6Anthropic1M$3$15
Claude Opus 4.7Anthropic1M$5$25
Claude Haiku 4.5Anthropic200K$1$5
GPT-5OpenAI400K$5$20
Gemini 3.1 ProGoogle1M$2$12
Local Ollama modelsOllama32KFreeFree
SWE-bench: 75%HumanEval: 91%

OpenCode is a TUI shell — pricing follows whichever provider key you supply. Local Ollama models are free but cap out around 32k context.

Prices in USD per 1M tokens. Pricing changes — verify on each provider's official page before relying on it.