LLM Comparison 2026: GPT-4o vs Claude vs Gemini vs Llama | A Versus B

A Versus B

LLM Comparison 2026

An independent, citation-backed comparison of 12 leading large language models — covering parameter count, context window, input/output modalities, license, and knowledge cutoff. Column structure mirrors Wikipedia's Comparison of large language models for easy cross-reference.

By Daniel Rozin, Founder & Editor-in-Chief·Last updated: May 22, 2026·Methodology

Key takeaways (as of May 2026)

Largest context windows: GPT-4.1 and Gemini 2.0 Flash / 2.5 Pro — all at 1M tokens.^[1]^[5]
Most open: DeepSeek-V3 (MIT, 671B MoE) and Llama 3.3 70B (community license, 70.6B) — both weights downloadable.^[10]^[7]
Multimodal leaders: GPT-4o (text/image/audio in+out) and Gemini 2.0 Flash (text/image/audio/video in, text/image out).^[1]^[5]
Parameters undisclosed: OpenAI (GPT-4 series) and Anthropic (Claude 3 series) have not published parameter counts. Cells read “Undisclosed” — not estimated figures.

Comparison table

All data as of May 2026.^[*] “Undisclosed” means the vendor has not published the value — not that it is unknown to us. Context window definitions vary; see methodology for details.

Model	Vendor	Parameters	Context window	Input modalities	Output modalities	License	Knowledge cutoff
GPT-4o	OpenAI	Undisclosed^[1]	128K	Text, Image, Audio	Text, Audio	Proprietary	Oct 2023
GPT-4.1	OpenAI	Undisclosed^[2]	1M	Text, Image	Text	Proprietary	Undisclosed
Claude 3.5 Sonnet	Anthropic	Undisclosed^[3]	200K	Text, Image	Text	Proprietary	Apr 2024
Claude 3.7 Sonnet	Anthropic	Undisclosed^[4]	200K	Text, Image	Text	Proprietary	Undisclosed
Gemini 2.0 Flash	Google DeepMind	Undisclosed^[5]	1M	Text, Image, Audio, Video	Text, Image	Proprietary	Aug 2024
Gemini 2.5 Pro	Google DeepMind	Undisclosed^[6]	1M	Text, Image, Audio, Video	Text	Proprietary	Undisclosed
Llama 3.3 70B	Meta AI	70.6B^[7]	128K	Text	Text	Llama 3 Community License (open weights)	Dec 2023
Mistral Large 2	Mistral AI	Undisclosed^[8]	128K	Text	Text	Proprietary	Undisclosed
Grok-3	xAI	Undisclosed^[9]	131K	Text, Image	Text	Proprietary	Undisclosed
DeepSeek-V3	DeepSeek	671B total / 37B active (MoE)^[10]	128K	Text	Text	MIT (open weights)	Undisclosed
Qwen2.5-72B	Alibaba Cloud	72.7B^[11]	128K	Text	Text	Qwen License (Apache 2.0 base with usage restrictions)	Undisclosed
Command R+	Cohere	Undisclosed^[12]	128K	Text	Text	Proprietary	Undisclosed

How we compile this table

Our full methodology covers column definitions, source tiers, the “Undisclosed” policy, recency policy, COI disclosure, and our correction process. Column structure intentionally mirrors Wikipedia's LLM comparison article for editor cross-reference.

Read the methodology →

Sources

GPT-4o (OpenAI). GPT-4o system card; OpenAI models API reference Accessed May 2026.
GPT-4.1 (OpenAI). OpenAI models API reference Accessed May 2026.
Claude 3.5 Sonnet (Anthropic). Anthropic models list Accessed May 2026.
Claude 3.7 Sonnet (Anthropic). Anthropic Claude 3.7 Sonnet announcement; Anthropic models list Accessed May 2026.
Gemini 2.0 Flash (Google DeepMind). Google AI Gemini model page Accessed May 2026.
Gemini 2.5 Pro (Google DeepMind). Google AI Gemini 2.5 Pro model page Accessed May 2026.
Llama 3.3 70B (Meta AI). Meta Llama 3.3 announcement; Llama 3.3 70B HuggingFace model card; Llama 3 technical report (arXiv 2407.21783) Accessed May 2026.
Mistral Large 2 (Mistral AI). Mistral AI models Accessed May 2026.
Grok-3 (xAI). xAI Grok-3 announcement Accessed May 2026.
DeepSeek-V3 (DeepSeek). DeepSeek-V3 technical report (arXiv 2412.19437); DeepSeek-V3 GitHub Accessed May 2026.
Qwen2.5-72B (Alibaba Cloud). Qwen2.5 technical report (arXiv 2412.15115); Qwen2.5-72B HuggingFace model card Accessed May 2026.
Command R+ (Cohere). Cohere Command R+ model page Accessed May 2026.
^* Data reflects information available from primary vendor sources as of May 2026. This is a fast-moving field; check the May 2026 dateModified stamp above and consult vendor documentation for the latest specifications.