OpenAI
OpenAI APIGPT-5.5
OpenAI’s current flagship. Strongest reasoning in the lineup — priced for high-value tasks rather than volume workloads.
GPT-5.4
Current mainstream frontier. Best OpenAI model for production workloads — reasoning, vision, and multimodal tasks.
GPT-5.4 mini
Frontier-level intelligence at mid-tier cost. Replaces GPT-4o as the go-to balanced option for most production use cases.
GPT-4o mini
Still the cheapest OpenAI option per token. Ideal for high-volume classification, summarisation, or simple routing tasks.
GPT-4.1
Best choice for document-heavy workflows. 1M token context window for long codebases, contracts, or multi-document retrieval.
GPT-4.1 mini
1M context at budget price. Great for batch processing large documents where you don’t need frontier quality.
Anthropic
Claude APIClaude Opus 4.8
Anthropic’s current flagship (released May 2026). Same price as Opus 4.7 with stronger reasoning and agentic performance.
Claude Opus 4.7
Most capable Claude model. Step-change improvement in agentic coding over Opus 4.6. For complex reasoning and autonomous agents.
Claude Sonnet 4.6
Best balance of speed and intelligence. Production-ready for coding, analysis, and complex reasoning. Extended thinking available.
Claude Haiku 4.5
Fastest Claude with near-frontier intelligence. Built for customer-facing apps where latency and cost per call matter most.
Side-by-side comparison
All models ranked by input token price. June 2026 — verify at provider pages before committing to a budget.
| Model | Input / 1M | Cached / 1M | Output / 1M | Context | |
|---|---|---|---|---|---|
| GPT-4o mini OpenAI | $0.15 | $0.075 | $0.60 | 128K | Calculate → |
| GPT-4.1 mini OpenAI | $0.40 | $0.10 | $1.60 | 1M | Calculate → |
| GPT-5.4 mini OpenAI | $0.75 | $0.075 | $4.50 | 128K | Calculate → |
| Claude Haiku 4.5 Anthropic | $1.00 | $0.10 | $5.00 | 200K | Calculate → |
| GPT-4.1 OpenAI | $2.00 | $0.50 | $8.00 | 1M | Calculate → |
| GPT-5.4 OpenAI | $2.50 | $0.25 | $15.00 | 128K | Calculate → |
| Claude Sonnet 4.6 Anthropic | $3.00 | $0.30 | $15.00 | 1M | Calculate → |
| Claude Opus 4.8 Anthropic | $5.00 | $0.50 | $25.00 | 1M | Calculate → |
| GPT-5.5 OpenAI | $5.00 | $0.50 | $30.00 | 256K | Calculate → |
| Claude Opus 4.7 Anthropic | $5.00 | $0.50 | $25.00 | 1M | Calculate → |
Frequently asked questions
What is a token?
Tokens are the units LLMs use to process text. One token is roughly 4 characters or ¾ of an English word. “Hello world” is about 3 tokens. A 1,000-word document is roughly 1,300 tokens.
What’s the difference between input and output tokens?
Input tokens are everything you send to the model — your prompt, system instructions, retrieved documents. Output tokens are what the model generates back. Output is typically 3–5× more expensive per token because it’s computationally heavier.
What is prompt caching?
Prompt caching lets you reuse a fixed prefix (e.g. a long system prompt or knowledge base) across calls. Cached tokens cost 50–90% less than uncached. Both OpenAI and Anthropic support it. High cache hit rates dramatically reduce costs for document-heavy workflows.
What is the batch API?
Both OpenAI and Anthropic offer a batch API where jobs are processed asynchronously (within 24 hours) at a 50% discount. Ideal for background tasks, bulk evaluation, or overnight processing pipelines that don’t need real-time responses.
How do I estimate my monthly token usage?
Multiply: (monthly active users) × (workflows per user) × (model calls per workflow) × (avg tokens per call). Don’t forget retries, which typically add 5–15% overhead. Use the full calculator to model this with low/expected/high scenarios.
Why is the real cost higher than just tokens?
Token costs are rarely the biggest line item in a production AI system. Fixed infrastructure, embeddings, web search, and OCR often dominate. The full-stack calculator models all of these and shows you what’s actually driving cost.
GPT-5.4 vs Claude Sonnet 4.6 — which is cheaper?
GPT-5.4 input is marginally cheaper ($2.50 vs $3.00/M) and both charge $15/M for output. Cached rates are also close ($0.25 vs $0.30/M) — so for most workloads the price difference is small, and quality on your specific task should decide. Note Anthropic adds a one-off 1.25× surcharge when writing the cache; OpenAI doesn’t.
When should I use a mini / Haiku model?
For classification, routing, short-form generation, or any task where a smaller model gets the job done. GPT-4o mini and Claude Haiku 4.5 are 5–30× cheaper than their flagship counterparts. Run a quality eval on your specific task before assuming you need the bigger model.