Updated March 2026
Claude API Pricing in 2026
Anthropic offers three Claude models ranging from $0.80 to $15 per million input tokens. This guide breaks down every cost, explains caching and batch discounts, and helps you choose the right model for your workload.
Haiku 3.5
$0.80/MTok
Sonnet 4
$3.00/MTok
Opus 4
$15.00/MTok
Complete Claude API Pricing Table
All prices are per 1 million tokens. Prompt caching write costs 25% more; cache reads cost 90% less. Batch API is 50% off standard prices.
| Model | Input | Output |
|---|---|---|
| Claude Opus 4 | $15.00 | $75.00 |
| Claude Sonnet 4Most Popular | $3.00 | $15.00 |
| Claude Haiku 3.5 | $0.80 | $4.00 |
What does “per million tokens” mean?
How Claude API Pricing Works
The Claude API uses token-based pricing. Every piece of text you send to the API (your prompt, system instructions, conversation history) is split into tokens—small chunks of text that the model processes. You pay separately for input tokens (what you send) and output tokens (what Claude generates back).
Output tokens are always more expensive than input tokens because generating text requires more computation than reading it. On Claude Sonnet 4, output costs 5x more than input ($15 vs $3 per million tokens). This means a long prompt with a short response is cheaper than a short prompt with a long response.
Prices are quoted per million tokens (MTok). To put that in perspective: one million tokens equals roughly 750,000 words, which is about 1,500 pages of standard text or roughly 10-15 full-length novels. A single customer support interaction typically uses 500-2,000 input tokens and 200-1,000 output tokens—a tiny fraction of a million.
The cost formula
Example: 2,000 input tokens + 500 output tokens on Sonnet 4 = (2,000 / 1M × $3) + (500 / 1M × $15) = $0.0135 per request.
There is no monthly subscription, no minimum spend, and no commitment. You pay only for the tokens you consume. Anthropic bills based on actual usage, making it easy to start small and scale as your application grows. You can also reduce costs significantly through prompt caching (save up to 90% on input) and the Batch API (50% off everything).
Quick Cost Examples
What common API requests actually cost, calculated with real pricing.
Chatbot conversation
500 input + 200 output tokens on Sonnet 4
<$0.01
per request
Summarise a 50-page document
~30,000 input + 1,000 output tokens on Sonnet 4
$0.10
per request
10,000 support tickets via Batch API
2,000 input + 500 output each on Haiku Batch (50% off)
$18.00
per request
The batch example shows the total cost for all 10,000 tickets combined ($$18.00 total), not per request.
Choose the Right Claude Model
Each model sits at a different point on the cost-quality spectrum. Pick the cheapest one that meets your quality bar.
Claude Sonnet 4
$3 / $15 input / output per MTok
The sweet spot. Handles complex coding, analysis, and content generation with quality close to Opus at a fraction of the price.
Best for:
- •Production chatbots and assistants
- •Code generation and review
- •Content creation at scale
- •RAG-powered Q&A systems
Claude Opus 4
$15 / $75 input / output per MTok
Maximum intelligence for tasks where quality matters more than cost. Excels at multi-step reasoning, research synthesis, and architectural decisions.
Best for:
- •Legal and scientific analysis
- •Complex multi-step reasoning
- •Agentic coding workflows
- •High-stakes decision support
Claude Haiku 3.5
$0.80 / $4 input / output per MTok
Fast and affordable. Purpose-built for high-volume workloads where speed and cost matter more than deep reasoning.
Best for:
- •Classification and routing
- •Entity extraction
- •Content moderation
- •Real-time data formatting
Save Money with Advanced Features
Two built-in features can dramatically reduce your Claude API bill.
Prompt Caching
Cache your system prompt and shared context across requests. Cached input tokens cost just 10% of the standard input price. A chatbot sending a 2,000-token system prompt with 100,000 requests/month on Sonnet saves approximately $540/month on input costs alone.
Learn about prompt caching pricing →
Batch API
Submit requests in bulk and get a flat 50% discount on both input and output tokens. Results are processed within a 24-hour window. Ideal for content generation, data processing, bulk classification, and evaluation pipelines that do not need real-time responses.
Learn about batch API pricing →
How Does Claude Compare?
See how Claude API pricing stacks up against the other major LLM providers, token-for-token.