For developers calling Claude programmatically. You write the system prompt, you choose the model, you configure caching, you decide what enters context. The consumer interfaces make many decisions for you; the API exposes all of them.
Per million tokens, as of April 2026, on standard pricing:
| Model | Input | Output | Released |
|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | Oct 2025 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | Feb 2026 |
| Claude Opus 4.6 | $5.00 | $25.00 | Feb 2026 |
| Claude Opus 4.7 | $5.00 | $25.00 | Apr 2026 |
Pricing verified at Anthropic's official documentation and several independent sources on April 25, 2026. Note that Claude Opus 4.7 ships with a new tokenizer that can produce up to 35% more tokens for the same input text on certain content types; sticker price is unchanged from 4.6, but effective per-task cost can rise.
Three structural facts about this pricing:
API users have two large multiplicative discounts available, and they stack:
"cache_control": {"type": "ephemeral"} on static content (system prompt, tool definitions, document prefixes). On a 50,000-token system prompt sent across 20 requests, cached cost drops from approximately $3.00 to $0.47 on Sonnet 4.6.Stacked, a batch request with a cache hit costs as little as 5% of standard non-cached pricing. For document-processing pipelines, bulk classification, and offline analysis, batches plus caching is the single biggest cost lever available.
The single highest-leverage architectural decision. Routing simple tasks to Haiku and reserving Sonnet or Opus for tasks that genuinely require stronger reasoning can cut blended costs by 5x or more. A system that uses Opus for everything when Haiku would suffice 70% of the time is paying 5x more than necessary on the majority of its requests. Build a router. Classification, extraction, simple summarisation, routing: Haiku. General application work: Sonnet. Complex multi-step reasoning, agentic coding: Opus.