Breaking

Some Customers Cut Anthropic, OpenAI Bills by Over 90% via Cheaper Models

June 25, 2026 at 09:30 EDT

Foundation Models
Sales & Business
Infra & Chips

A subset of large Anthropic and OpenAI customers are cutting their API bills by more than 90% by reducing use of flagship models and switching to cheaper models or alternatives.

AI Cost Strategy · Enterprise

Enterprises Slash AI Bills by 90%+ — by Switching to Cheaper Models

Major Anthropic and OpenAI customers are abandoning flagship-only setups for a "hybrid strategy": reserve top models for hard problems, route routine work to cheap ones. For high-volume, simple tasks, the savings top 90%.

90%+

Cost cut on high-volume, simple tasks

3–5×

Cheaper just by swapping Opus/Sonnet → Haiku

73.3%

Haiku 4.5 on SWE-bench, near Sonnet's 77.2%

Output price per 1M tokens — the gap is the story

Same task, very different bill. Column height is proportional to price.

$25

Claude
Opus

$15

Claude
Sonnet 4.5

Claude
Haiku 4.5

$0.60

GPT-4o
mini

The hybrid routing strategy

Incoming task

Every request enters a router

→

Route by need

Match the right model to each task

→

Result

Cost down, quality held

Cheap models are "good enough" for

Classification & extraction
High-volume, repetitive processing
Simple coding and RAG

Flagships still win for

Deep analysis & advanced coding
Complex agent tasks
Cases where cheap models degrade

The bigger picture

An AI price war is underway — OpenAI is weighing drastic cuts while cheaper alternatives, including Chinese open-source models, gain ground. Moving away from flagship-only reliance is becoming an industry-wide trend.

Continue reading

The rest of this article is for AI News Blitz readers. Choose an option below to keep reading.