Serverless pricing · per 1M tokens · USD

Opus 4.8 vs Opus 4.7 vs GLM-5.2

Drag the sliders to see how cost scales across real workloads. Both Opus models share the same standard list price; GLM-5.2 is ~3.6× cheaper on input and ~5.7× cheaper on output. Objective, numbers-only comparison.

Cheaper on input
3.6×
$5.00 → $1.40 per M input tokens
Cheaper on output
5.7×
$25.00 → $4.40 per M output tokens
Cheaper per request
4.0×
at the current slider mix

Headline rates

Opus 4.8 Opus 4.7 GLM-5.2
Opus 4.8
input$5.00 / M
output$25.00 / M
cached input$0.50 / M
Opus 4.7
input$5.00 / M
output$25.00 / M
cached input$0.50 / M
GLM-5.2
input$1.40 / M
output$4.40 / M
cached input$0.26 / M

Workload

Opus 4.7+ uses a new tokenizer that can emit up to 35% more tokens for the same text (per Anthropic's docs), so the Opus effective cost runs higher than the headline rate above.

Cost for this workload

Opus 4.8$0.00
Opus 4.7$0.00
GLM-5.2$0.00
Opus 4.8 total
$0
Opus 4.7 total
$0
GLM-5.2 total
$0
Saved with GLM
$0

What a fixed budget buys

Holding spend constant at $1M — how many requests can each model serve?

Opus 4.80
Opus 4.70
GLM-5.20
GLM vs Opus 4.8
GLM vs Opus 4.7
Extra reqs (GLM vs 4.8)
0

Rates — Opus 4.8 & 4.7: $5 / $25 / $0.50 per M (in / out / cached), per Anthropic pricing. GLM-5.2: $1.40 / $4.40 / $0.26 per M, Together AI serverless. Objective comparison of public list prices. No personal or account-specific data.