Serverless pricing · per 1M tokens · USD

Opus 4.8 vs Opus 4.7 vs GLM-5.2

Drag the sliders to see how cost scales across real workloads. Both Opus models share the same standard list price; GLM-5.2 is ~3.6× cheaper on input and ~5.7× cheaper on output. Objective, numbers-only comparison.

Cheaper on input

3.6×

$5.00 → $1.40 per M input tokens

Cheaper on output

5.7×

$25.00 → $4.40 per M output tokens

Cheaper per request

4.0×

at the current slider mix

Headline rates

Opus 4.8 Opus 4.7 GLM-5.2

Opus 4.8

input$5.00 / M

output$25.00 / M

cached input$0.50 / M

Opus 4.7

input$5.00 / M

output$25.00 / M

cached input$0.50 / M

GLM-5.2

input$1.40 / M

output$4.40 / M

cached input$0.26 / M

Workload

Input tokens / request 50,000

Output tokens / request 4,000

Cached input / request 0

Requests 1,000

Fixed budget $1,000,000

Opus 4.7+ uses a new tokenizer that can emit up to 35% more tokens for the same text (per Anthropic's docs), so the Opus effective cost runs higher than the headline rate above.

Cost for this workload

Opus 4.8$0.00

Opus 4.7$0.00

GLM-5.2$0.00

Opus 4.8 total

Opus 4.7 total

GLM-5.2 total

Saved with GLM

What a fixed budget buys

Holding spend constant at $1M — how many requests can each model serve?

Opus 4.80

Opus 4.70

GLM-5.20

GLM vs Opus 4.8

—

GLM vs Opus 4.7

—

Extra reqs (GLM vs 4.8)

Rates — Opus 4.8 & 4.7: $5 / $25 / $0.50 per M (in / out / cached), per Anthropic pricing. GLM-5.2: $1.40 / $4.40 / $0.26 per M, Together AI serverless. Objective comparison of public list prices. No personal or account-specific data.