Skip to main content
The easiest way to compare models isn’t the price list — it’s a real invoice. Here’s the usage export from one day of Claude Code: an ordinary day with an agent reading the repo, editing files and running tests. It added up to 405 million tokens and $167.24:
Billing lineTokens for the dayPriceCost
Input (cache hit)395,374,336$0.30 / MTok$118.61
Input (cache miss)8,509,057$3.00 / MTok$25.53
Output1,540,240$15.00 / MTok$23.10
Total405,423,633$167.24
One line is worth pausing on. Almost the entire volume — 395 million tokens out of 405 — is cache reads, not fresh input. Claude caches the repeated context: the system prompt, the contents of open files, the conversation history. Reading it costs $0.30 per million instead of $3.00 — ten times less. Without caching, the same day would have run about $1,235. So $167 is already a heavily discounted number, not the sticker price. The question from here is simple: what do those exact same tokens cost on DeepSeek?

The same day on DeepSeek V4 Pro

Same three lines, priced with deepseek-v4-pro — the top of the line, the closest match to Sonnet by class of work:
Billing lineTokens for the dayPriceCost
Input (cache hit)395,374,336$0.004 / MTok$1.58
Input (cache miss)8,509,057$0.435 / MTok$3.70
Output1,540,240$0.87 / MTok$1.34
Total405,423,633$6.62
$167.24 versus $6.62 — for the same work, that’s 25× cheaper.

The same day on DeepSeek V4 Flash

deepseek-v4-flash is the cheapest model in the line, and it’s fast; it’s what you reach for on a steady stream of simple tasks:
Billing lineTokens for the dayPriceCost
Input (cache hit)395,374,336$0.01 / MTok$3.95
Input (cache miss)8,509,057$0.14 / MTok$1.19
Output1,540,240$0.28 / MTok$0.43
Total405,423,633$5.57
Here the gap is even wider — roughly 30×.

The result in one table

ModelCost of the dayDifference
Claude Sonnet$167.24
DeepSeek V4 Pro$6.62~25× cheaper
DeepSeek V4 Flash$5.57~30× cheaper

Where the gap comes from

It comes from two things, and both favor DeepSeek. Price per token. Fresh input on Sonnet is $3.00 per million; on Pro it’s $0.435 — nearly seven times less. Output is $15.00 against $0.87, a gap of more than seventeen times. Output is where the saving shows most: the longer the model’s answers, the wider the invoices spread apart. Price of a cache read. Cache carries the load here — 97.5% of the tokens. And even on cache reads DeepSeek is cheaper: $0.004 per million on Pro against $0.30 on Sonnet. For this day, cache reads cost $118.61 on Claude and $1.58 on Pro. Stack one on top of the other, and on an identical set of tokens the gap reaches 25–30×.
Prices are current as of testing and the vendors can change them. Current per-token rates are on the Pricing page and at www.ruapi.ai. What matters here is the order of the difference between the models, not the exact dollar figures.

Which one to use

Cheaper isn’t the same as “always better.” It depends on the task.
  • deepseek-v4-pro — the everyday workhorse: code, refactoring, reasoning, agents. Quality stays close to the top models while costing a fraction. If Claude Code or a similar agent is running for you right now, this is the first thing to swap in.
  • deepseek-v4-flash — for volume: classification, tagging, short replies, rough drafts. Where price and speed matter more than top quality.
  • Claude — when you need to read images (DeepSeek has no vision) or top quality on the hardest tasks. One RuAPI token covers everything, so the models happily coexist in a single project.
DeepSeek models are text-only: they don’t read images, diagrams or screenshots. If you need vision, look at Claude, Gemini or GLM-5V. More on the line itself is on the DeepSeek API page.

How to switch

If your project already talks to RuAPI over the OpenAI-compatible protocol, there’s one line to change — the model ID. The base_url and key stay the same:
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_RUAPI_KEY",
    base_url="https://www.ruapi.ai/v1",
)

resp = client.chat.completions.create(
    model="deepseek-v4-pro",   # instead of claude-sonnet-4-6
    messages=[{"role": "user", "content": "Refactor this function and explain what you changed"}],
)
print(resp.choices[0].message.content)
Setup from scratch — the key and a first request in Python and curl — is in the Quickstart.

Next

DeepSeek API

Pro and Flash: how they differ and what they do.

Claude models

Lines, versions, and when Claude earns its price.

How billing works

Pay per token used, with a log of every request.

Quickstart

base_url, key and your first request.