content field: instead of a string you pass an array of parts — text blocks {"type": "text"} and image blocks {"type": "image_url"}.
Image from a public URL
The simplest case — the image is already online at a direct URL.Python (OpenAI SDK)
Local file via base64
If the image isn’t publicly reachable (a screenshot, a photo on disk), encode it to base64 and pass it as a data URI in the formdata:image/jpeg;base64,....
Python (OpenAI SDK)
For PNG change the prefix to
data:image/png;base64,, for WebP use data:image/webp;base64,. The MIME type must match the file’s real format.Gotchas
Bigger images cost more tokens
Bigger images cost more tokens
An image counts as input tokens, and the higher the resolution the more expensive the request. If you don’t need the fine detail, downscale the image before sending (for example to 1024–1568 px on the long side).
Huge base64 payloads bloat the request
Huge base64 payloads bloat the request
A base64 string is about a third larger than the file itself and travels entirely inside the request body. For heavy images a public URL is safer than a multi-megabyte data URI — otherwise you can hit the request size limit.
Supported formats
Supported formats
Typically
jpeg, png, webp and gif. Convert exotic formats (HEIC, TIFF, SVG) to one of these beforehand.What’s next
- Quick Start — sign-up, key and your first request
- Errors and how to fix them — what to do on 400 / 413 / 415 and other codes