Torii Image Translator API
Integrate the power of Torii's API into your own applications.
Get Started in Minutes
Start translating images with just a few lines of code.
Sign Up & Get Your Key
Create a free account and generate your unique API key from your dashboard.
Purchase Credits
Our API uses a simple credit-based system. Top up your account to start translating.
Make Your First Request
Use our endpoints to perform translation, OCR, inpainting, or typesetting.
Limits & Rate Throttling
To ensure service reliability and fair resource distribution, the Torii API implements rate limiting and file size constraints on all requests.
Standard Rate Limit
Your API key has a default throttling limit configured to manage steady traffic:
1 request / second
Standard processing rate per API key.
100 requests
Spikes are queued and processed sequentially.
Image & File Limits
The API enforces constraints on uploaded media to ensure fast processing times:
30 MB
Maximum allowed file payload size.
50,000 px
Height & width limit per image.
50 Megapixels
Maximum total resolution limit.
Handling Limit Errors
When limits are exceeded, subsequent requests will be rejected with an HTTP status code of 429 Too Many Requests or 503 Service Unavailable (for rate limits) or a client/validation error (for size limits). We recommend implementing a retry mechanism with exponential backoff in your application to handle these gracefully.
API Reference
Endpoints
Typeset Endpoint
Render multiple textboxes onto a pre-cleaned image with custom fonts, colors, and alignments.
Credits Cost
Each request to the typeset endpoint costs 0.02 credits.
Build Your Request
Authentication
Authenticate your requests by including your API key in the Authorization
header as a Bearer token.
Request Parameters (Form Data)
file file * required
The background image file.
text_boxes string (JSON array) * required
A JSON stringified array of textbox objects to render. Each object must contain:
x: top-left x-coordinatey: top-left y-coordinatewidth: width of the textboxheight: height of the textboxtext: text to be renderedalignment: 'left', 'center', or 'right'text_color: hex color for the fill (e.g., #ffffff)stroke_color: hex color for the outline/stroke (e.g., #000000)
font string
The font to use for all textboxes.
min_font_size number
The minimum font size to render.
stroke_disabled boolean
Whether to globally disable text strokes (outlines).
Response Headers
success boolean
Whether the typeset operation was successful.
credits number
The amount of credits remaining.
Response Body
The response content is a JSON object.
{
"image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII+..."
}
Example Code
Translate Endpoint
Translate images or raw text between dozens of languages. Image translation automatically handles text detection, inpainting and typesetting, with lots of other settings and features available.
Credits Cost
Each request to the translate endpoint costs at least 1 credit, but it may cost more depending on the chosen translation model, context and input/output character length.
Build Your Request
Authentication
Authenticate your requests by including your API key in the Authorization
header as a Bearer token.
Request Parameters (Form Data)
These parameters should be sent as part of the multipart/form-data
body, alongside the file.
file file * required
The image file to be translated.
target_lang string * required
The target language code for the translation.
translator string * required
The translation model to use.
font string * required
The font to be used for the translated text.
text_align string
The alignment of the translated text.
stroke_disabled boolean
Whether to disable the text stroke/outline (useful for some documents, since Torii tries to detect the stroke and color for every word).
min_font_size number
The minimum font size of the rendered translated text.
bubbles_only boolean
If true, only text inside detected speech bubbles will be translated and also text that is very long and high-confidence, even if not inside a bubble.
custom_prompt string * max 1000 chars
A custom prompt with instructions to guide the translation.
context string * max 10000 chars
Additional context to ground
translation and provide extra information about the names, characters, events,
dialogue, history, etc.
You can start a context chain with the string "None" as input for the first
image's context.
Then, for the next images, you can use the previous context to continue the
chain.
The previous context is returned in the response body under the key "context".
If you don't wish to start a context chain, you can omit this parameter, or use
any other starting context string besides "None".
This will simply provide regular context to the model without any special
instructions.
x-byok-<provider> string
Optional. Your custom API key for a Bring-Your-Own-Key (BYOK) provider.
If supplied, requests will use your own provider API key and account.
Image translation will be reduced to 1 credit (OCR/server costs), no matter text
length or model chosen, and Text translation will be 0.01.
The parameter name must be prefixed with x-byok-
followed by one of the supported provider names:
x-byok-openrouter: OpenRouter API Keyx-byok-openai: OpenAI API Keyx-byok-google: Google Gemini API Keyx-byok-anthropic: Anthropic Claude API Keyx-byok-deepseek: DeepSeek API Keyx-byok-xai: xAI Grok API Keyx-byok-local: Self-hosted API Key (optional)
x-byok-local-url string
The Base URL of your self-hosted OpenAI-compatible LLM
endpoint (e.g. https://my-llm-server.com/v1).
Required if using self-hosted option.
Response Headers
success boolean
Whether the request was successful. If true, the response contains the translated or cleaned image, else the response contains an error message.
credits number
The amount of credits remaining in the account.
Response Body
The response content is a JSON object containing the translated image as a Data URL, the
inpainted image as a Data URL, and the detected text objects. The context
key will be empty if a context chain has not been started (see above for explanation).
{
"image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII...",
"inpainted": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII...",
"context": "...",
"text": [
{
"x": 623.0,
"y": 326.5,
"width": 50.0,
"height": 141.0,
"text": "Tanjiro",
"originalText": "炭治郎",
"textAlign": "center",
"strokeColor": "#f9f7f9",
"lineWidth": 10,
"fillColor": "#0e0c0f",
"font": "36px WildWords",
"addFontBackground": false,
"addFontBorder": false,
"addBackgroundColor": "#ffffff",
"rotation": 0.0,
"angle": 0.0,
"layout": "h",
"textDir": "ltr"
}
]
}
{
"text": [
"Translated sentence one.",
"Translated sentence two.",
"Translated paragraph three."
],
"context": "..."
}
Response Fields
image string
The base64-encoded Data URL of the final translated image. Only present for image translation requests (DataType: IMAGE).
inpainted string
The base64-encoded Data URL of the inpainted image (image with source text removed). Only present for image translation requests (DataType: IMAGE).
original string
Optional. The base64-encoded Data URL of the original source image. Only present for image translation requests (DataType: IMAGE) when a remote image URL is supplied and the user is authenticated.
context string
Optional. The updated translation context string. Only present if a context chain is active or has been updated.
text array
An array of translated contents. Always present. For image responses, this is an array of objects detailing bounding box coordinates and styling rules. For text responses, this is an array of raw translated strings.
text[].x / y number
The X and Y pixel coordinates of the center point of the text bounding box.
text[].width / height number
The dimensions (width and height) of the text box in pixels.
text[].text string
The final translated text content rendered inside the box.
text[].originalText string
The original detected source text.
text[].textAlign string
Text alignment style ("left", "center", "right").
text[].fillColor / strokeColor string
The text fill and outline/stroke colors in hexadecimal format (e.g. #ffffff).
text[].lineWidth number
The width of the text stroke/outline border in pixels.
text[].font string
CSS font property value specifying size and family (e.g. "36px WildWords").
text[].addFontBackground / addFontBorder boolean
Options indicating whether extra background fills or custom border outlines are applied behind/around the font.
text[].addBackgroundColor string
Hexadecimal background color for the text background if enabled.
text[].rotation / angle number
The text block's rotation in radians and degrees, respectively.
text[].layout string
The orientation layout direction of the text block ("h" for horizontal, "v" for vertical).
text[].textDir string
The layout reading direction of the text flow ("ltr" for left-to-right, "rtl" for right-to-left).
Example Code
OCR Endpoint
A powerful OCR endpoint that returns highly detailed information, including text orientation, font sizes, and extracted colors (text, background, stroke) for each paragraph.
Credits Cost
Each request to the ocr endpoint costs 1 credit.
Authentication
Authenticate your requests by including your API key in the Authorization
header as a Bearer token.
Request Parameters (Form Data)
file file * required
The image file to perform OCR on.
Response Headers
success boolean
Whether the request was successful. If true, the response contains the OCR result, else the response contains an error message.
credits number
The amount of credits remaining in the account.
Response Body
The response content is a JSON array of detailed paragraph objects.
[
{
"text": "Thank you very much!",
"polygon": [[10, 320], [170, 320], [170, 370], [10, 370]],
"fontsize": 24,
"angle": 0.0,
"alignment": "center",
"direction": "left_to_right",
"bg_color": [255, 255, 255],
"text_color": [0, 0, 0],
"stroke_color": [0, 0, 0],
"has_dominant_bg_color": true,
"confidence": 0.998,
"language_details": {
"language": "ENGLISH",
"code": "en",
"confidence": 0.999
},
"removed": false,
"lines": [
{
"text": "Thank you",
"polygon": [[11, 326], [165, 327], [165, 361], [11, 360]],
"confidence": 0.998,
"angle": 0.0,
"direction": "left_to_right",
"language_details": {
"language": "ENGLISH",
"code": "en",
"confidence": 0.999
},
"removed": false,
"words": [
{
"text": "Thank",
"polygon": [[12, 327], [100, 327], [99, 361], [11, 360]],
"confidence": 0.998,
"direction": "left_to_right",
"symbols": [
{
"text": "T",
"polygon": [[12, 327], [25, 327], [25, 361], [12, 361]],
"confidence": 0.999
}
]
}
]
}
]
}
]
Response Fields
paragraph array
The root response is an array of objects, where each object represents a detected paragraph of text.
paragraph[].text string
The full recognized text content of the paragraph.
paragraph[].polygon array
A list of 4 [x, y] points defining the oriented bounding box of the paragraph.
paragraph[].fontsize number
The estimated median font size of the text in pixels.
paragraph[].angle number
The rotation angle of the paragraph in degrees (0-360).
paragraph[].alignment string
The text alignment within the paragraph ("left", "center", "right").
paragraph[].direction string
The text flow direction of the paragraph (e.g., "left_to_right", "top_to_bottom").
paragraph[].bg_color / text_color / stroke_color array
The detected colors in [B, G, R] format.
paragraph[].language_details object
Detailed language detection results.
paragraph[].language_details.language string
The full name of the detected language (e.g., "ENGLISH", "JAPANESE").
paragraph[].language_details.code string
The ISO 639-1 language code (e.g., "en", "ja").
paragraph[].language_details.confidence number
The confidence score of the language detection (0 to 1).
paragraph[].removed boolean
Indicates if the paragraph was filtered out (e.g., detected as noise or furigana).
paragraph[].confidence number
The average confidence score of the recognition (0 to 1).
paragraph[].lines array
An array of detailed line objects within the paragraph.
paragraph[].lines[].direction string
The text flow direction (e.g., "left_to_right", "top_to_bottom").
paragraph[].lines[].language_details object
Detailed language detection for this specific line.
paragraph[].lines[].language_details.language string
Full name of the line-level language.
paragraph[].lines[].language_details.code string
ISO 639-1 code for the line.
paragraph[].lines[].language_details.confidence number
Detection confidence for the line.
paragraph[].lines[].removed boolean
Mainly used for detected furigana (small reading aids for Kanji).
paragraph[].lines[].words array
An array of word objects within the line.
paragraph[].lines[].words[].polygon array
A list of 4 [x, y] points defining the word's bounding box.
paragraph[].lines[].words[].symbols array
An array of individual character/symbol objects.
paragraph[].lines[].words[].symbols[].text string
The recognized character.
paragraph[].lines[].words[].symbols[].polygon array
A list of 4 [x, y] points defining the symbol's bounding box.
paragraph[].lines[].words[].symbols[].confidence number
The recognition confidence for this specific character.
Example Code
Inpaint Endpoint
Remove text or objects from images. Returns a clean image with the specified areas filled in seamlessly using our advanced inpainting models.
Credits Cost
Each request to the inpaint endpoint costs 0.02 credits.
Authentication
Authenticate your requests by including your API key in the Authorization
header as a Bearer token.
Request Parameters (Form Data)
The request must contain the following files as multipart/form-data:
image file * required
The original image file.
mask file * required
The mask image file (white areas will be inpainted).
Response Headers
success boolean
Whether the request was successful.
credits number
The amount of credits remaining.
Response Body
The response content is a JSON object.
{
"image": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII=..."
}
Example Code
Credits Endpoint
Retrieve the current credit balance remaining in your account.
Credits Cost
Requests to the credits endpoints are completely free.
Authentication
Authenticate your requests by including your API key in the Authorization
header as a Bearer token.
Response Headers
success boolean
Whether the request was successful.
Response Body
The response content is a JSON object.
{
"credits": 145.25
}
Example Code
Manage Your API Key
Please sign in to generate and manage your API key.
Sign InYour unique API key:
Keep your API key secure. Do not share it publicly. You can only copy it once, but you can revoke it at any time and generate a new one.