Images · OCR + AI translation

Translate images. Photos of documents, signs, screenshots, anything.

Drop a .jpg, .png, .tiff, or .webp. Fily runs OCR, translates the recognized text, and gives you back a clean DOCX with the original layout reconstructed.

0Image jobs

What is Images (OCR + AI)?

Image translation handles single-image inputs: phone photos of documents, screenshots of UI in another language, photos of signs or menus, scanned single pages. The pipeline is OCR-then-translate, with layout reconstruction adapted to the source.

Why Images (OCR + AI) is tricky for AI translation

  • Resolution and orientation vary: phone photos are often rotated, skewed, or under-lit.
  • Mixed content: a photo can have typed text, handwriting, and image-of-text inside one image.
  • Background complexity: photos of signs in the street have OCR confused by background patterns.
  • Single-page is different from multi-page PDF: image OCR pipelines often skip layout, dumping a wall of text.
  • Screenshots have specific layout: UI text, buttons, menus, in nested visual hierarchy that should survive.

How Fily handles Images (OCR + AI)

  • Pre-OCR normalization: deskew, contrast adjustment, orientation correction.
  • Dual-backend OCR with automatic failover (same engine as scanned PDFs).
  • Layout-aware reconstruction: text blocks with positions inform the output layout, not a flat dump.
  • Mixed-content handling: typed text and handwriting flagged separately in the QA report.
  • Screenshot mode: opt-in for UI screenshots — preserves visual hierarchy (titles, buttons, menus) as a structured DOCX.

Pipeline: image_qa_12step_v2@1.0.0

The Images (OCR + AI) workflow with Fily

1

Upload

Drop your .jpg / .png / .tiff (single or batch ZIP). Optional: glossary, TM, style guide.

2

Process

Fily runs the Images (OCR + AI) pipeline + 12 QA steps. Typical job: 10–20 minutes.

3

Download

Same format, ready to deliver. QA report HTML attached.

Common upload: a photo of a printed form in a foreign language that someone needs translated, a screenshot of a foreign-language UI to localize, or a photo of a single-page contract picked up in another country. Output is a DOCX with the recognized text, translated, in an approximately matching layout.

Frequently asked about Images (OCR + AI)

Ready to translate a Images (OCR + AI) file?

No card. No setup. Upload one file and see the output.