LogoAIAny
Icon for item

Umi-OCR

Offline desktop OCR for Windows and Linux that extracts text from screenshots, image batches, and scanned PDFs without requiring a network connection. Bundles multilingual offline engines (PaddleOCR / RapidOCR), supports ignore-regions, searchable PDF output, CLI and HTTP interfaces for automation and integration.

Introduction

Most consumer OCR is either online (privacy concerns) or single-shot tools with limited batch/automation support. This project targets users who need reliable, offline text extraction workflows—screenshot capture, large-scale batch processing, and converting scanned PDFs into searchable documents—while offering ways to integrate OCR into scripts and services.

What Sets It Apart
  • Offline-first workflow: ships with local OCR engines (PaddleOCR-json and RapidOCR-json plugins) so recognition runs without sending images to third-party servers—useful for sensitive data and air-gapped environments.
  • Practical production features: supports screenshot OCR, bulk image recognition, scanned-document → dual-layer searchable PDF conversion, QR/barcode read & generation, and an "ignore region" editor to exclude watermarks or headers from results.
  • Automation-friendly interfaces: provides a command-line mode and a lightweight HTTP API so you can embed OCR into pipelines, scheduled jobs, or other desktop/server workflows.
  • Low friction for end users: portable releases (no installer), multi-language UI, and an emphasis on Windows7 x64 and Linux x64 compatibility for older systems.
Who It's For

Great fit if you need offline, privacy-preserving OCR at scale (batch jobs, document digitization) or want easy desktop-to-automation handoff via CLI/HTTP. It’s suitable for power users, small teams, and admins who must process many images/PDFs without cloud dependencies.

Look elsewhere if you need cutting-edge research-grade recognition accuracy on specialized scripts (you may prefer custom-trained models or cloud OCR offerings), or if you require native macOS builds—the project primarily supports Windows and Linux.

Where It Fits

Compared with cloud OCR services, this project trades potential top-tier model accuracy for offline privacy, stability, and bulk-processing features. Compared with minimal single-shot screenshot utilities, it adds batch output formats, post-processing (layout-aware text ordering), and programmatic interfaces that make it usable in automated workflows.

Information

  • Websitegithub.com
  • Authorshiroi-sora
  • Published date2022/03/28

Categories