Logo
Sign in
Product Logo
Deep Learning Optical Character Recognition (OCR) APIsCloudmersive

High‑accuracy, ML-powered OCR to convert scanned images and photos into text in seconds.

Product details

Overview

Cloudmersive OCR API enables developers to convert scanned images, PDFs, and document photos into high‑quality machine‑readable text using advanced machine learning and pre‑processing. It supports more than 90 languages and automatically deskews, unrotates, and preprocesses images for optimal accuracy across scanned documents and photos. The API includes separate endpoints for standard scanned images, smartphone‑captured photos, and receipts, making it versatile across many use cases.

Features and Capabilities

  • Image OCR: Converts scanned images (e.g. JPEG, PNG) to text; supports fault‑tolerant recognition modes Basic, Normal, or Advanced.
  • Photo OCR: Handles smartphone photos of documents or receipts, including deskewing, derotation, and cropping to improve recognition.
  • PDF OCR: Processes PDFs with scanned pages into text lines or words, with detailed location metadata.
  • Words/Lines with Location Metadata: Retrieve precise coordinates of words or lines within documents or images, enabling layout-aware processing.
  • Multi-Language Support: Recognizes text in over 90 languages including English, Chinese (simplified & traditional), French, Spanish, German, Italian, and many more.
  • Preprocessing Utilities: Includes automatic image preprocessing such as unrotation and deskewing to enhance OCR accuracy.
  • Free Tier Access: Offers up to 600 API calls per month at no cost, with no expiration on the free tier.
  • Flexible Deployment Options: Available as a hosted cloud service or deployed to private cloud, on-premises, or major public cloud environments like Azure, AWS, or GCP.