
Kensho ExtractKensho
Kensho Extract is a machine learning tool that converts unstructured documents into structured data. It extracts text, tables, and figures from PDFs and scanned files, enabling faster analysis, improved data quality, and seamless integration into enterprise workflows.
Vendor
Kensho
Company Website



Product details
Kensho Extract
Kensho Extract is a machine learning-powered document processing tool that transforms unstructured PDFs and text files into structured, machine-readable formats. It enables fast, accurate extraction of text, tables, and key-value pairs, streamlining workflows for data analysis, enrichment, and integration across enterprise systems.
Features
- Optical Character Recognition (OCR) for scanned documents
- Hierarchical document structure recognition
- Advanced table extraction with support for merged cells and headers
- Figure and chart data extraction (e.g., bar charts)
- REST API for high-throughput document processing
- JSON output with detailed segmentation (titles, paragraphs, tables, footnotes)
- Support for multiple output models (hierarchical, general)
- UI for manual review and correction
- Integration with Kensho NERD and Link services
- Toolkit for markdown conversion, section organization, and visual formatting
Capabilities
- Convert complex PDFs into structured data for analysis
- Extract and standardize financial metrics, periods, and currencies
- Identify and isolate tables, figures, and key-value pairs
- Maintain page layout for multilingual translation workflows
- Enable downstream NLP and machine learning applications
- Automate data onboarding and reduce manual copy-paste errors
- Handle inconsistent formatting and messy layouts
- Support batch and real-time processing for large-scale operations
Benefits
- Saves time and resources in document processing
- Improves data quality and accessibility for analytics
- Reduces reliance on manual data entry and spreadsheet manipulation
- Enhances integration with internal databases and external knowledge bases
- Enables faster decision-making with structured insights
- Scales across teams and departments with minimal setup
- Supports compliance and audit workflows through accurate data extraction
- Boosts productivity in financial, legal, and research environments