Document Analyzer SDK Libraries

The LEADTOOLS Document Analyzer SDK Library intelligently identifies document components and zones in text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG PDF) for automated document processing and smart data extraction. The Document Analyzer automatically finds key phrases within structured and unstructured documents, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that nothing is missed and all data of interest is found. .NET (C#, VB), Java, and web developers building applications to handle processing various types of forms and documents will benefit greatly by integrating this easy-to-use API.

Feature-rich Document Analyzer SDK

Intelligent Document Analyzer Component

LEAD’s investment in AI and machine learning is showcased in the Document Analyzer SDK, which automatically detects and extracts data from any type of structured or unstructured form, document, or image with simple rule-based configurations.

All Document Analyzer features are provided without the need of additional 3rd-party tools or applications. Some of those features include:

  • Location search, including relative locations
  • Conditional search to match and filter the results
  • Partial and full match Regex support
  • Predefined rules for some common data types like SSN, ID number, TaxID, Address, Email address and more
  • Functions to add custom rulesets that find, collect, and act upon information of interest
  • Actions such as redact, highlight, and extract can be applied to data of interest
  • Handles various data formats, including tables, text flows, data across multiple lines
Intelligently extract data

Smart Data Extraction

Harnessing the power of LEAD’s Forms Recognition and Processing libraries, the Document Analyzer intelligently extracts text, paragraphs, or any key-value from text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG PDF) based on rules. This smart data extraction automatically finds key phrases working with structured and unstructured documents such as invoices, statements, bills of lading, and receipts, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that all data of interest is found and nothing slips through.

Work with all types of input

Analyze Any Input — Even Mixed Content

The Document Analyzer works on all types of input, including text-based files, image-based files, or files with mixed text and image content, using the seamless integration of the LEADTOOLS proprietary OCR technology built with patented machine learning algorithms.

Confidence ratings

Confidence Ratings Provided

The Document Analyzer provides users a confidence ratings to individually accept or decline the value recognized. A solution developer can use the rating to automatically accept or reject recognized values with full control of the following workflow.

Efficiently store data

Save Space in your Document Management System

Considering all the documents with sensitive data being processed regularly within various industries such as healthcare, finance, and insurance — a common pain point is manual data redaction and file storage. Having to manually redact documents and store both the redacted and unredacted files within a document management system can take up a lot of time and space. By leveraging the powerful machine vision libraries within the LEADTOOLS Document Analyzer, users need only to store the unredacted files and the system can automatically redact on-the-fly when a file is requested.

Several ways to use the analyzer

An Interface for Any User

The Document Analyzer is provided as a configuration driven application for ease of use and as .NET and Java classes for the ultimate in flexibility.

Document Analyzer Development Made Easy

Easy to Integrate

LEADTOOLS handles the heavy lifting — eliminating months of R&D, while giving you the best quality and performance available. This leaves you free to focus on other components of your application. Download the evaluation and start coding to get an idea of how much more streamlined your development will be using LEADTOOLS.

Cross platform libraries

Document Analyzer SDK Platforms and Programming Interfaces

Projects that use LEADTOOLS Document Analyzer libraries can be deployed to web browsers and Windows devices.

Document Analyzer SDK libraries are available for

Start Coding With LEADTOOLS Document Analyzer


Document Analyzer libraries for Windows, Linux, and macOS as well as all LEADTOOLS Recognition, Document, Medical, Vector, and Imaging technologies for all development and target platforms.

Download LEADTOOLS Libraries