Document Analyzer SDK Libraries
The LEADTOOLS Document Analyzer SDK Library intelligently identifies document components and zones in text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG PDF) for automated document processing and smart data extraction. The Document Analyzer automatically finds key phrases within structured and unstructured documents, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that nothing is missed and all data of interest is found. .NET (C#, VB), Java, and web developers building applications to handle processing various types of forms and documents will benefit greatly by integrating this easy-to-use API.
Intelligent Document Analyzer Component
LEAD’s investment in AI and machine learning is showcased in the Document Analyzer SDK, which automatically detects and extracts data from any type of structured or unstructured form, document, or image with simple rule-based configurations.
All Document Analyzer features are provided without the need of additional 3rd-party tools or applications. Some of those features include:
- Location search, including relative locations
- Conditional search to match and filter the results
- Partial and full match Regex support
- Predefined rules for some common data types like SSN, ID number, TaxID, Address, Email address and more
- Functions to add custom rulesets that find, collect, and act upon information of interest
- Actions such as redact, highlight, and extract can be applied to data of interest
- Handles various data formats, including tables, text flows, data across multiple lines
Smart Data Extraction
Harnessing the power of LEAD’s Forms Recognition and Processing libraries, the Document Analyzer intelligently extracts text, paragraphs, or any key-value from text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG PDF) based on rules. This smart data extraction automatically finds key phrases working with structured and unstructured documents such as invoices, statements, bills of lading, and receipts, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that all data of interest is found and nothing slips through.
Analyze Any Input — Even Mixed Content
The Document Analyzer works on all types of input, including text-based files, image-based files, or files with mixed text and image content, using the seamless integration of the LEADTOOLS proprietary OCR technology built with patented machine learning algorithms.
Confidence Ratings Provided
The Document Analyzer provides users a confidence ratings to individually accept or decline the value recognized. A solution developer can use the rating to automatically accept or reject recognized values with full control of the following workflow.
Save Space in your Document Management System
Considering all the documents with sensitive data being processed regularly within various industries such as healthcare, finance, and insurance — a common pain point is manual data redaction and file storage. Having to manually redact documents and store both the redacted and unredacted files within a document management system can take up a lot of time and space. By leveraging the powerful machine vision libraries within the LEADTOOLS Document Analyzer, users need only to store the unredacted files and the system can automatically redact on-the-fly when a file is requested.
An Interface for Any User
The Document Analyzer is provided as a configuration driven application for ease of use and as .NET and Java classes for the ultimate in flexibility.
Easy to Integrate
LEADTOOLS handles the heavy lifting — eliminating months of R&D, while giving you the best quality and performance available. This leaves you free to focus on other components of your application. Download the evaluation and start coding to get an idea of how much more streamlined your development will be using LEADTOOLS.
Document Analyzer SDK Platforms and Programming Interfaces
Start Coding With LEADTOOLS Document Analyzer
Document Analyzer libraries for Windows, Linux, and macOS as well as all LEADTOOLS Recognition, Document, Medical, Vector, and Imaging technologies for all development and target platforms.