Technology / Document

LEADTOOLS PDF SDK

PDF

LEADTOOLS toolkits include comprehensive PDF reading, writing and viewing technology with advanced capabilities such as support for extraction of text, hyperlinks, bookmarks and metadata as well as updating, splitting and merging pages from existing PDF documents. Combined with LEADTOOLS' advanced rasterization and image display technology, developers can take advantage of these properties to enhance their applications with dynamic document viewing, editing and assembly features. Furthermore, programmers using .NET (C# & VB), C/C++, HTML5 and more can implement state of the art OCR, ICR, OMR, Forms Recognition, Virtual Printing and scanning technologies within LEADTOOLS to create any type of document and medical imaging application that utilizes the PDF format.

Tested against thousands of PDF documents, the LEADTOOLS PDF SDK provides impeccable accuracy which tops many market-leading PDF reading applications. LEADTOOLS accounts for common errors and differences between PDF file versions to give programmers peace of mind, minimize their testing phase and create the best PDF applications on the market.

Overview of LEADTOOLS PDF SDK Technology

PDF Document Features

  • Load and view any PDF document
  • Extract text (characters, words and lines), fonts, images, annotations, rectangles and hyperlinks with location and size
  • Full support for reading, editing and writing native PDF annotations
  • Full Unicode support including Chinese, Japanese, Arabic and Hebrew
  • Parse the document structure by reading PDF bookmarks (Table of Contents) and internal links (jumps)
  • Generate a raster image or thumbnail of any page

PDF File Features

  • Comprehensive multi-page support including
    • Merge existing PDF files into a single PDF
    • Split a single PDF into multiple PDF files
    • Extract, delete, insert or replace any page in existing PDF files
  • Read and update the Table of Contents (TOC) of existing PDF files
  • Convert any existing PDF to PDF/A
  • Linearize (optimize for web viewing) any existing PDF
  • Encrypt/decrypt documents and convert to and from any PDF version
  • Read, write and update all PDF metadata such as author, title, subject and keywords
  • Read, write and update the PDF document Table of Contents
  • Convert (Distill) postscript to PDF with optimization for eBook, Screen and Prepress

PDF Annotations and Markup

LEADTOOLS supports reading, displaying, editing and writing native PDF annotations and markup that work seamlessly with Adobe Acrobat and other compliant PDF readers. Annotations are an important feature in document imaging, as it allows users to communicate with each other by writing comments and drawing shapes on top of the document without making any permanent changes.

  • Support for all PDF annotation and markup objects
    • Comment
    • Highlight
    • Text
    • Arrow
    • Line
    • Review
    • Shapes
  • Convert PDF annotations to and from LEADTOOLS annotations for live editing
  • Options to control annotation rendering when loading PDF as raster with support for No Appearance Stream annotations
  • Fully functional sample application with source code that implements all of the PDF reading, writing, editing, and annotation features

OCR PDF Output

LEADTOOLS allows developers to easily convert any image into a searchable text PDF. Searchable text PDFs are generally smaller in size than the comparable raster image and the embedded text can be searched, indexed and edited as in a word processor.

  • Convert images to searchable text PDF files with as little as three lines of code using LEADTOOLS SDK OCR technology
  • Export text only or image over text to retain original formatting
  • Multiple PDF versions including 1.2 - 1.7 and PDF/A
  • Multiple compression options for images within the PDF including:
    • JPEG
    • JPEG2000
    • LZW
    • CCITT G3/G4
    • JBIG2
    • MRC
  • Convert entire file or only specified pages
  • Convert images from disk, memory, Internet or SharePoint
  • Preprocess images to improve readability, compression and recognition
  • Create and update PDF document metadata such as author, title and keywords
  • Protect sensitive data with encrypted PDF documents using RC4 40-bit and RC4 128-bit encryption
  • Control access to the PDF document with User and Owner passwords
  • Options to embed fonts in the PDF file
  • Options to create linearized PDF files for faster web viewing

Raster Image PDF Features

In addition to handling text-based PDF files, LEADTOOLS fully supports loading, saving and editing raster image PDFs. This includes rasterizing any text or image-based PDF into thumbnails or full size document images, as well as converting single and multi-page image formats such as JPEG and TIFF into image-based PDF files.

  • Convert any PDF file to and from over 150 supported raster image formats
  • Multiple PDF versions including 1.2 - 1.7 and PDF/A
  • Specify RGB or CMYK color space
  • Multiple Compression options including:
    • JPEG
    • LZW
    • CCITT G3/G4
    • JBIG2
    • MRC
  • Encrypt and decrypt PDF documents using RC4 40-bit and RC4 128-bit encryption
  • Control access to the PDF document with User and Owner passwords
  • Process entire PDF file or specified set of pages
  • Load PDF from disk, memory, Internet and SharePoint

PDF Rasterization Options

At the heart of PDF to Image Conversion is the rasterization process. By nature, PDF documents are made up of vector objects such as text and 2d images. These objects have a relative location based on the physical, printed dimensions. This means that PDFs are dynamic documents which can be rasterized to any pixel dimension based on the DPI (Dots Per Inch) while preserving a high quality display. LEADTOOLS provides maximum flexibility when rasterizing PDF files and allows the developer to control the quality, size, color and more.

  • Load at any DPI to control overall quality and file size
  • Load at 1, 8, or 24 bits per pixel
  • Render fonts with 2 and 4 bit anti-aliasing resulting in a more readable image
  • Rescale embedded graphics with 2 and 4 bit anti-aliasing to retain original image quality and reduced graininess
  • Automatically detect the best rasterization options by examining the contents of the PDF

PDF Compression

Maintain quality while maximizing PDF compression with LEADTOOLS advanced image segmentation and compression technologies. The resulting compressed PDF can be loaded and viewed in any PDF viewer that supports standard PDF files. By storing complex mixed raster content (MRC), this process creates PDF files with better compression and quality than a standard raster PDF file.

  • Automatically segment the image with optimization options
  • Manually segment the image to take full control over file size and image quality optimization
  • Compression for different segment types can be automatically or manually selected
  • Multiple compression options including:
    • ZIP
    • LZW
    • CCITT G3 /G4
    • JBIG2
    • JPEG
  • Automatic background detection
  • Compress single or multi-page PDF files
  • Native 32 and 64 bit binaries for compressing PDF files
  • Add PDF compression to single or multi-threaded applications

Explanation of PDF File Types

In general, PDF and PDF/A files can be categorized into two basic types: raster image and searchable text. Raster Image PDFs are comprised of a complete raster image in a PDF wrapper and support multiple compression types including JPEG, CCITT G3/G4, JBIG2, and LZW. The greatest advantage of raster image PDFs is that they appear identical to the original document. On the other hand, searchable text PDFs are often smaller in size and the text can be searched and edited as in a word processor.

When converting from raster images to searchable text based PDFs, the formatting of the original image is often modified. To alleviate this concern, LEAD has implemented a hybrid type of PDF known as "image over text". In image over text PDFs, the text is formatted as usual, but the the original raster image is overlaid on top of the text. This maintains the look and formatting of the original raster image while still allowing the text content to be searched, selected, copied and pasted.

LEADTOOLS SDK Products that Include PDF

Hover over each product for a description. Click for more details.

LEADTOOLS Document Imaging SDK

Develop powerful document imaging applications with LEADTOOLS Document Imaging. Features include comprehensive image annotation, specialized bitonal image display such as scale-to-gray and favor-black, and specialized bitonal image processing. Other features include performance and memory optimizations for bitonal images, document image clean-up including inverted text, border, hole-punch and line removal, and scanning with LEADTOOLS Fast TWAIN and WIA.

LEADTOOLS Recognition Imaging SDK

The LEADTOOLS Recognition Imaging SDK is a handpicked collection of LEADTOOLS SDK features designed to build end-to-end document imaging applications as part of an enterprise level document automation solution that requires scanning, OCR, OMR, forms recognition and processing, archival, annotation and display functionality. This powerful set of tools utilizes LEAD's industry LEADing image processing technology to intelligently identify document features that can be used to recognize any type of scanned or faxed form image.

LEADTOOLS Document Imaging Suite SDK

The LEADTOOLS Document Imaging Suite SDK is a comprehensive collection of LEADTOOLS SDK features designed to build end-to-end document imaging applications within enterprise level document automation solutions that requires capture, OCR, OMR, forms recognition and processing, PDF, print capture, archival, annotation and display functionality. This powerful set of tools utilizes LEAD's industry LEADing image processing technology to intelligently identify document features that can be used to recognize any type of scanned or faxed form image.

LEADTOOLS Medical Imaging SDK

Develop powerful Medical Imaging applications with the LEADTOOLS Medical Imaging SDK. Features include comprehensive full DICOM dataset support, 8-16 bit extended grayscale image support, image annotation, specialized extended grayscale image display such as window level and LUT processing, and specialized medical image processing. Other features include lossless JPEG compression, and signed and unsigned image data processing.

LEADTOOLS PACS Imaging SDK

Develop robust PACS Imaging applications with LEADTOOLS PACS Imaging. Features include Medical Web Viewer Framework, high and low level PACS SCP and SCU functions and controls, secure PACS communication, comprehensive DICOM data set support, image annotation, extended grayscale image display such as window level and LUT processing, and specialized medical image processing. Other features include lossless JPEG compression, JPIP and signed and unsigned image data processing.

LEADTOOLS Medical Imaging Suite SDK

Develop powerful PACS and Medical imaging applications with LEADTOOLS Medical Imaging Suite. Features include Medical Web Viewer Framework, Medical 3D, Zero Footprint HTML5 DICOM Viewer, DICOM Multimedia codecs, high and low level PACS SCP and SCU functions and controls, secure PACS communication, Print to PACS, comprehensive DICOM data set support, image annotation, extended grayscale image display such as window level and LUT processing, and specialized medical image processing. Other features include lossless JPEG compression, JPIP, and signed and unsigned image data processing.

Platforms and Programming Interfaces

LEAD Technologies Logo
LEADTOOLS Logo