Programming with LEADTOOLS PDF

The Adobe Portable Format (PDF) was developed by the Adobe Corporation to allow the exchange and viewing of electronic documents easily and reliably, independent of the environment in which they were created. This format lets you compress large documents to a size small enough to download very quickly. It is also a powerful format for reproducing documents over the web.

LEADTOOLS offers extensive support for reading and writing PDF documents. The following section will quickly summarize the LEADTOOLS support for the various PDF functionalities, starting with the Leadtools.Pdf assembly-specific features.

PDF File Features such as Merging and Extraction of Pages

The Leadtools.Pdf.PDFFile class allows you to perform the following actions on PDF and PS files:

The C# and VB PDF Features Demo shipped with LEADTOOLS contains a wizard style user interface to perform all the actions above on existing PDF and PS files.

The following example will convert an existing PDF file to a PDF/A file:

// Create a PDFFile object from the input PDF file 
PDFFile inputFile = new PDFFile("Input.pdf"); 
// Convert it to PDF/A 
inputFile.ConvertToPDFA("OutputPDFA.pdf"); 

The following example example will merge four PDF files:

// Create a PDFFile object from the first PDF file 
PDFFile firstFile = new PDFFile("1.pdf"); 
// Merge it with the second, third and fourth files 
firstFile.MergeWith(new string[] { "2.pdf", "3.pdf", "4.pdf" }, "Output.pdf"); 

PDF Document Object Parsing

The Leadtools.Pdf.PDFDocument class encapsulates a PDF document on disk and supports the following functionality:

The following example will convert a multi-page PDF document into a multi-page TIFF file:

// Load the input PDF document 
PDFDocument document = new PDFDocument("Input.pdf"); 
using(RasterCodecs codecs = new RasterCodecs()) 
{ 
   // Loop through all the pages in the document 
   for(int pageNumber = 1; pageNumber <= document.Pages.Count; pageNumber++) 
   { 
      // Render the page into a raster image 
      using(RasterImage image = document.GetPageImage(codecs, pageNumber)) 
      { 
         // Append to (or create if it does not exist) a TIFF file 
         codecs.Save(image, "Output.tif", RasterImageFormat.TifJpeg, 24, 1, 1, -1, CodecsSavePageMode.Append); 
      } 
   } 
} 

The following example will parse the text of PDF File and save it to a TEXT file on disk:

// Load the input PDF document 
PDFDocument document = new PDFDocument("Input.pdf"); 
// Create the output text file 
StreamWriter writer = File.CreateText("Page1.txt"); 
// Parse the text objects in all pages 
document.ParsePages(PDFParsePagesOptions.Objects, 1, -1); 
// Loop through all the pages 
foreach(PDFDocumentPage page in document.Pages) 
{ 
   // Loop through the objects of this page 
   foreach(PDFObject obj in page.Objects) 
   { 
      // Is this is a text object (character)? 
      if(obj.ObjectType == PDFObjectType.Text) 
      { 
         // Yes, write it the output file 
        writer.Write(obj.Code); 
              
        // Check if we need to move to a new line 
         if(obj.TextProperties.IsEndOfLine) 
         { 
            writer.WriteLine(); 
         } 
      } 
   } 
   // End of page 
   writer.WriteLine(); 
} 
writer.Close(); 

PDF as a Raster Image

LEADTOOLS supports getting information about, loading (rendering), and saving a PDF document as a raster image (Leadtools.RasterImage). Using the Leadtools.Codecs.RasterCodecs class, you can treat a PDF file just like any other image format such as TIFF or JPEG. You can query the size of a PDF page or its bits/pixel value, render a PDF page on the surface of an image, save any image to PDF on disk, or convert a PDF to TIFF or JPEG (or any other supported format) and vice versa.

Refer to the following topics for more information:

Creating PDF Documents from Windows Metafiles

The LEADTOOLS Document Writers can be used to create a searchable multi-page PDF document from one or more Windows Metafiles (EMF).

Refer to the following topics for more information:

Creating PDF Documents from OCR Results

All the LEADTOOLS Optical Character Recognition (OCR) engines support outputting the final document as PDF. With OCR, you can convert a scanned TIFF or JPEG file to a searchable PDF—or extract the text from a raster PDF document.

Refer to the following topics for more information:

Creating Highly Compressed PDF Documents using MRC

The LEADTOOLS PDF Compressor supports saving files using Mixed Raster Content (MRC) technology. Using the LEADTOOLS PDF Compressor with the MRC engine, the compressor can be used to break down a page or image into smaller segments, saving each segment using the compression appropriate for that segment. The entire process works to provide a PDF file with the highest-possible compression as well as the best-possible quality, as compared to a standard Raster PDF.

Refer to the following topics for more information:

Help Version 20.0.2018.1.19
Products | Support | Contact Us | Copyright Notices
© 1991-2018 LEAD Technologies, Inc. All Rights Reserved.
LEADTOOLS Imaging, Medical, and Document