Efficiently Convert a Document to an Image

One of the new features of the latest LEADTOOLS V19 update is a re-factored load algorithm, which has resulted in greatly reduced load times of documents formats such as PDF, MS-Office formats 97-2013 (Word, Excel, and PowerPoint), and TXT. The increase in speed is directly related to the number of pages in the document; the more pages, the greater the increase of speed.

Below is a C#6 code snippet showing how to use the new feature with the new bits marked.


private static void ConvertDocumentToImage(
    string inputFile,
    string outputFile,
    RasterImageFormat outputFormat,
    int bitsPerPixel)
{
    if (!File.Exists(inputFile))
        throw new ArgumentException($"{inputFile} not found.", nameof(inputFile));

    if (bitsPerPixel != 0 && bitsPerPixel != 1 && bitsPerPixel != 2 && bitsPerPixel != 4 &&
        bitsPerPixel != 8 && bitsPerPixel != 16 && bitsPerPixel != 24 && bitsPerPixel != 32)
        throw new ArgumentOutOfRangeException(nameof(bitsPerPixel), bitsPerPixel, 
            $"Invalid {nameof(bitsPerPixel)} value");

    using (var codecs = new RasterCodecs())
    {
        codecs.Options.RasterizeDocument.Load.XResolution = 300;
        codecs.Options.RasterizeDocument.Load.YResolution = 300;

        // indicates the start of a loop from the same source file
        codecs.StartOptimizedLoad();

        var totalPages = codecs.GetTotalPages(inputFile);
        if (totalPages > 1 && !RasterCodecs.FormatSupportsMultipageSave(outputFormat))
            throw new NotSupportedException(
                $"The {outputFormat} format does not support multiple pages.");

        for (var pageNumber = 1; pageNumber <= totalPages; pageNumber++)
        {
            Console.WriteLine($"Loading and saving page {pageNumber}");
            using (var rasterImage = codecs.Load(inputFile, bitsPerPixel, CodecsLoadByteOrder.Bgr, pageNumber, pageNumber))
                codecs.Save(rasterImage, outputFile, outputFormat, bitsPerPixel, 1, -1, 1, CodecsSavePageMode.Append);
        }

        // indicates the end of the load for the source file
        codecs.StopOptimizedLoad();
    }
}

 

Git the Code

Get a Visual Studio 2017 Windows Console project that includes the sample code from above to convert documents to TIFF. Clone, Fork, or Download the code from GitHub: Git the Code

 

 

 

About 

Developer Advocate

    Find more about me on:
  • linkedin
  • twitter
  • youtube
This entry was posted in Document Imaging and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *