New White Paper: Using LEADTOOLS PDF File Features to Enhance Google Drive Search

Posted on 2013-10-01 12:45:31 by Greg

In one of our first white papers, we wrote about how to use LEADTOOLS OCR to modify and improve your Google Drive search. Today we're publishing part two of that series and showing how to use the LEADTOOLS PDF SDK to improve Google Drive in a similar fashion.

The problem this white paper addresses is that Google Drive's search features do not look at the keywords metadata within PDF files. Many people use these keywords as an important way to catalog and keep track of their documents, so the usefulness of that information gets lost once synced into your Google Drive. Click the link below to read this new white paper:

Continue Reading...

New White Paper: Using LEADTOOLS OCR to Enhance Google Drive Search

Posted on 2013-05-31 09:38:22 by Greg

Back by popular demand are LEAD white papers! There are many ways that we here at LEAD get the word out on how to use LEADTOOLS to create real-world solutions including this blog, CodeProject articles, support forums, code tips and the like. Lately we have had some requests for more in depth examples and solutions using LEADTOOLS which are best suited for a white paper format. So without further ado, here's a description of our first white paper in the upcoming series:

Using LEADTOOLS OCR to Enhance Google Drive Search
Google Drive is a wonderful service for storing, organizing and sharing files such as documents, photos and videos. However, TIFF and other raster image file formats can get easily lost because of the limited search capabilities. With LEADTOOLS, developers can use its OCR SDK to extract the text and then add it to the IndexableTextData for each item.

Continue Reading...

Native PDF Annotations in Version 18

Posted on 2013-03-27 16:53:23 by Greg

PDF is by far one of the most widely used and adopted file formats in the world. However, support is lacking among most SDKs, which generally only implement basic loading as raster and saving searchable text with OCR. Since 17.5, LEADTOOLS has suppported advanced PDF capabilities such as extraction of text, hyperlinks, bookmarks and metadata as well as updating, splitting and merging pages from existing PDF documents. In version 18, the LEADTOOLS PDF SDK is even more powerful and comprehensive with its new and improved support for native PDF annotations and drawing markups.

Native annotations are an important feature in document imaging, as it allows users to communicate with each other by writing comments and drawing shapes on top of the document without making any permanent changes. Here are some of the most important PDF Annotations features in LEADTOOLS Version 18:

Continue Reading...

New Video Presenting OCR SDK Updates in Version 18

Posted on 2013-03-19 14:51:56 by Greg

Our marketing department's media team has produced a new video highlighting the new and improved features within the LEADTOOLS OCR Engine in Version 18, including Faster Recognition, Zonal and Full Page Recognition, Image Over Text PDF, Tables and Cells, Dithered Text and Underlined Text Recognition. Take a couple minutes to learn about these exciting features, and we hope you enjoy the video!

We plan on releasing more videos just like this one covering other major feature sets like HTML5, PACS, DICOM, Barcode and more. As always, we welcome any feedback and would love to know what you think about our new videos, and/or let us know what you'd like to see next.

Continue Reading...

Enhanced OCR Noise Removal Coming Soon

Posted on 2013-03-12 17:02:22 by Greg

While making my rounds through the engineering department, the OCR team showed me some really impressive enhancements to the Advantage OCR engine coming soon. They've accomplished a lot, but my personal favorite is what they've done to the Advantage OCR engine's preprocessing algorithm. With much sweat, tears and coffee, they've fine-tuned the noise removal algorithm with impressive results. Other engines may have difficulty seeing between the lines (literally) when forms and documents use separator bars or boxes for individual characters. LEADTOOLS Advantage OCR Engine is doing a superb job at intelligently removing the noise and returning only the text of interest, rather than getting hung up on bars, dashes, speckles and other types of noise that should simply be ignored.



Other than the obvious benefit of improved accuracy, this is especially helpful for customers using forms recognition where character separators are prevalent. On documents where it might have been necessary to use a separate zone for each character and piece them together post-recognition, now only a single zone is needed since the separator bars and cells will no longer be taken into account.

Continue Reading...
LEADTOOLS Blog

LEADTOOLS Powered by Apryse,the Market Leading PDF SDK,All Rights Reserved