Convert Images to Searchable PDF with OCR in C#

Posted on 2020-05-19 12:42:24 by Nick Villalobos

PDFs are used virtually everywhere and by everyone these days. Throughout most organizations, PDF documents are vital to business applications and workflows. Many industries such as insurance agencies, financial institutions, and legal practices have standardized their document management systems on the PDF format due to the file format’s portability and versatility.

How these PDFs are being consumed depends on the type of PDF being dealt with. There are two main types of PDFs: image and searchable. For example, if you use a word processor to save the PDF, then that most likely will be a searchable PDF and you may copy/paste the text within the document as you please. On the other hand, if you use a scanner to convert paper to PDF, that most likely will be an image PDF and you will not be able to be able to search the text.

Even if you use a scanner to create an image PDF or were sent an image PDF by someone else, there still is a way to make it searchable. This happens through OCR and OCR is what LEADTOOLS does best! Developers are able to easily make automated OCR solutions and achieve these image to searchable PDF conversions with as little as five lines of code thanks to LEAD's powerful OCR libraries. These solutions are what save people and companies their two most valuable resources: time and money.

Continue Reading...

Categories:

PDF

Tags:

PDF

.NET

Create and Extract PDF Bookmarks in .NET

Posted on 2020-03-04 11:37:52 by Nick Villalobos

aiim logo

LEADTOOLS will be presenting at the Solutions Showcase at AIIM this year. This will be the second year in a row that we have had the opportunity to present and demonstrate specific LEADTOOLS functionalities. With Version 21 on its way, we want to show off one of the newest additions to the SDK.

This showcase will focus on showing how to automatically redact values found by leveraging machine vision and key-value extraction. One goal we have with this feature is to show attendees how it can help transition offices to a paperless office and smooth out their day-to-day document needs. Powered by proprietary machine learning algorithms and the patented LEADTOOLS OCR Engine, this automatic redaction will work on all types of input documents, meaning there is no task this feature won’t be able to handle.

Continue Reading...

Categories:

PDF

Protect Personal Information with LEADTOOLS

Posted on 2020-02-14 07:48:24 by Gabriel Smith

Redaction PDF OCR

The protection of privacy is at the forefront of concerns for many organizations that need to distribute information. One way to do this is to redact private information from an image or document before releasing it. Recently we had a customer ask how to do this with LEADTOOLS.

Using LEADTOOLS, a user would load an image or a document as an image, use the redaction annotation object to "black-out" the text, and then burn the redaction into the image. Once the protected information is redacted, the image can then be saved as a text searchable PDF so the remaining text can be indexed or searched.

Continue Reading...

Convert a Word Document to PDF

Posted on 2020-02-14 07:43:23 by Gabriel Smith

PDF Header

Following up on my last post, Convert an Image to Black & White PDF, I have updated the project to include a third method that converts document and vector formats to PDF. This conversion uses SVG as an intermediate format and does not require OCR, which results in perfect text accuracy.

Continue Reading...

Categories:

PDF

Retrieve Digital Signature Information from PDFs Using LEADTOOLS

Posted on 2020-02-07 08:52:00 by Nick Villalobos

Digital signatures are one of the most advanced and secure type of an electronic signature. These signatures provide the highest level assurance and are used to comply with legal and regulatory requirements. Digital signatures use a certificate-based digital ID to authenticate the signer's identity and bind each signature to the document using encryption. With the LEADTOOLS SDK, developers can use LEADTOOLS PDF libraries to retrieve information on these digital signatures. The .NET and Java PDFSignature Class contains properties of a PDF digital signature.

Continue Reading...
LEADTOOLS Blog

LEADTOOLS Powered by Apryse,the Market Leading PDF SDK,All Rights Reserved