Convert PDF to DOC / DOCX in C#, VB, and Java

There are many reasons why you might want to convert to or from a PDF document. Perhaps you need to make a PDF editable or text searchable. Or maybe you would prefer a PDF to be a DOC or DOCX file, but you don’t want to copy/paste the text or lose the original formatting. You might even need to batch convert a ton of PDFs in the same way. Good news! The LEADTOOLS Document Converter SDK is an easy to integrate tool able to handle all of your document and image conversion needs.

Behind the scenes, the Document Converter uses artificial intelligence to select the right combination of LEADTOOLS Raster, SVG, OCR, and Document Writer technologies to convert images and documents with both speed and precision.

Over the next few weeks, we will demonstrate the versatility of our document conversion technology by showing you how to convert to and from specific formats. Today’s blog focuses on how to convert PDFs to Word formats (DOC/DOCX).

Convert PDF to DOC or DOCX in C#

Easily convert PDFs to Word (DOC/DOCX) documents with LEADTOOLS. The following example is in C#, but LEADTOOLS supports several other languages, including Java and VB. First, import the LEADTOOLS SDK.

using Leadtools; 
using Leadtools.Codecs; 
using Leadtools.Document.Converter; 
using Leadtools.Document.Writer; 
using Leadtools.Ocr;     

Then, initialize the Document Converter.

namespace Convert_Files_with_Document_Converter 
{ 
 class Program 
 { 
  static void Main(string[] args) 
  { 
   string directory = @"C:\InputFileDirectory"; 
   SetLicense(); 
   using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD)) 
   using (DocumentConverter docConverter = new DocumentConverter()) 
   { 
    ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS21\Bin\Common\OcrLEADRuntime"); 
    docConverter.SetOcrEngineInstance(ocrEngine, false); 
    // Change the DocumentFormat enumeration to whichever format is needed 
    ConvertToDocument(directory, docConverter, DocumentFormat.Doc); 
   } 
  } 
 } 
} 

Finally, convert the document.

static void ConvertToDocument(string directory, DocumentConverter docConverter, DocumentFormat docFormat) 
{ 
 string[] files = Directory.GetFiles(directory, "*.pdf"); 
 string outputDir = Path.Combine(directory, "Converted"); 
 if (!Directory.Exists(outputDir)) 
  Directory.CreateDirectory(outputDir); 
 foreach (string file in files) 
 { 
  Console.WriteLine($"Converting {file}..."); 
  string fileName = Path.GetFileNameWithoutExtension(file); 
  string ext = DocumentWriter.GetFormatFileExtension(docFormat); 
  string outFile = Path.Combine(outputDir, $"{fileName}.{ext}"); 
  DocumentConverterJobData jobData = DocumentConverterJobs.CreateJobData(file, outFile, docFormat); 
  jobData.JobName = "Convert to Image Job"; 
  DocumentConverterJob job = docConverter.Jobs.CreateJob(jobData); 
  docConverter.Jobs.RunJob(job); 
  if (job.Errors.Count > 0) 
   foreach (var error in job.Errors) 
    Console.WriteLine($"Error during conversion: {error.Error.Message}\n"); 
  else 
   Console.WriteLine($"Successfully Convereted {file} to {outFile}\n"); 
 } 
} 

For more information on this example, check out our full tutorial on the LEADTOOLS C# Document Converter.

Convert PDF to DOC or DOCX in Visual Basic

LEADTOOLS also supports Visual Basic .NET applications. Simply use a function like the one below to integrate LEADTOOLS Document Converter into your application.

Public Sub DocumentConverterExample() 
 Using documentConverter As New DocumentConverter() 
  Dim inFile As String = Path.Combine(ImagesPath.Path, "Leadtools.pdf") 
  Dim outFile As String = Path.Combine(ImagesPath.Path, "output.doc") 
  Dim format As DocumentFormat = DocumentFormat.Doc 
  Dim jobData As DocumentConverterJobData = DocumentConverterJobs.CreateJobData(inFile, outFile, format) 
  jobData.JobName = "conversion job" 
  Dim job As DocumentConverterJob = documentConverter.Jobs.CreateJob(jobData) 
  documentConverter.Jobs.RunJob(job) 
  If job.Status = DocumentConverterJobStatus.Success Then 
   Console.WriteLine("Success") 
  Else 
   Console.WriteLine("{0} Errors", job.Status) 
   For Each errorItem As DocumentConverterJobError In job.Errors 
    Console.WriteLine("  {0} at {1}: {2}", errorItem.Operation, errorItem.InputDocumentPageNumber, errorItem.Error.Message) 
   Next 
  End If 
 End Using 
End Sub

More information about this example can be found in the Document Converter class documentation.

Convert PDF to DOC or DOCX in Java

A similar approach is found while using LEADTOOLS Java Document Converter. In Java, the application would create a conversion job with a function like the one below.

static void ConvertToDocument(String inputFile, DocumentConverter docConverter, OcrEngine ocrEngine) 
{ 
 DocumentWriter docWriter = new DocumentWriter(); 
 ocrEngine.startup(new RasterCodecs(), docWriter, null, null); 

 String outputFile = "C:\\LEADTOOLS21\\Resources\\Images\\documentConverter.pdf"; 

 docConverter.setDocumentWriterInstance(docWriter); 
 docConverter.setOcrEngineInstance(ocrEngine, true); 
 DocumentConverterJobData jobData = DocumentConverterJobs.createJobData(inputFile, outputFile, DocumentFormat.Doc); 
 jobData.setJobName("DocumentConversion"); 

 DocumentConverterJob job = docConverter.getJobs().createJob(jobData); 
 docConverter.getJobs().runJob(job); 

 if (job.getErrors().size() > 0) 
  for (DocumentConverterJobError error : job.getErrors()) 
   System.out.println("\nError during conversion: " + error.getError().getMessage()); 
 else 
  System.out.println("Successfully converted file to " + outputFile); 
}

For more information on the Java Document Converter, check out our full tutorial on the LEADTOOLS Java Document Converter.

That’s all there is to converting files from PDF to Word formats, but why stop there? Check out all of the LEADTOOLS supported formats.

Download a Free Evaluation

Download the LEADTOOLS SDK for free, straight from our site. This trial is good for 60 days and comes with unlimited chat and email support.

Stay Tuned For More!

Stay tuned for more conversion examples to see how the LEADTOOLS document converter will easily fit into any workflow converting PDF files into other document files or images and back again. Need help in the meantime? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team via email or call us at 704-332-5532.

About 

Developer Advocate

    Find more about me on:
  • linkedin
  • twitter
  • youtube
This entry was posted in .net, Document Converter and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *