LEADTOOLS OCR (Leadtools.Forms.Ocr assembly)
LEAD Technologies, Inc

SaveXml(OcrXmlOutputOptions) Method

Example 





A combination of one or more OcrXmlOutputOptions enumeration members that specify the XML generation options.
Converts the cumulated recognition result stored in the pages of this OCR document and returns it as XML data. .NET support WinRT support
Syntax
'Declaration
 
Overloads Function SaveXml( _
   ByVal options As OcrXmlOutputOptions _
) As String
'Usage
 
Dim instance As IOcrDocument
Dim options As OcrXmlOutputOptions
Dim value As String
 
value = instance.SaveXml(options)
function Leadtools.Forms.Ocr.IOcrDocument.SaveXml(OcrXmlOutputOptions)( 
   options 
)

Parameters

options
A combination of one or more OcrXmlOutputOptions enumeration members that specify the XML generation options.

Return Value

A System.String object containing the XML data.
Remarks

To save the output document as XML to a disk file or a .NET stream, use IOcrDocument.SaveXml(string fileName, OcrXmlOutputOptions options) and IOcrDocument.SaveXml(Stream stream, OcrXmlOutputOptions options).

Each IOcrPage object in the Pages collection of this IOcrDocument object holds its recognition data internally. This data is used by this method to generate the final output document.

Typical OCR operation using the IOcrEngine involves starting up the engine. Creating a new IOcrDocument object using the IOcrDocumentManager.CreateDocument method before adding the pages into it and perform either automatic or manual zoning. Once this is done, you can use the IOcrPage.Recognize method of each page to collect the recognition data and store it internally in the page. After the recognition data is collected, you use the various IOcrDocument.Save methods to save the document to its final format as well as IOcrDocument.SaveXml to save as XML.

You can also use the IOcrPage.RecognizeText method to recognize and return the recognition data as a simple System.String object.

You can use IOcrDocument.SaveXml as many times as required to save the document to multiple formats. You can also continue to add and recognize pages (through the IOcrPage.Recognize method after you save the document.

For each IOcrPage that is not recognized (the user did not call Recognize and the value of the page IOcrPage.IsRecognized is still false) the IOcrDocument will insert an empty page into the final document.

To get the low level recognition data including the recognized characters and their confidence, use IOcrPage.GetRecognizedCharacters instead.

The IOcrDocument interface implements System.IDisposable, hence you must dispose the IOcrDocument object as soon as you are finished using it. Disposing an IOcrDocument object will free all the pages stored inside its IOcrDocument.Pages collection.

Example
Copy CodeCopy Code  
Private Sub SaveAndProcessXmlExample()
      Dim tifFileName As String = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.tif")
      ' Create an instance of the engine
      Using ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, False)
         ' Start the engine using default parameters
         ocrEngine.Startup(Nothing, Nothing, Nothing, Nothing)

         ' Create an OCR document
         Using ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument()
            ' Add this image to the document
            Dim ocrPage As IOcrPage = ocrDocument.Pages.AddPage(tifFileName, Nothing)

            ' Recognize it
            ocrPage.Recognize(Nothing)

            ' Get the recognition data as XML
            Dim xml As String = ocrDocument.SaveXml(OcrXmlOutputOptions.None)

            ' Process the data by showing all the words
            Using reader As New System.IO.StringReader(xml)
               Dim doc As New System.Xml.XPath.XPathDocument(reader)
               Dim nav As System.Xml.XPath.XPathNavigator = doc.CreateNavigator()

               ' Select all the <word> elements
               Dim iter As System.Xml.XPath.XPathNodeIterator = nav.Select("//word")

               Console.WriteLine("Word found:")
               While iter.MoveNext()
                  Console.WriteLine(iter.Current.Value)
               End While
            End Using
         End Using

         ' Shutdown the engine
         ' Note: calling Dispose will also automatically shutdown the engine if it has been started
         ocrEngine.Shutdown()
      End Using
   End Sub

Public NotInheritable Class LEAD_VARS
   Public Const ImagesDir As String = "C:\Users\Public\Documents\LEADTOOLS Images"
End Class
private void SaveAndProcessXmlExample()
   {
      string tifFileName = Path.Combine(LEAD_VARS.ImagesDir,"Ocr1.tif");
      // Create an instance of the engine
      using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Professional, false))
      {
         // Start the engine using default parameters
         ocrEngine.Startup(null, null, null, null);

         // Create an OCR document
         using(IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument())
         {
            // Add this image to the document
            IOcrPage ocrPage = ocrDocument.Pages.AddPage(tifFileName, null);

            // Recognize it
            ocrPage.Recognize(null);

            // Get the recognition data as XML
            string xml = ocrDocument.SaveXml(OcrXmlOutputOptions.None);

            // Process the data by showing all the words
            using(System.IO.StringReader reader = new System.IO.StringReader(xml))
            {
               System.Xml.XPath.XPathDocument doc = new System.Xml.XPath.XPathDocument(reader);
               System.Xml.XPath.XPathNavigator nav = doc.CreateNavigator();

               // Select all the <word> elements
               System.Xml.XPath.XPathNodeIterator iter = nav.Select(@"//word");

               Console.WriteLine("Word found:");
               while(iter.MoveNext())
               {
                  Console.WriteLine(iter.Current.Value);
               }
            }
         }

         // Shutdown the engine
         // Note: calling Dispose will also automatically shutdown the engine if it has been started
         ocrEngine.Shutdown();
      }
   }

static class LEAD_VARS
{
   public const string ImagesDir = @"C:\Users\Public\Documents\LEADTOOLS Images";
}
private async Task SaveAndProcessXmlExample()
{
   string tifFileName = @"Assets\Ocr1.tif";
   // Create an instance of the engine
   IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);

   // Start the engine using default parameters
   ocrEngine.Startup(null, null, String.Empty, Tools.OcrEnginePath);

   // Create an OCR document
   IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument();

   // Add this image to the document
   IOcrPage ocrPage = null;
   using (RasterCodecs codecs = new RasterCodecs())
   {
      StorageFile loadFile = await Tools.AppInstallFolder.GetFileAsync(tifFileName);
      using (RasterImage image = await codecs.LoadAsync(LeadStreamFactory.Create(loadFile)))
         ocrPage = ocrDocument.Pages.AddPage(image, null);
   }

   // Recognize it
   ocrPage.Recognize(null);

   // Get the recognition data as XML
   string xml = ocrDocument.SaveXml(OcrXmlOutputOptions.None);

   // Process the data by showing all the words
   using (System.IO.StringReader reader = new System.IO.StringReader(xml))
   {
      System.Xml.XmlReader xmlReader = System.Xml.XmlReader.Create(reader);

      xmlReader.MoveToContent();
      while (xmlReader.Read())
      {
         if (xmlReader.NodeType == XmlNodeType.Element)
         {
            if (xmlReader.Name == @"//word")
            {
               Debug.WriteLine(xmlReader.Value);
            }
         }
      }
   }

   // Shutdown the engine
   ocrEngine.Shutdown();
}
Requirements

Target Platforms: Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

See Also

Reference

IOcrDocument Interface
IOcrDocument Members
Overload List
Leadtools.Forms.DocumentWriters.DocumentFormat
IOcrDocumentManager Interface
IOcrDocument.Save
IOcrDocument.SaveXml
IOcrPage.Recognize
IOcrEngine Interface
OcrEngineManager Class
OcrEngineType Enumeration
Programming with the LEADTOOLS .NET OCR
Files to be Included with Your Application

 

 


Products | Support | Contact Us | Copyright Notices

© 2006-2012 All Rights Reserved. LEAD Technologies, Inc.

SaveXml(OcrXmlOutputOptions) requires an OCR module license and unlock key. For more information, refer to: Imaging Pro/Document/Medical Features