The LEADTOOLS Distributed Computing SDK is a complete job-processing framework which developers use to create powerful distributed applications. Utilizing an existing network of servers and worker computers, either on premise, cloud or a combination, developers use the framework to create distributed, grid or parallel computing applications such as:
- Optical Character Recognition (OCR)
- Barcode Recognition
- Forms Recognition and Processing
- Audio/video conversion and recompression
- Distributed graphics rendering
- Web crawlers
By developing an application with the LEADTOOLS Distributed Computing SDK, significant savings in time and money can be realized through the use of an existing infrastructure to eliminate bottle-necks in processor-intensive, business-critical activities.
Overview of LEADTOOLS Distributed Computing SDK Technology
- Framework to create distributed applications for any technology and not just imaging related
- Seamlessly integrates with key LEADTOOLS features such as OCR, Barcode, Forms Recognition, Audio/Video Transcoding and Conversion, and Image Processing
- Utilize any computer as a worker to process a piece of the entire process
- Worker configuration options include job type, CPU load, time of day and number of jobs
- Create an application that runs on any platform
- Includes demonstration applications with source code for OCR and Multimedia Conversion
Framework Components of the LEADTOOLS Distributed Computing SDK
The LEADTOOLS Distributed Computing SDK framework can be broken down into three individual components described below.
Clients can be any type of computer or mobile device. Communication between clients and the central server is facilitated with standard web services.
The central server acts as the primary interface between clients and worker machines. The primary responsibilities of the central server include:
- Host the web service used to communicate with clients.
- Manage worker machine settings such as job type, number of jobs, etc. Storing these settings in a central location makes it simple for administrators to make global changes within the application regardless of where worker machines are physically located.
- Manage the database which is used to store all of the jobs and information related to each job.
The worker machines perform the work of the distributed application. There is no limit to the number of worker machines used within the application and can be hot-plugged or hot-swapped as needed without interrupting service for the clients. Each worker can take on as much or as little work as needed using customizable configuration settings such as:
- Job Type
- Maximum percent of CPU usage
- Number of CPU Cores
- Number of threads
- Number of jobs
- Time of day
Benefits of Distributed Applications in the Cloud
The benefits of parallel processing are well established, but even the most advanced and powerful computers will encounter bottlenecks. For example, a computer with eight cores performing OCR on a 100 page document can only process up to 8 pages at a time. Utilizing the cloud, it is possible to OCR and convert that same document in virtually the same time it takes to OCR and convert a single page given enough worker machines on the network.
Additionally, older hardware and less powerful devices such as mobile phones can utilize cloud applications to accomplish tasks with the same speed and efficiency as a high-end server.
The cloud provides huge potential for significant cost savings. Server hardware is significantly more expensive than desktop PCs.
- Avoid expensive server hardware - Several inexpensive desktops can be purchased and linked together to provide the same or better speed than a pricey server.
- More efficient use of existing hardware - Computers utilizing minimal CPU capacity can be used as worker machines without any noticeable performance degradation to the users' regular tasks.
- Lower hardware specifications - Clients connecting to the cloud require less processing power which means each user's machine is cheaper and has a longer lifetime.
- Inexpensive upgrades - Worker machines can be added and removed at any time.
Additionally, the LEADTOOLS Distributed Computing SDK makes it easy for developers to create the cloud for themselves, eliminating the need for third-party cloud computing services.
Distributed applications running across multiple machines are more dependable and easier to maintain than their counterparts running on a single machine. A well designed cloud application will suffer no downtime during planned maintenance, hardware failures, virus infections and even power outages or natural disasters. As long as there are still machines within the cloud, there is no disruption of service to the client. The central server and worker machines implement fail-safe measures that retry, restart and redistribute jobs.
Since the primary work is done within the cloud, virtually any client is able to access the application. While the actual cloud must be implemented on machines running Windows, the clients can access the cloud using Macs, iPads, mobile devices or anything with a network or Internet connection.
Why Use LEADTOOLS Distributed Computing SDK?
LEADTOOLS Distributed Computing SDK can be used in any scenario where you want to perform some type of automated job processing on a farm of computers. Although this leaves a wide range of opportunities, consider the following scenarios and how the LEADTOOLS Distributed Computing SDK can be used to successfully implement a powerful and dynamic solution:
- My video is too large and takes too long to convert.
Decoding and encoding multimedia files can be both a long and processor-intensive task. By using a cloud-based service for large multimedia files, the client can utilize the farm of worker machines to split large files, convert each piece separately, and re-multiplex them together in a fraction of the time required to convert the same file on a single machine.
- There are too many documents to process and OCR.
A cloud-based application can divide the workload between a farm of worker machines. Once the document conversion or text extraction is complete, the data can be sent back to the client, archived in a database or whatever the application architecture requires. This process can be extremely flexible and dedicate some workers to document cleanup while others perform the OCR. If the files themselves are large, the document can be broken up into the individual pages and pieced back together.
- I don't want to bog down my computer with mindless tasks.
The LEADTOOLS Distributed Computing SDK can be used for any distributed computing application and is ideal for tasks that require little or no user interaction and spawn other tasks. For example, one can use a farm of worker machines to compute complex mathematical and statistical data or crawling and indexing websites.