All the methods for uploading a document are in the DocumentFactory class. Uploading the document can be done in chunks to show a progress bar and optionally allow the user to abort the application.
The application should first prompt the user for the location of the document and then obtain a file stream object with read access to the original file. At that point, the following actions are possible:
Create an instance of UploadDocumentOptions that can be populated with any optional values and then call the BeginUpload method. This method will return a temporary URL object to identity created by the DocumentFactory to identity this document for subsequent calls. This is a custom URL value in the following scheme
leadcache://unique_guid_identifier and is not meant to be used by anything other than the rest of the upload methods. This value must be saved in the application in a local variable at this point.
Read a chunk from the source file into a byte array. The size of this chunk is up to the user, the larger the chunk size, the shorter the whole upload operation but a too large of a chunk might throttle the server connection. A value of 64K is a good minimum. Once the chunk is read, call the DocumentFactory.UploadDocument method passing the URL value obtained from BeginUpload and the chunk of data.
Repeat until the file has been read and uploaded, then call EndUpload.
External annotations data can also be uploaded to the document at this time if desired. Call UploadAnnotations as many times as needed to upload the data in the same manner.
When uploading is finished, simply call the LoadFromUri or LoadFromCache method as usual when loading a document from a remote URL. The DocumentFactory class will check the value of the URL and can identify it as an uploaded document that does not have access to the physical file and so instead it will parse the document from the uploaded data. The data is stored in the cache and is disposed at the same time the LEADDocument object is disposed and its cache items have expired.
At any point during uploading the document, DocumentFactory.AbortUploadDocument can be called to abort the operation and DocumentFactory will immediately delete from the cache the data uploaded so far.
For an example, refer to DocumentFactory.BeginUpload.
During the upload process in UploadDocument, DocumentFactory saves each chunk into the cache. Therefore, the latest document data is available to all users of the system. If the system encounters an error and the process restarts, the upload operation can be restarted and uploading continues. This also ensures minimum amount of memory is needed when uploading large documents (only chunk size is needed at any time). This is the behavior when UploadDocumentOptions.EnableStreaming is false, the default value.
If EnableStreaming is set to true, then DocumentFactory does not save the chunk data into the cache during document upload in UploadDocument. Instead, it creates an internal stream in memory and appends the chunks into it as they arrive. When EndUpload is called, the factory then stores the data at once from the stream into the cache. This may speed up uploading of a document but at the expense of more memory being used (the whole document's data will be in memory). Take into account that uploading operations cannot be restarted, if the system encounters an error then the process restarts. Therefore, it is recommended not to set EnableStreaming to true unless the system is designed for a single process or single user operations.
Post upload operations can be performed on a document after it has been uploaded to the cache and before it is first loaded. These operations and their optional values can be set in the UploadDocumentOptions.PostUploadOperations dictionary.
The operations are performed when EndUpload is called.
The current version of LEADTOOLS contains support for the following post upload operations:
Uploaded PDF files can be checked for linearization and converted upon upload using the following:
// Automatically linearize (optimize for fast web viewing) PDF files that are greater than 1 MBytes in size.
byte pdfData = ...; // PDF data to upload
string pdfPassword = null; // If the PDF is encrypted, set its password here
var uploadDocumentOptions = new UploadDocumentOptions();
uploadDocumentOptions.Cache = cache;
const int minimumLengthInBytes = 1024 * 1024;
// The factory will not perform this operation unless we set the correct mime type:
uploadDocumentOptions.MimeType = "application/pdf";
uploadDocumentOptions.Password = pdfPassword;
// Now upload
Uri documentUri = DocumentFactory.BeginUpload(uploadOptions);
DocumentFactory.UploadDocument(cache, documentUri, pdfData, 0, pdfData.Length);
The factory performs the following actions on
If the PDF is encrypted, then a password is required to linearize it. This can be passed in UploadDocumentOptions.Password as shown above.
The metadata of LEADDocument can be examined to determine if this is a PDF document with linearized data using the following:
LEADDocument document = DocumentFactory.LoadFromCache(loadFromCacheOptions);
bool isLinearized = false;
// Check if the metadata contains the key
DocumentMetadata metadata = document.Metadata;
isLinearized = bool.Parse(metadata[LEADDocument.MetadataKey_IsLinearized]);
// Perform additional actions