Mastering the OCR for PDF Documents

In this article we look deeper into PDF documents creation and editing process. As we mentioned in “OCR: make your documents text-searchable” post, OCR process runs in multiple threads. The number of threads is equal to the number of CPU cores and each thread processes one page at a time. The time that OCR process takes for one page depends on multiple factors, such as page content, model of CPU and its utilization by other applications.

OCR: Make Your Documents Text-Searchable

Optical character recognition (OCR) is now available in TaxWorkFlow. This tool allows you to convert scanned paper data records to text. This technology is being developed and enhanced for 30+ years and nowadays it works perfectly with electronic documents. You can read more about it in Wikipedia.

So what are the benefits of this technology?

