How document understanding helps bring order to unstructured data.
Documents are at the center of many business processes. Scanned pages and PDFs are ubiquitous and contain large amounts of information represented as forms and tables.
Historically, this information could only be analysed and used following manual data re-entry—the process which is slow and prone to error—as traditional optical character recognition (OCR) systems haven't been able to analyse such data and preserve its inherent structure in their output.
Document understanding is concerned with advancing the abilities of document intelligence by supporting the retrieval of structured data in addition to simple text. A process that heavily relies on machine learning, it has proven key to automating structured data extraction and unlocking its full potential by making it readily accessible for subsequent processing and analysis.
"Understanding" a document first involves detecting its layout and key elements such as figures, tables, and forms. These elements are then processed separately to extract the underlying data relationships.
Any embedded forms are parsed into sets of key-value pairs, each pair corresponding to a single form field. An example of a key-value pair is "First name"–"Alice". The sets of such linked data items can subsequently be inserted into a database, one row or document per form.
The easiest way to incorporate document understanding into production workflows is to use existing cloud services. Major cloud providers each offer multiple machine learning-based services which include text and document intelligence. These offerings are summarised in the following table:
Document understanding is a key component of various emerging practical workflows and applications.
An example application of document understanding is invoice processing. Invoices are commonly sent as PDFs or paper documents that can be formatted in different ways but generally contain the same type of information such as invoice date, amount due, payment terms, etc. By being able to automatically recognise and extract this information, cognitive invoicing systems facilitate invoice processing and reduce the associated costs.
By automating manual document activities, document understanding enables organisations to process documents more efficiently, reduce error, and bring down costs. By helping extract the valuable information stored inside scanned and digital documents, it assists in search and discovery and compliance control for these documents.
The extracted structured data can be ingested by various downstream business applications, enabling smarter workflows and more advanced processing at scale.
Thanks for stopping by my digital playground! If you want to say hi, you can reach out to me on LinkedIn or via email. I'm always keen to chat and connect.
If you really-really like my work, you can support me by buying me a coffee.