OCR stands for Optical Character Recognition, a technology which distinguishes typed, printed or handwritten text characters inside digital images of physical documents, such as a scanned paper document. Once distinguished, these characters can then be converted into machine-readable text data.
OCR technology became popular in the 1990s and is now used widely in everyday scenarios for data entry. The digitised data can then be sent, stored, edited, mined, translated, displayed… the list is endless!
If you have been to a car park where your number plate has been automatically recognised, searched for a document online or submitted a written document at the bank, you’ve brushed shoulders with OCR technology.
Before OCR technology was available, the only option to digitize printed documents was by manually re-typing the text. A time-consuming process. As OCR technology has developed, not only was data re-entry no longer required, but its accuracy was hugely improved. Now it touches some of the most important personal and business documentation; passport documents, invoices, bank statements, computerised receipts… you name it.
How does Optical Character Recognition work?
Step 1
Physical documents are scanned and converted into a two-color (black and white) format. The scanned-in image is then analysed for light and dark areas. The dark areas are identified as characters that need to be recognized and the light areas are identified as the background.
Step 2
The dark areas are then further analysed using one of two algorithms to identify alphabetic letters or numeric digits:
- Pattern recognition. This is where examples of text in various fonts and formats are provided to the OCR programs and used to compare, and recognize, characters in the scanned document.
- Feature detection. This is where rules are applied by the OCR programs to determine the features of a specific letter or number. Rules often describe angled lines, crossed lines or curves to determine specific characters and numbers.
Step 3
When a character or number has been identified, it is converted into an ASCII code (American Standard Code for Information Interchange is a 7-bit character code where every single bit represents a unique character used to represent text digitally). Here at Cull, we then proofread the output to ensure the highest levels of quality, and provide the digital version of the documents in your chosen programme or format.
How can OCR Scanning help you?
Your digitised documents can now be sent, stored, edited, mined, translated, displayed… There is no need for time consuming and inaccurate data re-entry. There is no need for endless searching for paper documents. There is no need for paying out for storage facilities.
Call our friendly team today to talk about how we can help you with OCR Scanning. We’re here to expertly guide you through digitising your documents.
Call our team today on 0151 638 6000.