Enhancing the searchability of page-image PDF documents using an aligned hidden layer from a truth text

The search accuracy achieved in a PDF image-plus-hidden- text (PDF-IT) document depends upon the accuracy of the optical character recognition (OCR) process that produced the searchable hidden text layer. In many cases recognising words in a blurred area of a PDF page image may exceed the capabiliti...

Full description

Bibliographic Details
Main Authors: Knight, Ian A., Brailsford, David F.
Format: Conference or Workshop Item
Language:English
Published: 2016
Subjects:
Online Access:https://eprints.nottingham.ac.uk/45753/