Optical Character Recognition (OCR) Task

Jump to: navigation, search

An Optical Character Recognition (OCR) Task is a visual entity recognition task that requires the recognition of the graphemes in a written text.



  • (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Optical_character_recognition Retrieved:2014-9-28.
    • Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned or photographed images of typewritten or printed text into machine-encoded/computer-readable text. It is widely used as a form of data entry from some sort of original paper data source, whether passport documents, invoices, bank statement, receipts, business card, mail, or any number of printed records. It is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data extraction and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

      Early versions needed to be programmed with images of each character, and worked on one font at a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common. Some commercial systems are capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.


  • (Strasburder, 2005) ⇒ Hans Strasburger. (2005). “Unfocussed Spatial Attention Underlies the Crowding Effect in Indirect Form Vision.” In: Journal of Vision, 5(11):8. doi:10.1167/5.11.8
    • QUOTE: In a comprehensive analysis, Pelli et al. (2004) have characterized crowding as a process of impaired feature integration occurring in the visual periphery, in contradistinction to (lateral) masking as occurring from impaired feature detection anywhere in the visual field. We have ourselves characterized the visual periphery — where the interesting cases of crowding occur (Strasburger et al., 1991) — as differing from the fovea by the architecture of feature integration (Strasburger & Rentschler, 1996). That argument was based on the differing dependence-on-eccentricity functions of contrast sensitivity for grating detection and for character recognition (Strasburger, 2003b; Strasburger, Gothe, & Lutz, 2000; Strasburger, Rentschler, & Harvey, 1994) and by showing that the difference between the two cannot be explained by a spatial scaling concept (M scaling, cortical-magnification scaling). We concluded that there must be architectural differences across the visual field — in particular between the fovea and the rest of the field — that concern feature integration not feature detection. In a hierarchy of task complexity ranging from
  1. The term “discrimination task” is sometimes used in a different meaning, implying the judgement of a quantity being larger or smaller than another (the corresponding psychometric function then goes from −1 to 1). This is not implied here, the intended meaning being that the observer can discriminate between two broadly different stimuli and thereby identify each. The term “identification task” is sometimes used for that case but is avoided here to reserve the concept of identification for those tasks where discrimination between a few cases will not solve the identification.