Patent Number: 8,442,319

Title: System and method for classifying connected groups of foreground pixels in scanned document images according to the type of marking

Abstract: Methods and systems for classifying markings on images in a document are undertaken according to marking types. The document containing the images is supplied to a segmenter which breaks the images into fragments of foreground pixel structures that are identified as being likely to be of the same marking type by finding connected components, extracting near-horizontal or -vertical rule lines and subdividing some connected components to obtain the fragments. The fragments are then supplied to a classifier, where the classifier provides a category score for each fragment, wherein the classifier is trained from the groundtruth images whose pixels are labeled according to known marking types. Thereafter, a same label is assigned to all pixels in a particular fragment, when the fragment is classified by the classifier.

Inventors: Sarkar; Prateek (Sunnyvale, CA), Saund; Eric (San Carlos, CA)

Assignee: Palo Alto Research Center Incorporated

International Classification: G06K 9/30 (20060101)

Expiration Date: 2021-05-14 0:00:00