Patent Number: 7,085,420

Title: Text detection in continuous tone image segments

Abstract: For encoding of mixed-mode images containing text and continuous-tone content, the pixels in the image that form the text content are detected and separated. Text detection classifies pixels as text or continuous tone content by accumulating pixel counts for groups of contiguous, non-smooth pixels with the same color. Groups whose pixel count exceeds a threshold are classified as text. The text detection technique further reduces classification errors by testing for boundary dimensions and pixel density of the group characteristic of long straight lines or large borders. The text detection technique further searches the neighborhood of groups qualifying as text for pixels of the same color, so as to also detect pixels for isolated text marks like dots, accents or punctuation. The separated text and continuous-tone content can be encoded separately for efficient compression while preserving text quality, and the text again superimposed on the continuous tone content at decompression.

Inventors: Mehrotra; Sanjeev (Kirkland, WA)

Assignee: Microsoft Corporation

International Classification: G06K 9/36 (20060101); G06K 9/34 (20060101)

Expiration Date: 8/01/02018