Patent Number: 6,307,962

Title: Document data compression system which automatically segments documents and generates compressed smart documents therefrom

Abstract: Data representing a compressed document, referred to as a smart document, is produced from a document page composed of an array of pixel signals having values representative of gray scale. The system initially subdivides the pixel signals of the document page into a matrix of blocks, and classifies blocks as active or non-active. The document page is segmented into macroblocks (segments) by grouping one or more adjacent active blocks. One or more regions of adjacent non-active blocks are then located and the prevalent value of pixel signals in each region is determined to provide background data. The macroblocks are classified as first or second macroblock types based upon the values of the pixel signals in each macroblock. A bit-map is produced representing the blocks in the matrix of the first macroblock type. The pixel signals in the blocks represented in the map are thresholded into a binary representation to provide a binary image. Data signals representing the majority and minority gray scale levels of each first type macroblock are determined. Position data is generated specifying the locations of the macroblocks in the matrix of the second macroblock type. Map, binary image, and pixel signals from the second type macroblocks are then encoded into corresponding data. The smart document is generated from the encoded data, the background data, the position data, and data signals representing the gray scale levels of each macroblock of the first type. A reproduction of the document page can be rendered from the smart document.

Inventors: Parker; Kevin J. (Rochester, NY), Fung; Hei Tao (Rochester, NY)

Assignee: The University of Rochester

International Classification: G06T 9/00 (20060101); G06K 009/00 ()

Expiration Date: 10/23/2018