Patent Number: 7,822,768

Title: System and method for automating data normalization using text analytics

Abstract: A system, method and program product for normalizing, sanitizing and disambiguating structured data. Structured data includes data stored in a database management system (DBMA), as well labeled files (e.g., XML data). An automated data enhancement processing system is provided, comprising: a system for ingesting data structured in at least one predefined database format; and a set of text analytics processes that treat the ingested data as unstructured, and generate normalized data represented and indexed by consistent, structured metadata.

Inventors: Maymir-Ducharme; Fred A. (Potomac, MD), Hehenberger; Michael (Westport, CT)

Assignee: International Business Machines Corporation

International Classification: G06F 7/00 (20060101)

Expiration Date: 2018-10-26 0:00:00