Patent Number: 8,798,984

Title: Method and system for confidence-weighted learning of factored discriminative language models

Abstract: A system and method for building a language model for a translation system are provided. The method includes providing a first relative ranking of first and second translations in a target language of a same source string in a source language, determining a second relative ranking of the first and second translations using weights of a language model, the language model including a weight for each of a set of n-gram features, and comparing the first and second relative rankings to determine whether they are in agreement. The method further includes, when the rankings are not in agreement, updating one or more of the weights in the language model as a function of a measure of confidence in the weight, the confidence being a function of previous observations of the n-gram feature in the method.

Inventors: Cancedda; Nicola (Grenoble, FR), Ha-Thuc; Viet (Coralville, IA)

Assignee: Xerox Corporation

International Classification: G06F 17/20 (20060101); G06F 17/28 (20060101)

Expiration Date: 8/05/12018