Patent Number: 7,827,027

Title: Method and apparatus for bilingual word alignment, method and apparatus for training bilingual word alignment model

Abstract: The present invention provides method and apparatus for bilingual word alignment, method and apparatus for training bilingual word alignment model. The method for bilingual word alignment, comprising: training a bilingual word alignment model using a word-aligned labeled bilingual corpus; word-aligning a plurality of bilingual sentence pairs in a unlabeled bilingual corpus using said bilingual word alignment model; determining whether the word alignment of each of said plurality of bilingual sentence pairs is correct, and if it is correct, adding the bilingual sentence pair into the labeled bilingual corpus and removing the bilingual sentence pair from the unlabeled bilingual corpus; retraining the bilingual word alignment model using the expanded labeled bilingual corpus; and re-word-aligning the remaining bilingual sentence pairs in the unlabeled bilingual corpus using the retrained bilingual word alignment model.

Inventors: Wu; Hua (Don Cheng District, CN), Wang; Haifeng (Don Cheng District, CN), Liu; Zhanyi (Don Cheng District, CN)

Assignee: Kabushiki Kaisha Toshiba

International Classification: G06F 17/28 (20060101)

Expiration Date: 2019-11-02 0:00:00