-
公开(公告)号:US10185713B1
公开(公告)日:2019-01-22
申请号:US14867932
申请日:2015-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Michael Denkowski , Alon Lavie , Gregory Alan Hanneman , Austin Matthews , Matthew Ryan Fiorillo , Robert Thomas Olszewski , Christopher James Dyer , William Joseph Kaper , Alexandre Alexandrovich Klementiev , Gavin R. Jewell
Abstract: Technologies are disclosed herein for statistical machine translation. In particular, the disclosed technologies include extensions to conventional machine translation pipelines: the use of multiple domain-specific and non-domain-specific dynamic language translation models and language models; cluster-based language models; and large-scale discriminative training. Incremental update technologies are also disclosed for use in updating a machine translation system in four areas: word alignment; translation modeling; language modeling; and parameter estimation. A mechanism is also disclosed for training and utilizing a runtime machine translation quality classifier for estimating the quality of machine translations without the benefit of reference translations. The runtime machine translation quality classifier is generated in a manner to offset imbalances in the number of training instances in various classes, and to assign a greater penalty to the misclassification of lower-quality translations as higher-quality translations than to misclassification of higher-quality translations as lower-quality translations.
-
公开(公告)号:US10268684B1
公开(公告)日:2019-04-23
申请号:US14868166
申请日:2015-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Michael Denkowski , Alon Lavie , Gregory Alan Hanneman , Matthew Ryan Fiorillo , Laura Josephine Kieras , Robert Thomas Olszewski , William Joseph Kaper , Alexandre Alexandrovich Klementiev , Gavin Richard Jewell
Abstract: Technologies are disclosed herein for statistical machine translation. In particular, the disclosed technologies include extensions to conventional machine translation pipelines: the use of multiple domain-specific and non-domain-specific dynamic language translation models and language models; cluster-based language models; and large-scale discriminative training. Incremental update technologies are also disclosed for use in updating a machine translation system in four areas: word alignment; translation modeling; language modeling; and parameter estimation. A mechanism is also disclosed for training and utilizing a runtime machine translation quality classifier for estimating the quality of machine translations without the benefit of reference translations. The runtime machine translation quality classifier is generated in a manner to offset imbalances in the number of training instances in various classes, and to assign a greater penalty to the misclassification of lower-quality translations as higher-quality translations than to misclassification of higher-quality translations as lower-quality translations.
-
公开(公告)号:US09959271B1
公开(公告)日:2018-05-01
申请号:US14868083
申请日:2015-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Kartik Goyal , Alon Lavie , Michael Denkowski , Gregory Alan Hanneman , Matthew Ryan Fiorillo , Robert Thomas Olszewski , Ehud Hershkovich , William Joseph Kaper , Alexandre Alexandrovich Klementiev , Gavin R. Jewell
CPC classification number: G06F17/2818 , G06F17/2854
Abstract: Technologies are disclosed herein for statistical machine translation. In particular, the disclosed technologies include extensions to conventional machine translation pipelines: the use of multiple domain-specific and non-domain-specific dynamic language translation models and language models; cluster-based language models; and large-scale discriminative training. Incremental update technologies are also disclosed for use in updating a machine translation system in four areas: word alignment; translation modeling; language modeling; and parameter estimation. A mechanism is also disclosed for training and utilizing a runtime machine translation quality classifier for estimating the quality of machine translations without the benefit of reference translations. The runtime machine translation quality classifier is generated in a manner to offset imbalances in the number of training instances in various classes, and to assign a greater penalty to the misclassification of lower-quality translations as higher-quality translations than to misclassification of higher-quality translations as lower-quality translations.
-
-