Multi-domain machine translation system with training data clustering and dynamic domain adaptation

    公开(公告)号:US10437933B1

    公开(公告)日:2019-10-08

    申请号:US15238101

    申请日:2016-08-16

    Abstract: A machine translation system capable of clustering training data and performing dynamic domain adaptation is disclosed. An unsupervised domain clustering process is utilized to identify domains in general training data that can include in-domain training data and out-of-domain training data. Segments in the general training data are then assigned to the domains in order to create domain-specific training data. The domain-specific training data is then utilized to create domain-specific language models, domain-specific translation models, and domain-specific model weights for the domains. An input segment to be translated can be assigned to a domain at translation time. The domain-specific model weights for the assigned domain can be utilized to translate the input segment.

Patent Agency Ranking