Neural machine translation systems with rare word processing

    公开(公告)号:US10133739B2

    公开(公告)日:2018-11-20

    申请号:US14921925

    申请日:2015-10-23

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.

    Processing inputs using recurrent neural networks

    公开(公告)号:US10657435B1

    公开(公告)日:2020-05-19

    申请号:US14877096

    申请日:2015-10-07

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing an input sequence using a recurrent neural network to generate an output for the input sequence. One of the methods includes receiving the input sequence; generating a doubled sequence comprising a first instance of the input sequence followed by a second instance of the input sequence; and processing the doubled sequence using the recurrent neural network to generate the output for the input sequence.

    Training neural networks on partitioned training data

    公开(公告)号:US10380482B2

    公开(公告)日:2019-08-13

    申请号:US14877071

    申请日:2015-10-07

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes obtaining partitioned training data for the neural network, wherein the partitioned training data comprises a plurality of training items each of which is assigned to a respective one of a plurality of partitions, wherein each partition is associated with a respective difficulty level; and training the neural network on each of the partitions in a sequence from a partition associated with an easiest difficulty level to a partition associated with a hardest difficulty level, wherein, for each of the partitions, training the neural network comprises: training the neural network on a sequence of training items that includes training items selected from the training items in the partition interspersed with training items selected from the training items in all of the partitions.

Patent Agency Ranking