Generating natural language descriptions of images

    公开(公告)号:US10417557B2

    公开(公告)日:2019-09-17

    申请号:US15856453

    申请日:2017-12-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating descriptions of input images. One of the methods includes obtaining an input image; processing the input image using a first neural network to generate an alternative representation for the input image; and processing the alternative representation for the input image using a second neural network to generate a sequence of a plurality of words in a target natural language that describes the input image.

    Generating parse trees of text segments using neural networks

    公开(公告)号:US10409908B2

    公开(公告)日:2019-09-10

    申请号:US14976121

    申请日:2015-12-21

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating parse trees for input text segments. One of the methods includes obtaining an input text segment, processing the input text segment using a first long short term memory (LSTM) neural network to convert the input text segment into an alternative representation for the input text segment, and processing the alternative representation for the input text segment using a second LSTM neural network to generate a linearized representation of a parse tree for the input text segment.

    Neural machine translation systems with rare word processing

    公开(公告)号:US10133739B2

    公开(公告)日:2018-11-20

    申请号:US14921925

    申请日:2015-10-23

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural translation systems with rare word processing. One of the methods is a method training a neural network translation system to track the source in source sentences of unknown words in target sentences, in a source language and a target language, respectively and includes deriving alignment data from a parallel corpus, the alignment data identifying, in each pair of source and target language sentences in the parallel corpus, aligned source and target words; annotating the sentences in the parallel corpus according to the alignment data and a rare word model to generate a training dataset of paired source and target language sentences; and training a neural network translation model on the training dataset.

    TRAINING DISTILLED MACHINE LEARNING MODELS
    46.
    发明公开

    公开(公告)号:US20240144109A1

    公开(公告)日:2024-05-02

    申请号:US18399358

    申请日:2023-12-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a distilled machine learning model. One of the methods includes training a cumbersome machine learning model, wherein the cumbersome machine learning model is configured to receive an input and generate a respective score for each of a plurality of classes; and training a distilled machine learning model on a plurality of training inputs, wherein the distilled machine learning model is also configured to receive inputs and generate scores for the plurality of classes, comprising: processing each training input using the cumbersome machine learning model to generate a cumbersome target soft output for the training input; and training the distilled machine learning model to, for each of the training inputs, generate a soft output that matches the cumbersome target soft output for the training input.

    Training distilled machine learning models

    公开(公告)号:US11900232B2

    公开(公告)日:2024-02-13

    申请号:US17863733

    申请日:2022-07-13

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a distilled machine learning model. One of the methods includes training a cumbersome machine learning model, wherein the cumbersome machine learning model is configured to receive an input and generate a respective score for each of a plurality of classes; and training a distilled machine learning model on a plurality of training inputs, wherein the distilled machine learning model is also configured to receive inputs and generate scores for the plurality of classes, comprising: processing each training input using the cumbersome machine learning model to generate a cumbersome target soft output for the training input; and training the distilled machine learning model to, for each of the training inputs, generate a soft output that matches the cumbersome target soft output for the training input.

    Speech recognition with attention-based recurrent neural networks

    公开(公告)号:US11151985B2

    公开(公告)日:2021-10-19

    申请号:US16713298

    申请日:2019-12-13

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps, processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence, processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.

    Classifying videos using neural networks

    公开(公告)号:US11074454B1

    公开(公告)日:2021-07-27

    申请号:US16410863

    申请日:2019-05-13

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying videos using neural networks. One of the methods includes obtaining a temporal sequence of video frames, wherein the temporal sequence comprises a respective video frame from a particular video at each of a plurality time steps; for each time step of the plurality of time steps: processing the video frame at the time step using a convolutional neural network to generate features of the video frame; and processing the features of the video frame using an LSTM neural network to generate a set of label scores for the time step and classifying the video as relating to one or more of the topics represented by labels in the set of labels from the label scores for each of the plurality of time steps.

Patent Agency Ranking