SEQUENCE-TO-SEQUENCE PREDICTION USING A NEURAL NETWORK MODEL

    公开(公告)号:US20190130273A1

    公开(公告)日:2019-05-02

    申请号:US15884125

    申请日:2018-01-30

    Abstract: A method for sequence-to-sequence prediction using a neural network model includes generating an encoded representation based on an input sequence using an encoder of the neural network model and predicting an output sequence based on the encoded representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. At least one of the encoder or the decoder includes a branched attention layer. Each branch of the branched attention layer includes an interdependent scaling node configured to scale an intermediate representation of the branch by a learned scaling parameter. The learned scaling parameter depends on one or more other learned scaling parameters of one or more other interdependent scaling nodes of one or more other branches of the branched attention layer.

    Spatial Attention Model for Image Captioning
    12.
    发明申请

    公开(公告)号:US20180143966A1

    公开(公告)日:2018-05-24

    申请号:US15817153

    申请日:2017-11-17

    Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

    ENGAGEMENT ESTIMATOR
    15.
    发明申请
    ENGAGEMENT ESTIMATOR 审中-公开
    参与估计

    公开(公告)号:US20170032280A1

    公开(公告)日:2017-02-02

    申请号:US15221541

    申请日:2016-07-27

    Inventor: Richard SOCHER

    CPC classification number: G06N3/0454 G06N3/0445

    Abstract: A machine learning system may be implemented as a set of trained models. A set of trained models, for example, a deep learning system, is disclosed wherein one or more types of media input may be analyzed to determine an associated engagement of the one or more types of media input.

    Abstract translation: 机器学习系统可以被实现为一组经过训练的模型。 公开了一组经过训练的模型,例如深度学习系统,其中可以分析一种或多种类型的媒体输入以确定一种或多种类型的媒体输入的相关联。

    Systems and Methods for Reading Comprehension for a Question Answering Task

    公开(公告)号:US20230419050A1

    公开(公告)日:2023-12-28

    申请号:US18463019

    申请日:2023-09-07

    CPC classification number: G06F40/40 G06F40/30

    Abstract: Embodiments described herein provide a pipelined natural language question answering system that improves a BERT-based system. Specifically, the natural language question answering system uses a pipeline of neural networks each trained to perform a particular task. The context selection network identifies premium context from context for the question. The question type network identifies the natural language question as a yes, no, or span question and a yes or no answer to the natural language question when the question is a yes or no question. The span extraction model determines an answer span to the natural language question when the question is a span question.

    PREDICTION-CORRECTION APPROACH TO ZERO SHOT LEARNING

    公开(公告)号:US20210365740A1

    公开(公告)日:2021-11-25

    申请号:US17397677

    申请日:2021-08-09

    Abstract: Approaches to zero-shot learning include partitioning training data into first and second sets according to classes assigned to the training data, training a prediction module based on the first set to predict a cluster center based on a class label, training a correction module based on the second set and each of the class labels in the first set to generate a correction to a cluster center predicted by the prediction module, presenting a new class label for a new class to the prediction module to predict a new cluster center, presenting the new class label, the predicted new cluster center, and each of the class labels in the first set to the correction module to generate a correction for the predicted new cluster center, augmenting a classifier based on the corrected cluster center for the new class, and classifying input data into the new class using the classifier.

    STRUCTURED TEXT TRANSLATION
    18.
    发明申请

    公开(公告)号:US20210216728A1

    公开(公告)日:2021-07-15

    申请号:US17214691

    申请日:2021-03-26

    Abstract: Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.

    Interpretable Counting in Visual Question Answering

    公开(公告)号:US20200175305A1

    公开(公告)日:2020-06-04

    申请号:US16781179

    申请日:2020-02-04

    Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.

Patent Agency Ranking