-
公开(公告)号:US20200372319A1
公开(公告)日:2020-11-26
申请号:US16559196
申请日:2019-09-03
Applicant: salesforce.com, inc.
Inventor: Lichao SUN , Kazuma HASHIMOTO , Jia LI , Richard SOCHER , Caiming XIONG
Abstract: A method for evaluating robustness of one or more target neural network models using natural typos. The method includes receiving one or more natural typo generation rules associated with a first task associated with a first input document type, receiving a first target neural network model, and receiving a first document and corresponding its ground truth labels. The method further includes generating one or more natural typos for the first document based on the one or more natural typo generation rules, and providing, to the first target neural network model, a test document generated based on the first document and the one or more natural typos as an input document to generate a first output. A robustness evaluation result of the first target neural network model is generated based on a comparison between the output and the ground truth labels.
-
公开(公告)号:US20200285993A1
公开(公告)日:2020-09-10
申请号:US16653890
申请日:2019-10-15
Applicant: salesforce.com, inc.
Inventor: Hao LIU , Richard SOCHER , Caiming XIONG
Abstract: Systems and methods are provided for efficient off-policy credit assignment (ECA) in reinforcement learning. ECA allows principled credit assignment for off-policy samples, and therefore improves sample efficiency and asymptotic performance. One aspect of ECA is to formulate the optimization of expected return as approximate inference, where policy is approximating a learned prior distribution, which leads to a principled way of utilizing off-policy samples. Other features are also provided.
-
公开(公告)号:US20190130248A1
公开(公告)日:2019-05-02
申请号:US15881582
申请日:2018-01-26
Applicant: salesforce.com, inc.
Inventor: Victor Zhong , Caiming XIONG , Richard SOCHER
Abstract: A computer-implemented method for dual sequence inference using a neural network model includes generating a codependent representation based on a first input representation of a first sequence and a second input representation of a second sequence using an encoder of the neural network model and generating an inference based on the codependent representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. The encoder includes a plurality of coattention layers arranged sequentially, each coattention layer being configured to receive a pair of layer input representations and generate one or more summary representations, and an output layer configured to receive the one or more summary representations from a last layer among the plurality of coattention layers and generate the codependent representation.
-
公开(公告)号:US20180129938A1
公开(公告)日:2018-05-10
申请号:US15421193
申请日:2017-01-31
Applicant: salesforce.com, inc.
Inventor: Caiming XIONG , Victor ZHONG , Richard SOCHER
CPC classification number: G06N3/08 , G06N3/0445 , G06N3/0454 , G06N5/022 , G06N5/04
Abstract: The technology disclosed relates to an end-to-end neural network for question answering, referred to herein as “dynamic coattention network (DCN)”. Roughly described, the DCN includes an encoder neural network and a coattentive encoder that capture the interactions between a question and a document in a so-called “coattention encoding”. The DCN also includes a decoder neural network and highway maxout networks that process the coattention encoding to estimate start and end positions of a phrase in the document that responds to the question.
-
公开(公告)号:US20200372341A1
公开(公告)日:2020-11-26
申请号:US16695494
申请日:2019-11-26
Applicant: salesforce.com, inc.
Inventor: Akari ASAI , Kazuma HASHIMOTO , Richard SOCHER , Caiming XIONG
Abstract: Embodiments described herein provide a pipelined natural language question answering system that improves a BERT-based system. Specifically, the natural language question answering system uses a pipeline of neural networks each trained to perform a particular task. The context selection network identifies premium context from context for the question. The question type network identifies the natural language question as a yes, no, or span question and a yes or no answer to the natural language question when the question is a yes or no question. The span extraction model determines an answer span to the natural language question when the question is a span question.
-
公开(公告)号:US20190149834A1
公开(公告)日:2019-05-16
申请号:US15874515
申请日:2018-01-18
Applicant: salesforce.com, inc.
Inventor: Yingbo Zhou , Luowei ZHOU , Caiming XIONG , Richard SOCHER
IPC: H04N19/46 , H04N19/44 , H04N19/60 , H04N19/187 , H04N19/132 , H04N19/33 , H04N19/126 , H04N21/488 , H04N21/81
CPC classification number: H04N19/46 , H04N19/126 , H04N19/132 , H04N19/187 , H04N19/33 , H04N19/44 , H04N19/60 , H04N21/4884 , H04N21/8126
Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.
-
公开(公告)号:US20190130896A1
公开(公告)日:2019-05-02
申请号:US15851579
申请日:2017-12-21
Applicant: salesforce.com, inc.
Inventor: Yingbo ZHOU , Caiming XIONG , Richard SOCHER
IPC: G10L15/06 , G10L15/24 , G10L13/033 , G10L13/04 , G10L15/20
Abstract: The disclosed technology teaches regularizing a deep end-to-end speech recognition model to reduce overfitting and improve generalization: synthesizing sample speech variations on original speech samples labelled with text transcriptions, and modifying a particular original speech sample to independently vary tempo and pitch of the original speech sample while retaining the labelled text transcription of the original speech sample, thereby producing multiple sample speech variations having multiple degrees of variation from the original speech sample. The disclosed technology includes training a deep end-to-end speech recognition model, on thousands to millions of original speech samples and the sample speech variations on the original speech samples, that outputs recognized text transcriptions corresponding to speech detected in the original speech samples and the sample speech variations. Additional sample speech variations include augmented volume, temporal alignment offsets and the addition of pseudo-random noise to the particular original speech sample.
-
公开(公告)号:US20190130206A1
公开(公告)日:2019-05-02
申请号:US15882220
申请日:2018-01-29
Applicant: salesforce.com, inc.
Inventor: Alexander Richard Trott , Caiming XIONG , Richard SOCHER
Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.
-
公开(公告)号:US20200285705A1
公开(公告)日:2020-09-10
申请号:US16399871
申请日:2019-04-30
Applicant: salesforce.com, inc.
Inventor: Stephan ZHENG , Wojciech KRYSCINSKI , Michael SHUM , Richard SOCHER , Caiming XIONG
Abstract: Approaches for determining a response for an agent in an undirected dialogue are provided. The approaches include a dialogue generating framework comprising an encoder neural network, a decoder neural network, and a language model neural network. The dialogue generating framework generates a sketch sentence response with at least one slot. The sketch sentence response is generated word by word and takes into account the undirected dialogue and agent traits of the agent making the response. The dialogue generating framework generates sentence responses by filling the slot with words from the agent traits. The dialogue generating framework ranks the sentence responses according to perplexity by passing the sentence responses through a language model and selects a final response which is a sentence response that has a lowest perplexity.
-
公开(公告)号:US20200184020A1
公开(公告)日:2020-06-11
申请号:US16264392
申请日:2019-01-31
Applicant: salesforce.com, inc.
Inventor: Kazuma HASHIMOTO , Raffaella BUSCHIAZZO , James BRADBURY , Teresa MARSHALL , Caiming XIONG , Richard SOCHER
IPC: G06F17/28
Abstract: Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.
-
-
-
-
-
-
-
-
-