-
公开(公告)号:US20180144208A1
公开(公告)日:2018-05-24
申请号:US15817161
申请日:2017-11-17
Applicant: salesforce.com, inc.
Inventor: Jiasen LU , Caiming XIONG , Richard SOCHER
Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.
-
公开(公告)号:US20180129931A1
公开(公告)日:2018-05-10
申请号:US15420801
申请日:2017-01-31
Applicant: salesforce.com, inc.
Inventor: James BRADBURY , Stephen Joseph MERITY , Caiming XIONG , Richard SOCHER
IPC: G06N3/04
CPC classification number: G06N3/08 , G06F17/16 , G06F17/20 , G06F17/2715 , G06F17/2785 , G06F17/2818 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/10 , G10L15/16 , G10L15/18 , G10L15/1815 , G10L25/30
Abstract: The technology disclosed provides a quasi-recurrent neural network (QRNN) encoder-decoder model that alternates convolutional layers, which apply in parallel across timesteps, and minimalist recurrent pooling layers that apply in parallel across feature dimensions.
-
公开(公告)号:US20180121799A1
公开(公告)日:2018-05-03
申请号:US15421431
申请日:2017-01-31
Applicant: salesforce.com, inc.
Inventor: Kazuma HASHIMOTO , Caiming XIONG , Richard SOCHER
CPC classification number: G06N3/04 , G06F17/20 , G06F17/2705 , G06F17/2715 , G06F17/274 , G06F17/277 , G06F17/2785 , G06N3/0445 , G06N3/0454 , G06N3/0472 , G06N3/063 , G06N3/08 , G06N3/084 , G10L15/16 , G10L15/18 , G10L25/30
Abstract: The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.
-
公开(公告)号:US20230419050A1
公开(公告)日:2023-12-28
申请号:US18463019
申请日:2023-09-07
Applicant: salesforce.com, inc.
Inventor: Akari ASAI , Kazuma HASHIMOTO , Richard SOCHER , Caiming XIONG
IPC: G06F40/40
Abstract: Embodiments described herein provide a pipelined natural language question answering system that improves a BERT-based system. Specifically, the natural language question answering system uses a pipeline of neural networks each trained to perform a particular task. The context selection network identifies premium context from context for the question. The question type network identifies the natural language question as a yes, no, or span question and a yes or no answer to the natural language question when the question is a yes or no question. The span extraction model determines an answer span to the natural language question when the question is a span question.
-
公开(公告)号:US20210365740A1
公开(公告)日:2021-11-25
申请号:US17397677
申请日:2021-08-09
Applicant: salesforce.com, inc.
Inventor: Lily HU , Caiming XIONG , Richard SOCHER
Abstract: Approaches to zero-shot learning include partitioning training data into first and second sets according to classes assigned to the training data, training a prediction module based on the first set to predict a cluster center based on a class label, training a correction module based on the second set and each of the class labels in the first set to generate a correction to a cluster center predicted by the prediction module, presenting a new class label for a new class to the prediction module to predict a new cluster center, presenting the new class label, the predicted new cluster center, and each of the class labels in the first set to the correction module to generate a correction for the predicted new cluster center, augmenting a classifier based on the corrected cluster center for the new class, and classifying input data into the new class using the classifier.
-
公开(公告)号:US20210216728A1
公开(公告)日:2021-07-15
申请号:US17214691
申请日:2021-03-26
Applicant: salesforce.com, inc.
Inventor: Kazuma HASHIMOTO , Raffaella BUSCHIAZZO , James BRADBURY , Teresa MARSHALL , Caiming XIONG , Richard SOCHER
IPC: G06F40/58
Abstract: Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.
-
公开(公告)号:US20210142164A1
公开(公告)日:2021-05-13
申请号:US16716249
申请日:2019-12-16
Applicant: salesforce.com, inc.
Inventor: Linqing LIU , Caiming XIONG
Abstract: Systems and methods are provided that employ knowledge distillation under a multi-task learning setting. In some embodiments, the systems and methods are implemented with a larger teacher model and a smaller student model, each of which comprise one or more shared layers and a plurality of task layers for performing multiple tasks. During training of the teacher model, its shared layers are initialized, and then the teacher model is multi-task refined. The teacher model predicts teacher logits. During training of the student model, its shared layers are initialized. Knowledge distillation is employed to transfer knowledge from the teacher model to the student model by the student model updating its shared layers and task layers, for example, according to the teacher logits of the teacher model. Other features are also provided.
-
公开(公告)号:US20200175305A1
公开(公告)日:2020-06-04
申请号:US16781179
申请日:2020-02-04
Applicant: salesforce.com, inc.
Inventor: Alexander Richard TROTT , Caiming XIONG , Richard SOCHER
IPC: G06K9/46 , G06N3/04 , G06K9/00 , G06N5/04 , G06F16/332
Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.
-
公开(公告)号:US20190213482A1
公开(公告)日:2019-07-11
申请号:US16355290
申请日:2019-03-15
Applicant: salesforce.com, inc.
Inventor: Richard SOCHER , Caiming XIONG , Kai Sheng TAI
Abstract: A method of classifying three-dimensional (3D) data includes receiving three-dimensional (3D) data and processing the 3D data using a neural network that includes a plurality of subnetworks arranged in a sequence and the data is processed through each of the subnetworks. Each of the subnetworks is configured to receive an output generated by a preceding subnetwork in the sequence, process the output through a plurality of parallel 3D convolution layer paths of varying convolution volume, process the output through a parallel pooling path, and concatenate output of the 3D convolution layer paths and the pooling path to generate an output representation from each of the subnetworks. Following processing the data through the subnetworks, the method includes processing the output of a last one of the subnetworks in the sequence through a vertical pooling layer to generate an output and classifying the received 3D data based upon the generated output.
-
30.
公开(公告)号:US20190130312A1
公开(公告)日:2019-05-02
申请号:US15885727
申请日:2018-01-31
Applicant: salesforce.com, inc.
Inventor: Caiming XIONG , Tianmin SHU , Richard SOCHER
Abstract: The disclosed technology reveals a hierarchical policy network, for use by a software agent, to accomplish an objective that requires execution of multiple tasks. A terminal policy learned by training the agent on a terminal task set, serves as a base task set of the intermediate task set. An intermediate policy learned by training the agent on an intermediate task set serves as a base policy of the top policy. A top policy learned by training the agent on a top task set serves as a base task set of the top task set. The agent is configurable to accomplish the objective by traversal of the hierarchical policy network. A current task in a current task set is executed by executing a previously-learned task selected from a corresponding base task set governed by a corresponding base policy, or performing a primitive action selected from a library of primitive actions.
-
-
-
-
-
-
-
-
-