-
公开(公告)号:US11580359B2
公开(公告)日:2023-02-14
申请号:US16664508
申请日:2019-10-25
Applicant: salesforce.com, inc.
Inventor: Stephen Joseph Merity , Caiming Xiong , James Bradbury , Richard Socher
IPC: G06N3/04 , G06N3/084 , G06F40/284 , G06N3/08 , G06N7/00
Abstract: The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.
-
公开(公告)号:US11562142B2
公开(公告)日:2023-01-24
申请号:US17187608
申请日:2021-02-26
Applicant: salesforce.com, inc.
Inventor: Erik Lennart Nijkamp , Caiming Xiong
IPC: G06F40/279 , G06F40/58
Abstract: A machine learning based model generates a feature representation of a text sequence, for example, a natural language sentence or phrase. The system trains the machine learning based model by receiving an input text sequence and perturbing the input text sequence by masking a subset of tokens. The machine learning based model is used to predict the masked tokens. A predicted text sequence is generated based on the predictions of the masked tokens. The system processes the predicted text sequence using the machine learning based model to determine whether a token was predicted or an original token. The parameters of the machine learning based model are adjusted to minimize an aggregate loss based on prediction of the correct word for a masked token and a classification of a word as original or replaced.
-
公开(公告)号:US11531821B2
公开(公告)日:2022-12-20
申请号:US16993257
申请日:2020-08-13
Applicant: salesforce.com, inc.
Inventor: Tian Xie , Xinyi Yang , Caiming Xiong , Wenhao Liu , Huan Wang , Wenpeng Yin , Jin Qu
Abstract: A system performs conversations with users using chatbots customized for performing a set of tasks. The system may be a multi-tenant system that allows customization of the chatbots for each tenant. The system processes sentences that may include negation or coreferences. The system determines a confidence score for an input sentence using an intent detection model, for example, a neural network. The system modifies the sentence to generate a modified sentence, for example, by removing a negation or by replacing a pronoun with an entity. The system generates a confidence score for the modified sentence using the intent detection model. The system determines the intent of the sentence based on the confidence scores of the sentence and the modified sentence. The system performs tasks based on the determined intent and performs conversations with users based on the tasks.
-
公开(公告)号:US11487939B2
公开(公告)日:2022-11-01
申请号:US16549985
申请日:2019-08-23
Applicant: salesforce.com, inc.
Inventor: Tong Niu , Caiming Xiong , Richard Socher
IPC: G06F40/284 , G06N3/08 , H03M7/42 , H03M7/30 , G06F40/40
Abstract: Embodiments described herein provide a provide a fully unsupervised model for text compression. Specifically, the unsupervised model is configured to identify an optimal deletion path for each input sequence of texts (e.g., a sentence) and words from the input sequence are gradually deleted along the deletion path. To identify the optimal deletion path, the unsupervised model may adopt a pretrained bidirectional language model (BERT) to score each candidate deletion based on the average perplexity of the resulting sentence and performs a simple greedy look-ahead tree search to select the best deletion for each step.
-
公开(公告)号:US11481636B2
公开(公告)日:2022-10-25
申请号:US16877325
申请日:2020-05-18
Applicant: salesforce.com, inc.
Inventor: Govardana Sachithanandam Ramachandran , Ka Chun Au , Shashank Harinath , Wenhao Liu , Alexis Roos , Caiming Xiong
Abstract: An embodiment provided herein preprocesses the input samples to the classification neural network, e.g., by adding Gaussian noise to word/sentence representations to make the function of the neural network satisfy Lipschitz property such that a small change in the input does not cause much change to the output if the input sample is in-distribution. Method to induce properties in the feature representation of neural network such that for out-of-distribution examples the feature representation magnitude is either close to zero or the feature representation is orthogonal to all class representations. Method to generate examples that are structurally similar to in-domain and semantically out-of domain for use in out-of-domain classification training. Method to prune feature representation dimension to mitigate long tail error of unused dimension in out-of-domain classification. Using these techniques, the accuracy of both in-domain and out-of-distribution identification can be improved.
-
公开(公告)号:US20220277141A1
公开(公告)日:2022-09-01
申请号:US17187608
申请日:2021-02-26
Applicant: salesforce.com, inc.
Inventor: Erik Lennart Nijkamp , Caiming Xiong
IPC: G06F40/279 , G06F40/58
Abstract: A machine learning based model generates a feature representation of a text sequence, for example, a natural language sentence or phrase. The system trains the machine learning based model by receiving an input text sequence and perturbing the input text sequence by masking a subset of tokens. The machine learning based model is used to predict the masked tokens. A predicted text sequence is generated based on the predictions of the masked tokens. The system processes the predicted text sequence using the machine learning based model to determine whether a token was predicted or an original token. The parameters of the machine learning based model are adjusted to minimize an aggregate loss based on prediction of the correct word for a masked token and a classification of a word as original or replaced.
-
公开(公告)号:US11409945B2
公开(公告)日:2022-08-09
申请号:US17027130
申请日:2020-09-21
Applicant: salesforce.com, inc.
Inventor: Bryan McCann , Caiming Xiong , Richard Socher
IPC: G06F40/126 , G06N3/08 , G06N3/04 , G06F40/30 , G06F40/47 , G06F40/205 , G06F40/289 , G06F40/44 , G06F40/58
Abstract: A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequence of words using the context-specific word vectors. The first natural language process task is different from the second natural language processing task and the neural network is separately trained from the encoder. In some embodiments, the first natural processing task can be machine translation, and the second natural processing task can be one of sentiment analysis, question classification, entailment classification, and question answering.
-
118.
公开(公告)号:US11386327B2
公开(公告)日:2022-07-12
申请号:US15983782
申请日:2018-05-18
Applicant: salesforce.com, inc.
Inventor: Huishuai Zhang , Caiming Xiong
Abstract: Embodiments for training a neural network are provided. A neural network is divided into a first block and a second block, and the parameters in the first block and second block are trained in parallel. To train the parameters, a gradient from a gradient mini-batch included in training data is generated. A curvature-vector product from a curvature mini-batch included in the training data is also generated. The gradient and the curvature-vector product generate a conjugate gradient. The conjugate gradient is used to determine a change in parameters in the first block in parallel with a change in parameters in the second block. The curvature matrix in the curvature-vector product includes zero values when the terms correspond to parameters from different blocks.
-
119.
公开(公告)号:US20220215195A1
公开(公告)日:2022-07-07
申请号:US17140987
申请日:2021-01-04
Applicant: salesforce.com, inc.
Inventor: Mingfei Gao , Zeyuan Chen , Le Xue , Ran Xu , Caiming Xiong
IPC: G06K9/00 , G06F40/289 , G06F40/186
Abstract: An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.
-
公开(公告)号:US20220067534A1
公开(公告)日:2022-03-03
申请号:US17006570
申请日:2020-08-28
Applicant: salesforce.com, inc.
Inventor: Junwen Bai , Weiran Wang , Yingbo Zhou , Caiming Xiong
Abstract: Embodiments described herein combine both masked reconstruction and predictive coding. Specifically, unlike contrastive learning, the mutual information between past states and future states are directly estimated. The context information can also be directly captured via shifted masked reconstruction—unlike standard masked reconstruction, the target reconstructed observations are shifted slightly towards the future to incorporate more predictability. The estimated mutual information and shifted masked reconstruction loss can then be combined as the loss function to update the neural model.
-
-
-
-
-
-
-
-
-