-
公开(公告)号:US11749264B2
公开(公告)日:2023-09-05
申请号:US17088206
申请日:2020-11-03
Applicant: salesforce.com, inc.
Inventor: Chien-Sheng Wu , Chu Hong Hoi , Richard Socher , Caiming Xiong
CPC classification number: G10L15/1815 , G10L15/063 , G10L15/1822
Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.
-
2.
公开(公告)号:US20230237275A1
公开(公告)日:2023-07-27
申请号:US17830889
申请日:2022-06-02
Applicant: salesforce.com, inc.
Inventor: Guangsen Wang , Samson Min Rong Tan , Shafiq Rayhan Joty , Gang Wu , Chu Hong Hoi , Ka Chun Au
IPC: G06F40/35 , G06F40/40 , H04L51/02 , G06F40/186
CPC classification number: G06F40/35 , G06F40/40 , H04L51/02 , G06F40/186
Abstract: Embodiments provide a software framework for evaluating and troubleshooting real-world task-oriented bot systems. Specifically, the evaluation framework includes a generator that infers dialog acts and entities from bot definitions and generates test cases for the system via model-based paraphrasing. The framework may also include a simulator for task-oriented dialog user simulation that supports both regression testing and end-to-end evaluation. The framework may also include a remediator to analyze and visualize the simulation results, remedy some of the identified issues, and provide actionable suggestions for improving the task-oriented dialog system.
-
公开(公告)号:US11651158B2
公开(公告)日:2023-05-16
申请号:US16993256
申请日:2020-08-13
Applicant: salesforce.com, inc.
Inventor: Xinyi Yang , Tian Xie , Caiming Xiong , Wenhao Liu , Huan Wang , Jin Qu , Soujanya Lanka , Chu Hong Hoi , Xugang Ye , Feihong Wu
IPC: G10L15/05 , G06F40/295 , G06F40/35 , G06N3/04 , H04L51/02
CPC classification number: G06F40/295 , G06F40/35 , G06N3/04 , H04L51/02
Abstract: A system performs conversations with users using chatbots customized for performing a set of tasks. The system may be a multi-tenant system that allows customization of the chatbots for each tenant. The system receives a task configuration that maps tasks to entity types and an entity configuration that specifies methods for determining entities of a particular entity type. The system receives a user utterance and determines the intent of the user using an intent detection model, for example, a neural network. The intent represents a task that the user is requesting. The system determines one or more entities corresponding to the task. The system performs tasks based on the determined intent and the entities and performs conversations with users based on the tasks.
-
公开(公告)号:US11615240B2
公开(公告)日:2023-03-28
申请号:US16581035
申请日:2019-09-24
Applicant: salesforce.com, inc.
Inventor: Xuan Phi Nguyen , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F40/205
Abstract: Embodiments described herein provide an attention-based tree encoding mechanism. Specifically, the attention layer receives as input the pre-parsed constituency tree of a sentence and the lower-layer representations of all nodes. The attention layer then performs upward accumulation to encode the tree structure from leaves to the root in a bottom-up fashion. Afterwards, weighted aggregation is used to compute the final representations of non-terminal nodes.
-
公开(公告)号:US11599792B2
公开(公告)日:2023-03-07
申请号:US16688104
申请日:2019-11-19
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: A method provides learning with noisy labels. The method includes generating a first network of a machine learning model with a first set of parameter initial values, and generating a second network of the machine learning model with a second set of parameter initial values. First clean probabilities for samples in a training dataset are generated using the second network. A first labeled dataset and a first unlabeled dataset are generated from the training dataset based on the first clean probabilities. The first network is trained based on the first labeled dataset and first unlabeled dataset to update parameters of the first network.
-
6.
公开(公告)号:US20220261651A1
公开(公告)日:2022-08-18
申请号:US17479565
申请日:2021-09-20
Applicant: salesforce.com, inc.
Inventor: Gerald Woo , Doyen Sahoo , Chu Hong Hoi
Abstract: A multi-view contrastive relational learning framework is provided. In the multi-view contrastive relational learning framework, contrastive learning is augmented with a multi-view learning signal. The auxiliary views guide an encoder of the underlying time series data's main view, by using an inter-sample similarity structure as a learning signal to learn representations which encode information from multiple views.
-
公开(公告)号:US11334766B2
公开(公告)日:2022-05-17
申请号:US16778339
申请日:2020-01-31
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: Systems and methods are provided for training object detectors of a neural network model with a mixture of label noise and bounding box noise. According to some embodiments, a learning framework is provided which jointly optimizes object labels, bounding box coordinates, and model parameters by performing alternating noise correction and model training. In some embodiments, to disentangle label noise and bounding box noise, a two-step noise correction method is employed. In some examples, the first step performs class-agnostic bounding box correction by minimizing classifier discrepancy and maximizing region objectness. In some examples, the second step uses dual detection heads for label correction and class-specific bounding box refinement.
-
公开(公告)号:US20220067506A1
公开(公告)日:2022-03-03
申请号:US17005763
申请日:2020-08-28
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: A learning mechanism with partially-labeled web images is provided while correcting the noise labels during the learning. Specifically, the mechanism employs a momentum prototype that represents common characteristics of a specific class. One training objective is to minimize the difference between the normalized embedding of a training image sample and the momentum prototype of the corresponding class. Meanwhile, during the training process, the momentum prototype is used to generate a pseudo label for the training image sample, which can then be used to identify and remove out of distribution (OOD) samples to correct the noisy labels from the original partially-labeled training images. The momentum prototype for each class is in turn constantly updated based on the embeddings of new training samples and their pseudo labels.
-
公开(公告)号:US20230244925A1
公开(公告)日:2023-08-03
申请号:US17589595
申请日:2022-01-31
Applicant: salesforce.com, inc.
Inventor: Wenzhuo Yang , Chu Hong Hoi , Kun Zhang
CPC classification number: G06N3/08 , G06K9/6284 , G06K9/6262
Abstract: Embodiments described herein provide a system and method for unsupervised anomaly detection. The system receives, via a communication interface, a dataset of instances that include anomalies. The system determines, via an inlier model, a set of noisy labels. The system trains a causality-based label-noise model based at least in part on the set of noisy labels and the set of high-confidence instances. The system determines an estimated proportion of anomalies in the dataset of instances. The system retrains the inlier model based on the estimated inlier samples. The system iteratively retrains the inlier model and the trained causality-based label-noise model based on the output from the corresponding retrained models not converging within the convergence threshold. The system extracts the anomaly detection model from the iteratively trained causality-based label-noise model.
-
公开(公告)号:US11640505B2
公开(公告)日:2023-05-02
申请号:US16863999
申请日:2020-04-30
Applicant: salesforce.com, inc.
Inventor: Yifan Gao , Chu Hong Hoi , Shafiq Rayhan Joty , Chien-Sheng Wu
IPC: G06F40/289 , G06F16/332
Abstract: Embodiments described herein provide systems and methods for an Explicit Memory Tracker (EMT) that tracks each rule sentence to perform decision making and to generate follow-up clarifying questions. Specifically, the EMT first segments the regulation text into several rule sentences and allocates the segmented rule sentences into memory modules, and then feeds information regarding the user scenario and dialogue history into the EMT sequentially to update each memory module separately. At each dialogue turn, the EMT makes a decision among based on current memory status of the memory modules whether further clarification is needed to come up with an answer to a user question. The EMT determines that further clarification is needed by identifying an underspecified rule sentence span by modulating token-level span distributions with sentence-level selection scores. The EMT extracts the underspecified rule sentence span and rephrases the underspecified rule sentence span to generate a follow-up question.
-
-
-
-
-
-
-
-
-