-
公开(公告)号:US11568000B2
公开(公告)日:2023-01-31
申请号:US16736268
申请日:2020-01-07
Applicant: salesforce.com, inc.
Inventor: Hung Le , Chu Hong Hoi
IPC: G06F16/9032 , G06F40/56 , G06N5/04 , G06N3/04
Abstract: A method for dialog state tracking includes decoding, by a fertility decoder, encoded dialog information associated with a dialog to generate fertilities for generating dialog states of the dialog. Each dialog state includes one or more domains. Each domain includes one or more slots. Each slot includes one or more slot tokens. The method further includes generating an input sequence to a state decoder based on the fertilities. A total number of each slot token in the input sequence is based on a corresponding fertility. The method further includes encoding, by a state encoder, the input sequence to the state decoder, and decoding, by the state decoder, the encoded input sequence to generate a complete sequence of the dialog states.
-
公开(公告)号:US20220374595A1
公开(公告)日:2022-11-24
申请号:US17531591
申请日:2021-11-19
Applicant: salesforce.com, inc.
Inventor: Akhilesh Deepak Gotmare , Junnan Li , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F40/226 , G06F40/40 , G06F40/30 , G06F40/151
Abstract: Embodiments described herein provides a contrastive learning framework that leverages hard negative examples, that are mined globally from the entire training corpus for a given query to improve the quality of code and natural language representations. Specifically, similar examples from the training corpus are extracted and used as hard negatives in an online manner during training while keeping the minibatch construction random.
-
公开(公告)号:US20220156591A1
公开(公告)日:2022-05-19
申请号:US17160896
申请日:2021-01-28
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: Embodiments described herein provide an approach (referred to as “Co-training” mechanism throughout this disclosure) that jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings. Specifically, two representations of each image sample are generated: a class probability produced by the classification head and a low-dimensional embedding produced by the projection head. The classification head is trained using memory-smoothed pseudo-labels, where pseudo-labels are smoothed by aggregating information from nearby samples in the embedding space. The projection head is trained using contrastive learning on a pseudo-label graph, where samples with similar pseudo-labels are encouraged to have similar embeddings.
-
公开(公告)号:US20220156530A1
公开(公告)日:2022-05-19
申请号:US17188232
申请日:2021-03-01
Applicant: salesforce.com, inc.
Inventor: Anthony Meng Huat Tiong , Junnan Li , Chu Hong Hoi
Abstract: An interpolative centroid contrastive learning (ICCL) framework is disclosed for learning a more discriminative representation for tail classes. Specifically, data samples, such as natural images, are projected into a low-dimensional embedding space, and class centroids for respective classes are created as average embeddings of samples that belong to a respective class. Virtual training samples are then created by interpolating two images from two samplers: a class-agnostic sampler which returns all images from both the head class and the tail class with an equal probability, and a class-aware sampler which focuses more on tail-class images by sampling images from the tail class with a higher probability compared to images from the head class. The sampled images, e.g., images from the class-agnostic sampler and images from the class-aware sampler may be interpolated to generate interpolated images.
-
公开(公告)号:US20220156507A1
公开(公告)日:2022-05-19
申请号:US17591121
申请日:2022-02-02
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
-
公开(公告)号:US20220114464A1
公开(公告)日:2022-04-14
申请号:US17162967
申请日:2021-01-29
Applicant: salesforce.com, inc.
Inventor: Wenzhuo Yang , Jia Li , Chu Hong Hoi , Caiming Xiong
Abstract: Embodiments described herein provide a two-stage model-agnostic approach for generating counterfactual explanation via counterfactual feature selection and counterfactual feature optimization. Given a query instance, counterfactual feature selection picks a subset of feature columns and values that can potentially change the prediction and then counterfactual feature optimization determines the best feature value for the selected feature as a counterfactual example.
-
公开(公告)号:US20210383188A1
公开(公告)日:2021-12-09
申请号:US17072485
申请日:2020-10-16
Applicant: salesforce.com, inc.
Inventor: Pan Zhou , Chu Hong Hoi
Abstract: A method for generating a neural network, including initializing the neural network including a plurality of cells, each cell corresponding to a graph including one or more nodes, each node corresponding to a latent representation of a dataset. A plurality of gates are generated, wherein each gate independently determines whether an operation between two nodes is used. A first regularization is performed using the plurality of gates. The first regularization is one of a group-structured sparsity regularization and a path-depth-wised regularization. An optimization is performed on the neural network by adjusting its network parameters and gate parameters based on the regularization of the sparsity.
-
公开(公告)号:US20210295091A1
公开(公告)日:2021-09-23
申请号:US16870621
申请日:2020-05-08
Applicant: salesforce.com, inc.
Inventor: Junnan Li , Chu Hong Hoi
Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
-
公开(公告)号:US12072955B2
公开(公告)日:2024-08-27
申请号:US17532851
申请日:2021-11-22
Applicant: salesforce.com, inc.
Inventor: Chen Xing , Wenhao Liu , Chu Hong Hoi , Nitish Shirish Keskar , Caiming Xiong
IPC: G06F18/214 , G06F18/21 , G06F40/00
CPC classification number: G06F18/2148 , G06F18/2163 , G06F40/00
Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.
-
公开(公告)号:US11798534B2
公开(公告)日:2023-10-24
申请号:US17162624
申请日:2021-01-29
Applicant: salesforce.com, inc.
Inventor: Guangsen Wang , Chu Hong Hoi , Genta Indra Winata
IPC: G10L15/16 , G10L15/065 , G06N3/08 , G06N3/04 , G10L15/06
CPC classification number: G10L15/16 , G06N3/04 , G06N3/08 , G10L15/063 , G10L15/065
Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
-
-
-
-
-
-
-
-
-