-
公开(公告)号:US11562147B2
公开(公告)日:2023-01-24
申请号:US16929738
申请日:2020-07-15
Applicant: salesforce.com, inc.
Inventor: Yue Wang , Chu Hong Hoi , Shafiq Rayhan Joty
Abstract: A visual dialogue model receives image input and text input that includes a dialogue history between the model and a current utterance by a human user. The model generates a unified contextualized representation using a transformer encoder network, in which the unified contextualized representation includes a token level encoding of the image input and text input. The model generates an encoded visual dialogue input from the unified contextualized representation using visual dialogue encoding layers. The encoded visual dialogue input includes a position level encoding and a segment type encoding. The model generates an answer prediction from the encoded visual dialogue input using a first self-attention mask associated with discriminative settings or a second self-attention mask associated with generative settings. Dense annotation fine tuning may be performed to increase accuracy of the answer prediction. The model provides the answer prediction as a response to the current utterance of the human user.
-
公开(公告)号:US20220382527A1
公开(公告)日:2022-12-01
申请号:US17459968
申请日:2021-08-27
Applicant: salesforce.com, inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.
-
公开(公告)号:US11640505B2
公开(公告)日:2023-05-02
申请号:US16863999
申请日:2020-04-30
Applicant: salesforce.com, inc.
Inventor: Yifan Gao , Chu Hong Hoi , Shafiq Rayhan Joty , Chien-Sheng Wu
IPC: G06F40/289 , G06F16/332
Abstract: Embodiments described herein provide systems and methods for an Explicit Memory Tracker (EMT) that tracks each rule sentence to perform decision making and to generate follow-up clarifying questions. Specifically, the EMT first segments the regulation text into several rule sentences and allocates the segmented rule sentences into memory modules, and then feeds information regarding the user scenario and dialogue history into the EMT sequentially to update each memory module separately. At each dialogue turn, the EMT makes a decision among based on current memory status of the memory modules whether further clarification is needed to come up with an answer to a user question. The EMT determines that further clarification is needed by identifying an underspecified rule sentence span by modulating token-level span distributions with sentence-level selection scores. The EMT extracts the underspecified rule sentence span and rephrases the underspecified rule sentence span to generate a follow-up question.
-
公开(公告)号:US20220374595A1
公开(公告)日:2022-11-24
申请号:US17531591
申请日:2021-11-19
Applicant: salesforce.com, inc.
Inventor: Akhilesh Deepak Gotmare , Junnan Li , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F40/226 , G06F40/40 , G06F40/30 , G06F40/151
Abstract: Embodiments described herein provides a contrastive learning framework that leverages hard negative examples, that are mined globally from the entire training corpus for a given query to improve the quality of code and natural language representations. Specifically, similar examples from the training corpus are extracted and used as hard negatives in an online manner during training while keeping the minibatch construction random.
-
公开(公告)号:US20210173872A1
公开(公告)日:2021-06-10
申请号:US16869903
申请日:2020-05-08
Applicant: salesforce.com, inc.
Inventor: Samson Min Rong Tan , Shafiq Rayhan Joty
IPC: G06F16/9032 , G10L15/16 , G10L15/18 , G06F40/284
Abstract: Embodiments described herein provide systems and methods for generating an adversarial sample with inflectional perturbations for training a natural language processing (NLP) system. A natural language sentence is received at an inflection perturbation module. Tokens are generated from the natural language sentence. For each token that has a part of speech that is a verb, adjective, or an adverb, an inflected form is determined. An adversarial sample of the natural language sentence is generated by detokenizing inflected forms of the tokens. The NLP system is trained using the adversarial sample.
-
-
-
-