-
公开(公告)号:US12217033B2
公开(公告)日:2025-02-04
申请号:US18475103
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.
-
公开(公告)号:US20230376840A1
公开(公告)日:2023-11-23
申请号:US17896942
申请日:2022-08-26
Applicant: Salesforce, Inc.
Inventor: Hung Le , Yue Wang , Akhilesh Deepak Gotmare , Chu Hong Hoi
IPC: G06N20/00 , G06F40/284 , G06K9/62
CPC classification number: G06N20/00 , G06F40/284 , G06K9/6262
Abstract: Embodiments described herein provide a reinforcement learning based framework engaging pretrained language models (LMs) for program synthesis tasks. Specifically, the framework adopts a training strategy that optimizes pretrained LMs for program synthesis tasks in an actor-critic approach.
-
3.
公开(公告)号:US20240289606A1
公开(公告)日:2024-08-29
申请号:US18174547
申请日:2023-02-24
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Hung Le , Akhilesh Deepak Gotmare , Junnan Li , Chu Hong Hoi
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Embodiments described herein provide a mixture of encoder-decoder Transformer framework for multi-task pretraining and flexible finetuning for both code understanding and generation tasks. Specifically, the framework is built on multimodal encoder and decoder modules. During pre-training, the encoder-decoder framework is trained with multiple learning objectives, including a diverse set of self-supervised tasks over two major stages of pretraining on unimodal and bimodal data.
-
公开(公告)号:US20240134773A1
公开(公告)日:2024-04-25
申请号:US18047175
申请日:2022-10-17
Applicant: Salesforce, Inc.
Inventor: Nghi Bui , Yue Wang , Chu Hong Hoi
CPC classification number: G06F11/3608 , G06F8/65 , G06N3/047 , G06N3/084
Abstract: Embodiments described herein provide a unified debugging framework that adapts a pretrained programming language model for line-level debugging and repair. Specifically, the debugging framework follow the logic of programmers on how to debug their code. For example, the debugging framework first determines whether or not a function is buggy. If it is buggy, the debugging framework localizes the problematic line and provide a patch (repair).
-
公开(公告)号:US20240020102A1
公开(公告)日:2024-01-18
申请号:US18475103
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F8/41 , G06F40/20 , G06N3/084 , G06F18/214 , G06N3/047
CPC classification number: G06F8/427 , G06F40/20 , G06N3/084 , G06F18/214 , G06N3/047
Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.
-
公开(公告)号:US20230376841A1
公开(公告)日:2023-11-23
申请号:US17896946
申请日:2022-08-26
Applicant: Salesforce, Inc.
Inventor: Hung Le , Yue Wang , Akhilesh Deepak Gotmare , Chu Hong Hoi
IPC: G06N20/00 , G06F40/284 , G06K9/62 , G06F40/289
CPC classification number: G06N20/00 , G06F40/284 , G06K9/6256 , G06F40/289
Abstract: Embodiments described herein provide a reinforcement learning based framework engaging pretrained language models (LMs) for program synthesis tasks. Specifically, the framework adopts a training strategy that optimizes pretrained LMs for program synthesis tasks in an actor-critic approach.
-
7.
公开(公告)号:US20230376401A1
公开(公告)日:2023-11-23
申请号:US17896873
申请日:2022-08-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
CPC classification number: G06F11/3624 , G06F8/65
Abstract: Systems and methods for automatic program repair using neural network models are described. After a first buggy code patch is received, a first representation of the first buggy code patch is generated using a retriever encoder of a patch retriever. The patch retriever retrieves, based on the first representation, a first bug-fix code pair from a first plurality of bug-fix code pairs. A first augmented buggy code patch is generated based on the first buggy code patch and the first bug-fix code pair. A patch generator generates a fixed code patch based on the first augmented buggy code patch.
-
-
-
-
-
-