-
公开(公告)号:US12217033B2
公开(公告)日:2025-02-04
申请号:US18475103
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.
-
2.
公开(公告)号:US20230376401A1
公开(公告)日:2023-11-23
申请号:US17896873
申请日:2022-08-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
CPC classification number: G06F11/3624 , G06F8/65
Abstract: Systems and methods for automatic program repair using neural network models are described. After a first buggy code patch is received, a first representation of the first buggy code patch is generated using a retriever encoder of a patch retriever. The patch retriever retrieves, based on the first representation, a first bug-fix code pair from a first plurality of bug-fix code pairs. A first augmented buggy code patch is generated based on the first buggy code patch and the first bug-fix code pair. A patch generator generates a fixed code patch based on the first augmented buggy code patch.
-
公开(公告)号:US20240020102A1
公开(公告)日:2024-01-18
申请号:US18475103
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Yue Wang , Weishi Wang , Shafiq Rayhan Joty , Chu Hong Hoi
IPC: G06F8/41 , G06F40/20 , G06N3/084 , G06F18/214 , G06N3/047
CPC classification number: G06F8/427 , G06F40/20 , G06N3/084 , G06F18/214 , G06N3/047
Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.
-
-