SYSTEMS AND METHODS FOR CODE UNDERSTANDING AND GENERATION

    公开(公告)号:US20220382527A1

    公开(公告)日:2022-12-01

    申请号:US17459968

    申请日:2021-08-27

    Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.

Patent Agency Ranking