SYSTEMS AND METHODS FOR CODE REPOSITORY EMBEDDING FOR TAGGING AND SUMMARIZATION TASKS USING ATTENTION ON MULTIPLE CODE DOMAINS

    公开(公告)号:US20230062297A1

    公开(公告)日:2023-03-02

    申请号:US17821643

    申请日:2022-08-23

    Abstract: Systems and methods for code repository embedding using attention mechanism for tagging and summarization are disclosed. According to one embodiment, a method for code repository embedding may include: (1) extracting, by a computer program executed by an electronic device, docstring embeddings, code embeddings, and dependency embeddings from scripts in a repository (2) applying, by the computer program, a machine learning algorithm to each of the docstring embeddings, code embeddings, and dependency embeddings; (3) concatenating, by the computer program, outputs of the machine learning algorithm; (4) weighting, by the computer program, the concatenated outputs of the machine learning algorithm using an attention mechanism, resulting in a repository representation comprising an abstract vector; and (5) tagging or summarizing, by the computer program, the script using its repository representation.

Patent Agency Ranking