GENERATIVE-DISCRIMINATIVE LANGUAGE MODELING FOR CONTROLLABLE TEXT GENERATION

    公开(公告)号:US20210374341A1

    公开(公告)日:2021-12-02

    申请号:US17011939

    申请日:2020-09-03

    Abstract: The embodiments describe a generative-discriminative (GeDi) language modeling for determining a next token in a text sequence. A class conditional language model and a positive control code determine a first class conditional probability for each token candidate. The class conditional language model and a negative control code determine a second class conditional probability for the each token candidate. A logarithmic probability difference between the first class conditional probability and the second class conditional probability is determined for each token candidate. An unconditional language model determines an unconditional probability for each token candidate. A combined probability is determined by combining the unconditional probability and the logarithmic probability difference for each token candidate. The next token is selected from the token candidates based on the combined probabilities of the token candidates.

    SYSTEMS AND METHODS FOR NATURAL LANGUAGE CODE SEARCH

    公开(公告)号:US20230109681A1

    公开(公告)日:2023-04-13

    申请号:US17587984

    申请日:2022-01-28

    Abstract: Embodiments are directed to translating a natural language query into a code snippet in a programing language that semantically represents the query. The embodiments include a cascading neural network that includes an encoder network and a classifier network. The encoder network being faster but less accurate than the classifier network. The encoder network is trained using a contrastive learning framework to identify code candidates from a large set of code snippets. The classifier network is trained using a binary classifier to identify the code snippet that semantically represents the query from the code candidates.

    Generative-discriminative language modeling for controllable text generation

    公开(公告)号:US11481552B2

    公开(公告)日:2022-10-25

    申请号:US17011939

    申请日:2020-09-03

    Abstract: The embodiments describe a generative-discriminative (GeDi) language modeling for determining a next token in a text sequence. A class conditional language model and a positive control code determine a first class conditional probability for each token candidate. The class conditional language model and a negative control code determine a second class conditional probability for the each token candidate. A logarithmic probability difference between the first class conditional probability and the second class conditional probability is determined for each token candidate. An unconditional language model determines an unconditional probability for each token candidate. A combined probability is determined by combining the unconditional probability and the logarithmic probability difference for each token candidate. The next token is selected from the token candidates based on the combined probabilities of the token candidates.

Patent Agency Ranking