SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20250148015A1

    公开(公告)日:2025-05-08

    申请号:US19012626

    申请日:2025-01-07

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

    HYBRID GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240362468A1

    公开(公告)日:2024-10-31

    申请号:US18543533

    申请日:2023-12-18

    CPC classification number: G06N3/0475

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.

    SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240354345A1

    公开(公告)日:2024-10-24

    申请号:US18538912

    申请日:2023-12-13

    CPC classification number: G06F16/9027 G06F40/284

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

    SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240320433A1

    公开(公告)日:2024-09-26

    申请号:US18479672

    申请日:2023-10-02

    CPC classification number: G06F40/284 G06F16/2246

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using generative models. The method generally includes generating, based on an input query and a first generative model, a first plurality of sets of tokens. The first plurality of sets of tokens are output to a second generative model for verification. While waiting to receive an indication of a selected set of tokens from the first plurality of sets of tokens, a second plurality of sets of tokens are speculatively generated. The indication of a selected set of tokens from the first plurality of sets of tokens is received. Tokens from the second plurality of sets of tokens associated with the selected set of tokens are output to the second generative model for verification, and the selected set of tokens is output as a response to the input query.

Patent Agency Ranking