NEURAL DIRECTED ACYCLIC GRAPH (DAG) SCHEDULING VIA ONE-SHOT PRIORITY SAMPLING

    公开(公告)号:US20240119301A1

    公开(公告)日:2024-04-11

    申请号:US18464996

    申请日:2023-09-11

    CPC classification number: G06N3/092

    Abstract: A processor-implemented method includes sampling, according to a priority sampling policy, a set of node priorities from a computation graph. Each node priority of the set of node priorities may be associated with a respective node on the computation graph. Additionally, each node may represent an operation of a task performed by an artificial neural network. The method also includes converting, via a list scheduling function, the node priorities to a schedule that associates each node of the computation graph with a processor of a group of processors of a device associated with the artificial neural network, the schedule associated with a makespan. The method further includes performing the task in accordance with the schedule.

    SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240354346A1

    公开(公告)日:2024-10-24

    申请号:US18538965

    申请日:2023-12-13

    CPC classification number: G06F16/9027 G06F40/284

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

    SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20250148015A1

    公开(公告)日:2025-05-08

    申请号:US19012626

    申请日:2025-01-07

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

    SPECULATIVE DECODING IN AUTOREGRESSIVE GENERATIVE ARTIFICIAL INTELLIGENCE MODELS

    公开(公告)号:US20240354345A1

    公开(公告)日:2024-10-24

    申请号:US18538912

    申请日:2023-12-13

    CPC classification number: G06F16/9027 G06F40/284

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for generating a response to a query input in a generative artificial intelligence model. An example method generally includes receiving a plurality of sets of tokens generated based on an input prompt and a first generative artificial intelligence model, each set of tokens in the plurality of sets of tokens corresponding to a candidate response to the input prompt; selecting, using a second generative artificial intelligence model and recursive adjustment of a target distribution associated with the received plurality of sets of tokens, a set of tokens from the plurality of sets of tokens; and outputting the selected set of tokens as a response to the input prompt.

Patent Agency Ranking