Abstract:
Certain aspects provide techniques and apparatuses for efficient operation of a machine learning model based on partitioning the machine learning model. An example method generally includes receiving a graph for a machine learning model. The graph for the machine learning model generally includes a plurality of subgraphs representing different portions of the machine learning model. The machine learning model is instantiated across a plurality of process domains associated with a same application based on the plurality of subgraphs in the graph for the machine learning model. An inference is generated based on executing the machine learning model across the plurality of process domains, and one or more actions are taken based on the generated inference.
Abstract:
Certain aspects of the present disclosure provide techniques and apparatus for generating a response to an input query using a generative artificial intelligence model. An example method generally includes receiving an input for processing. A prompt representing the received input is generated based on the received input, contextual information associated with the received prompt, and a prompt-generating artificial intelligence model. The generated prompt is output to a generative artificial intelligence model for processing. A response to the generated prompt is received from the generative artificial intelligence model and output as a response to the received input.
Abstract:
Certain aspects of the present disclosure support efficient implementation of common neuron models. In an aspect, a first memory layout can be allocated for parameters and state variables of instances of a first neuron model, and a second memory layout different from the first memory layout can be allocated for parameters and state variables of instances of a second neuron model having a different complexity than the first neuron model.
Abstract:
Certain aspects of the present disclosure support assigning neurons and/or synapses to group tags where group tags have an associated set of parameters. By using group tags, neurons or synapses in a population can be assigned a group tag. Then, by changing a parameter associated with the group tag, all synapses or neurons in the group may have that parameter changed.