WEIGHT OSCILLATION MITIGATION DURING MACHINE LEARNING

    公开(公告)号:US20250005452A1

    公开(公告)日:2025-01-02

    申请号:US18708948

    申请日:2023-01-24

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for mitigating weight oscillation during quantization-aware training. In one example, a method includes identifying oscillation of a parameter of a machine learning model during quantization-aware training of the machine learning model, and applying an oscillation mitigation procedure during the quantization-aware training of the machine learning model in response to identifying the oscillation, the oscillation mitigation procedure comprising at least one of oscillation dampening or parameter freezing.

    ADAPTERS FOR QUANTIZATION
    2.
    发明公开

    公开(公告)号:US20230419087A1

    公开(公告)日:2023-12-28

    申请号:US18330990

    申请日:2023-06-07

    CPC classification number: G06N3/0495 G06N3/044 G06N3/08

    Abstract: A processor-implemented method for adaptive quantization in an artificial neural network (ANN) includes receiving an ANN model. The ANN model has multiple channels of target activations. A quantization module is incorporated between a first linear layer of the ANN and a second linear layer of the ANN to generate an adapted ANN. The quantization module scales a first set of weights and biases of the first linear layer based on a learnable quantization module parameter and scales a second set of weights of the second linear layer based on an inverse of the learnable quantization module parameter.

    QUANTIZATION RANGE ESTIMATION FOR QUANTIZED TRAINING

    公开(公告)号:US20240144017A1

    公开(公告)日:2024-05-02

    申请号:US18548557

    申请日:2022-04-18

    CPC classification number: G06N3/084 G06N3/0495

    Abstract: Certain aspects of the present disclosure provide techniques for efficient quantized learning. A tensor is received at a layer of a neural network, and a current tensor is generated at a first bitwidth based on the received tensor. One or more quantization parameter values are determined based on the current tensor. The current tensor is quantized to a lower bitwidth based on one or more quantization parameter values determined based on a previous tensor generated during the training of a neural network.

    OUTLIER ATTENUATION IN TRANSFORMER NEURAL NETWORKS

    公开(公告)号:US20240386239A1

    公开(公告)日:2024-11-21

    申请号:US18482196

    申请日:2023-10-06

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for processing data using a transformer neural network. The method generally includes receiving an input for processing using a transformer neural network. An attention output is generated in the transformer neural network. Generally, the attention output may be generated such that outlier values for the attention output are attenuated in the transformer neural network. An output of the transformer neural network is generated based on the generated attention output.

    DATA-AWARE LAYER DECOMPOSITION FOR NEURAL NETWORK COMPRESSION

    公开(公告)号:US20200293864A1

    公开(公告)日:2020-09-17

    申请号:US16299375

    申请日:2019-03-12

    Abstract: Certain aspects of the present disclosure are directed to methods and apparatus for operating an artificial neural network using data-aware layer decomposition. One exemplary method generally includes receiving a first input signal at a first layer of the artificial neural network; generating a first output signal of the first layer based, at least in part, on a weight matrix of the first layer and the first input signal; decomposing the weight matrix; generating an approximate output signal of the first layer based, at least in part, on the decomposed weight matrix and the first input signal; generating an updated decomposed weight matrix by minimizing a difference between the generated first output signal of the first layer and the approximate output signal of the first layer; and operating the first layer of the artificial neural network using the updated decomposed weight matrix.

    PROCESSING VIDEO DATA USING DELTA QUANTIZATION

    公开(公告)号:US20240169708A1

    公开(公告)日:2024-05-23

    申请号:US18338184

    申请日:2023-06-20

    CPC classification number: G06V10/776 G06V10/7715 G06V20/46 G06V10/82

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for delta quantization for video processing and other data streams with temporal content. An example method generally includes receiving image data including at least a first frame and a second frame, generating a first convolutional output based on a first frame using a machine learning model, generating a second convolutional output based on a difference between the first frame and the second frame using one or more quantizers of the machine learning model, generating a third convolutional output associated with the second frame as a combination of the first convolutional output and the second convolutional output, and performing image processing based on the first convolutional output and the third convolutional output.

Patent Agency Ranking