METHOD AND APPARATUS WITH NEURAL NETWORK OPTIMIZATION

    公开(公告)号:US20240202527A1

    公开(公告)日:2024-06-20

    申请号:US18353432

    申请日:2023-07-17

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method of processing data is performed by a computing device including processing hardware and storage hardware, the method including: converting, by the processing hardware, a neural network, stored in the storage hardware, from a first neural network format into a second neural network format; obtaining, by the processing hardware, information about hardware configured to perform a neural network operation for the neural network and obtaining partition information; dividing the neural network in the second neural network format into partitions, wherein the dividing is based on the information about the hardware and the partition information, wherein each partition includes a respective layer with an input thereto and an output thereof; optimizing each of the partitions based on a relationship between the input and the output of the corresponding layer; and converting the optimized partitions into the first neural network format.

    APPARATUS AND METHOD WITH QUANTIZATION CONFIGURATOR

    公开(公告)号:US20240185077A1

    公开(公告)日:2024-06-06

    申请号:US18320896

    申请日:2023-05-19

    CPC classification number: G06N3/086

    Abstract: Apparatuses and methods for drawing a quantization configuration are disclosed, where A method may include generating genes by cataloging possible combinations of a quantization precision and a calibration method for each of layers of a pre-trained neural network, determining layer sensitivity for each of the layers based on combinations corresponding to the genes, determining priorities of the genes and selecting some of the genes based on the respective priority of the genes, generating progeny genes by performing crossover on the selected genes, calculating layer sensitivity for each of the layers corresponding to a combination of the crossover, and updating one or more of the genes using the progeny genes based on a comparison of layer sensitivity of the genes and layer sensitivity of the progeny genes.

Patent Agency Ranking