-
公开(公告)号:US20190354842A1
公开(公告)日:2019-11-21
申请号:US16413535
申请日:2019-05-15
Applicant: QUALCOMM Incorporated
Abstract: A method for quantizing a neural network includes modeling noise of parameters of the neural network. The method also includes assigning grid values to each realization of the parameters according to a concrete distribution that depends on a local fixed-point quantization grid and the modeled noise and. The method further includes computing a fixed-point value representing parameters of a hard fixed-point quantized neural network.
-
2.
公开(公告)号:US20240195434A1
公开(公告)日:2024-06-13
申请号:US18556622
申请日:2022-05-31
Applicant: QUALCOMM Incorporated
Inventor: Matthias REISSER , Aleksei TRIASTCYN , Christos LOUIZOS
CPC classification number: H03M7/70 , G06F30/27 , G06N7/01 , H03M7/3057
Abstract: Certain aspects of the present disclosure provide techniques for performing federated learning, including receiving a global model from a federated learning server; determining an updated model based on the global model and local data; and sending the updated model to the federated learning server using relative entropy coding.
-
公开(公告)号:US20230169350A1
公开(公告)日:2023-06-01
申请号:US18040111
申请日:2021-09-28
Applicant: QUALCOMM Incorporated
Inventor: Christos LOUIZOS , Hossein HOSSEINI , Matthias REISSER , Max WELLING , Joseph Binamira SORIAGA
IPC: G06N3/098
CPC classification number: G06N3/098
Abstract: Aspects described herein provide techniques for performing federated learning of a machine learning model, comprising: for each respective client of a plurality of clients and for each training round in a plurality of training rounds: generating a subset of model elements for the respective client based on sampling a gate probability distribution for each model element of a set of model elements for a global machine learning model; transmitting to the respective client: the subset of model elements; and a set of gate probabilities based on the sampling, wherein each gate probability of the set of gate probabilities is associated with one model element of the subset of model elements; receiving from each respective client of the plurality of clients a respective set of model updates; and updating the global machine learning model based on the respective set of model updates from each respective client of the plurality of clients.
-
公开(公告)号:US20210089922A1
公开(公告)日:2021-03-25
申请号:US17030315
申请日:2020-09-23
Applicant: QUALCOMM Incorporated
Inventor: Yadong LU , Ying WANG , Tijmen Pieter Frederik BLANKEVOORT , Christos LOUIZOS , Matthias REISSER , Jilei HOU
Abstract: A method for compressing a deep neural network includes determining a pruning ratio for a channel and a mixed-precision quantization bit-width based on an operational budget of a device implementing the deep neural network. The method further includes quantizing a weight parameter of the deep neural network and/or an activation parameter of the deep neural network based on the quantization bit-width. The method also includes pruning the channel of the deep neural network based on the pruning ratio.
-
公开(公告)号:US20210073650A1
公开(公告)日:2021-03-11
申请号:US17016130
申请日:2020-09-09
Applicant: QUALCOMM Incorporated
Inventor: Matthias REISSER , Saurabh Kedar PITRE , Xiaochun ZHU , Edward Harris TEAGUE , Zhongze WANG , Max WELLING
Abstract: In one embodiment, a method of simulating an operation of an artificial neural network on a binary neural network processor includes receiving a binary input vector for a layer including a probabilistic binary weight matrix and performing vector-matrix multiplication of the input vector with the probabilistic binary weight matrix, wherein the multiplication results are modified by simulated binary-neural-processing hardware noise, to generate a binary output vector, where the simulation is performed in the forward pass of a training algorithm for a neural network model for the binary-neural-processing hardware.
-
公开(公告)号:US20190354865A1
公开(公告)日:2019-11-21
申请号:US16417430
申请日:2019-05-20
Applicant: QUALCOMM Incorporated
Inventor: Matthias REISSER , Max WELLING , Efstratios GAVVES , Christos LOUIZOS
Abstract: A neural network may be configured to receive, during a training phase of the neural network, a first input at an input layer of the neural network. The neural network may determine, during the training phase, a first classification at an output layer of the neural network based on the first input. The neural network may adjust, during the training phase and based on a comparison between the determined first classification and an expected classification of the first input, weights for artificial neurons of the neural network based on a loss function. The neural network may output, during an operational phase of the neural network, a second classification determined based on a second input, the second classification being determined by processing the second input through the artificial neurons using the adjusted weights.
-
-
-
-
-