Patent search ap:("QUALCOMM Incorporated") AND inv:"Markus NAGEL" Page 2

11.

发明申请
Analytic And Empirical Correction Of Biased Error Introduced By Approximation Methods 审中-公开

公开(公告)号：US20200302298A1

公开(公告)日：2020-09-24

申请号：US16826472

申请日：2020-03-23

Applicant: QUALCOMM Incorporated

Inventor： Marinus Willem VAN BAALEN , Tijmen Pieter Frederik BLANKEVOORT , Markus NAGEL

IPC: G06N3/08 , G06N3/04

Abstract: Various embodiments include methods and neural network computing devices implementing the methods for methods for method for generating an approximation neural network correcting for errors due to approximation operations. Various embodiments may include performing approximation operations on a weights tensor associated with a layer of a neural network to generate an approximation weights tensor, determining an expected output error of the layer in the neural network due to the approximation weights tensor, subtracting the expected output error from a bias parameter of the layer to determine an adjusted bias parameter and substituting the adjusted bias parameter for the bias parameter in the layer. Such operations may be performed for all layers in a neural network to produce an approximation version of the neural network for execution on a resource limited processor.

12.

发明申请
LARGE LANGUAGE MODEL (LLM) PRUNING USING EXTENDED KRONECKER APPROXIMATIONS 有权

公开(公告)号：US20250111232A1

公开(公告)日：2025-04-03

申请号：US18476729

申请日：2023-09-28

Applicant: QUALCOMM Incorporated

Inventor： Tycho VAN DER OUDERAA , Markus NAGEL , Marinus Willem VAN BAALEN , Tijmen Pieter Frederik BLANKEVOORT

IPC: G06N3/082

Abstract: An apparatus has one or more memories and one or more processor(s) coupled to the memories. The processor(s) is configured to estimate a local curvature of a loss landscape of a neural network. The processor(s) is also configured to dynamically allocate parameters to be removed from the neural network based on the local curvature. The processor(s) is further configured to update remaining weights of the neural network based on the parameters to be removed.

13.

发明公开
FAST EIGHT-BIT FLOATING POINT (FP8) SIMULATION WITH LEARNABLE PARAMETERS 审中-公开

公开(公告)号：US20230376272A1

公开(公告)日：2023-11-23

申请号：US18102582

申请日：2023-01-27

Applicant: QUALCOMM Incorporated

Inventor： Marinus Willem VAN BAALEN , Jorn Wilhelmus Timotheus PETERS , Markus NAGEL , Tijmen Pieter Frederik BLANKEVOORT , Andrey KUZMIN

IPC: G06F7/483 , G06F5/01

CPC classification number: G06F7/483 , G06F5/012

Abstract: A processor-implemented method for fast floating point simulations with learnable parameters includes receiving a single precision input. An integer quantization process is performed on the input. Each element of the input is scaled based on a scaling parameter to generate an m-bit floating point output, where m is an integer.

14.

发明申请
PER-EMBEDDING-GROUP ACTIVATION QUANTIZATION 有权

公开(公告)号：US20230139347A1

公开(公告)日：2023-05-04

申请号：US17976683

申请日：2022-10-28

Applicant: QUALCOMM Incorporated

Inventor： Yelysei BONDARENKO , Markus NAGEL , Tijmen Pieter Frederik BLANKEVOORT

IPC: G06F40/40 , G06N3/04 , G06N3/08

Abstract: A processor-implemented method for providing per-embedding-group activation quantization includes receiving sequential data at a first layer of a transformer neural network. The sequential data is processed via the first layer of the transformer neural network to generate an activation tensor. The activation tensor is split into multiple groups of embeddings. Each of the embeddings groups has a different set of quantization parameters. Each of the embedding groups is quantized separately based on the corresponding quantization parameters of the different set of quantization parameters. The quantized embedding groups are multiplied with a set of weights to generate an output.

15.

发明申请
Neural Network Pruning With Cyclical Sparsity 有权

公开(公告)号：US20220245457A1

公开(公告)日：2022-08-04

申请号：US17456318

申请日：2021-11-23

Applicant: QUALCOMM Incorporated

Inventor： Suraj SRINIVAS , Tijmen Pieter Frederik BLANKEVOORT , Andrey KUZMIN , Markus NAGEL , Marinus Willem VAN BAALEN , Andrii SKLIAR

IPC: G06N3/08 , G06K9/62

Abstract: Various embodiments include methods and devices for neural network pruning. Embodiments may include receiving as an input a weight tensor for a neural network, increasing a level of sparsity of the weight tensor generating a sparse weight tensor, updating the neural network using the sparse weight tensor generating an updated weight tensor, decreasing a level of sparsity of the updated weight tensor generating a dense weight tensor, increasing the level of sparsity of the dense weight tensor the dense weight tensor generating a final sparse weight tensor, and using the neural network with the final sparse weight tensor to generate inferences. Some embodiments may include increasing a level of sparsity of a first sparse weight tensor generating a second sparse weight tensor, updating the neural network using the second sparse weight tensor generating a second updated weight tensor, and decreasing the level of sparsity the second updated weight tensor.

16.

发明申请
Systems and Methods of Cross Layer Rescaling for Improved Quantization Performance 审中-公开

公开(公告)号：US20200302299A1

公开(公告)日：2020-09-24

申请号：US16826524

申请日：2020-03-23

Applicant: QUALCOMM Incorporated

Inventor： Markus NAGEL , Marinus Willem VAN BAALEN , Tijmen Pieter Frederik BLANKEVOORT

IPC: G06N3/08 , G06N3/04

Abstract: Various embodiments include methods and neural network computing devices implementing the methods for performing quantization in neural networks. Various embodiments may include equalizing ranges of weight tensors or output channel weights within a first layer of the neural network by scaling each of the output channel weights of the first layer by a corresponding scaling factor, and scaling each of a second adjacent layer's corresponding input channel weights by applying an inverse of the corresponding scaling factor to the input channel weights. The corresponding scaling factor may be determined using a black-box optimizer on a quantization error metric or based on heuristics, equalization of dynamic ranges, equalization of range extrema (minima or maxima), differential learning using straight through estimator (STE) methods and a local or global loss, or using an error metric for the quantization error and a black-box optimizer that minimizes the error metric with respect to the scaling.

Patent Agency Ranking