-
1.
公开(公告)号:US20240345883A1
公开(公告)日:2024-10-17
申请号:US18681825
申请日:2022-09-22
Applicant: QUALCOMM Incorporated
Inventor: Ioannis NOUSIAS , Vishal Ganesh SHITOLE , Deeksha DIXIT , Sami KHAWAM , Ben VANDERGRIEND , Mark Ian Roy MUIR
IPC: G06F9/50
CPC classification number: G06F9/5027 , G06F9/5061
Abstract: A method for accelerating machine learning on a computing device is described. The method includes partitioning neural network parameters and input data processed by a plurality of multiply-accumulate (MAC) units of a MAC array of the computing device. The method also includes interleaving MAC operations on the neural network parameters and the input data accessed according to a data sliding window and/or a stride N to compute an output during each cycle, in which N is greater than or equal to one.
-
公开(公告)号:US20190235863A1
公开(公告)日:2019-08-01
申请号:US16004335
申请日:2018-06-08
Applicant: QUALCOMM Incorporated
Inventor: Ioannis NOUSIAS , Mark IR MUIR , Sami KHAWAM
IPC: G06F9/30
CPC classification number: G06F9/3001 , G06F9/30072
Abstract: According to various aspects, a sorting instruction described herein may advantageously be implemented using intrinsic properties of a reconfigurable computing engine. For example, the reconfigurable computing engine may comprise an arithmetic logic unit (ALU) or other suitable operational unit(s) that can perform one or more comparisons among a given plurality of inputs and output a plurality of select signals that at least indicate maximum and minimum values among the given plurality of inputs. In addition, the reconfigurable computing engine may comprise various multiplexers that make up an interconnect fabric coupled to the ALU or other suitable operational units, wherein the multiplexers may be arranged to receive the plurality of inputs and the plurality of select signals such that the plurality of multiplexers can be dynamically configured to perform the permutations to sort the plurality of inputs.
-