-
1.
公开(公告)号:US20210303267A1
公开(公告)日:2021-09-30
申请号:US17203591
申请日:2021-03-16
Applicant: STMICROELECTRONICS S.r.l.
Inventor: Xiao Kang JIAO , Fabio Giuseppe DE AMBROGGI , Loris LUISE
Abstract: A method includes retrieving a plurality of datasets from respective memory registers of a memory and storing the retrieved plurality of datasets in respective register portions of a first register. A dataset of data-processing coefficients are stored in a second register. First processing is applied using, as the first operand, a first sub-set of dataset elements stored in the first register, and using, as the second operand, the data-processing coefficients, obtaining a first result. Second processing is applied using, as the first operand, a second sub-set of dataset elements stored in the first register comprised in a second window having a size equal to the dataset size, and using, as the second operand, the replica of the dataset of data-processing coefficients, obtaining a second result. An output is generated based on the first and second results. The first and second processing may perform multiply accumulate (MAC) operations.
-
公开(公告)号:US20220414420A1
公开(公告)日:2022-12-29
申请号:US17360986
申请日:2021-06-28
Inventor: Loris LUISE , Surinder Pal SINGH , Fabio Giuseppe DE AMBROGGI
Abstract: Data structure and microcontroller architecture performing binary multiply-accumulate operations using multiple partial copies of weights. Destination-register location, source-register location, and weight-register location are received. Using the weight-register location, a sub-set of the weight bits is copied a select number of times based on a filter index value that is received. Each copy of the sub-set of weights is executed in parallel. Using the source-register location, a sub-set of the input bits is selected based on the size of the sub-set of weights, wherein the sub-set of input bits is shifted one bit from a previous sub-set of input bits. XOR operation is performed on each corresponding bit in the copy of the sub-set of weights with each corresponding bit in the selected sub-set of input bits. In a corresponding destination sub-location, output of each XOR operation is aggregated with each other and with current value of the corresponding destination sub-location.
-