-
11.
公开(公告)号:US20210141603A1
公开(公告)日:2021-05-13
申请号:US17151115
申请日:2021-01-15
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Joseph HASSOUN , Lei WANG
Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.
-
公开(公告)号:US20230153586A1
公开(公告)日:2023-05-18
申请号:US17578428
申请日:2022-01-18
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ling LI , Ali SHAFIEE ARDESTANI
CPC classification number: G06N3/063 , G06F5/01 , G06F7/5443
Abstract: A neural network accelerator includes 2n multiplier circuits, 2n shifter circuits and an adder tree circuit. Each respective multiplier circuit multiplies a first value by a second value to output a first product value. Each respective first value is represented by a first predetermined number of bits beginning at a most significant bit of the first value having a value equal to 1. Each respective second value is represented by a second predetermined number of bits, and each respective first product value is represented by a third predetermined number of bits. Each respective shifter circuit receives the first product value of a corresponding multiplier circuit and left shifts the corresponding product value by the first predetermined number of bits to form a respective second product value. The adder circuit adds each respective second product value to form a partial-sum value represented by a fourth predetermined number of bits.
-
公开(公告)号:US20230047025A1
公开(公告)日:2023-02-16
申请号:US17969671
申请日:2022-10-19
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Lei WANG , Joseph H. HASSOUN
Abstract: A multichannel data packer includes a plurality of two-input multiplexers and a controller. The plurality of two-input multiplexers is arranged in 2N rows and N columns in which N is an integer greater than 1. Each input of a multiplexer in a first column receives a respective bit stream of 2N channels of bit streams. Each respective bit stream includes a bit-stream length based on data in the bit stream. The multiplexers in a last column output 2N channels of packed bit streams each having a same bit-stream length. The controller controls the plurality of multiplexers so that the multiplexers in the last column output the 2N channels of bit streams that each has the same bit-stream length.
-
公开(公告)号:US20220413805A1
公开(公告)日:2022-12-29
申请号:US17407150
申请日:2021-08-19
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ling LI , Ali SHAFIEE ARDESTANI
Abstract: A method for performing a neural network operation. In some embodiments, method includes: calculating a first plurality of products, each of the first plurality of products being the product of a weight and an activation; calculating a first partial sum, the first partial sum being the sum of the products; and compressing the first partial sum to form a first compressed partial sum.
-
公开(公告)号:US20220405559A1
公开(公告)日:2022-12-22
申请号:US17463544
申请日:2021-08-31
Applicant: Samsung Electronics Co., Ltd.
Inventor: Hamzah Ahmed Ali ABDELAZIZ , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN
Abstract: A neural network accelerator is disclosed that includes a multiplication unit, an adder-tree unit and an accumulator unit. The multiplication unit and the adder tree unit are configured to perform lattice-multiplication operations. The accumulator unit is coupled to an output of the adder tree to form dot-product values from the lattice-multiplication operations performed by the multiplication unit and the adder tree unit. The multiplication unit includes n multiplier units that perform lattice-multiplication-based operations and output product values. Each multiplier unit includes a plurality of multipliers. Each multiplier unit receives first and second multiplicands that each include a most significant nibble (MSN) and a least significant nibble (LSN). The multipliers in each multiplier unit receive different combinations of the MSNs and the LSNs of the multiplicands. The multiplication unit and the adder can provide mixed-precision dot-product computations.
-
公开(公告)号:US20220156568A1
公开(公告)日:2022-05-19
申请号:US17521840
申请日:2021-11-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN
Abstract: A general matrix-matrix (GEMM) accelerator core includes first and second buffers, a control logic circuit, and a first processing element (PE). The first buffer receives a elements of a first matrix A of activation values. The second buffer receives b elements of a second matrix B of weight values. The control logic circuit replaces a zero-valued a element in a first column of the first buffer with a nonzero-valued a element that is within a maximum borrowing distance of a location of the zero-valued a element in the first column of the first buffer. The PE receives a elements from the first column of the first buffer including the nonzero-valued element a selected to replace the zero-valued a element and receives b elements from locations in the second buffer that correspond to locations in the first buffer from where the a elements have been received by the PE.
-
公开(公告)号:US20210319079A1
公开(公告)日:2021-10-14
申请号:US17153871
申请日:2021-01-20
Applicant: Samsung Electronics Co., Ltd.
Inventor: Hamzah Ahmed Ali ABDELAZIZ , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN
Abstract: A dot-product architecture and method are disclosed for calculating floating-point dot-products of two vectors. The architecture includes an array of multiplier units that each include an integer logic that multiplies integer values of corresponding elements of the two vectors; an exponent logic that adds exponent values of the corresponding elements of the two vectors to form an unbiased exponent values, and a local shifter that forms a first shifted value by shifting a product-integer value by a number of bits in a predetermined direction based on a difference value between an unbiased exponent value corresponding to the product-integer value and a maximum unbiased exponent value for the array of multiplier units. An adder tree adds shifted values output from local shifters of the array of multiplier units to form an output, and an accumulator accumulates the output of the addition unit.
-
公开(公告)号:US20210312325A1
公开(公告)日:2021-10-07
申请号:US16898433
申请日:2020-06-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Hamzah ABDELAZIZ , Joseph HASSOUN , Ali SHAFIEE ARDESTANI
Abstract: According to one general aspect, an apparatus may include a machine learning system. The machine learning system may include a precision determination circuit configured to: determine a precision level of data, and divide the data into a data subdivision. The machine learning system may exploit sparsity during the computation of each subdivision. The machine learning system may include a load balancing circuit configured to select a load balancing technique, wherein the load balancing technique includes alternately loading the computation circuit with at least a first data/weight subdivision combination and a second data/weight subdivision combination. The load balancing circuit may be configured to load a computation circuit with a selected data subdivision and a selected weight subdivision based, at least in part, upon the load balancing technique. The machine learning system may include a computation circuit configured to compute a partial computation result based, at least in part, upon the selected data subdivision and the weight subdivision.
-
19.
公开(公告)号:US20200150924A1
公开(公告)日:2020-05-14
申请号:US16276582
申请日:2019-02-14
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Joseph HASSOUN , Lei WANG
Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.
-
-
-
-
-
-
-
-