Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Ali SHAFIEE ARDESTANI"

11.

发明申请
SIGNED MULTIPLICATION USING UNSIGNED MULTIPLIER WITH DYNAMIC FINE-GRAINED OPERAND ISOLATION 有权

公开(公告)号：US20210141603A1

公开(公告)日：2021-05-13

申请号：US17151115

申请日：2021-01-15

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Joseph HASSOUN , Lei WANG

IPC: G06F7/487 , G06F9/30 , G06F7/523

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

12.

发明公开
ACCELERATE NEURAL NETWORKS WITH COMPRESSION AT DIFFERENT LEVELS 审中-公开

公开(公告)号：US20230153586A1

公开(公告)日：2023-05-18

申请号：US17578428

申请日：2022-01-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ling LI , Ali SHAFIEE ARDESTANI

IPC: G06N3/063 , G06F7/544 , G06F5/01

CPC classification number: G06N3/063 , G06F5/01 , G06F7/5443

Abstract: A neural network accelerator includes 2n multiplier circuits, 2n shifter circuits and an adder tree circuit. Each respective multiplier circuit multiplies a first value by a second value to output a first product value. Each respective first value is represented by a first predetermined number of bits beginning at a most significant bit of the first value having a value equal to 1. Each respective second value is represented by a second predetermined number of bits, and each respective first product value is represented by a third predetermined number of bits. Each respective shifter circuit receives the first product value of a corresponding multiplier circuit and left shifts the corresponding product value by the first predetermined number of bits to form a respective second product value. The adder circuit adds each respective second product value to form a partial-sum value represented by a fourth predetermined number of bits.

13.

发明申请
HARDWARE CHANNEL-PARALLEL DATA COMPRESSION/DECOMPRESSION 有权

公开(公告)号：US20230047025A1

公开(公告)日：2023-02-16

申请号：US17969671

申请日：2022-10-19

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Lei WANG , Joseph H. HASSOUN

IPC: H03M7/30 , G06F9/30 , G06F9/38 , H04L5/02 , H03M7/40

Abstract: A multichannel data packer includes a plurality of two-input multiplexers and a controller. The plurality of two-input multiplexers is arranged in 2N rows and N columns in which N is an integer greater than 1. Each input of a multiplexer in a first column receives a respective bit stream of 2N channels of bit streams. Each respective bit stream includes a bit-stream length based on data in the bit stream. The multiplexers in a last column output 2N channels of packed bit streams each having a same bit-stream length. The controller controls the plurality of multiplexers so that the multiplexers in the last column output the 2N channels of bit streams that each has the same bit-stream length.

14.

发明申请
PARTIAL SUM COMPRESSION 有权

公开(公告)号：US20220413805A1

公开(公告)日：2022-12-29

申请号：US17407150

申请日：2021-08-19

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ling LI , Ali SHAFIEE ARDESTANI

IPC: G06F7/544 , G06F7/523 , G06F7/50 , G06F7/556 , G06N3/063

Abstract: A method for performing a neural network operation. In some embodiments, method includes: calculating a first plurality of products, each of the first plurality of products being the product of a weight and an activation; calculating a first partial sum, the first partial sum being the sum of the products; and compressing the first partial sum to form a first compressed partial sum.

15.

发明申请
MIXED-PRECISION NEURAL NETWORK ACCELERATOR TILE WITH LATTICE FUSION 有权

公开(公告)号：US20220405559A1

公开(公告)日：2022-12-22

申请号：US17463544

申请日：2021-08-31

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Ahmed Ali ABDELAZIZ , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063 , G06F9/50 , G06F7/544 , G06F7/523 , G06F7/50

Abstract: A neural network accelerator is disclosed that includes a multiplication unit, an adder-tree unit and an accumulator unit. The multiplication unit and the adder tree unit are configured to perform lattice-multiplication operations. The accumulator unit is coupled to an output of the adder tree to form dot-product values from the lattice-multiplication operations performed by the multiplication unit and the adder tree unit. The multiplication unit includes n multiplier units that perform lattice-multiplication-based operations and output product values. Each multiplier unit includes a plurality of multipliers. Each multiplier unit receives first and second multiplicands that each include a most significant nibble (MSN) and a least significant nibble (LSN). The multipliers in each multiplier unit receive different combinations of the MSNs and the LSNs of the multiplicands. The multiplication unit and the adder can provide mixed-precision dot-product computations.

16.

发明申请
DUAL-SPARSE NEURAL PROCESSING UNIT WITH MULTI-DIMENSIONAL ROUTING OF NON-ZERO VALUES 有权

公开(公告)号：US20220156568A1

公开(公告)日：2022-05-19

申请号：US17521840

申请日：2021-11-08

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063 , G06F7/544

Abstract: A general matrix-matrix (GEMM) accelerator core includes first and second buffers, a control logic circuit, and a first processing element (PE). The first buffer receives a elements of a first matrix A of activation values. The second buffer receives b elements of a second matrix B of weight values. The control logic circuit replaces a zero-valued a element in a first column of the first buffer with a nonzero-valued a element that is within a maximum borrowing distance of a location of the zero-valued a element in the first column of the first buffer. The PE receives a elements from the first column of the first buffer including the nonzero-valued element a selected to replace the zero-valued a element and receives b elements from locations in the second buffer that correspond to locations in the first buffer from where the a elements have been received by the PE.

17.

发明申请
SUPPORTING FLOATING POINT 16 (FP16) IN DOT PRODUCT ARCHITECTURE 有权

公开(公告)号：US20210319079A1

公开(公告)日：2021-10-14

申请号：US17153871

申请日：2021-01-20

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Ahmed Ali ABDELAZIZ , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06F17/16 , G06F7/544

Abstract: A dot-product architecture and method are disclosed for calculating floating-point dot-products of two vectors. The architecture includes an array of multiplier units that each include an integer logic that multiplies integer values of corresponding elements of the two vectors; an exponent logic that adds exponent values of the corresponding elements of the two vectors to form an unbiased exponent values, and a local shifter that forms a first shifted value by shifting a product-integer value by a number of bits in a predetermined direction based on a difference value between an unbiased exponent value corresponding to the product-integer value and a maximum unbiased exponent value for the array of multiplier units. An adder tree adds shifted values output from local shifters of the array of multiplier units to form an output, and an accumulator accumulates the output of the addition unit.

18.

发明申请
MIXED-PRECISION NEURAL PROCESSING UNIT (NPU) USING SPATIAL FUSION WITH LOAD BALANCING 有权

公开(公告)号：US20210312325A1

公开(公告)日：2021-10-07

申请号：US16898433

申请日：2020-06-10

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah ABDELAZIZ , Joseph HASSOUN , Ali SHAFIEE ARDESTANI

IPC: G06N20/00 , H04L29/08

Abstract: According to one general aspect, an apparatus may include a machine learning system. The machine learning system may include a precision determination circuit configured to: determine a precision level of data, and divide the data into a data subdivision. The machine learning system may exploit sparsity during the computation of each subdivision. The machine learning system may include a load balancing circuit configured to select a load balancing technique, wherein the load balancing technique includes alternately loading the computation circuit with at least a first data/weight subdivision combination and a second data/weight subdivision combination. The load balancing circuit may be configured to load a computation circuit with a selected data subdivision and a selected weight subdivision based, at least in part, upon the load balancing technique. The machine learning system may include a computation circuit configured to compute a partial computation result based, at least in part, upon the selected data subdivision and the weight subdivision.

19.

发明申请
SIGNED MULTIPLICATION USING UNSIGNED MULTIPLIER WITH DYNAMIC FINE-GRAINED OPERAND ISOLATION 审中-公开

公开(公告)号：US20200150924A1

公开(公告)日：2020-05-14

申请号：US16276582

申请日：2019-02-14

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ilia OVSIANNIKOV , Ali SHAFIEE ARDESTANI , Joseph HASSOUN , Lei WANG

IPC: G06F7/487 , G06F9/30

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification