Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Jong Hoon SHIN"

1.

发明公开
RUNTIME RECONFIGURABLE COMPRESSION FORMAT CONVERSION WITH BIT-PLANE GRANULARITY 审中-公开

公开(公告)号：US20240162917A1

公开(公告)日：2024-05-16

申请号：US18096557

申请日：2023-01-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ardavan PEDRAM , Joseph HASSOUN

IPC: H03M7/30 , H03M7/42

CPC classification number: H03M7/6088 , H03M7/42

Abstract: A runtime bit-plane data-format optimizer for a processing element includes a sparsity-detector and a compression-converter. The sparsity-detector selects a bit-plane compression-conversion format during a runtime of the processing element using a performance model that is based on a first sparsity pattern of first bit-plane data stored in a memory exterior to the processing element and a second sparsity pattern of second bit-plane data that is to be stored in a memory within the processing element. The second sparsity pattern is based on a runtime configuration of the processing element. The first bit-plane data is stored using a first bit-plane compression format and the bit-plane second data is to be stored using a second bit-plane compression format. The compression-conversion circuit converts the first bit-plane compression format of the first data to be the second bit-plane compression format of the second data.

2.

发明申请
DEPTHWISE-CONVOLUTION IMPLEMENTATION ON A NEURAL PROCESSING CORE 有权

公开(公告)号：US20220405558A1

公开(公告)日：2022-12-22

申请号：US17401298

申请日：2021-08-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063 , G06F15/80

Abstract: A core of neural processing units is configured to efficiently process a depthwise convolution by maximizing spatial feature-map locality using adder trees. Data paths of activations and weights are inverted, and 2-to-1 multiplexers are every 2/9 multipliers along a row of multipliers. During a depthwise convolution operation, the core is operated using a RS×HW dataflow to maximize the locality of feature maps. For a normal convolution operation, the data paths of activations and weights may be configured for a normal convolution configuration and in which multiplexers are idle.

3.

发明公开
HYBRID-SPARSE NPU WITH FINE-GRAINED STRUCTURED SPARSITY 审中-公开

公开(公告)号：US20240095505A1

公开(公告)日：2024-03-21

申请号：US17980541

申请日：2022-11-03

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ardavan PEDRAM , Joseph HASSOUN

IPC: G06N3/063 , G06N3/08

CPC classification number: G06N3/063 , G06N3/08

Abstract: A neural processing unit is disclosed that supports dual-sparsity modes. A weight buffer is configured to store weight values in an arrangement selected from a structured weight sparsity arrangement or a random weight sparsity arrangement. A weight multiplexer array is configured to output one or more weight values stored in the weight buffer as first operand values based on the selected weight sparsity arrangement. An activation buffer is configured to store activation values. An activation multiplexer array includes inputs to the activation multiplexer array that are coupled to the activation buffer, and is configured to output one or more activation values stored in the activation buffer as second operand values in which each respective second operand value and a corresponding first operand value forming an operand value pair. A multiplier array is configured to output a product value for each operand value pair.

4.

发明申请
DUAL-SPARSE NEURAL PROCESSING UNIT WITH MULTI-DIMENSIONAL ROUTING OF NON-ZERO VALUES 有权

公开(公告)号：US20220156568A1

公开(公告)日：2022-05-19

申请号：US17521840

申请日：2021-11-08

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063 , G06F7/544

Abstract: A general matrix-matrix (GEMM) accelerator core includes first and second buffers, a control logic circuit, and a first processing element (PE). The first buffer receives a elements of a first matrix A of activation values. The second buffer receives b elements of a second matrix B of weight values. The control logic circuit replaces a zero-valued a element in a first column of the first buffer with a nonzero-valued a element that is within a maximum borrowing distance of a location of the zero-valued a element in the first column of the first buffer. The PE receives a elements from the first column of the first buffer including the nonzero-valued element a selected to replace the zero-valued a element and receives b elements from locations in the second buffer that correspond to locations in the first buffer from where the a elements have been received by the PE.

5.

发明公开
RUNTIME RECONFIGURABLE COMPRESSION FORMAT CONVERSION 审中-公开

公开(公告)号：US20240162916A1

公开(公告)日：2024-05-16

申请号：US18096551

申请日：2023-01-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ardavan PEDRAM , Joseph HASSOUN

IPC: H03M7/30

CPC classification number: H03M7/3059 , H03M7/6011 , H03M7/6094

Abstract: A runtime data-format optimizer for a processing element includes a sparsity-detector and a compression-converter. The sparsity-detector selects a first compression-conversion format during a runtime of the processing element based on a performance model that is based on a first sparsity pattern of first data stored in a first memory that is exterior to the processing element and a second sparsity pattern of second data that is to be stored in a second memory within the processing element. The second sparsity pattern is based on a runtime configuration of the processing element. The first data is stored in the first memory using a first compression format and the second data is to be stored in the second memory using a second compression format. The compression-conversion circuit converts the first compression format of the first data to be the second compression format of the second data based on the first compression-conversion format.

6.

发明申请
WEIGHT-SPARSE NEURAL PROCESSING UNIT WITH MULTI-DIMENSIONAL ROUTING OF NON-ZERO VALUES 有权

公开(公告)号：US20220156569A1

公开(公告)日：2022-05-19

申请号：US17521846

申请日：2021-11-08

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063 , G06F7/544

Abstract: A general matrix-matrix (GEMM) accelerator core includes first and second buffers, and a processing element (PE). The first buffer receives a elements of a matrix A of activation values. The second buffer receives b elements of a matrix B of weight values. The matrix B is preprocessed with a nonzero-valued b element replacing a zero-valued b element in a first row of the second buffer based on the zero-valued b element being in the first row of the second buffer. Metadata is generated that includes movement information of the nonzero-valued b element to replace the zero-valued b element. The PE receives b elements from a first row of the second buffer and a elements from the first buffer from locations in the first buffer that correspond to locations in the second buffer from where the b elements have been received by the PE as indicated by the metadata.

7.

发明公开
WEIGHT-SPARSE NPU WITH FINE-GRAINED STRUCTURED SPARSITY 审中-公开

公开(公告)号：US20240119270A1

公开(公告)日：2024-04-11

申请号：US17980544

申请日：2022-11-03

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ardavan PEDRAM , Joseph HASSOUN

IPC: G06N3/063 , G06N3/08

CPC classification number: G06N3/063 , G06N3/08

Abstract: A neural processing unit is reconfigurable to process a fine-grain structured sparsity weight arrangement selected from N:M=1:4, 2:4, 2:8 and 4:8 fine-grain structured weight sparsity arrangements. A weight buffer stores weight values and a weight multiplexer array outputs one or more weight values stored in the weight buffer as first operand values based on a selected fine-grain structured sparsity weight arrangement. An activation buffer stores activation values and an activation multiplexer array outputs one or more activation values stored in the activation buffer as second operand values based on the selected fine-grain structured weight sparsity in which each respective second operand value and a corresponding first operand value forms an operand value pair. A multiplier array outputs a product value for each operand value pair.

8.

发明公开
EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR 审中-公开

公开(公告)号：US20240095519A1

公开(公告)日：2024-03-21

申请号：US17989675

申请日：2022-11-17

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ardavan PEDRAM , Ali SHAFIEE ARDESTANI , Jong Hoon SHIN , Joseph H. HASSOUN

IPC: G06N3/08 , H03M7/30

CPC classification number: G06N3/08 , H03M7/3066

Abstract: A neural network inference accelerator includes first and second neural processing units (NPUs) and a sparsity management unit. The first NPU receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. The second NPU receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. The sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.

9.

发明公开
STRUCTURED SPARSE MEMORY HIERARCHY FOR DEEP LEARNING 审中-公开

公开(公告)号：US20240095518A1

公开(公告)日：2024-03-21

申请号：US17988739

申请日：2022-11-16

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ardavan PEDRAM , Jong Hoon SHIN , Joseph H. HASSOUN

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A memory system and a method are disclosed for training a neural network model. A decompressor unit decompresses an activation tensor to a first predetermined sparsity density based on the activation tensor being compressed, and decompresses an weight tensor to a second predetermined sparsity density based on the weight tensor being compressed. A buffer unit receives the activation tensor at the first predetermined sparsity density and the weight tensor at the second predetermined sparsity density. A neural processing unit receives the activation tensor and the weight tensor from the buffer unit and computes a result for the activation tensor and the weight tensor based on first predetermined sparsity density of the activation tensor and based on the second predetermined sparsity density of the weight tensor.

10.

发明申请
SRAM-SHARING FOR RECONFIGURABLE NEURAL PROCESSING UNITS 有权

公开(公告)号：US20220405557A1

公开(公告)日：2022-12-22

申请号：US17400094

申请日：2021-08-11

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon SHIN , Ali SHAFIEE ARDESTANI , Joseph H. HASSOUN

IPC: G06N3/063

Abstract: A system and a method is disclosed for processing input feature map (IFM) data of a current layer of a neural network model using an array of reconfigurable neural processing units (NPUs) and storing output feature map (OFM) data of the next layer of the neural network model at a location that does not involve a data transfer between memories of the NPUs according to the subject matter disclosed herein. The reconfigurable NPUs may be used to improve NPU utilization of NPUs of a neural processing system.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification