-
21.
公开(公告)号:US11423251B2
公开(公告)日:2022-08-23
申请号:US16733314
申请日:2020-01-03
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dinesh Kumar Yadav , Ankur Deshwal , Saptarsi Das , Junwoo Jang , Sehwan Lee
Abstract: A method of performing convolution in a neural network with variable dilation rate is provided. The method includes receiving a size of a first kernel and a dilation rate, determining at least one of size of one or more disintegrated kernels based on the size of the first kernel, a baseline architecture of a memory and the dilation rate, determining an address of one or more blocks of an input image based on the dilation rate, and one or more parameters associated with a size of the input image and the memory. Thereafter, the one or more blocks of the input image and the one or more disintegrated kernels are fetched from the memory, and an output image is obtained based on convolution of each of the one or more disintegrated kernels and the one or more blocks of the input image.
-
公开(公告)号:US10885433B2
公开(公告)日:2021-01-05
申请号:US16107717
申请日:2018-08-21
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Joonho Song , Sehwan Lee , Junwoo Jang
Abstract: A neural network apparatus configured to perform a deconvolution operation includes a memory configured to store a first kernel; and a processor configured to: obtain, from the memory, the first kernel; calculate a second kernel by adjusting an arrangement of matrix elements comprised in the first kernel; generate sub-kernels by dividing the second kernel; perform a convolution operation between an input feature map and the sub-kernels using a convolution operator; and generate an output feature map, as a deconvolution of the input feature map, by merging results of the convolution operation.
-
公开(公告)号:US20200026980A1
公开(公告)日:2020-01-23
申请号:US16552945
申请日:2019-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US20190392287A1
公开(公告)日:2019-12-26
申请号:US16446610
申请日:2019-06-19
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US12086700B2
公开(公告)日:2024-09-10
申请号:US16552619
申请日:2019-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
CPC classification number: G06N3/04 , G06F17/153 , G06F17/16 , G06N3/08 , G06T9/002 , G06F9/3001
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US11971823B2
公开(公告)日:2024-04-30
申请号:US17317339
申请日:2021-05-11
Applicant: Samsung Electronics Co., Ltd
Inventor: Yoojin Kim , Channoh Kim , Hyun Sun Park , Sehwan Lee , Jun-Woo Jang
IPC: G06F12/0862 , G06F1/04 , G06F12/0804 , G06F17/15 , G06F17/16 , G06N3/04 , G06N3/045 , G06N3/063 , G06N3/08
CPC classification number: G06F12/0862 , G06F12/0804 , G06F1/04 , G06F2212/1021 , G06N3/04
Abstract: A computing method and device with data sharing are provided. The method includes loading, by a loader, input data of an input feature map stored in a memory in loading units according to a loading order, storing, by a buffer controller, the loaded input data in a reuse buffer of an address rotationally allocated according to the loading order, and transmitting, by each of a plurality of senders, to an executer respective input data corresponding to each output data of respective convolution operations among the input data stored in the reuse buffer, wherein portions of the transmitted respective input data overlap other.
-
公开(公告)号:US11783162B2
公开(公告)日:2023-10-10
申请号:US16552619
申请日:2019-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
CPC classification number: G06N3/04 , G06F17/153 , G06F17/16 , G06N3/08 , G06T9/002 , G06F9/3001
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US11663473B2
公开(公告)日:2023-05-30
申请号:US17112041
申请日:2020-12-04
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Joonho Song , Sehwan Lee , Junwoo Jang
CPC classification number: G06N3/08 , G06F17/153 , G06N3/04 , G06N3/045
Abstract: A neural network apparatus configured to perform a deconvolution operation includes a memory configured to store a first kernel; and a processor configured to: obtain, from the memory, the first kernel; calculate a second kernel by adjusting an arrangement of matrix elements comprised in the first kernel; generate sub-kernels by dividing the second kernel; perform a convolution operation between an input feature map and the sub-kernels using a convolution operator; and generate an output feature map, as a deconvolution of the input feature map, by merging results of the convolution operation.
-
公开(公告)号:US20220036243A1
公开(公告)日:2022-02-03
申请号:US17147858
申请日:2021-01-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Arnab Roy , Ankur Deshwal , Kiran Kolar Chandrasekharan , Sehwan Lee
Abstract: An apparatus includes a global memory and a systolic array. The global memory is configured to store and provide an input feature map (IFM) vector stream from an IFM tensor and a kernel vector stream from a kernel tensor. The systolic array is configured to receive the IFM vector stream and the kernel vector stream from the global memory. The systolic array is on-chip together with the global memory. The systolic array includes a plurality of processing elements (PEs) each having a plurality of vector units, each of the plurality of vector units being configured to perform a dot-product operation on at least one IFM vector of the IFM vector stream and at least one kernel vector of the kernel vector stream per unit clock cycle to generate a plurality of output feature maps (OFMs).
-
公开(公告)号:US10909418B2
公开(公告)日:2021-02-02
申请号:US16884232
申请日:2020-05-27
Inventor: Sehwan Lee , Leesup Kim , Hyeonuk Kim , Jaehyeong Sim , Yeongjae Choi
Abstract: A processor-implemented neural network method includes: obtaining, from a memory, data of an input feature map and kernels having a binary-weight, wherein the kernels are to be processed in a layer of a neural network; decomposing each of the kernels into a first type sub-kernel reconstructed with weights of a same sign, and a second type sub-kernel for correcting a difference between a respective kernel, among the kernels, and the first type sub-kernel; performing a convolution operation by using the input feature map and the first type sub-kernels and the second type sub-kernels decomposed from each of the kernels; and obtaining an output feature map by combining results of the convolution operation.
-
-
-
-
-
-
-
-
-