-
公开(公告)号:US12164593B2
公开(公告)日:2024-12-10
申请号:US17374988
申请日:2021-07-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Peng Gu , Krishna Malladi , Hongzhong Zheng , Dimin Niu
IPC: G06F17/16 , G06F12/0802 , G06F12/0877 , G06N3/008 , G06N3/045 , G06N3/063 , G06N3/08
Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.
-
公开(公告)号:US12130884B2
公开(公告)日:2024-10-29
申请号:US17374988
申请日:2021-07-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Peng Gu , Krishna Malladi , Hongzhong Zheng , Dimin Niu
IPC: G06F17/16 , G06F12/0802 , G06F12/0877 , G06N3/008 , G06N3/045 , G06N3/063 , G06N3/08
CPC classification number: G06F17/16 , G06F12/0802 , G06F12/0877 , G06N3/008 , G06N3/045 , G06N3/063 , G06F2212/1024 , G06F2212/1036 , G06F2212/22 , G06N3/08
Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.
-
公开(公告)号:US11940922B2
公开(公告)日:2024-03-26
申请号:US18081488
申请日:2022-12-14
Applicant: Samsung Electronics Co., Ltd.
Inventor: Mu-Tien Chang , Krishna T. Malladi , Dimin Niu , Hongzhong Zheng
IPC: G06F12/0875 , G06F13/12 , G06F13/16 , G06F9/30
CPC classification number: G06F12/0875 , G06F13/124 , G06F13/1636 , G06F13/1689 , G06F9/3001 , G06F9/30098 , G06F2212/452
Abstract: A method of processing in-memory commands in a high-bandwidth memory (HBM) system includes sending a function-in-HBM instruction to the HBM by a HBM memory controller of a GPU. A logic component of the HBM receives the FIM instruction and coordinates the instructions execution using the controller, an ALU, and a SRAM located on the logic component.
-
公开(公告)号:US11934669B2
公开(公告)日:2024-03-19
申请号:US16942641
申请日:2020-07-29
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dimin Niu , Shuangchen Li , Bob Brennan , Krishna T. Malladi , Hongzhong Zheng
IPC: G06F3/06 , G06F15/78 , G11C11/4096
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/067 , G06F15/7821 , G11C11/4096
Abstract: A processor includes a plurality of memory units, each of the memory units including a plurality of memory cells, wherein each of the memory units is configurable to operate as memory, as a computation unit, or as a hybrid memory-computation unit.
-
公开(公告)号:US11334284B2
公开(公告)日:2022-05-17
申请号:US16195732
申请日:2018-11-19
Applicant: Samsung Electronics Co., Ltd.
Inventor: Andrew Zhenwen Chang , Jongmin Gim , Hongzhong Zheng
Abstract: A database offloading engine. In some embodiments, the database offloading engine includes a vectorized adder including a plurality of read-modify-write circuits, a plurality of sum buffers respectively connected to the read-modify-write circuits, a key address table, and a control circuit. The control circuit may be configured to receive a first key and a corresponding first value; to search the key address table for the first key; and, in response to finding, in the key address table, an address corresponding to the first key, to route the address and the first value to a read-modify-write circuit, of the plurality of read-modify-write circuits, corresponding to the address.
-
公开(公告)号:US11269811B2
公开(公告)日:2022-03-08
申请号:US16595441
申请日:2019-10-07
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dongyan Jiang , Qiang Peng , Hongzhong Zheng
IPC: G06F3/06 , G06F16/174 , G06F11/14 , G06F16/22 , G06F16/215
Abstract: A memory system is disclosed. The memory system may include a Big Hash Table and a Little Hash Table. The memory system may also include an Overflow Region and a Translation Table to map a logical address to a Physical Line Identifier (PLID), which may include a region identifier and a physical address.
-
公开(公告)号:US20210406202A1
公开(公告)日:2021-12-30
申请号:US17469769
申请日:2021-09-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Hongzhong Zheng , Dimin Niu , Peng Gu
Abstract: A high bandwidth memory (HBM) system includes a first HBM+ card. The first HBM+ card includes a plurality of HBM+ cubes. Each HBM+ cube has a logic die and a memory die. The first HBM+ card also includes a HBM+ card controller coupled to each of the plurality of HBM+ cubes and configured to interface with a host, a pin connection configured to connect to the host, and a fabric connection configured to connect to at least one HBM+ card.
-
公开(公告)号:US11151006B2
公开(公告)日:2021-10-19
申请号:US16150239
申请日:2018-10-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dimin Niu , Krishna Malladi , Hongzhong Zheng
Abstract: According to one general aspect, an apparatus may include a plurality of stacked integrated circuit dies that include a memory cell die and a logic die. The memory cell die may be configured to store data at a memory address. The logic die may include an interface to the stacked integrated circuit dies and configured to communicate memory accesses between the memory cell die and at least one external device. The logic die may include a reliability circuit configured to ameliorate data errors within the memory cell die. The reliability circuit may include a spare memory configured to store data, and an address table configured to map a memory address associated with an error to the spare memory. The reliability circuit may be configured to determine if the memory access is associated with an error, and if so completing the memory access with the spare memory.
-
公开(公告)号:US11126354B2
公开(公告)日:2021-09-21
申请号:US16735688
申请日:2020-01-06
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dongyan Jiang , Hongzhong Zheng
IPC: G06F3/06 , G06F12/0831 , G06F9/46 , G06F12/1009 , G06F12/0868 , G06F13/28
Abstract: A transaction manager for use with memory is described. The transaction manager can include a write data buffer to store outstanding write requests, a read data multiplexer to select between data read from the memory and the write data buffer, a command queue and a priority queue to store requests for the memory, and a transaction table to track outstanding write requests, each write request associated with a state that is Invalid, Modified, or Forwarded.
-
公开(公告)号:US20210271594A1
公开(公告)日:2021-09-02
申请号:US17322805
申请日:2021-05-17
Applicant: Samsung Electronics Co., Ltd.
Inventor: Krishna T. Malladi , Jongmin Gim , Hongzhong Zheng
Abstract: A pseudo main memory system. The system includes a memory adapter circuit for performing memory augmentation using compression, deduplication, and/or error correction. The memory adapter circuit is connected to a memory, and employs the memory augmentation methods to increase the effective storage capacity of the memory. The memory adapter circuit is also connected to a memory bus and implements an NVDIMM-F or modified NVDIMM-F interface for connecting to the memory bus.
-
-
-
-
-
-
-
-
-