Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Jian HUANG"

1.

发明申请
MATRIX MULTIPLICATION UNIT WITH FLEXIBLE PRECISION OPERATIONS 有权

公开(公告)号：US20210089304A1

公开(公告)日：2021-03-25

申请号：US16581252

申请日：2019-09-24

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin HE , Michael MANTOR , Jiasheng CHEN , Jian HUANG

IPC: G06F9/30 , G06F17/16 , G06F9/54 , G06F9/38

Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.

2.

发明申请
ARITHMETIC LOGIC UNIT REGISTER SEQUENCING 有权

公开(公告)号：US20220171621A1

公开(公告)日：2022-06-02

申请号：US17574026

申请日：2022-01-12

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin HE , Jiasheng CHEN , Jian HUANG

IPC: G06F9/30 , G06F7/57 , G06F9/48

Abstract: A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double precision operations, the GPU stores the corresponding operands at the operand registers. Over a plurality of execution cycles, the GPU sequences transfer of operands from the set of operand registers to a designated double precision operand register. During each execution cycle, the double-precision ALU executes a double precision operation using the operand stored at the double precision operand register.

3.

发明申请
ARITHEMETIC LOGIC UNIT REGISTER SEQUENCING 有权

公开(公告)号：US20210157581A1

公开(公告)日：2021-05-27

申请号：US16696108

申请日：2019-11-26

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin HE , Jiasheng CHEN , Jian HUANG

IPC: G06F9/30 , G06F9/48 , G06F7/57

Abstract: A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double precision operations, the GPU stores the corresponding operands at the operand registers. Over a plurality of execution cycles, the GPU sequences transfer of operands from the set of operand registers to a designated double precision operand register. During each execution cycle, the double-precision ALU executes a double precision operation using the operand stored at the double precision operand register.

4.

发明公开
MATRIX MULTIPLICATION UNIT WITH FLEXIBLE PRECISION OPERATIONS 审中-公开

公开(公告)号：US20240111530A1

公开(公告)日：2024-04-04

申请号：US18243264

申请日：2023-09-07

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin HE , Michael MANTOR , Jiasheng CHEN , Jian HUANG

IPC: G06F9/30 , G06F9/38 , G06F9/54 , G06F17/16

CPC classification number: G06F9/30036 , G06F9/30101 , G06F9/3877 , G06F9/544 , G06F17/16

Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.

5.

发明申请
DEDICATED VECTOR SUB-PROCESSOR SYSTEM 有权

公开(公告)号：US20210157588A1

公开(公告)日：2021-05-27

申请号：US16697660

申请日：2019-11-27

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Jiasheng CHEN , Bin HE , Jian HUANG , Michael MANTOR

IPC: G06F9/30 , G06F9/48

Abstract: A processor includes a plurality of vector sub-processors (VSPs) and a plurality of memory banks dedicated to respective VSPs. A first memory bank corresponding to a first VSP includes a first plurality of high vector general purpose register (VGPR) banks and a first plurality of low VGPR banks corresponding to the first plurality of high VGPR banks. The first memory bank further includes a plurality of operand gathering components that store operands from respective high VGPR banks and low VGPR banks. The operand gathering components are assigned to individual threads while the threads are executed by the first VSP.

Patent Agency Ranking