Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Haichen Li"

1.

发明申请
HIERARCHICAL PARTITIONING OF OPERATORS 有权

公开(公告)号：US20210158131A1

公开(公告)日：2021-05-27

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

2.

发明授权
Reconfigurable neural network processing based on subgraph recognition 有权

公开(公告)号：US11782706B1

公开(公告)日：2023-10-10

申请号：US17361992

申请日：2021-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Hongbin Zheng , Drazen Borkovic , Haichen Li

IPC: G06F8/41 , G06N3/063 , G06F9/30 , G06F7/548 , G06N3/04

CPC classification number: G06F9/3001 , G06F7/548 , G06F8/433 , G06F8/443 , G06N3/04 , G06N3/063

Abstract: In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other operators of the computational dataflow graph targeted at the second execution unit.

3.

发明授权
Reconfigurable neural network processing based on subgraph recognition 有权

公开(公告)号：US12045611B1

公开(公告)日：2024-07-23

申请号：US18231024

申请日：2023-08-07

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Hongbin Zheng , Drazen Borkovic , Haichen Li

IPC: G06F9/30 , G06F7/548 , G06F8/41 , G06N3/04 , G06N3/063

CPC classification number: G06F9/3001 , G06F7/548 , G06F8/433 , G06F8/443 , G06N3/04 , G06N3/063

Abstract: In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other operators of the computational dataflow graph targeted at the second execution unit.

4.

发明授权
Transpose operations using processing element array 有权

公开(公告)号：US10884707B1

公开(公告)日：2021-01-05

申请号：US16455201

申请日：2019-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Haichen Li , Ron Diamant , Jeffrey T. Huynh , Yu Zhou , Se jong Oh

IPC: G06F7/78 , G06F7/523 , G06F7/50 , G06N3/063 , G06F9/38 , G06F9/50 , G06F8/41

Abstract: Provided are systems and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

5.

发明授权
Hierarchical partitioning of operators 有权

公开(公告)号：US12182688B2

公开(公告)日：2024-12-31

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

6.

发明授权
Transpose operations using processing element array 有权

公开(公告)号：US11347480B2

公开(公告)日：2022-05-31

申请号：US17122136

申请日：2020-12-15

Applicant: Amazon Technologies, Inc.

Inventor： Haichen Li , Ron Diamant , Jeffrey T. Huynh , Yu Zhou , Se jong Oh

IPC: G06F7/78 , G06F7/50 , G06F7/523 , G06F8/41 , G06F9/38 , G06F9/50 , G06N3/063

Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

7.

发明申请
TRANSPOSE OPERATIONS USING PROCESSING ELEMENT ARRAY 有权

公开(公告)号：US20210096823A1

公开(公告)日：2021-04-01

申请号：US17122136

申请日：2020-12-15

Applicant: Amazon Technologies, Inc.

Inventor： Haichen Li , Ron Diamant , Jeffrey T. Huynh , Yu Zhou , Se jong Oh

IPC: G06F7/78 , G06F9/38 , G06F7/523 , G06F9/50 , G06F7/50 , G06F8/41 , G06N3/063

Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification