Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Taemin Kim"

1.

发明授权
Synchronization of concurrent computation engines 有权

公开(公告)号：US11061654B1

公开(公告)日：2021-07-13

申请号：US16217797

申请日：2018-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Drazen Borkovic , Jindrich Zejda , Taemin Kim , Ron Diamant

IPC: G06F9/44 , G06F8/41 , G06F9/30 , G06N3/02 , G06F12/1081

Abstract: Provided are systems and methods for synchronizing program code execution for a plurality of execution engines in an integrated circuit device. In some cases, the operation of one execution engine may be dependent on the operation of another execution engine. To accommodate this dependency, the instructions for the first execution engine can include a set-event instruction and the instructions for the second execution engine can include a wait-on-event instruction. The wait-on-event instruction can cause the second execution engine to wait for the first execution engine to reach the set-event instruction. In this way, the two execution engines can be synchronized around the data or resource dependency.

2.

发明授权
Performing hardware operator fusion 有权

公开(公告)号：US11809981B1

公开(公告)日：2023-11-07

申请号：US16698753

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Tobias Joseph Kastulus Edler von Koch , Yizhi Liu , Taemin Kim , Jindrich Zejda , Yida Wang , Vinod Sharma , Richard John Heaton , Randy Renfu Huang

IPC: G06N3/063 , G06F9/30 , G06F9/54

CPC classification number: G06N3/063 , G06F9/30007 , G06F9/545

Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.

3.

发明授权
Compile-time scheduling 有权

公开(公告)号：US11003429B1

公开(公告)日：2021-05-11

申请号：US16266915

申请日：2019-02-04

Applicant: Amazon Technologies, Inc.

Inventor： Jindrich Zejda , Jeffrey T. Huynh , Tobias Joseph Kastulus Edler von Koch , Drazen Borkovic , Taemin Kim

IPC: G06F8/41 , G06F16/901 , G06F15/80

Abstract: Scheduling of the operations of an integrated circuit device such as a hardware accelerator, including scheduling of movement of data into and out of the accelerator, can be performed by a compiler that produces program code for the accelerator. The compiler can produce a graph that represents operations to be performed by the accelerator. Using the graph, the compiler can determine estimated execution times for the operations represented by each node in the graph. The compiler can schedule operations by determining an estimated execution time for set of dependent operations that depend from an operation. The compiler can then select an operation that has a shortest estimated execution time from among a set of operations and which has a set of dependent operations that has a longest estimated execution time as compared to other sets of dependent operations.

4.

发明授权
Loop-oriented neural network compilation 有权

公开(公告)号：US11144291B1

公开(公告)日：2021-10-12

申请号：US16698320

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Hongbin Zheng , Preston Pengra Briggs , Tobias Joseph Kastulus Edler von Koch , Taemin Kim , Randy Renfu Huang

IPC: G06F9/44 , G06F8/41 , G06N3/04

Abstract: Methods of accelerating the execution of neural networks are disclosed. A description of a neural network may be received. A plurality of operators may be identified based on the description of the neural network. A plurality of symbolic models associated with the plurality of operators may be generated. For each symbolic model, a nested loop associated with an operator may be identified, a loop order may be defined, and a set of data dependencies may be defined. A set of inter-operator dependencies may be extracted based on the description of the neural network. The plurality of symbolic models and the set of inter-operator dependencies may be analyzed to identify a combinable pair of nested loops. The combinable pair of nested loops may be combined to form a combined nested loop.

5.

发明授权
Synchronization of computation engines with non-blocking instructions 有权

公开(公告)号：US10761822B1

公开(公告)日：2020-09-01

申请号：US16217858

申请日：2018-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Drazen Borkovic , Jindrich Zejda , Taemin Kim , Ron Diamant

IPC: G06F11/36 , G06F17/10 , G06F3/048 , G06F13/40 , G06F17/50 , G06F9/30 , G06F12/00 , G06F8/41 , G06N3/02 , G06F12/1081 , G06F12/06 , G06F12/0888 , G06F8/34 , G06F9/50 , G06F9/455

Abstract: Provided are systems and methods for generating program code for an integrated circuit, where instructions in the code synchronize computation engines that support non-blocking instructions. In various examples, a computing device can receiving an input data set including operations to be performed by an integrated circuit device and dependencies between the operations. The input data set can include a non-blocking instruction, and an operation that requires that the non-blocking instruction be completed. The computing device can generate instructions for performing the operation including a particular instruction to wait for a value to be set in a register of the integrated circuit device. The computing device can further generate program code including the non-blocking instruction and the instructions for performing the operation, wherein the non-blocking instruction is configured to set the value in the register.

Patent Agency Ranking