Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Drazen Borkovic"

31.

发明授权
Neural network layer-by-layer debugging 有权

公开(公告)号：US11308396B2

公开(公告)日：2022-04-19

申请号：US16455329

申请日：2019-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Jindrich Zejda , Jeffrey T. Huynh , Drazen Borkovic , Se jong Oh , Ron Diamant , Randy Renfu Huang

IPC: G06N3/08 , G06F9/38

Abstract: Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the first machine instructions to generate a first device tensor, and the debugger program determines whether the first device tensor matches a first reference tensor. A shortest length is identified for which the first device tensor does not match the first reference tensor. Tensor output is enabled for a lower-level intermediate representation of the shortest neural network, and the neural network is converted into second machine instructions, which are executed by the target processor to generate a second device tensor.

32.

发明申请
NEURAL NETWORK OPERATION REORDERING FOR PARALLEL EXECUTION 有权

公开(公告)号：US20210247984A1

公开(公告)日：2021-08-12

申请号：US17243415

申请日：2021-04-28

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey T. Huynh , Drazen Borkovic , Jindrich Zejda , Randy Renfu Huang , Ron Diamant

IPC: G06F9/38 , G06F9/50 , G06N3/08 , G06N3/04

Abstract: Techniques are disclosed for reordering operations of a neural network to improve runtime efficiency. In some examples, a compiler receives a description of the neural network comprising a plurality of operations. The compiler may determine which execution engine of a plurality of execution engines is to perform each of the plurality of operations. The compiler may determine an order of performance associated with the plurality of operations. The compiler may identify a runtime inefficiency based on the order of performance and a hardware usage for each of the plurality of operations. An operation may be reordered to reduce the runtime inefficiency. Instructions may be compiled based on the plurality of operations, which include the reordered operation.

33.

发明授权
Synchronization of computation engines with non-blocking instructions 有权

公开(公告)号：US10761822B1

公开(公告)日：2020-09-01

申请号：US16217858

申请日：2018-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Drazen Borkovic , Jindrich Zejda , Taemin Kim , Ron Diamant

IPC: G06F11/36 , G06F17/10 , G06F3/048 , G06F13/40 , G06F17/50 , G06F9/30 , G06F12/00 , G06F8/41 , G06N3/02 , G06F12/1081 , G06F12/06 , G06F12/0888 , G06F8/34 , G06F9/50 , G06F9/455

Abstract: Provided are systems and methods for generating program code for an integrated circuit, where instructions in the code synchronize computation engines that support non-blocking instructions. In various examples, a computing device can receiving an input data set including operations to be performed by an integrated circuit device and dependencies between the operations. The input data set can include a non-blocking instruction, and an operation that requires that the non-blocking instruction be completed. The computing device can generate instructions for performing the operation including a particular instruction to wait for a value to be set in a register of the integrated circuit device. The computing device can further generate program code including the non-blocking instruction and the instructions for performing the operation, wherein the non-blocking instruction is configured to set the value in the register.

Patent Agency Ranking