Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Randy Renfu Huang"

31.

发明授权
Acceleration of neural networks with stacks of convolutional layers 有权

公开(公告)号：US12008469B1

公开(公告)日：2024-06-11

申请号：US17009517

申请日：2020-09-01

Applicant: Amazon Technologies, Inc.

Inventor： Thiam Khean Hah , Randy Renfu Huang , Richard John Heaton , Ron Diamant , Vignesh Vivekraja

IPC: G06N3/08 , G06F13/20 , G06N3/04

CPC classification number: G06N3/08 , G06F13/20 , G06N3/04

Abstract: A single neural network model can be used by each computing engine (CE) in a neural network processor to perform convolution operations in parallel for one or more stacks of convolutional layers. An input feature map can be divided into N chunks to be processed by N CEs, respectively. Each CE can process a last portion of a respective chunk to generate respective shared states to be used by a subsequent CE. A first CE uses pre-computed states to generate a first portion of an output feature map, while other CEs use shared states computed by a preceding CE to generate respective portions of the output feature map.

32.

发明授权
Speculative training using partial gradients update 有权

公开(公告)号：US11948352B2

公开(公告)日：2024-04-02

申请号：US16831060

申请日：2020-03-26

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Randy Renfu Huang

IPC: G06N3/084 , G06N3/063 , G06N5/046 , G06N20/00 , G06V10/764 , G06V10/82 , G06V10/94

CPC classification number: G06V10/955 , G06N3/063 , G06N3/084 , G06N5/046 , G06N20/00 , G06V10/764 , G06V10/82

Abstract: The exchange of weight gradients among the processing nodes can introduce a substantial bottleneck to the training process. Instead of remaining idle during the weight gradients exchange process, a processing node can update its own set of weights for the next iteration of the training process using the processing node's local weight gradients. The next iteration of training can be started by using these speculative weights until the weight gradients exchange process completes and a global weights update is available. If the speculative weights is close enough to the weight values from the global weights update, the training process at the processing node can continue training using the results computed from the speculative weights to reduce the overall training time.

33.

发明授权
Improper neural network input detection and handling 有权

公开(公告)号：US11687761B2

公开(公告)日：2023-06-27

申请号：US16216485

申请日：2018-12-11

Applicant: Amazon Technologies, Inc.

Inventor： Randy Renfu Huang , Richard John Heaton , Andrea Olgiati , Ron Diamant

IPC: G06N3/045 , G06N3/04 , G06N3/08 , G06F18/214

CPC classification number: G06N3/045 , G06F18/214 , G06N3/04 , G06N3/08

Abstract: Systems and methods for performing improper input data detection are described. In one example, a system comprises: hardware circuits configured to receive input data and to perform computations of a neural network based on the input data to generate computation outputs; and an improper input detection circuit configured to: determine a relationship between the computation outputs of the hardware circuits and reference outputs; determine that the input data are improper based on the relationship; and perform an action based on determining that the input data are improper.

34.

发明公开
NEURAL NETWORK TRAINING UNDER MEMORY RESTRAINT 审中-公开

公开(公告)号：US20230196113A1

公开(公告)日：2023-06-22

申请号：US18112036

申请日：2023-02-21

Applicant: Amazon Technologies,Inc

Inventor： Sudipta Sengupta , Randy Renfu Huang , Ron Diamant , Vignesh Vivekaja

IPC: G06N3/04

CPC classification number: G06N3/084 , G06N3/04

Abstract: Methods and systems for training a neural network are provided. In one example, an apparatus comprises a memory that stores instructions; and a hardware processor configured to execute the instructions to: control a neural network processor to perform a loss gradient operation to generate data gradients; after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to generate intermediate outputs; control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients; receive the weight gradients from the neural network processor; and update weights of a neural network based on the weight gradients.

35.

发明公开
DYNAMIC PROCESSING ELEMENT ARRAY EXPANSION 审中-公开

公开(公告)号：US20230153620A1

公开(公告)日：2023-05-18

申请号：US18154576

申请日：2023-01-13

Applicant: Amazon Technologies, Inc.

Inventor： Randy Renfu Huang , Ron Diamant , Richard John Heaton

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/08 , G06N3/04

Abstract: A computer-implemented method includes receiving a neural network model that includes a tensor operation, dividing the tensor operation into a set of sub-operations, and generating instructions for performing a plurality of sub-operations of the set of sub-operations on respective computing engines of a plurality of computing engines on a same integrated circuit device or on different integrated circuit devices. Each sub-operation of the set of sub-operations generates a portion of a final output of the tensor operation. An inference is made based on a result of a sub-operation of the plurality of sub-operations, or based on results of the plurality of sub-operations.

36.

发明授权
Neural network training under memory restraint 有权

公开(公告)号：US11610128B2

公开(公告)日：2023-03-21

申请号：US16836421

申请日：2020-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Randy Renfu Huang , Ron Diamant , Vignesh Vivekraja

IPC: G06N3/08 , G06N3/084 , G06N3/04

Abstract: Methods and systems for training a neural network are provided. In one example, an apparatus comprises a memory that stores instructions; and a hardware processor configured to execute the instructions to: control a neural network processor to perform a loss gradient operation to generate data gradients; after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to generate intermediate outputs; control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients; receive the weight gradients from the neural network processor; and update weights of a neural network based on the weight gradients.

37.

发明授权
Neural network operation reordering for parallel execution 有权

公开(公告)号：US11567778B2

公开(公告)日：2023-01-31

申请号：US17243415

申请日：2021-04-28

Applicant: Amazon Technologies, Inc.

Inventor： Jeffrey T. Huynh , Drazen Borkovic , Jindrich Zejda , Randy Renfu Huang , Ron Diamant

IPC: G06F9/44 , G06F9/38 , G06F9/50 , G06N3/04 , G06N3/08

Abstract: Techniques are disclosed for reordering operations of a neural network to improve runtime efficiency. In some examples, a compiler receives a description of the neural network comprising a plurality of operations. The compiler may determine which execution engine of a plurality of execution engines is to perform each of the plurality of operations. The compiler may determine an order of performance associated with the plurality of operations. The compiler may identify a runtime inefficiency based on the order of performance and a hardware usage for each of the plurality of operations. An operation may be reordered to reduce the runtime inefficiency. Instructions may be compiled based on the plurality of operations, which include the reordered operation.

38.

发明授权
Allocation and placement of resources for network computation 有权

公开(公告)号：US11561833B1

公开(公告)日：2023-01-24

申请号：US16021866

申请日：2018-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Richard John Heaton , Randy Renfu Huang , Drazen Borkovic , Jindrich Zejda

IPC: G06F9/50 , G06F9/54 , G06F9/38 , G06N3/04

Abstract: Techniques for operating a computing system to perform neural network operations are disclosed. In one example, a method comprises receiving a neural network model, determining a sequence of neural network operations based on data dependency in the neural network model, and determining a set of instructions to map the sequence of neural network operations to the processing resources of the neural network processor. The method further comprises determining, based on a set of memory access operations included in the set of instructions, a first set of memory references associated with a first location of an external memory to store the input data and a second set of memory references associated with a second location of the external memory to store the output data, and generating an instruction file including the set of instructions, the first set of memory references and the second set of memory references.

39.

发明授权
Compilation time reduction for memory and compute bound neural networks 有权

公开(公告)号：US11461662B1

公开(公告)日：2022-10-04

申请号：US16829887

申请日：2020-03-25

Applicant: Amazon Technologies, Inc.

Inventor： Hongbin Zheng , Randy Renfu Huang , Richard John Heaton

IPC: G06N3/10 , G06N3/08 , G06N3/04

Abstract: Techniques for reducing a compilation time for compiling a neural network are disclosed. A description of a neural network is received by a compiler. A plurality of operators are identified based on the description of the neural network. A plurality of subgraphs are formed, each including one or more operators. For each subgraph, a performance factor is calculated based on a compute usage and a memory usage associated with the operators included in the subgraph. The performance factor is compared to a threshold. Based on the comparison, either the subgraph is classified as a compute bound subgraph and a set of memory optimizations are suppressed or the subgraph is classified as a memory bound subgraph and a set of compute optimizations are suppressed.

40.

发明申请
SPECULATIVE TRAINING USING PARTIAL GRADIENTS UPDATE 有权

公开(公告)号：US20210304008A1

公开(公告)日：2021-09-30

申请号：US16831060

申请日：2020-03-26

Applicant: Amazon Technologies, Inc.

Inventor： Patricio Kaplan , Randy Renfu Huang

IPC: G06N3/08 , G06N3/063 , G06N5/04 , G06N20/00 , G06K9/62

Abstract: The exchange of weight gradients among the processing nodes can introduce a substantial bottleneck to the training process. Instead of remaining idle during the weight gradients exchange process, a processing node can update its own set of weights for the next iteration of the training process using the processing node's local weight gradients. The next iteration of training can be started by using these speculative weights until the weight gradients exchange process completes and a global weights update is available. If the speculative weights is close enough to the weight values from the global weights update, the training process at the processing node can continue training using the results computed from the speculative weights to reduce the overall training time.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification