-
公开(公告)号:US11983128B1
公开(公告)日:2024-05-14
申请号:US18067109
申请日:2022-12-16
Applicant: Amazon Technologies, Inc.
Inventor: Kun Xu , Ron Diamant , Ilya Minkin , Mohammad El-Shabani , Raymond S. Whiteside , Uday Shilton Udayaselvam
CPC classification number: G06F13/30 , G06F13/1621 , G06F13/1642
Abstract: Techniques to reduce overhead in a direct memory access (DMA) engine can include processing descriptors from a descriptor queue to obtain a striding configuration to generate tensorized memory descriptors. The striding configuration can include, for each striding dimension, a stride and a repetition number indicating a number of times to repeat striding in the corresponding striding dimension. One or more sets of tensorized memory descriptors can be generated based on the striding configuration. Data transfers are then performed based on the generated tensorized memory descriptors.
-
公开(公告)号:US11507378B1
公开(公告)日:2022-11-22
申请号:US17188548
申请日:2021-03-01
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Sundeep Amirineni , Mohammad El-Shabani , Sagar Sonar , Kenneth Wayne Patton
Abstract: In one example, an integrated circuit comprises: a memory configured to store a first mapping between a first opcode and first control information and a second mapping between the first opcode and second control information; a processing engine configured to perform processing operations based on the control information; and a controller configured to: at a first time, provide the first opcode to the memory to, based on the first mapping stored in the memory, fetch the first control information for the processing engine, to enable the processing engine to perform a first processing operation based on the first control information; and at a second time, provide the first opcode to the memory to, based on the second mapping stored in the memory, fetch the second control information for the processing engine, to enable the processing engine to perform a second processing operation based on the second control information.
-
公开(公告)号:US11461622B2
公开(公告)日:2022-10-04
申请号:US16457268
申请日:2019-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Samuel Jacob , Ilya Minkin , Mohammad El-Shabani
Abstract: Embodiments include techniques for enabling execution of N inferences on an execution engine of a neural network device. Instruction code for a single inference is stored in a memory that is accessible by a DMA engine, the instruction code forming a regular code block. A NOP code block and a reset code block for resetting an instruction DMA queue are stored in the memory. The instruction DMA queue is generated such that, when it is executed by the DMA engine, it causes the DMA engine to copy, for each of N inferences, both the regular code block and an additional code block to an instruction buffer. The additional code block is the NOP code block for the first N−1 inferences and is the reset code block for the Nth inference. When the reset code block is executed by the execution engine, the instruction DMA queue is reset.
-
公开(公告)号:US10664282B1
公开(公告)日:2020-05-26
申请号:US16266731
申请日:2019-02-04
Applicant: Amazon Technologies, Inc.
Inventor: Ilya Minkin , Ron Diamant , Mohammad El-Shabani , Dana Michelle Vantrease
Abstract: Methods for repeated execution of program code by an execution engine are provided. In order to execute large programs, the instruction buffer of an execution engine may be refilled may times with program code to complete one execution of the program. At completion of program execution, the program code needed to begin re-execution of the program is no longer in the instruction buffer. A runtime driver program can load instructions into the instruction buffer, or can cause instructions to be loaded. Once the instructions are loaded, the execution engine may be able to re-execute the instructions without needing further assistance from the runtime driver.
-
-
-