Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Sundeep Amirineni"

21.

发明授权
Accelerated quantized multiply-and-add operations 有权

公开(公告)号：US10678508B2

公开(公告)日：2020-06-09

申请号：US15934681

申请日：2018-03-23

Applicant: Amazon Technologies, Inc.

Inventor： Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni

IPC: G06F7/544 , G06N3/08 , G06F17/15 , G06N3/063 , G06N3/04

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

22.

发明授权
Configuration of a deep vector engine using an opcode table, control table, and datapath table 有权

公开(公告)号：US12271732B1

公开(公告)日：2025-04-08

申请号：US17937333

申请日：2022-09-30

Applicant: Amazon Technologies, Inc.

Inventor： Paul Gilbert Meyer , Ron Diamant , Sundeep Amirineni

IPC: G06F9/22 , G06F9/30 , G06F9/38 , G06F15/78

Abstract: A technique to program a compute channel having multiple computational circuit blocks coupled in series in a pipeline can include receiving a machine instruction for the compute channel. The machine instruction is decoded to obtain an opcode, and the opcode can be used as an index to access an opcode entry in an opcode table. The opcode entry contains a pointer to a microoperation, and the pointer can be used to access a microoperation represented by a control entry in a control table and a datapath configuration entry in a datapath table. The microoperation can then be issued to the compute channel by configuring the compute channel with the control entry and the datapath configuration entry.

23.

发明授权
Increasing performance of computational array accelerators 有权

公开(公告)号：US12182691B1

公开(公告)日：2024-12-31

申请号：US17249900

申请日：2021-03-17

Applicant: Amazon Technologies, Inc.

Inventor： Sundeep Amirineni , Akshay Balasubramanian , Joshua Wayne Bowman , Ron Diamant , Paul Gilbert Meyer , Thomas Elmer

IPC: G06N3/063 , G06F7/544 , G06F9/30

Abstract: To improve performance of a computational array, the architecture of the array can be modified to allow the processing engines of a column to operate in parallel and the clock frequency of the array to be increased. The processing engines of each column of the array can be grouped into a series of row groups. The processing engines of each row group can be loaded with input values, and computations on the input values can be carried out in parallel to generate the column output. One or more flip-flop stages can be inserted into the computational logic of each of the processing engines. The computational logic can then be distributed across the flip-flop stages to reduce the propagation delay between flip-flop stages of the processing engine, hence allowing the clock frequency of the array to be increased.

24.

发明授权
Resizable scratchpad memory 有权

公开(公告)号：US12045475B1

公开(公告)日：2024-07-23

申请号：US17457502

申请日：2021-12-03

Applicant: Amazon Technologies, Inc.

Inventor： Paul Gilbert Meyer , Patricio Kaplan , Sundeep Amirineni , Laura Sharpless , Ron Diamant , Akshay Balasubramanian

IPC: G06F3/06 , G06F12/02

CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/064 , G06F3/0656 , G06F3/0659 , G06F3/0679 , G06F12/0246

Abstract: Techniques for implementing a dynamically resizable memory region for alternative use in a memory are described. The techniques may include using two concurrent address maps corresponding to two address ranges for a memory represented as an array of memory blocks. The first address range can be mapped to the memory with starting addresses of the memory blocks incrementing sequentially along each row. The second address range can be mapped to the memory with starting addresses of the memory blocks incrementing sequentially along each column. When an access request is received having a target address belonging to the first address range, the target address is provided as the memory address to access the memory. When an access request having a target address belonging to the second address range, the target address is translated by address translation logic into a memory address to access the memory.

25.

发明授权
Data selection circuit 有权

公开(公告)号：US11868875B1

公开(公告)日：2024-01-09

申请号：US16127170

申请日：2018-09-10

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Randy Renfu Huang , Jeffrey T. Huynh , Sundeep Amirineni

IPC: G06N3/065 , G11C11/54 , G06N3/049

CPC classification number: G06N3/065 , G06N3/049 , G11C11/54

Abstract: Provided are systems and methods for operating a neural network processor, wherein the processor includes an input selector circuit that can be configured to select the data that will be input into the processor's computational array. In various implementations, the selector circuit can determine, for a row of the array, whether the row input will be the output from a buffer memory or data that the input selector circuit has selected for a different row. The row can receive an input feature map from a set of input data or an input feature map that was selected for inputting into a different row, such that the input feature map is input into more than one row at a time. The selector circuit can also include a delay circuit, so that the duplicated input feature map can be input into the computational array later than the original input feature map.

26.

发明授权
Memory access for multiple circuit components 有权

公开(公告)号：US11775430B1

公开(公告)日：2023-10-03

申请号：US17000842

申请日：2020-08-24

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Sundeep Amirineni , Akshay Balasubramanian , Eyal Freund

IPC: G06F12/08 , G11C11/419 , G11C11/418 , G06N3/063

CPC classification number: G06F12/08 , G06N3/063 , G11C11/418 , G11C11/419

Abstract: Disclosed herein are techniques for performing memory access. In one embodiment, an integrated circuit includes a port and an access engine. The integrated circuit is coupled with a memory device. The access engine is configured to: receive, from an access requester device, a request to access data stored at a memory device; and based on receiving the request: provide, via the port, a sequential access of a plurality of portions of the data to the access requester device; and access the plurality of portions of the data in a parallel form at the memory device for the access requester device. The sequential access can include a sequential write access or a sequential read access of the plurality of portions of the data.

27.

发明授权
Reducing dynamic power consumption in arrays 有权

公开(公告)号：US10817260B1

公开(公告)日：2020-10-27

申请号：US16007749

申请日：2018-06-13

Applicant: Amazon Technologies, Inc.

Inventor： Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni , Thomas A. Volpe

IPC: G06F7/523 , G06F7/544 , G06N3/04 , G06F7/50

Abstract: Systems and methods are provided to skip multiplication operations with zeros in processing elements of the systolic array to reduce dynamic power consumption. A value of zero can be detected on an input data element entering each row of the array and respective zero indicators may be generated. These respective zero indicators may be passed to all the processing elements in the respective rows. The multiplication operation with the zero value can be skipped in each processing element based on the zero indicators, thus reducing dynamic power consumption.

28.

发明授权
Isolating unresponsive customer logic from a bus 有权

公开(公告)号：US10795742B1

公开(公告)日：2020-10-06

申请号：US15638080

申请日：2017-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Asif Khan , Sundeep Amirineni , Kiran Kalkunte Seshadri , Nafea Bshara

IPC: G06F11/07 , H04L12/24 , H04L12/26 , G06F16/635 , G06F11/30 , G06F15/16

Abstract: Disclosed are techniques regarding aspects of implementing client configurable logic within a computer system. The computer system can be a cloud infrastructure. The techniques can include determining that the client configurable logic has performed an errant action.

29.

发明授权
Hardware implementation of mathematical functions 有权

公开(公告)号：US10740432B1

公开(公告)日：2020-08-11

申请号：US16219604

申请日：2018-12-13

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Randy Renfu Huang , Mohammad El-Shabani , Sundeep Amirineni , Kenneth Wayne Patton , Willis Wang

IPC: G06F16/84 , G06F17/14 , G06F7/544 , G06F7/50 , G06F7/52

Abstract: Methods and systems for performing hardware computations of mathematical functions are provided. In one example, a system comprises a mapping table that maps each base value of a plurality of base values to parameters related to a mathematical function; a selection module configured to select, based on an input value, a first base value and first parameters mapped to the first base value in the mapping table; and arithmetic circuits configured to: receive, from the mapping table, the first base value and the first plurality of parameters; and compute, based on a relationship between the input value and the first base value, and based on the first parameters, an estimated output value of the mathematical function for the input value.

30.

发明授权
Registers for restricted memory 有权

公开(公告)号：US10678479B1

公开(公告)日：2020-06-09

申请号：US16204943

申请日：2018-11-29

Applicant: Amazon Technologies, Inc.

Inventor： Ron Diamant , Randy Renfu Huang , Sundeep Amirineni , Jeffrey T. Huynh

IPC: G06F12/00 , G06F3/06 , G06F13/28 , G06N3/02

Abstract: Provided are integrated circuits and methods for operating integrated circuits. An integrated circuit can include a plurality of memory banks and an execution engine including a set of execution components. Each execution component can be associated with a respective memory bank, and can read from and write to only the respective memory bank. The integrated circuit can further include a set of registers each associated with a respective memory bank from the plurality of memory banks. The integrated circuit can further be operable to load to or store from the set of registers in parallel, and load to or store from the set of registers serially. A parallel operation followed by a serial operation enables data to be moved from many memory banks into one memory bank. A serial operation followed by a parallel operation enables data to be moved from one memory bank into many memory banks.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification