-
公开(公告)号:US11231987B1
公开(公告)日:2022-01-25
申请号:US16456256
申请日:2019-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Benita Bose , Ron Diamant , Georgy Zorik Machulsky , Alex Levin
Abstract: A debugging tool, such as may take the form of a software daemon running in the background, can provide for the monitoring of utilization of access mechanisms, such as Direct Memory Access (DMA) mechanisms, for purposes such as debugging and performance improvement. Debugging tools can obtain and provide DMA utilization data, as may include statistics, graphs, predictive analytics, or other such information. The data can help to pinpoint issues that have arisen, or may arise, in the system, and take appropriate remedial or preventative action. Data from related DMAs can be aggregated intelligently, helping to identify bottlenecks where the individual DMA data might not. A debugging tool can store state information as snapshots, which may be beneficial if the system is in a state where current data is not accessible. The statistics and predictive analytics can also be leveraged to optimize system-performance.
-
公开(公告)号:US20210173656A1
公开(公告)日:2021-06-10
申请号:US16707857
申请日:2019-12-09
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant
Abstract: In one example, a hardware accelerator comprises: a programmable hardware instruction decoder programmed to store a plurality of opcodes; a programmable instruction schema mapping table implemented as a content addressable memory (CAM) and programmed to map the plurality of opcodes to a plurality of definitions of operands in a plurality of instructions; a hardware execution engine; and a controller configured to: receive an instruction that includes a first opcode of the plurality of opcodes; control the hardware instruction decoder to extract the first opcode from the instruction; obtain, from the instruction schema mapping table and based on the first opcode, a first definition of a first operand; and forward the instruction and the first definition to the hardware execution engine to control the hardware execution engine to extract the first operand from the instruction based on the first definition, and execute the instruction based on the first operand.
-
公开(公告)号:US10990408B1
公开(公告)日:2021-04-27
申请号:US16582573
申请日:2019-09-25
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Akshay Balasubramanian , Sundeep Amirineni
Abstract: Methods for place-and-route aware data pipelining for an integrated circuit device are provided. In large integrated circuits, the physical distance a data signal must travel between a signal source in a master circuit block partition and a signal destination in a servant circuit block partition can exceed the distance the signal can travel in a single clock cycle. To maintain timing requirements of the integrated circuit, a longest physical distance and signal delay for a datapath between master and servant circuit block partitions can be determined and pipelining registers added. Datapaths of master circuit block partitions further away from the servant circuit block can have more pipelining registers added within the master circuit block than datapaths of master circuit block partitions that are closer to the servant circuit block.
-
公开(公告)号:US10983754B2
公开(公告)日:2021-04-20
申请号:US16891010
申请日:2020-06-02
Applicant: Amazon Technologies, Inc.
Inventor: Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni
Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.
-
公开(公告)号:US10831693B1
公开(公告)日:2020-11-10
申请号:US16145122
申请日:2018-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Randy Renfu Huang , Ron Diamant
Abstract: Provided are integrated circuit devices and methods for operating integrated circuit devices. In various examples, an integrated circuit device can include a master port operable to send transactions to a target components of the device. The master port can have point-to-point connections with each of the targets. The master port can be configured with a first address range for a first target, a second address range for a second target, and a multicast address range for both the first and second target. When the master port receive a request with an address that is in the multicast address range, the master port can generate, for the one request, a transaction for each of the first and second transactions.
-
66.
公开(公告)号:US10740265B1
公开(公告)日:2020-08-11
申请号:US16144910
申请日:2018-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Kun Xu , Ron Diamant
Abstract: Methods and apparatus for performing memory access are provided. In one example, an apparatus comprises a hardware processor, a memory, and a bus interface. The hardware processor is configured to: receive, from a host device and via the bus interface, a packet including a host input address, the host input address being defined based on a first host address space operated by the host device; determine, based on the host input address, a host relative address, the host relative address being relative to a first host base address of the first host address space; determine, based on the host relative address, a target device address of the memory; and access the memory at the target device address on behalf of the host device.
-
公开(公告)号:US10733498B1
公开(公告)日:2020-08-04
申请号:US16215405
申请日:2018-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Sundeep Amirineni , Mohammad El-Shabani
Abstract: Methods and systems for supporting parametric function computations in hardware circuits are proposed. In one example, a system comprises a hardware mapping table, a control circuit, and arithmetic circuits. The control circuit is configured to: in a first mode of operation, forward a set of parameters of a non-parametric function associated with an input value from the hardware mapping table to the arithmetic circuits to compute a first approximation of the non-parametric function at the input value; and in a second mode of operation, based on information indicating whether the input value is in a first input range or in a second input range from the hardware mapping table, forward a first parameter or a second parameter of a parametric function to the arithmetic circuits to compute, respectively, a second approximation or a third approximation of the parametric function at the input value.
-
公开(公告)号:US10691850B1
公开(公告)日:2020-06-23
申请号:US16219205
申请日:2018-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Lev Makovsky , Adi Habusha , Ron Diamant
IPC: G06F30/327 , G06F7/02 , G06N20/00 , G06F119/06
Abstract: A power analysis system for an integrated circuit device design can use machine learning to determine an estimated power consumption of the design. In various examples, the system can generate workloads for a power projection tool, which can include less than all the data of a full suite of power projection tests. The results from the power projection tool can be used to train a machine learning data model. From the results, the data model can learn the functions of the design by grouping together cells that are triggered together by the same signals. The data model can also learn estimated power consumption for each of the functions. The output of the data model can then be used to configure a design testing tool, which can run tests on the design. The output of the tests can then be used to compute an estimated overall power consumption for the design.
-
公开(公告)号:US10678508B2
公开(公告)日:2020-06-09
申请号:US15934681
申请日:2018-03-23
Applicant: Amazon Technologies, Inc.
Inventor: Dana Michelle Vantrease , Randy Huang , Ron Diamant , Thomas Elmer , Sundeep Amirineni
Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.
-
公开(公告)号:US20200175170A1
公开(公告)日:2020-06-04
申请号:US16786742
申请日:2020-02-10
Applicant: Amazon Technologies, Inc.
Inventor: Ron Diamant , Alex Levin , Ihab Bishara
IPC: G06F21/57 , H04L9/08 , G06F9/4401
Abstract: Methods and apparatus are disclosed for securing executable code for execution with a processor using a trusted platform module (TPM). In one example of the disclosed technology, a method of decrypting executable code for execution includes measuring values stored in a CPU boot ROM and measuring second values for executable code stored in non-volatile memory, storing the resulting measurement value in a TPM platform configuration register. The PCR value is used to unseal a key stored in non-volatile memory of the TPM, which key is used to decrypt executable code for execution. Security can be further enhanced by destroying the values stored in the PCR by performing additional measurement operations with the TPM PCR used to generate the measurement value.
-
-
-
-
-
-
-
-
-