Debugging of memory operations
    61.
    发明授权

    公开(公告)号:US11231987B1

    公开(公告)日:2022-01-25

    申请号:US16456256

    申请日:2019-06-28

    Abstract: A debugging tool, such as may take the form of a software daemon running in the background, can provide for the monitoring of utilization of access mechanisms, such as Direct Memory Access (DMA) mechanisms, for purposes such as debugging and performance improvement. Debugging tools can obtain and provide DMA utilization data, as may include statistics, graphs, predictive analytics, or other such information. The data can help to pinpoint issues that have arisen, or may arise, in the system, and take appropriate remedial or preventative action. Data from related DMAs can be aggregated intelligently, helping to identify bottlenecks where the individual DMA data might not. A debugging tool can store state information as snapshots, which may be beneficial if the system is in a state where current data is not accessible. The statistics and predictive analytics can also be leveraged to optimize system-performance.

    HARDWARE ACCELERATOR HAVING RECONFIGURABLE INSTRUCTION SET AND RECONFIGURABLE DECODER

    公开(公告)号:US20210173656A1

    公开(公告)日:2021-06-10

    申请号:US16707857

    申请日:2019-12-09

    Inventor: Ron Diamant

    Abstract: In one example, a hardware accelerator comprises: a programmable hardware instruction decoder programmed to store a plurality of opcodes; a programmable instruction schema mapping table implemented as a content addressable memory (CAM) and programmed to map the plurality of opcodes to a plurality of definitions of operands in a plurality of instructions; a hardware execution engine; and a controller configured to: receive an instruction that includes a first opcode of the plurality of opcodes; control the hardware instruction decoder to extract the first opcode from the instruction; obtain, from the instruction schema mapping table and based on the first opcode, a first definition of a first operand; and forward the instruction and the first definition to the hardware execution engine to control the hardware execution engine to extract the first operand from the instruction based on the first definition, and execute the instruction based on the first operand.

    Place and route aware data pipelining

    公开(公告)号:US10990408B1

    公开(公告)日:2021-04-27

    申请号:US16582573

    申请日:2019-09-25

    Abstract: Methods for place-and-route aware data pipelining for an integrated circuit device are provided. In large integrated circuits, the physical distance a data signal must travel between a signal source in a master circuit block partition and a signal destination in a servant circuit block partition can exceed the distance the signal can travel in a single clock cycle. To maintain timing requirements of the integrated circuit, a longest physical distance and signal delay for a datapath between master and servant circuit block partitions can be determined and pipelining registers added. Datapaths of master circuit block partitions further away from the servant circuit block can have more pipelining registers added within the master circuit block than datapaths of master circuit block partitions that are closer to the servant circuit block.

    Accelerated quantized multiply-and-add operations

    公开(公告)号:US10983754B2

    公开(公告)日:2021-04-20

    申请号:US16891010

    申请日:2020-06-02

    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

    Multicast master
    65.
    发明授权

    公开(公告)号:US10831693B1

    公开(公告)日:2020-11-10

    申请号:US16145122

    申请日:2018-09-27

    Abstract: Provided are integrated circuit devices and methods for operating integrated circuit devices. In various examples, an integrated circuit device can include a master port operable to send transactions to a target components of the device. The master port can have point-to-point connections with each of the targets. The master port can be configured with a first address range for a first target, a second address range for a second target, and a multicast address range for both the first and second target. When the master port receive a request with an address that is in the multicast address range, the master port can generate, for the one request, a transaction for each of the first and second transactions.

    PCI-based bus system having peripheral device address translation based on base address register (BAR) index

    公开(公告)号:US10740265B1

    公开(公告)日:2020-08-11

    申请号:US16144910

    申请日:2018-09-27

    Inventor: Kun Xu Ron Diamant

    Abstract: Methods and apparatus for performing memory access are provided. In one example, an apparatus comprises a hardware processor, a memory, and a bus interface. The hardware processor is configured to: receive, from a host device and via the bus interface, a packet including a host input address, the host input address being defined based on a first host address space operated by the host device; determine, based on the host input address, a host relative address, the host relative address being relative to a first host base address of the first host address space; determine, based on the host relative address, a target device address of the memory; and access the memory at the target device address on behalf of the host device.

    Parametric mathematical function approximation in integrated circuits

    公开(公告)号:US10733498B1

    公开(公告)日:2020-08-04

    申请号:US16215405

    申请日:2018-12-10

    Abstract: Methods and systems for supporting parametric function computations in hardware circuits are proposed. In one example, a system comprises a hardware mapping table, a control circuit, and arithmetic circuits. The control circuit is configured to: in a first mode of operation, forward a set of parameters of a non-parametric function associated with an input value from the hardware mapping table to the arithmetic circuits to compute a first approximation of the non-parametric function at the input value; and in a second mode of operation, based on information indicating whether the input value is in a first input range or in a second input range from the hardware mapping table, forward a first parameter or a second parameter of a parametric function to the arithmetic circuits to compute, respectively, a second approximation or a third approximation of the parametric function at the input value.

    Power projection using machine learning

    公开(公告)号:US10691850B1

    公开(公告)日:2020-06-23

    申请号:US16219205

    申请日:2018-12-13

    Abstract: A power analysis system for an integrated circuit device design can use machine learning to determine an estimated power consumption of the design. In various examples, the system can generate workloads for a power projection tool, which can include less than all the data of a full suite of power projection tests. The results from the power projection tool can be used to train a machine learning data model. From the results, the data model can learn the functions of the design by grouping together cells that are triggered together by the same signals. The data model can also learn estimated power consumption for each of the functions. The output of the data model can then be used to configure a design testing tool, which can run tests on the design. The output of the tests can then be used to compute an estimated overall power consumption for the design.

    Accelerated quantized multiply-and-add operations

    公开(公告)号:US10678508B2

    公开(公告)日:2020-06-09

    申请号:US15934681

    申请日:2018-03-23

    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

    MAINTAINING KEYS FOR TRUSTED BOOT CODE
    70.
    发明申请

    公开(公告)号:US20200175170A1

    公开(公告)日:2020-06-04

    申请号:US16786742

    申请日:2020-02-10

    Abstract: Methods and apparatus are disclosed for securing executable code for execution with a processor using a trusted platform module (TPM). In one example of the disclosed technology, a method of decrypting executable code for execution includes measuring values stored in a CPU boot ROM and measuring second values for executable code stored in non-volatile memory, storing the resulting measurement value in a TPM platform configuration register. The PCR value is used to unseal a key stored in non-volatile memory of the TPM, which key is used to decrypt executable code for execution. Security can be further enhanced by destroying the values stored in the PCR by performing additional measurement operations with the TPM PCR used to generate the measurement value.

Patent Agency Ranking