TEMPERATURE-BASED ADJUSTMENTS FOR IN-MEMORY MATRIX MULTIPLICATION

    公开(公告)号:US20200380063A1

    公开(公告)日:2020-12-03

    申请号:US16428903

    申请日:2019-05-31

    Abstract: Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.

    RUNTIME EXTENSION FOR NEURAL NETWORK TRAINING WITH HETEROGENEOUS MEMORY

    公开(公告)号:US20200042859A1

    公开(公告)日:2020-02-06

    申请号:US16194958

    申请日:2018-11-19

    Abstract: Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.

    Software Only Intra-Compute Unit Redundant Multithreading for GPUs
    34.
    发明申请
    Software Only Intra-Compute Unit Redundant Multithreading for GPUs 有权
    用于GPU的软件内部计算单元冗余多线程

    公开(公告)号:US20140368513A1

    公开(公告)日:2014-12-18

    申请号:US13920574

    申请日:2013-06-18

    Abstract: A system, method and computer program product to execute a first and a second work-item, and compare the signature variable of the first work-item to the signature variable of the second work-item. The first and the second work-items are mapped to an identifier via software. This mapping ensures that the first and second work-items execute exactly the same data for exactly the same code without changes to the underlying hardware. By executing the first and second work-items independently, the underlying computation of the first and second work-item can be verified. Moreover, system performance is not substantially affected because the execution results of the first and second work-items are compared only at specified comparison points.

    Abstract translation: 一种用于执行第一和第二工作项目的系统,方法和计算机程序产品,并且将第一工作项目的签名变量与第二工作项目的签名变量进行比较。 第一个和第二个工作项通过软件映射到一个标识符。 此映射确保第一个和第二个工作项完全相同的数据完全相同的代码,而不会更改底层硬件。 通过独立地执行第一和第二工作项目,可以验证第一和第二工件的基础计算。 此外,系统性能基本上不受影响,因为第一和第二工作项目的执行结果仅在指定的比较点进行比较。

    Determining the Vulnerability of Multi-Threaded Program Code to Soft Errors
    35.
    发明申请
    Determining the Vulnerability of Multi-Threaded Program Code to Soft Errors 有权
    确定多线程程序代码对软错误的漏洞

    公开(公告)号:US20140331207A1

    公开(公告)日:2014-11-06

    申请号:US14266131

    申请日:2014-04-30

    Abstract: The described embodiments include a program code testing system that determines the vulnerability of multi-threaded program code to soft errors. For multi-threaded program code, two to more threads from the program code may access shared architectural structures while the program code is being executed. The program code testing system determines accesses of architectural structures made by the two or more threads of the multi-threaded program code and uses the determined accesses to determine a time for which the program code is exposed to soft errors. From this time, the program code testing system determines a vulnerability of the program code to soft errors.

    Abstract translation: 所描述的实施例包括程序代码测试系统,其确定多线程程序代码对软错误的脆弱性。 对于多线程程序代码,程序代码中的两个到更多的线程可以在执行程序代码时访问共享架构结构。 程序代码测试系统确定由多线程程序代码的两个或多个线程进行的架构结构的访问,并使用所确定的访问来确定程序代码暴露于软错误的时间。 从这时起,程序代码测试系统将程序代码的漏洞确定为软错误。

    Hardware Based Redundant Multi-Threading Inside a GPU for Improved Reliability
    36.
    发明申请
    Hardware Based Redundant Multi-Threading Inside a GPU for Improved Reliability 有权
    基于硬件的冗余多线程GPU内部,以提高可靠性

    公开(公告)号:US20140181587A1

    公开(公告)日:2014-06-26

    申请号:US13724968

    申请日:2012-12-21

    CPC classification number: G06F11/0778 G06F11/1482

    Abstract: A system and method for verifying computation output using computer hardware are provided. Instances of computation are generated and processed on hardware-based processors. As instances of computation are processed, each instance of computation receives a load accessible to other instances of computation. Instances of output are generated by processing the instances of computation. The instances of output are verified against each other in a hardware based processor to ensure accuracy of the output.

    Abstract translation: 提供了一种使用计算机硬件验证计算输出的系统和方法。 在基于硬件的处理器上生成和处理计算实例。 当计算的实例被处理时,每个计算实例都接收到其他计算实例可访问的负载。 通过处理计算实例生成输出实例。 输出的实例在基于硬件的处理器中相互验证,以确保输出的准确性。

    HOST-LEVEL ERROR DETECTION AND FAULT CORRECTION

    公开(公告)号:US20250004873A1

    公开(公告)日:2025-01-02

    申请号:US18678596

    申请日:2024-05-30

    Abstract: A processing system includes a processing device coupled to a memory configured to check for and correct faults in requested data. In response to correcting the faults of the requested data, the memory sends the corrected data and unused check bits to the processing device as a plurality of fetch returns. The memory also sends a parity fetch based on the corrected data and one or more operations to the processing device. After receiving the plurality of fetch returns and the unused check bits, the processing device checks each fetch return for faults based on the unused check bits. In response to determining that a fetch return includes a fault, the processing device erases the fetch return and reconstructs the fetch return based on one or more other received fetch returns and the parity fetch.

Patent Agency Ranking