Abstract:
A method and system for memory attack mitigation in a memory device includes receiving, at a memory controller, an allocation of a page in memory. One or more device controllers detects an aggressor-victim set within the memory. Based upon the detection, an address of the allocated page is identified for further action.
Abstract:
Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.
Abstract:
Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.
Abstract:
A system, method and computer program product to execute a first and a second work-item, and compare the signature variable of the first work-item to the signature variable of the second work-item. The first and the second work-items are mapped to an identifier via software. This mapping ensures that the first and second work-items execute exactly the same data for exactly the same code without changes to the underlying hardware. By executing the first and second work-items independently, the underlying computation of the first and second work-item can be verified. Moreover, system performance is not substantially affected because the execution results of the first and second work-items are compared only at specified comparison points.
Abstract:
The described embodiments include a program code testing system that determines the vulnerability of multi-threaded program code to soft errors. For multi-threaded program code, two to more threads from the program code may access shared architectural structures while the program code is being executed. The program code testing system determines accesses of architectural structures made by the two or more threads of the multi-threaded program code and uses the determined accesses to determine a time for which the program code is exposed to soft errors. From this time, the program code testing system determines a vulnerability of the program code to soft errors.
Abstract:
A system and method for verifying computation output using computer hardware are provided. Instances of computation are generated and processed on hardware-based processors. As instances of computation are processed, each instance of computation receives a load accessible to other instances of computation. Instances of output are generated by processing the instances of computation. The instances of output are verified against each other in a hardware based processor to ensure accuracy of the output.
Abstract:
A processing system includes a processing device coupled to a memory configured to check for and correct faults in requested data. In response to correcting the faults of the requested data, the memory sends the corrected data and unused check bits to the processing device as a plurality of fetch returns. The memory also sends a parity fetch based on the corrected data and one or more operations to the processing device. After receiving the plurality of fetch returns and the unused check bits, the processing device checks each fetch return for faults based on the unused check bits. In response to determining that a fetch return includes a fault, the processing device erases the fetch return and reconstructs the fetch return based on one or more other received fetch returns and the parity fetch.
Abstract:
A memory system uses error detection codes to detect when errors have occurred in a region of memory. A count of the number of errors is kept and a notification is output in response to the number of errors satisfying a threshold value. The notification is an indication to a host (e.g., a program accessing or managing a machine learning system) that the threshold number of errors have been detected in the region of memory. As long as the number of errors that have been detected in the region of memory remains under the threshold number no notification need be output to the host.
Abstract:
A processing system employs techniques for enhancing dynamic random access memory (DRAM) page retirement to facilitate identification and retirement of pages affected by multi-page DRAM faults. In response to detecting an uncorrectable error at a first page of DRAM, the processing system identifies a second page of the DRAM for potential retirement based on one or more of physical proximity to the first page, inclusion in a range of addresses stored at a fault map that tracks addresses of DRAM pages having detected faults, and predicting a set of pages to check for faults based on misses at a translation lookaside buffer (TLB).
Abstract:
An exemplary computing device includes a plurality of circuits and/or a plurality of in-situ monitors configured to generate outputs that indicate one or more operating conditions of the circuits. The computing device also includes a system management unit configured to detect a potentially faulty voltage-to-frequency ratio implemented by one of the circuits based at least in part on one or more of the outputs. The system management unit is also configured to modify the potentially faulty voltage-to-frequency ratio based at least in part on one or more of the outputs. Various other devices, systems, and methods are also disclosed.