Abstract:
A method of partitioning a data cache comprising a plurality of sets, the plurality of sets comprising a plurality of ways, is provided. Responsive to a stack data request, the method stores a cache line associated with the stack data in one of a plurality of designated ways of the data cache, wherein the plurality of designated ways is configured to store all requested stack data.
Abstract:
In some embodiments, a method of managing cache memory includes identifying a group of cache lines in a cache memory, based on a correlation between the cache lines. The method also includes tracking evictions of cache lines in the group from the cache memory and, in response to a determination that a criterion regarding eviction of cache lines in the group from the cache memory is satisfied, selecting one or more (e.g., all) remaining cache lines in the group for eviction.
Abstract:
A method of way prediction for a data cache having a plurality of ways is provided. Responsive to an instruction to access a stack data block, the method accesses identifying information associated with a plurality of most recently accessed ways of a data cache to determine whether the stack data block resides in one of the plurality of most recently accessed ways of the data cache, wherein the identifying information is accessed from a subset of an array of identifying information corresponding to the plurality of most recently accessed ways; and when the stack data block resides in one of the plurality of most recently accessed ways of the data cache, the method accesses the stack data block from the data cache.
Abstract:
A processing system includes one or more processing units to perform operations and one or more sensors to measure a temperature concurrently with the one or more processing units performing the operations. The processing system also includes a controller to receive feedback indicating the temperature and to determine a peak temperature and a thermal time constant for heating of the processing system based on a comparison of the measured temperature to a first temperature that is predicted based on the peak temperature and a previously determined thermal time constant for heating. Some embodiments of the controller can control a performance state of the processing system based on the peak temperature and the thermal time constant for heating of the processing system.
Abstract:
A system and method for efficient predicting and processing of memory access dependencies. A computing system includes control logic that marks a detected load instruction as a first type responsive to predicting the load instruction has high locality and is a candidate for store-to-load (STL) data forwarding. The control logic marks the detected load instruction as a second type responsive to predicting the load instruction has low locality and is not a candidate for STL data forwarding. The control logic processes a load instruction marked as the first type as if the load instruction is dependent on an older store operation. The control logic processes a load instruction marked as the second type as if the load instruction is independent on any older store operation.
Abstract:
The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.
Abstract:
A level of cache memory receives modified data from a higher level of cache memory. A set of cache lines with an index associated with the modified data is identified. The modified data is stored in the set in a cache line with an eviction priority that is at least as high as an eviction priority, before the modified data is stored, of an unmodified cache line with a highest eviction priority among unmodified cache lines in the set.
Abstract:
A processing device includes a plurality of components and a system management unit to selectively schedule an application phase to one of the plurality of components based on one or more comparisons of predictions of a plurality of thermal impacts of executing the application phase on each of the plurality of components. The predictions may be generated based on a thermal history associated with the application phase, thermal sensitivities of the plurality of components, or a layout of the plurality of components in the processing device.
Abstract:
A circuit includes a plurality of synchronizers to adapt a signal from a first clock domain to a second clock domain. Each synchronizer of the plurality of synchronizers includes a synchronizer input to receive the signal from the first clock domain and a synchronizer output to provide the signal as adapted to the second clock domain. The circuit also includes a multiplexer (mux) that includes a plurality of mux inputs and a mux output. Each mux input is coupled to the synchronizer output of a respective synchronizer of the plurality of synchronizers. The mux output provides the signal, as adapted to the second clock domain, from the synchronizer output of a selected synchronizer of the plurality of synchronizers.
Abstract:
Embodiments are described for methods and systems for mapping virtual memory pages to physical memory pages by analyzing a sequence of memory-bound accesses to the virtual memory pages, determining a degree of contiguity between the accessed virtual memory pages, and mapping sets of the accessed virtual memory pages to respective single physical memory pages. Embodiments are also described for a method for increasing locality of memory accesses to DRAM in virtual memory systems by analyzing a pattern of virtual memory accesses to identify contiguity of accessed virtual memory pages, predicting contiguity of the accessed virtual memory pages based on the pattern, and mapping the identified and predicted contiguous virtual memory pages to respective single physical memory pages.