Abstract:
Prefetching functionality on a logic die stacked with memory is described herein. A device includes a logic chip stacked with a memory chip. The logic chip includes a control block, an in-stack prefetch request handler and a memory controller. The control block receives memory requests from an external source and determines availability of the requested data in the in-stack prefetch request handler. If the data is available, the control block sends the requested data to the external source. If the data is not available, the control block obtains the requested data via the memory controller. The in-stack prefetch request handler includes a prefetch controller, a prefetcher and a prefetch buffer. The prefetcher monitors the memory requests and based on observed patterns, issues additional prefetch requests to the memory controller.
Abstract:
A method and apparatus are described for performing sprinting in a processor. An analyzer in the processor may monitor thermal capacity remaining in the processor while not sprinting. When the remaining thermal capacity is sufficient to support sprinting, the analyzer may perform sprinting of a new workload when a benefit derived by sprinting the new workload exceeds a threshold and does not cause the remaining thermal capacity in the processor to be exhausted. The analyzer may perform sprinting of the new workload in accordance with sprinting parameters determined for the new workload. The analyzer may continue to monitor the remaining thermal capacity while not sprinting when the benefit derived by sprinting the new workload does not exceed the threshold.
Abstract:
A data processing system includes a plurality of processor resources, a manager, and a power distributor. Each of the plurality of data processor cores is operable at a selected one of a plurality of performance states. The manager assigns each of a plurality of program elements to one of the plurality of processor resources, and synchronizing the program elements using barriers. The power distributor is coupled to the manager and to the plurality of processor resources, and assigns a performance state to each of the plurality of processor resources within an overall power budget, and in response to detecting that a program element assigned to a first processor resource is at a barrier, increases the performance state of a second processor resource that is not at the barrier within the overall power budget.
Abstract:
A method and apparatus are described for performing sprinting in a processor. An analyzer in the processor may monitor thermal capacity remaining in the processor while not sprinting. When the remaining thermal capacity is sufficient to support sprinting, the analyzer may perform sprinting of a new workload when a benefit derived by sprinting the new workload exceeds a threshold and does not cause the remaining thermal capacity in the processor to be exhausted. The analyzer may perform sprinting of the new workload in accordance with sprinting parameters determined for the new workload. The analyzer may continue to monitor the remaining thermal capacity while not sprinting when the benefit derived by sprinting the new workload does not exceed the threshold.
Abstract:
A system and method are presented. Some embodiments include a processing unit, at least one memory coupled to the processing unit, and at least one cache coupled to the processing unit and divided into a series of blocks, wherein at least one of the series of cache blocks includes data identified as being in a modified state. The modified state data is flushed by writing the data to the at least one memory based on a write back policy and the aggressiveness of the policy is based on at least one factor including the number of idle cores, the proximity of the last cache flush, the activity of the thread associated with the data, and which cores are idle and if the idle core is associated with the data.
Abstract:
A system and method are presented. Some embodiments include a processing unit, at least one memory coupled to the processing unit, and at least one cache coupled to the processing unit and divided into a series of blocks, wherein at least one of the series of cache blocks includes data identified as being in a modified state. The modified state data is flushed by writing the data to the at least one memory based on a write back policy and the aggressiveness of the policy is based on at least one factor including the number of idle cores, the proximity of the last cache flush, the activity of the thread associated with the data, and which cores are idle and if the idle core is associated with the data.