Abstract:
A die-stacked memory device incorporates a reconfigurable logic device to provide implementation flexibility in performing various data manipulation operations and other memory operations that use data stored in the die-stacked memory device or that result in data that is to be stored in the die-stacked memory device. One or more configuration files representing corresponding logic configurations for the reconfigurable logic device can be stored in a configuration store at the die-stacked memory device, and a configuration controller can program a reconfigurable logic fabric of the reconfigurable logic device using a selected one of the configuration files. Due to the integration of the logic dies and the memory dies, the reconfigurable logic device can perform various data manipulation operations with higher bandwidth and lower latency and power consumption compared to devices external to the die-stacked memory device.
Abstract:
An accelerator device includes a first processing unit to access a structure of a graph dataset, and a second processing unit coupled with the first processing unit to perform computations based on data values in the graph dataset.
Abstract:
The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.
Abstract:
A die-stacked memory device incorporates a reconfigurable logic device to provide implementation flexibility in performing various data manipulation operations and other memory operations that use data stored in the die-stacked memory device or that result in data that is to be stored in the die-stacked memory device. One or more configuration files representing corresponding logic configurations for the reconfigurable logic device can be stored in a configuration store at the die-stacked memory device, and a configuration controller can program a reconfigurable logic fabric of the reconfigurable logic device using a selected one of the configuration files. Due to the integration of the logic dies and the memory dies, the reconfigurable logic device can perform various data manipulation operations with higher bandwidth and lower latency and power consumption compared to devices external to the die-stacked memory device.
Abstract:
A system includes an atomic processing engine (APE) coupled to an interconnect. The interconnect is to couple to one or more processor cores. The APE receives a plurality of commands from the one or more processor cores through the interconnect. In response to a first command, the APE performs a first plurality of operations associated with the first command. The first plurality of operations references multiple memory locations, at least one of which is shared between two or more threads executed by the one or more processor cores.
Abstract:
A method and apparatus for exiting a low power state based on a prior prediction is disclosed. An integrated circuit (IC) includes a functional unit configured to, during operation, cycle between intervals of an active state and intervals of an idle state. The IC also include a power management unit configured to place the functional unit in a low power state responsive to the functional unit entering the idle state. The power management unit is further configured to preemptively cause the functional unit to exit the low power state at a predetermined time after entering the low power. The predetermined time is based on a prediction of idle state duration made prior to entering the low power state. The prediction may be generated by a prediction unit, based on a history of durations of intervals in which the functional unit was in the idle state.
Abstract:
A multi-core data processor includes multiple data processor cores and a circuit. The multiple data processor cores each include a power state controller having a first input for receiving an idle signal, a second input for receiving a release signal, a third input for receiving a control signal, and an output for providing a current power state. In response to the idle signal, the power state controller causes a corresponding data processor core to enter an idle state. In response to the release signal, the power state controller changes the current power state from the idle state to an active state in dependence on the control signal. The circuit is coupled to each of the multiple data processor cores for providing the control signal in response to current power states in the multiple data processor cores.
Abstract:
A method and apparatus for idle phase prediction in integrated circuits is disclosed. In one embodiment, an integrated circuit (IC) includes a functional unit configured to cycle between intervals of an active state and an idle state. The IC further includes a prediction unit configured to record a history of idle state durations for a plurality of intervals of the idle state. Based on the history of idle state durations, the prediction unit is configured to generate a prediction of the duration of the next interval of the idle state. The prediction may be used by a power management unit to, among other uses, determine whether to place the functional unit in a low power (e.g., sleep) state.
Abstract:
A die-stacked memory device implements an integrated coherency manager to offload cache coherency protocol operations for the devices of a processing system. The die-stacked memory device includes a set of one or more stacked memory dies and a set of one or more logic dies. The one or more logic dies implement hardware logic providing a memory interface and the coherency manager. The memory interface operates to perform memory accesses in response to memory access requests from the coherency manager and the one or more external devices. The coherency manager comprises logic to perform coherency operations for shared data stored at the stacked memory dies. Due to the integration of the logic dies and the memory dies, the coherency manager can access shared data stored in the memory dies and perform related coherency operations with higher bandwidth and lower latency and power consumption compared to the external devices.
Abstract:
The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.