Abstract:
A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.
Abstract:
A system and method for efficiently powering down banks in a cache memory for reducing power consumption. A computing system includes a cache array and a corresponding cache controller. The cache array includes multiple banks, each comprising multiple cache sets. In response to a request to power down a first bank of the multiple banks in the cache array, the cache controller selects a cache line of a given type in the first bank and determines whether a respective locality of reference for the selected cache line exceeds a threshold. If the threshold is exceeded, then the selected cache line is migrated to a second bank in the cache array. If the threshold is not exceeded, then the selected cache line is written back to lower-level memory.
Abstract:
Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.
Abstract:
Described is a system and method for a multi-level memory hierarchy. Each level is based on different attributes including, for example, power, capacity, bandwidth, reliability, and volatility. In some embodiments, the different levels of the memory hierarchy may use an on-chip stacked dynamic random access memory, (providing fast, high-bandwidth, low-energy access to data) and an off-chip non-volatile random access memory, (providing low-power, high-capacity storage), in order to provide higher-capacity, lower power, and higher-bandwidth performance. The multi-level memory may present a unified interface to a processor so that specific memory hardware and software implementation details are hidden. The multi-level memory enables the illusion of a single-level memory that satisfies multiple conflicting constraints. A comparator receives a memory address from the processor, processes the address and reads from or writes to the appropriate memory level. In some embodiments, the memory architecture is visible to the software stack to optimize memory utilization.
Abstract:
A die-stacked memory device incorporates a data translation controller at one or more logic dies of the device to provide data translation services for data to be stored at, or retrieved from, the die-stacked memory device. The data translation operations implemented by the data translation controller can include compression/decompression operations, encryption/decryption operations, format translations, wear-leveling translations, data ordering operations, and the like. Due to the tight integration of the logic dies and the memory dies, the data translation controller can perform data translation operations with higher bandwidth and lower latency and power consumption compared to operations performed by devices external to the die-stacked memory device.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism performs a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the resource is not available for performing the operation and until another resource is selected for performing the operation, the selection mechanism identifies a next resource in the table and selects the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
Methods, systems and computer readable storage mediums for more efficient and flexible scheduling of tasks on an asymmetric processing system having at least one host processor and one or more slave processors, are disclosed. An example embodiment includes, determining a data access requirement of a task, comparing the data access requirement to respective local memories of the one or more slave processors selecting a slave processor from the one or more slave processors based upon the comparing, and running the task on the selected slave processor.
Abstract:
A system, method, and memory device embodying some aspects of the present invention for remapping external memory addresses and internal memory locations in stacked memory are provided. The stacked memory includes one or more memory layers configured to store data. The stacked memory also includes a logic layer connected to the memory layer. The logic layer has an Input/Output (I/O) port configured to receive read and write commands from external devices, a memory map configured to maintain an association between external memory addresses and internal memory locations, and a controller coupled to the I/O port, memory map, and memory layers, configured to store data received from external devices to internal memory locations.
Abstract:
The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism is configured to perform a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the identified resource is not available for performing the operation and until a resource is selected for performing the operation, the selection mechanism is configured to identify a next resource in the table and select the next resource for performing the operation when the next resource is available for performing the operation.
Abstract:
Methods, systems and computer readable storage mediums for more efficient and flexible scheduling of tasks on an asymmetric processing system having at least one host processor and one or more slave processors, are disclosed. An example embodiment includes, determining a data access requirement of a task, comparing the data access requirement to respective local memories of the one or more slave processors selecting a slave processor from the one or more slave processors based upon the comparing, and running the task on the selected slave processor.