摘要:
A processing device implementing stride-based translation lookaside buffer (TLB) prefetching with adaptive offset is disclosed. A processing device of the disclosure includes a data prefetcher to generate a data prefetch address based on a linear address, a stride, or a prefetch distance, the data prefetch address associated with a data prefetch request, and a TLB prefetch address computation component to generate a TLB prefetch address based on the linear address, the stride, the prefetch distance, or an adaptive offset. The processing device also includes a cross page detection component to determine that the data prefetch address or the TLB prefetch address cross a page boundary associated with the linear address, and cause a TLB prefetch request to be written to a TLB request queue, the TLB prefetch request for translation of an address of a linear page number (LPN) based on the data prefetch address or the TLB prefetch address.
摘要:
A method, an apparatus, and a non-transitory computer readable medium for tracking prefetches generated by a stride prefetcher are presented. Responsive to a prefetcher table entry for an address stream locking on a stride, prefetch suppression logic is updated and prefetches from the prefetcher table entry are suppressed when suppression is enabled for that prefetcher table entry. A stride is a difference between consecutive addresses in the address stream. A prefetch request is issued from the prefetcher table entry when suppression is not enabled for that prefetcher table entry.
摘要:
A processor comprising: an instruction processing pipeline, configured to receive a sequence of instructions for execution, said sequence comprising at least one instruction including a flow control instruction which terminates the sequence; a hash generator, configured to generate a hash associated with execution of the sequence of instructions; a memory configured to securely receive a reference signature corresponding to a hash of a verified corresponding sequence of instructions; verification logic configured to determine a correspondence between the hash and the reference signature; and authorization logic configured to selectively produce a signal, in dependence on a degree of correspondence of the hash with the reference signature.
摘要:
A data prefetcher in a microprocessor having a cache memory receives memory accesses each to an address within a memory block. The access addresses are non-monotonically increasing or decreasing as a function of time. As the accesses are received, the prefetcher maintains a largest address and a smallest address of the accesses and counts of changes to the largest and smallest addresses and maintains a history of recently accessed cache lines implicated by the access addresses within the memory block. The prefetcher also determines a predominant access direction based on the counts and determines a predominant access pattern based on the history. The prefetcher also prefetches into the cache memory, in the predominant access direction according to the predominant access pattern, cache lines of the memory block which the history indicates have not been recently accessed.
摘要:
Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.
摘要:
Examples of enabling cache read optimization for mobile memory devices are described. One or more access commands may be received, from a host, at a memory device. The one or more access commands may instruct the memory device to access at least two data blocks. The memory device may generate pre-fetch information for the at least two data blocks based at least in part on an order of accessing the at least two data blocks.
摘要:
Embodiments of the present invention disclose a data cache apparatus, and a data storage system and method. The data cache apparatus is coupled to a data processing device through a data interface. The data cache apparatus includes a controller and a memory coupled to the controller. The memory is configured to cache hot data in a hard disk that is connected to the data cache apparatus. The controller is configured to read data from or write data into the memory according to a data reading/writing request of the data processing device.
摘要:
A method, computer readable, and apparatus for read-ahead prediction of subsequent requests to send data between a client coupled to a server via a network includes receiving at a traffic management device a request for a part of at least one of a data file and metadata. The traffic management device selects from two or more of a sequential prediction engine, an expert prediction engine and a learning prediction engine to predict a read-ahead of the at least one of the data file and metadata. One or more additional read-ahead parts of the at least one of the data file and metadata are determined with the traffic management device based on the selecting.
摘要:
A scatter/gather technique optimizes unstructured streaming memory accesses, providing off-chip bandwidth efficiency by accessing only useful data at a fine granularity, and off-loading memory access overhead by supporting address calculation, data shuffling, and format conversion.
摘要:
The disclosed embodiments provide a system that filters pre-fetch requests to reduce pre-fetching overhead. During operation, the system executes an instruction that involves a memory reference that is directed to a cache line in a cache. Upon determining that the memory reference will miss in the cache, the system determines whether the instruction frequently leads to cache misses. If so, the system issues a pre-fetch request for one or more additional cache lines. Otherwise, no pre-fetch request is sent. Filtering pre-fetch requests based on instructions' likelihood to miss reduces pre-fetching overhead while preserving the performance benefits of pre-fetching.