摘要:
System for working with shared memory includes a plurality of contexts, each having executable processes writing and reading data; a ring buffer in the shared memory for writing and reading data by the contexts; a software primitive manages access attempts by the contexts to the ring buffer. Each context, upon writing to the ring buffer, is allocated an amount of space up to a maximum available at that moment. The software primitive guarantees consistency of the data written to the ring buffer. The software primitive permits simultaneous writing into the buffer by multiple contexts. After finishing writing to the buffer, the context updates a state of the buffer by decrementing the count of the active writers and/or by shifting the permitting pointers for communicating with writers and readers. A context can read from the buffer only data is marked as valid for reading by the context that wrote that data.
摘要:
A memory system is disclosed. The memory system includes first and second memory devices, and a memory controller configured to selectively enable one of the memory devices, the memory controller having a first line coupled to the first and second memory devices and a second line coupled to the first and second memory devices. The first memory device is configured to provide a notification to the memory controller on the first line and the second memory device is configured to provide a notification to the memory controller on the second line. The first memory device is further configured not to load the first line and the second memory device is further configured not to load the second line when the memory controller is writing to the enabled memory device.
摘要:
Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.
摘要:
The disclosure is directed to a system and method of cache management for a data storage system. According to various embodiments, the cache management system includes a hinting driver and a priority controller. The hinting driver generates pointers based upon data packets intercepted from data transfer requests being processed by a host controller of the data storage system. The priority controller determines whether the data packets are associated with at least a first (high) priority level or a second (normal or low) priority level based upon the pointers generated by the hinting driver. High priority data packets are stored in cache memory regardless of whether they satisfy a threshold heat quotient (i.e. a selected level of data transfer activity).
摘要:
A hybrid storage system is described having a mixture of different types of storage devices comprising rotational drives, flash devices, SDRAM, and SRAM. The rotational drives are used as the main storage, providing lowest cost per unit of storage memory. Flash memory is used as a higher-level cache for rotational drives. Methods for managing multiple levels of cache for this storage system is provided having a very fast Level 1 cache which consists of volatile memory (SRAM or SDRAM), and a non-volatile Level 2 cache using an array of flash devices. It describes a method of distributing the data across the rotational drives to make caching more efficient. It also describes efficient techniques for flushing data from L1 cache and L2 cache to the rotational drives, taking advantage of concurrent flash devices operations, concurrent rotational drive operations, and maximizing sequential access types in the rotational drives rather than random accesses which are relatively slower. Methods provided here may be extended for systems that have more than two cache levels.
摘要:
The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.
摘要:
A processor includes a processing unit, a cache memory, and a central request queue. The central request queue is operable to receive a prefetch load request for a cache line to be loaded into the cache memory, receive a demand load request for the cache line from the processing unit, merge the prefetch load request and the demand load request to generate a promoted load request specifying the processing unit as a requestor, receive the cache line associated with the promoted load request, and forward the cache line to the processing unit.
摘要:
An apparatus having a memory and circuit is disclosed. The memory may (i) assert a first signal in response to detecting a conflict between at least two addresses requesting access to a block at a first time, (ii) generate a second signal in response to a cache miss caused by an address requesting access to the block at a second time and (iii) store a line fetched in response to the cache miss in another block by adjusting the first address by an offset. The second time is generally after the first time. The circuit may (i) generate the offset in response to the assertion of the first signal and (ii) present the offset in a third signal to the memory in response to the assertion of the second signal corresponding to reception of the first address at the second time. The offset is generally associated with the first address.
摘要:
The systems and methods described herein may provide a flush-retire instruction for retiring “bad” cache locations (e.g., locations associated with persistent errors) to prevent their allocation for any further accesses, and a flush-unretire instruction for unretiring cache locations previously retired. These instructions may be implemented as hardware instructions of a processor. They may be executable by processes executing in a hyper-privileged state, without the need to quiesce any other processes. The flush-retire instruction may atomically flush a cache line implicated by a detected cache error and set a lock bit to disable subsequent allocation of the corresponding cache location. The flush-unretire instruction may atomically flush an identified cache line (if valid) and clear the lock bit to re-enable subsequent allocation of the cache location. Various bits in the encodings of these instructions may identify the cache location to be retired or unretired in terms of the physical cache structure.
摘要:
A distributed caching system for storing and serving information modeled as a graph that includes nodes and edges that define associations or relationships between nodes that the edges connect in the graph.