摘要:
A method for reducing memory latency in a multi-node architecture. In one embodiment, a speculative read request is issued to a home node before results of a cache coherence protocol are determined. The home node initiates a read to memory to complete the speculative read request. Results of a cache coherence protocol may be determined by a coherence agent to resolve cache coherency after the speculative read request is issued.
摘要:
A method and apparatus for a mechanism for handling explicit writeback in a cache coherent multi-node architecture is described. In one embodiment, the invention is a method. The method includes receiving a read request relating to a first line of data in a coherent memory system. The method further includes receiving a write request relating to the first line of data at about the same time as the read request is received. The method further includes detecting that the read request and the write request both relate to the first line. The method also includes determining which request of the read and write request should proceed first. Additionally, the method includes completing the request of the read and write request which should proceed first.
摘要:
An approach for pipelining ordered input/output transactions to coherent memory in a distributed memory, cache coherent, multi-processor system. A prefetch engine prefetches data from the distributed, coherent memory in response to a transaction from an input/output bus directed to the distributed, coherent memory. An input/output coherent cache buffer receives the prefetched data and is kept coherent with the distributed, coherent memory and with other caching agents in the system.
摘要:
Embodiments of the invention provide a programmable FSA building block, having a number of programmable registers and associated logic implemented therein, that provide the capability of contextually evaluating complex REs of arbitrary size against multiple data streams. Embodiments of the invention provide fully programmable hardware in which all of the states of an RE are instantiated and all of the states are fully connected. For one embodiment, the building blocks have a fixed number of states to facilitate implementation on a chip. For such an embodiment, an RE having an excessive number of states is implemented on two or more FSA building blocks and the FSA building blocks are then stitched together to effect evaluation of the RE. For one embodiment, two or more REs having a number of states less than the fixed number of states of a building block may be implemented with a single building block.
摘要:
A method and apparatus for a mechanism for handling i/o transactions with known transaction length to coherent memory in a cache coherent multi-node architecture is described. In one embodiment, the invention is a method. The method includes receiving a request for a current copy of a data line. The method further includes finding the data line within a cache-coherent multi-node system. The method also includes copying the data line without disturbing a state associated with the data line. The method also includes providing a copy of the data line in response to the request. The method also includes determining if the data line is a last data line of a transaction based on a known transaction length of the transaction.
摘要:
A method and apparatus are described for providing an implicit write-back in a distributed shared memory environment implementing a snoop based architecture. A requesting node submits a single read request to a snoop based architecture controller switch. The switch recognizes that another node other than the requesting node and the home node for the desired data has a copy of the data. The switch directs the request to the responding node that is not the home node. The responding node, having modified the data, provides a single response back to the switch that causes the switch to both update the data at the home node and answer the requesting node. The updating of the data at the home node is done without receiving an explicit write instruction from the requesting node.
摘要:
According to one embodiment, a method is disclosed. The method includes receiving a first request from a first node in a multi-node computer system to invalidate a first cache line at a second node. The method also includes receiving a second request from the second node to invalidate the first cache line at the first node and detecting the concurrent requests at conflict detection circuitry.
摘要:
A memory subsystem and method are disclosed in which instruction code read-prefetching is implemented in the memory subsystem itself. A single-line read-prefetch buffer is implemented in the memory subsystem. A memory controller includes an address buffer for read-prefetches, and a memory datapath includes a data buffer for read-prefetches. Smart read-prefetching is used in which only code(instruction) reads are prefetched, taking advantage of the sequentiality of code (instruction) type data as well as the page mode feature of dynamic random access memories.
摘要:
A method for reducing memory latency in a multi-node architecture. In one embodiment, a speculative read request is issued to a home node before results of a cache coherence protocol are determined. The home node initiates a read to memory to complete the speculative read request. Results of a cache coherence protocol may be determined by a coherence agent to resolve cache coherency after the speculative read request is issued.
摘要:
A method for reducing memory latency in a multi-node architecture. In one embodiment, a speculative read request is issued to a home node before results of a cache coherence protocol are determined. The home node initiates a read to memory to complete the speculative read request. Results of a cache coherence protocol may be determined by a coherence agent to resolve cache coherency after the speculative read request is issued.