Abstract:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself. Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
Abstract:
A processor with on-chip memory including a plurality of physical memory banks is disclosed. The processor includes a method, and corresponding apparatus, of enabling multi-access to the plurality of physical memory banks. The method comprises selecting a subset of multiple access requests to be executed in at least one clock cycle over at least one of a number of access ports connected to the plurality of physical memory banks, the selected subset of access requests addressed to different physical memory banks, among the plurality of memory banks, and scheduling the selected subset of access requests, each over a separate access port.
Abstract:
A packet processor provides for rule matching of packets in a network architecture. The packet processor includes a lookup cluster complex having a number of lookup engines and respective on-chip memory units. The on-chip memory stores rules for matching against packet data. A lookup front-end receives lookup requests from a host, and processes these lookup requests to generate key requests for forwarding to the lookup engines. As a result of the rule matching, the lookup engine returns a response message indicating whether a match is found. The lookup front-end further processes the response message and provides a corresponding response to the host.
Abstract:
In one embodiment, a system includes memory ports distributed into subsets identified by a subset index, where each memory port has an individual wait time based on a respective workload. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address referring to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.
Abstract:
A packet processor provides for rule matching of packets in a network architecture. The packet processor includes a lookup cluster complex having a number of lookup engines and respective on-chip memory units. The on-chip memory stores rules for matching against packet data. Each of the lookup engines receives a key request associated with a packet and determines a subset of the rules to match against the packet data. As a result of the rule matching, the lookup engine returns a response message indicating whether a match is found.
Abstract:
In one embodiment, a system comprises multiple memory ports distributed into multiple subsets, each subset identified by a subset index and each memory port having an individual wait time. The system further comprises a first address hashing unit configured to receive a read request including a virtual memory address associated with a replication factor, and referring to graph data. The first address hashing unit translates the replication factor into a corresponding subset index based on the virtual memory address, and converts the virtual memory address to a hardware based memory address that refers to graph data in the memory ports within a subset indicated by the corresponding subset index. The system further comprises a memory replication controller configured to direct read requests to the hardware based address to the one of the memory ports within the subset indicated by the corresponding subset index with a lowest individual wait time.
Abstract:
A packet processor provides for rule matching of packets in a network architecture. The packet processor includes a lookup cluster complex having a number of lookup engines and respective on-chip memory units. The on-chip memory stores rules for matching against packet data. A lookup front-end receives lookup requests from a host, and processes these lookup requests to generate key requests for forwarding to the lookup engines. As a result of the rule matching, the lookup engine returns a response message indicating whether a match is found. The lookup front-end further processes the response message and provides a corresponding response to the host.
Abstract:
A packet processor provides for rule matching of packets in a network architecture. The packet processor includes a lookup cluster complex having a number of lookup engines and respective on-chip memory units. The on-chip memory stores rules for matching against packet data. Each of the lookup engines receives a key request associated with a packet and determines a subset of the rules to match against the packet data. As a result of the rule matching, the lookup engine returns a response message indicating whether a match is found.
Abstract:
A method and apparatus for optimizing IPsec processing by providing execution units with windowing data during prefetch and managing coherency of security association data by management of security association accesses. Providing execution units with windowing data allows initial parallel processing of IPsec packets. The security association access ordering apparatus serializes access to the dynamic section of security association data according to packet order arrival while otherwise allowing parallel processing of the IPsec packet by multiple execution units in a security processor.
Abstract:
An efficient system and method for managing reads and writes on a bi-directional bus to optimize bus performance while avoiding bus contention and avoiding read/write starvation. In particular, by intelligently managing reads and writes on a bi-directional bus, bus latency can be reduced while still ensuring no bus contention or read/write starvation. This is accomplished by utilizing bus streaming control logic, separate queues for reads and writes, and a simple 2 to 1 mux.