Abstract:
A predictor circuit permits advanced execution of instructions depending for their data on previous instructions by predicting such dependencies based on previous mis-speculations detected at the final stages of processing. Synchronization of dependent instructions is provided by a table creating entries for each instance of potential dependency. Table entries are created and deleted dynamically to limit total memory requirements.
Abstract:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
Abstract:
A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules. The unload pointer increments in accordance with a clock generated by the computer system itself. Because the trace length of each memory channel may differ, the time that it takes for a memory module to provide read data back to the memory controller may differ for each channel. The “skew” is defined as the difference in time between when the data arrives on the earliest channel and when data arrives on the latest channel. During system initialization, the pointers are synchronized. After initialization, the pointers are used to load and unload the read buffers in such a way that the effects of inner-channel skew is eliminated.
Abstract:
A system comprising a communications link between processors configured to transmit packets between transmitting and receiving processors. The communications link comprises a conduction path for each bit in the packet and the paths are grouped into separate bundles and routed along different paths. A forwarded clock signal is sent with each bundle. The processors operate with a clock frequency that is roughly three times as fast as the clock frequency of the forwarded clock signal. Data is transmitted on both rising and falling edges of the clock. The receiving processor comprises a recovery circuit to which it pulls the asynchronous data into the processor clock domain. The recovery circuit comprises a delay locked loop circuit configured to create a delayed copy of the clock signal with clock edges that are aligned with the center of the data window for the transmitted data.