Abstract:
Apparatus and methods are provided for improving yield and frequency performance of integrated circuit processors, such as multiple-core processors. In an example, an apparatus can include a plurality of clock buffers, each clock buffer of the plurality of clock buffers configured to receive a first clock signal and distribute a plurality of second clock signals, a one-time programmable locate critical path mechanism configured provide a plurality of indications to enable or disable a delay of each clock buffer of the plurality of clock buffers, and a power management control circuit configured to over-ride one or more of the plurality of indications in a first non-test mode of operation of the apparatus and to not over-ride the one or more indications in a second non-test mode of operation of the apparatus.
Abstract:
An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.
Abstract:
A processor includes a cache hierarchy and an execution unit. The cache hierarchy includes a lower level cache and a higher level cache. The execution unit includes logic to issue a memory operation to access the cache hierarchy. The lower level cache includes logic to determine that a requested cache line of the memory operation is unavailable in the lower level cache, determine that a line fill buffer of the lower level cache is full, and initiate prefetching of the requested cache line from the higher level cache based upon the determination that the line fill buffer of the lower level cache is full. The line fill buffer is to forward miss requests to the higher level cache.
Abstract:
Apparatus and methods are provided for improving yield and frequency performance of integrated circuit processors, such as multiple-core processors. In an example, an apparatus can include a plurality of clock buffers, each clock buffer of the plurality of clock buffers configured to receive a first clock signal and distribute a plurality of second clock signals, a one-time programmable locate critical path mechanism configured provide a plurality of indications to enable or disable a delay of each clock buffer of the plurality of clock buffers, and a power management control circuit configured to over-ride one or more of the plurality of indications in a first non-test mode of operation of the apparatus and to not over-ride the one or more indications in a second non-test mode of operation of the apparatus.
Abstract:
A processor includes a cache hierarchy and an execution unit. The cache hierarchy includes a lower level cache and a higher level cache. The execution unit includes logic to issue a memory operation to access the cache hierarchy. The lower level cache includes logic to determine that a requested cache line of the memory operation is unavailable in the lower level cache, determine that a line fill buffer of the lower level cache is full, and initiate prefetching of the requested cache line from the higher level cache based upon the determination that the line fill buffer of the lower level cache is full. The line fill buffer is to forward miss requests to the higher level cache.