Abstract:
Systems, apparatuses, and methods for efficient program flow prediction. After receiving a current fetch address, a first predictor performs a lookup of a first table. When the lookup results in a miss and the first table has no available entries, the first predictor overwrites a given entry of the first table with the received fetch address, in response to detecting a strength value for the given entry is below a threshold. Otherwise, in response to detecting no entries of the first table have a strength value below the threshold, the first predictor allocates an entry in the second table for the received fetch address. When an indication of a target address for the received fetch address is a return address for a function call, a third predictor allocates an entry of a third table with the received fetch address.
Abstract:
In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
Abstract:
Systems, processors, and methods for determining when to enter loop buffer mode early for loops in an instruction stream. A processor waits until a branch history register has saturated before entering loop buffer mode for a loop if the processor has not yet determined the loop has an unpredictable exit. However, if the loop has an unpredictable exit, then the loop is allowed to enter loop buffer mode early. While in loop buffer mode, the loop is dispatched from a loop buffer, and the front-end of the processor is powered down until the loop terminates.
Abstract:
A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
Abstract:
Techniques are disclosed relating to signature-based instruction prefetching. In some embodiments, processor pipeline circuitry executes a computer program that includes control transfer instructions, such that the execution follows a taken path through the computer program. First signature prefetch table circuitry indicates prefetch addresses for signatures generated using a first signature generation technique and second signature prefetch table circuitry indicates prefetch addresses for signatures generated using a second, different signature generation technique. Signature prefetch circuitry, in response to a prefetch training event, determines a first signature according to the first technique and a second signature according to the second technique and selects one but not both of the first and second signature prefetch tables to train using the first signature or the second signature.
Abstract:
Systems, apparatuses, and methods for efficient handling of subroutine epilogues. When an indirect control transfer instruction corresponding to a procedure return for a subroutine is identified, the return address and a signature are retrieved from one or more of a return address stack and the memory stack. An authenticator generates a signature based on at least a portion of the retrieved return address. While the signature is being generated, instruction processing speculatively continues. No instructions are permitted to commit yet. The generated signature is later compared to a copy of the signature generated earlier during the corresponding procedure call. A mismatch causes an exception.
Abstract:
A system and method for efficiently protecting branch prediction information. In various embodiments, a computing system includes at least one processor with a branch predictor storing branch target addresses and security tags in a table. The security tag includes one or more components of machine context. When the branch predictor receives a portion of a first program counter of a first branch instruction, and hits on a first table entry during an access, the branch predictor reads out a first security tag. The branch predictor compares one or more components of machine context of the first security tag to one or more components of machine context of the first branch instruction. When there is at least one mismatch, the branch prediction information of the first table entry is not used. Additionally, there is no updating of any branch prediction training information of the first table entry.
Abstract:
In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
Abstract:
In an embodiment, an indirect branch predictor generates indirect branch predictions based on one or more register values. The register values may be the contents of registers on which the indirect branch instruction is directly or indirectly dependent for generating the branch target address, for example. In an embodiment, at least one of the registers may be a source for a load instruction, and the indirect branch may be dependent (directly or indirectly) on the target of the load. In an embodiment, the indirect branch predictor may be one of at least two indirect branch predictors in a processor. The other indirect branch predictor may be based on a fetch address, or PC, associated with the indirect branch instruction. The other indirect branch predictor may generate a first predicted target address, and the indirect branch predictor may generate a second predicted target address for the same indirect branch instruction.
Abstract:
In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.