Abstract:
A memory device including an array of memory cells and a method for copying information within the memory device. Each memory cell includes a first memory sub-cell and a second memory sub-cell. Each memory cell also includes a device that copies information from the first memory sub-cell into the second memory sub-cell. Each memory cell may include a static random access memory (SRAM) cell and may utilize tri-state inverters to make overwriting information easier and reduce power consumption. Each memory cell may also include a second copy device that allows information to be copied from the second memory sub-cell to the first memory sub-cell. The memory device may be provided in a register file of a microprocessor to copy information from an architectural branch register (ABR) file to a speculative branch register (SBR) file.
Abstract:
A branch prediction system is described for use within a microprocessor having an instruction cache capable of storing two or more instructions per cache line. Each entry of a branch prediction table (BPT) includes a value identifying whether at least one other instruction within a common cache line contains a branch. The value is referred to herein as a multiple-B bit value. The multiple-B bit value is examined by branch prediction logic while one branch prediction is being performed to determine whether a second branch prediction can be initiated for another branch within the same cache line. In one implementation, the multiple-B bit of one BPT entry is examined following a hit. A branch prediction for the entry generating a hit is initiated. Simultaneously, the BPT is reaccessed to search for an entry corresponding to another instruction within the same cache line if the multiple-B bit for the first entry was set. If the second entry is found, a secondary branch prediction is initiated. Eventually, the first branch prediction is output. If the first branch prediction is Not Taken, then the second branch prediction is output during the next clock cycle. If the first branch prediction is Taken, then the second branch prediction may be aborted as it is not needed. Method and apparatus embodiments of the invention are described.
Abstract:
A processor and method that reduces instruction fetch penalty in the execution of a program sequence of instructions comprises a branch predict instruction that is inserted into the program at a location which precedes the branch. The branch predict instruction has an opcode that specifies a branch as likely to be taken or not taken, and which also specifies a target address of the branch. A block of target instructions, starting at the target address, is prefetched into the instruction cache of the processor so that the instructions are available for execution prior to the point in the program where the branch is encountered. Also specified by the opcode is an indication of the size of the block of target instructions, and a trace vector of a path in the program sequence that leads to the target from the branch predict instruction for better utilization of limited memory bandwidth.
Abstract:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
Abstract:
A processor architecture with an instruction set having a predict instruction, the predict instruction providing static prediction information and a statically predicted target address to the processor for a branch instruction. The processor decodes a predict instruction to obtain an associated pair of addresses comprising a predicted target address and a referenced instruction address, and fetches a predicted target instruction having an instruction address matching the predicted target address when a fetched and decoded branch instruction has an instruction address matching the referenced instruction address.
Abstract:
A microprocessor that includes first and second Instruction Fetch Units (IFU) coupled therebetween is provided. The first IFU implements a first Instruction Set Architecture (ISA). The second IFU implements a second ISA. The microprocessor further includes a shared branch prediction unit coupled to the first and second IFU. The shared branch prediction unit stores prediction-related information. In the same paragraph, the present invention also provides a method of performing branch prediction. According to this method, an instruction pointer is provided to a branch prediction unit that stores information shared by first and second IFU. The instruction pointer is generated by one of the first and second IFU that is active. Determination is made of whether an instruction corresponding to the instruction pointer, provided to the branch prediction unit, is a branch instruction, and if so, it is determined if a branch is predicted taken. If the branch instruction is predicted taken, target address corresponding to the branch instruction is provided to the first and second IFU.
Abstract:
Allocation circuitry for allocating entries within a set-associative cache memory is disclosed. The set-associative cache memory comprises N ways, each way having M entries and corresponding entries in each of the N ways constituting a set of entries. The allocation circuitry has a first circuit which identifies related data units by identifying a probability that the related data units may be successively read from the cache memory. A second circuit within the allocation circuitry allocates the corresponding entries in each of the ways to the related data units, so that related data units are stored in a common set of entries. Accordingly, the related data units will be simultaneously outputted from the set-associative cache memory, and are thus concurrently available for processing. The invention may find application in allocating entries of a common set in a branch prediction table (BPT) to branch prediction information for related branch instructions.
Abstract:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
Abstract:
A loop branch prediction system is provided to predict a final iteration of a loop and resteer an associated fetch module to an appropriate target address. The loop prediction system includes a counter and an end of loop (EOL) module. In one mode, the counter tracks loop branches in process. When a termination condition is detected, the counter switches to a second mode to track the number of loop branches still to be issued. The EOL module compares the number of loop branches still to be issued with one or more threshold values and generates a resteer signal when a match is detected.
Abstract:
A processor that includes an execution pipeline that executes a programmed flow of instructions is provided. The processor also includes an instruction pointer generator configured to generate an instruction pointer. Furthermore, the processor includes a branch prediction circuit configured to receive the instruction pointer. In response to the instruction pointer, the branch prediction circuit is configured to determine if an instruction corresponding to the instruction pointer includes a branch that is predicted taken and if so to provide to said execution pipeline a target instruction corresponding to said instruction. The branch prediction circuit provides to the execution pipeline at least one target instruction corresponding to the instruction corresponding to the instruction pointer.