Branch prediction in high-performance processor

    公开(公告)号:US06076158A

    公开(公告)日:2000-06-13

    申请号:US86354

    申请日:1993-07-01

    IPC分类号: G06F9/315 G06F9/32 G06F9/38

    摘要: A CPU of the RISC type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes limited to register-to-register operations and register load/store operations. Byte manipulation instructions include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream. Performance can be speeded up by predicting the target of a branch and prefetching the new instruction based upon this prediction; a branch prediction rule is followed that requires all forward branches to be predicted not-taken and all backward branches to be predicted as taken. Another embodiment uses unused bits in the standard-sized instruction to provide a hint of the expected target address for jump and jump to subroutine instructions or the like. The target can thus be prefetched before the actual address has been calculated and placed in a register. In addition, the unused displacement part of the jump instruction can contain a field to define the actual type of jump, i.e., jump, jump to subroutine, return from subroutine, and thus place a predicted target address in a stack to allow prefetching before the instruction has been executed. The processor can employ a variable memory page size, so that the entries in a translation buffer for implementing virtual addressing can be optimally used. A granularity hint is added to the page table entry to define the page size for this entry. An additional feature is the addition of a prefetch instruction which serves to move a block of data to a faster-access cache in the memory hierarchy before the data block is to be used.

    Prefetch instruction for improving performance in reduced instruction
set processor

    公开(公告)号:US5778423A

    公开(公告)日:1998-07-07

    申请号:US547630

    申请日:1990-06-29

    摘要: A high-performance CPU of the RISC (reduced instruction set) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. Byte manipulation instructions, included to permit use of previously-established data structures, include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The provision of load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream. Performance can be speeded up by predicting the target of a branch and prefetching the new instruction based upon this prediction; a branch prediction rule is followed that requires all forward branches to be predicted not-taken and all backward branches (as is common for loops) to be predicted as taken. Another performance improvement makes use of unused bits in the standard-sized instruction to provide a hint of the expected target address for jump and jump to subroutine instructions or the like. The target can thus be prefetched before the actual address has been calculated and placed in a register. In addition, the unused displacement part of the jump instruction can contain a field to define the actual type of jump, i.e., jump, jump to subroutine, return from subroutine, and thus place a predicted target address in a stack to allow prefetching before the instruction has been executed. The processor can employ a variable memory page size, so that the entries in a translation buffer for implementing virtual addressing can be optimally used. A granularity hint is added to the page table entry to define the page size for this entry. An additional feature is the addition of a prefetch instruction which serves to move a block of data to a faster-access cache in the memory hierarchy before the data block is to be used.

    Byte-compare operation for high-performance processor
    3.
    发明授权
    Byte-compare operation for high-performance processor 失效
    高性能处理器的字节比较操作

    公开(公告)号:US5995746A

    公开(公告)日:1999-11-30

    申请号:US661196

    申请日:1996-06-10

    IPC分类号: G06F9/305 G06F9/30

    摘要: A high-performance CPU of the RISC (reduced instruction set) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. Byte manipulation instructions, included to permit use of previously-established data structures, include the facility for doing in-register byte extract, insert and masking, along with non-aligned load and store instructions. The provision of load/locked and store/conditional instructions permits the implementation of atomic byte writes. By providing a conditional move instruction, many short branches can be eliminated altogether. A conditional move instruction tests a register and moves a second register to a third if the condition is met; this function can be substituted for short branches and thus maintain the sequentiality of the instruction stream.

    摘要翻译: RISC(精简指令集)类型的高性能CPU采用标准化的固定指令大小,并且仅允许简化的存储器访问数据宽度和寻址模式。 指令集仅限于寄存器到寄存器操作和寄存器加载/存储操作。 包括允许使用先前建立的数据结构的字节操作指令包括进行寄存器中字节提取,插入和屏蔽以及非对齐加载和存储指令的功能。 提供加载/锁定和存储/条件指令允许实现原子字节写入。 通过提供条件移动指令,可以完全消除许多短分支。 条件移动指令测试寄存器,并且如果满足条件则将第二寄存器移动到第三寄存器; 该功能可以代替短分支,从而保持指令流的顺序性。

    Branch performance in high speed processor
    4.
    发明授权
    Branch performance in high speed processor 失效
    分支性能在高速处理器

    公开(公告)号:US6167509A

    公开(公告)日:2000-12-26

    申请号:US243559

    申请日:1994-05-16

    摘要: A high-performance CPU of the RISC (reduced instruction set) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. Performance can be speeded up by predicting the target of a branch and prefetching the new instruction based upon this prediction; a branch prediction rule is followed that requires all forward branches to be predicted not-taken and all backward branches (as is common for loops) to be predicted as taken. Another performance improvement makes use of unused bits in the standard. sized instruction to provide a hint of the expected target address for jump and jump to subroutine instructions or the like. The target can thus be prefetched before the actual address has been calculated and placed in a register. In addition, the unused displacement part of the jump instruction can contain a field to define the actual type of jump, i.e., jump, jump to subroutine, return from subroutine, and thus place a predicted target address in a stack to allow prefetching before the instruction has been executed.

    摘要翻译: RISC(精简指令集)类型的高性能CPU采用标准化的固定指令大小,并且仅允许简化的存储器访问数据宽度和寻址模式。 指令集仅限于寄存器到寄存器操作和寄存器加载/存储操作。 可以通过预测分支的目标并基于该预测来预取新指令来加快性能; 遵循分支预测规则,要求将所有前向分支预测为未被采用,并且所有后向分支(如循环常用)被预测为采用。 另一个性能改进使得在标准中使用未使用的位。 提供预期目标地址的提示以跳转和跳转到子程序指令等。 因此,可以在实际地址被计算并放置在寄存器中之前预取目标。 此外,跳转指令的未使用的位移部分可以包含一个字段来定义跳转的实际类型,即跳转,跳转到子程序,从子程序返回,从而将预测的目标地址放在堆栈中,以便在 指令已执行。

    Virtual to physical address translation scheme with granularity hint for
identifying subsequent pages to be accessed
    5.
    发明授权
    Virtual to physical address translation scheme with granularity hint for identifying subsequent pages to be accessed 失效
    虚拟到物理地址转换方案,其粒度提示用于识别要访问的后续页面

    公开(公告)号:US5454091A

    公开(公告)日:1995-09-26

    申请号:US111284

    申请日:1993-08-24

    IPC分类号: G06F9/34 G06F12/10

    摘要: A high-performance central processing unit (CPU) of the reduced instruction set (RISC) type employs a standardized, fixed instruction size, and permits only simplified memory access data width and addressing modes. The instruction set is limited to register-to-register operations and register load/store operations. The processor can employ a variable memory page size, so that the entries in a translation buffer for implementing virtual addressing can be optimally used. A granularity hint is added to the page table entry to define the page size for this entry.

    摘要翻译: 精简指令集(RISC)类型的高性能中央处理单元(CPU)采用标准化的固定指令大小,仅允许简化的存储器访问数据宽度和寻址模式。 指令集仅限于寄存器到寄存器操作和寄存器加载/存储操作。 处理器可以采用可变存储器页面大小,使得可以最佳地使用用于实现虚拟寻址的翻译缓冲器中的条目。 在页表项中添加了粒度提示,以定义此条目的页面大小。

    Apparatus and method for control of asynchronous program interrupt
events in a data processing system
    7.
    发明授权
    Apparatus and method for control of asynchronous program interrupt events in a data processing system 失效
    用于在数据处理系统中控制异步程序中断事件的装置和方法

    公开(公告)号:US5148544A

    公开(公告)日:1992-09-15

    申请号:US704710

    申请日:1991-05-17

    IPC分类号: G06F9/30 G06F9/48

    CPC分类号: G06F9/4812 G06F9/30076

    摘要: In a data procesing system having a kernel mode (i.e., for executing privileged instructions) and a user mode of operation, apparatus for responding to interrupt conditions includes a first register, subject to the control of the currently executing program for enabling the generation of a mode-related interrupt signal and includes a second register for indicating the presence of a pending mode-related interrupt condition and a third register for requesting a mode-related interrupt be entered in the second register. The mode of operation and the enable and pending interrupt condition registers are monitored and when the signals in the two registers have the appropriate relationship, an interrupt signal is generated to which a control program will respond. The contents of the first register can be controlled by the currently executing program which can control the enabling signal for the currently executing mode. The pending interrupt condition and the request registers may be accessed only from the privileged mode of operation.

    摘要翻译: 在具有内核模式(即,用于执行特许指令)和用户操作模式的数据处理系统中,用于响应中断条件的装置包括第一寄存器,受到当前正在执行的程序的控制以使能生成 模式相关中断信号,并且包括用于指示存在待决模式相关中断条件的第二寄存器,并且用于请求模式相关中断的第三寄存器被输入到第二寄存器中。 监视操作模式和使能和待处理中断条件寄存器,并且当两个寄存器中的信号具有适当的关系时,产生一个控制程序将响应的中断信号。 第一寄存器的内容可以由当前执行的程序控制,该程序可以控制当前执行模式的使能信号。 挂起的中断条件和请求寄存器只能从特权操作模式访问。

    Method and apparatus for lowering bus clock frequency in a complex integrated data processing system
    10.
    发明授权
    Method and apparatus for lowering bus clock frequency in a complex integrated data processing system 有权
    在复杂的集成数据处理系统中降低总线时钟频率的方法和装置

    公开(公告)号:US07093153B1

    公开(公告)日:2006-08-15

    申请号:US10284763

    申请日:2002-10-30

    IPC分类号: G06F1/08

    摘要: A data processing system (100) comprises a system bus (120), a plurality of devices (110, 150, 160, 170) coupled to the system bus (120), a bus monitor circuit (140), and a clock generator (130). The plurality of devices (110, 150, 160, 170) includes at least one bus master (110, 150) which is capable of performing accesses on the system bus (120). The bus monitor circuit (140) is coupled to the at least one bus master (110, 150), and has an output for providing a bus idle signal to indicate that no bus master is attempting to perform an access on the system bus (120). The clock generator (130) has an output coupled to at least one of the plurality of devices (110, 150, 160, 170) and provides a bus clock signal having a first frequency when the bus idle signal is inactive and having a second frequency lower than the first frequency when the bus idle signal is active.

    摘要翻译: 数据处理系统(100)包括系统总线(120),耦合到系统总线(120)的多个设备(110,150,160,170),总线监控电路(140)和时钟发生器 130)。 多个设备(110,150,160,170)包括能够在系统总线(120)上执行访问的至少一个总线主机(110,150)。 总线监视器电路(140)耦合到至少一个总线主机(110,150),并且具有用于提供总线空闲信号的输出,以指示没有总线主控器正在尝试在系统总线上执行访问(120 )。 时钟发生器(130)具有耦合到多个设备(110,150,160,170)中的至少一个的输出端,并且当总线空闲信号无效并具有第二频率时提供具有第一频率的总线时钟信号 当总线空闲信号有效时低于第一频率。