Control of branch prediction for zero-overhead loop

    公开(公告)号:US11663007B2

    公开(公告)日:2023-05-30

    申请号:US17492068

    申请日:2021-10-01

    Applicant: Arm Limited

    CPC classification number: G06F9/30065 G06F9/325 G06F9/3846

    Abstract: In response to decoding a zero-overhead loop control instruction of an instruction set architecture, processing circuitry sets at least one loop control parameter for controlling execution of one or more iterations of a program loop body of a zero-overhead loop. Based on the at least one loop control parameter, loop control circuitry controls execution of the one or more iterations of the program loop body of the zero-overhead loop, the program loop body excluding the zero-overhead loop control instruction. Branch prediction disabling circuitry detects whether the processing circuitry is executing the program loop body of the zero-overhead loop associated with the zero-overhead loop control instruction, and dependent on detecting that the processing circuitry is executing the program loop body of the zero-overhead loop, disables branch prediction circuitry. This reduces power consumption during a zero-overhead loop when the branch prediction circuitry is unlikely to provide a benefit.

    Dynamic branch hints using branches-to-nowhere conditional branch

    公开(公告)号:US09851973B2

    公开(公告)日:2017-12-26

    申请号:US13997828

    申请日:2012-03-30

    Abstract: A processor includes an execution pipeline having one or more execution units to execute instructions and a branch prediction unit coupled to the execution units. The branch prediction unit includes a branch history table to store prior branch predictions, a branch predictor, in response to a conditional branch instruction, to predict a branch target address of the conditional branch instruction based on the branch history table, and address match logic to compare the predicted branch target address with an address of a next instruction executed immediately following the conditional branch instruction. The address match logic is to cause the execution pipeline to be flushed if the predicted branch target address does not match the address of the next instruction to be executed.

    SYSTEMS AND METHODS FOR CONVERTING TYPED CODE

    公开(公告)号:US20160210129A1

    公开(公告)日:2016-07-21

    申请号:US15083167

    申请日:2016-03-28

    Applicant: Facebook, Inc.

    Abstract: Techniques provided implement automatic data type annotation in dynamically-typed source code. A codebase, which may comprise a plurality of source code files, is scanned at a global level. The resulting scanned data may describe characteristics of the codebase, including variable and function usage. Based on inferences drawn from the scanning, data types are determined for different variables, expressions, or functions to facilitate conversion from dynamically-typed source code to statically-typed source code. For example, if a function is called once with a parameter value of data type A (e.g., class A), and another time with a parameter value of data type B (e.g., class B), a conversion tool may annotate the parameter variable in the declaration of the function with a data type D (e.g., class d) when data type D is identified as a common ancestor (e.g., superclass) to both data type A and data type B.

    Microprocessor with multiple operating modes dynamically configurable by a device driver based on currently running applications
    5.
    发明授权
    Microprocessor with multiple operating modes dynamically configurable by a device driver based on currently running applications 有权
    具有多种工作模式的微处理器可以由基于当前运行的应用程序的设备驱动程序动态配置

    公开(公告)号:US08566565B2

    公开(公告)日:2013-10-22

    申请号:US12170591

    申请日:2008-07-10

    Abstract: A computing system includes a microprocessor that receives values for configuring operating modes thereof. A device driver monitors which software applications currently running on the microprocessor are in a predetermined list and responsively dynamically writes the values to the microprocessor to configure its operating modes. Examples of the operating modes the device driver may configure relate to the following: data prefetching; branch prediction; instruction cache eviction; instruction execution suspension; sizes of cache memories, reorder buffer, store/load/fill queues; hashing algorithms related to data forwarding and branch target address cache indexing; number of instruction translation, formatting, and issuing per clock cycle; load delay mechanism; speculative page tablewalks; instruction merging; out-of-order execution extent; caching of non-temporal hinted data; and serial or parallel access of an L2 cache and processor bus in response to an instruction cache miss.

    Abstract translation: 计算系统包括接收用于配置其操作模式的值的微处理器。 设备驱动程序监视当前在微处理器上运行的软件应用程序处于预定列表中并且响应地动态地将值写入微处理器以配置其操作模式。 设备驱动程序可以配置的操作模式的示例涉及以下内容:数据预取; 分支预测; 指令缓存驱逐; 指令执行暂停; 高速缓冲存储器的大小,重新排序缓冲器,存储/加载/填充队列; 与数据转发和分支目标地址缓存索引相关的散列算法; 每个时钟周期的指令翻译,格式化和发布的数量; 负载延迟机制; 投机页面 指令合并 无序执行程度; 缓存非时间暗示数据; 以及响应于指令高速缓存未命中的L2高速缓存和处理器总线的串行或并行访问。

    Overlay instruction accessing unit and overlay instruction accessing method
    6.
    发明授权
    Overlay instruction accessing unit and overlay instruction accessing method 有权
    覆盖指令访问单元和覆盖指令访问方法

    公开(公告)号:US08286151B2

    公开(公告)日:2012-10-09

    申请号:US12239070

    申请日:2008-09-26

    Abstract: The present invention provides an overlay instruction accessing unit and method, and a method and apparatus for compressing and storing a program. The overlay instruction accessing unit is used to execute a program stored in a memory in the form of a plurality of compressed program segments, and compresses: a buffer; a processing unit for issuing an instruction reading request, reading an instruction from the buffer, and executing the instruction; and a decompressing unit for reading a requested compressed instruction segment from the memory in response to the instruction reading request of the processing unit, decompressing the compressed instruction segment, and storing the decompressed instruction segment in the buffer, wherein while the processing unit is executing the instruction segment, the decompressing unit reads, according to a storage address of a compressed program segment to be invoked in a header corresponding to the instruction segment, a corresponding compressed instruction segment from the memory, decompresses the compressed instruction segment, and stores the decompressed instruction segment in the buffer for later use by the processing unit.

    Abstract translation: 本发明提供一种覆盖指令访问单元和方法,以及用于压缩和存储程序的方法和装置。 覆盖指令访问单元用于以多个压缩程序段的形式执行存储在存储器中的程序,并且压缩:缓冲器; 处理单元,用于发出指令读取请求,从缓冲器读取指令并执行指令; 以及解压缩单元,用于响应于处理单元的指令读取请求从存储器读取所请求的压缩指令段,解压缩压缩指令段,并将解压缩的指令段存储在缓冲器中,其中当处理单元正在执行时 指令段,解压缩单元根据在与指令段对应的报头中要调用的压缩程序段的存储地址读取来自存储器的对应的压缩指令段,解压缩压缩指令段,并存储解压缩指令 缓冲区中的段,以供稍后由处理单元使用。

    Structure for predictive decoding
    7.
    发明授权
    Structure for predictive decoding 有权
    预测解码结构

    公开(公告)号:US08095777B2

    公开(公告)日:2012-01-10

    申请号:US11933774

    申请日:2007-11-01

    Abstract: A design structure embodied in a machine readable medium used in a design process includes an apparatus for predictive decoding, the apparatus including register logic for fetching an instruction; predictor logic containing predictor information including prior instruction execution characteristics; logic for obtaining predictor information for the fetched instruction from the predictor; and decode logic for generating a selected one of a plurality of decode operation streams corresponding to the fetched instruction, wherein the decode operation stream is selected based on the predictor information.

    Abstract translation: 在设计过程中使用的机器可读介质中体现的设计结构包括用于预测解码的装置,该装置包括用于取指令的寄存器逻辑; 包含预测器信息的预测器逻辑,包括先前的指令执行特性; 用于从预测器获取所获取的指令的预测信息的逻辑; 以及解码逻辑,用于产生对应于获取的指令的多个解码操作流中的所选择的一个解码操作流,其中基于预测器信息来选择解码操作流。

    Branch Prediction Mechanisms Using Multiple Hash Functions
    8.
    发明申请
    Branch Prediction Mechanisms Using Multiple Hash Functions 有权
    使用多个哈希函数的分支预测机制

    公开(公告)号:US20090265533A1

    公开(公告)日:2009-10-22

    申请号:US12493768

    申请日:2009-06-29

    CPC classification number: G06F9/3846 G06F9/3848

    Abstract: In one embodiment, the branch prediction mechanism includes a first storage including a first plurality of locations for storing a first set of partial prediction information. The branch prediction mechanism also includes a second storage including a second plurality of locations for storing a second set of partial prediction information. Further, the branch prediction mechanism includes a control unit that performs a first hash function on input branch information to generate a first index for accessing a selected location within the first storage. The control unit also performs a second hash function on the input branch information to generate a second index for accessing a selected location within the second storage. Lastly, the control unit further provides a prediction value based on corresponding partial prediction information in the selected locations of the first and the second storages.

    Abstract translation: 在一个实施例中,分支预测机制包括第一存储器,其包括用于存储第一组部分预测信息的第一多个位置。 分支预测机制还包括包括用于存储第二组部分预测信息的第二多个位置的第二存储器。 此外,分支预测机构包括:控制单元,其对输入的分支信息执行第一散列函数,以生成用于访问第一存储器内的选定位置的第一索引。 控制单元还对输入的分支信息执行第二散列函数以产生用于访问第二存储器内的所选位置的第二索引。 最后,控制单元进一步提供基于第一和第二存储器的选定位置中的相应部分预测信息的预测值。

    Speculative execution for java hardware accelerator
    9.
    发明授权
    Speculative execution for java hardware accelerator 失效
    Java硬件加速器的推测执行

    公开(公告)号:US07243350B2

    公开(公告)日:2007-07-10

    申请号:US10259704

    申请日:2002-09-27

    CPC classification number: G06F9/3846 G06F9/30174

    Abstract: Conditional branch bytecodes are processed by a Virtual Machine Interpreter (VMI) hardware accelerator that utilizes a branch prediction scheme to determine whether to speculatively process bytecodes while waiting for the CPU to return a condition control variable. The VMI assumes the branch condition will be fulfilled if a conditional branch bytecode calls for a backward jump and that the branch condition will not be fulfilled if a conditional branch bytecode calls for a forward jump. Alternatively, the VMI makes an assumption only if a conditional branch bytecode calls for a backward jump or the VMI assumes that the branch condition will be fulfilled whenever it processes a conditional branch bytecode. The VMI only speculatively processes bytecodes that are easily reversible, and suspends speculative processing of bytecodes upon encountering a bytecode that is not easily reversible. If a VMI assumption is invalidated, any speculatively processed bytecodes are reversed.

    Abstract translation: 条件分支字节码由虚拟机解释器(VMI)硬件加速器处理,虚拟机解释器(VMI)硬件加速器利用分支预测方案来确定是否在等待CPU返回条件控制变量时推测性地处理字节码。 如果条件分支字节码要求后向跳转,则VMI假定分支条件将被满足,如果条件分支字节码要求前向跳转,则分支条件将不会被满足。 或者,VMI仅在条件分支字节码要求后向跳转时作出假设,或者VMI假定每当处理条件分支字节码时,分支条件将被满足。 VMI仅推测性地处理容易可逆的字节码,并且在遇到不易逆转的字节码时暂停对字节码的推测性处理。 如果VMI假设无效,任何推测性处理的字节码都会相反。

    Branch prediction control
    10.
    发明申请
    Branch prediction control 有权
    分支预测控制

    公开(公告)号:US20060271770A1

    公开(公告)日:2006-11-30

    申请号:US11139984

    申请日:2005-05-31

    CPC classification number: G06F9/3844 G06F9/3846 G06F9/3861

    Abstract: A processor 2 incorporates a branch prediction mechanism 14, 18, 20 which acts to predict branch outcomes for predicted type branch instructions. The processor also supports non-predicted type branch instructions which are ignored by the branch prediction mechanisms 14, 18, 20 and are not subject to prediction. The impact of mispredictions degrading overall performance of the prediction mechanisms 14, 18, 20 is reduced by employing non-prediction type branch program instructions to represent/control branch operations when it is known that misprediction is likely for those branch operations.

    Abstract translation: 处理器2包括分支预测机构14,18,20,其用于预测预测类型分支指令的分支结果。 处理器还支持被分支预测机制14,18,20忽略的非预测类型的分支指令,并且不进行预测。 通过使用非预测型分支程序指令来表示/控制分支操作来降低预测机构14,18,20的整体性能的劣化的影响,当知道这些分支操作可能出现错误预测时。

Patent Agency Ranking