Accessing a multibank register file using a thread identifier
    1.
    发明授权
    Accessing a multibank register file using a thread identifier 有权
    使用线程标识符访问多银行寄存器文件

    公开(公告)号:US08458446B2

    公开(公告)日:2013-06-04

    申请号:US12570682

    申请日:2009-09-30

    IPC分类号: G06F9/30

    摘要: A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.

    摘要翻译: 处理器包括:指令获取单元,被配置为发出用于执行的指令,其中从多个线程中选择指令,其中每个给定指令具有对应的线程标识符,并且其中至少一些指令经由寄存器指定操作数 身份标识。 寄存器文件存储指令可用的操作数,并且可以包括几个存储体,每个存储体对应于寄存器标识符,并且包括与多个线程对应的多个条目,其中条目被配置为存储数据值。 响应于接收到针对给定线程标识符读取特定寄存器标识符的请求,寄存器文件可以被配置为对给定的线程标识符进行解码以从对应于给定线程标识符的存储体检索条目。 寄存器文件还可以从检索到的条目中选择与要输出的特定寄存器标识符对应的数据值。

    MULTIPORTED REGISTER FILE FOR MULTITHREADED PROCESSORS AND PROCESSORS EMPLOYING REGISTER WINDOWS
    2.
    发明申请
    MULTIPORTED REGISTER FILE FOR MULTITHREADED PROCESSORS AND PROCESSORS EMPLOYING REGISTER WINDOWS 有权
    多用途处理器和使用注册窗口的处理器的多个寄存器文件

    公开(公告)号:US20110078414A1

    公开(公告)日:2011-03-31

    申请号:US12570682

    申请日:2009-09-30

    IPC分类号: G06F9/30

    摘要: A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.

    摘要翻译: 处理器包括:指令获取单元,被配置为发出用于执行的指令,其中从多个线程中选择指令,其中每个给定指令具有对应的线程标识符,并且其中至少一些指令经由寄存器指定操作数 身份标识。 寄存器文件存储指令可用的操作数,并且可以包括几个存储体,每个存储体对应于寄存器标识符,并且包括与多个线程对应的多个条目,其中条目被配置为存储数据值。 响应于接收到针对给定线程标识符读取特定寄存器标识符的请求,寄存器文件可以被配置为对给定的线程标识符进行解码以从对应于给定线程标识符的存储体检索条目。 寄存器文件还可以从检索到的条目中选择与要输出的特定寄存器标识符对应的数据值。

    MEMORY WITH WRITE PORT CONFIGURED FOR DOUBLE PUMP WRITE
    3.
    发明申请
    MEMORY WITH WRITE PORT CONFIGURED FOR DOUBLE PUMP WRITE 有权
    存储器配有写入端口用于双PU写入

    公开(公告)号:US20090231935A1

    公开(公告)日:2009-09-17

    申请号:US12049798

    申请日:2008-03-17

    IPC分类号: G11C7/10 G11C7/22

    摘要: A memory with a write port configured for double-pump writes. The memory includes a first and second memory locations each having one or more bit cells, and one or more bit lines each coupled to corresponding ones of the bit cells. A write port is coupled to each of the bit lines. Selection circuitry, responsive to a first clock edge, latches first data from a first data path through the write port, and responsive to a second clock edge, latches second data from a second data path through the write port. A first pulse is generated during a first phase of the clock signal to cause writing of the first data into the first memory location. A second pulse is generated during a second phase of the clock signal to cause writing of the second data into the second memory location.

    摘要翻译: 具有配置为双泵写入的写入端口的存储器。 存储器包括每个具有一个或多个位单元的第一和第二存储器单元,以及每个耦合到相应的位单元的一个或多个位线。 写端口耦合到每个位线。 响应于第一时钟沿的选择电路锁存来自第一数据路径的第一数据通过写入端口,并且响应于第二时钟沿,通过写入端口锁存来自第二数据路径的第二数据。 在时钟信号的第一阶段期间产生第一脉冲,以使第一数据写入第一存储器位置。 在时钟信号的第二阶段期间产生第二脉冲,以使第二数据写入第二存储器位置。

    Unified high-frequency out-of-order pick queue with support for triggering early issue of speculative instructions
    4.
    发明授权
    Unified high-frequency out-of-order pick queue with support for triggering early issue of speculative instructions 有权
    统一的高频无序拣选队列,支持触发早期发布的投机指令

    公开(公告)号:US09058180B2

    公开(公告)日:2015-06-16

    申请号:US12493743

    申请日:2009-06-29

    摘要: Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.

    摘要翻译: 用于在处理器中有效挑选无序问题和执行指令的系统和方法。 在一个实施例中,处理器包括动态分配的统一选择队列。 每个条目被配置为存储相对于其他解码指令的年龄和依赖性信息。 此外,每个条目存储拾取的字段,当被断言指示解码的指令已被选择用于无序发行和执行时。 当被确认时,触发字段指示相应的解码指令的结果将在预定数量的时钟周期之后可用。 在较老指令的结果可用之前,可以选择取决于旧指令结果的年轻指令。 在这种情况下,较旧的指令已经断言了选择和触发字段。

    DEPENDENCY MATRIX FOR THE DETERMINATION OF LOAD DEPENDENCIES
    5.
    发明申请
    DEPENDENCY MATRIX FOR THE DETERMINATION OF LOAD DEPENDENCIES 有权
    用于确定负载依赖性的依赖矩阵

    公开(公告)号:US20100332806A1

    公开(公告)日:2010-12-30

    申请号:US12495025

    申请日:2009-06-30

    IPC分类号: G06F9/30

    摘要: Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.

    摘要翻译: 用于识别处理器中推测加载操作的依赖指令的系统和方法。 处理器为解码和重新命名的指令分配统一挑选队列的条目。 相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。 处理器推测负载将在数据高速缓存中击中,在TLB中触发,写入(RAW)危险后不会有读取。 对于每个未解决的负载,拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。 如果发现负载推测不正确,则选择队列中的相关指令将被重置,以允许随后的拣配,并取消飞行中的相关指令。 完成负载错误后,重新发行依赖操作。 在解决TLB错误或RAW危险时,将重新起始原始负载,并从拾取队列再次发出依赖操作。

    UNIFIED HIGH-FREQUENCY OUT-OF-ORDER PICK QUEUE WITH SUPPORT FOR SPECULATIVE INSTRUCTIONS
    6.
    发明申请
    UNIFIED HIGH-FREQUENCY OUT-OF-ORDER PICK QUEUE WITH SUPPORT FOR SPECULATIVE INSTRUCTIONS 有权
    统一的高频无排序抽奖活动支持用于指导性说明

    公开(公告)号:US20100332804A1

    公开(公告)日:2010-12-30

    申请号:US12493743

    申请日:2009-06-29

    IPC分类号: G06F9/30

    摘要: Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.

    摘要翻译: 用于在处理器中有效挑选无序问题和执行指令的系统和方法。 在一个实施例中,处理器包括动态分配的统一选择队列。 每个条目被配置为存储相对于其他解码指令的年龄和依赖性信息。 此外,每个条目存储拾取的字段,当被断言指示解码的指令已被选择用于无序发行和执行时。 当被确认时,触发字段指示相应的解码指令的结果将在预定数量的时钟周期之后可用。 在较老指令的结果可用之前,可以选择取决于旧指令结果的年轻指令。 在这种情况下,较旧的指令已经断言了选择和触发字段。

    Dependency matrix for the determination of load dependencies
    7.
    发明授权
    Dependency matrix for the determination of load dependencies 有权
    用于确定负载依赖性的依赖矩阵

    公开(公告)号:US09262171B2

    公开(公告)日:2016-02-16

    申请号:US12495025

    申请日:2009-06-30

    摘要: Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.

    摘要翻译: 用于识别处理器中推测加载操作的依赖指令的系统和方法。 处理器为解码和重新命名的指令分配统一挑选队列的条目。 相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。 处理器推测负载将在数据高速缓存中击中,在TLB中触发,写入(RAW)危险后不会有读取。 对于每个未解决的负载,拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。 如果发现负载推测不正确,则选择队列中的相关指令将被重置,以允许随后的拣配,并取消飞行中的相关指令。 完成负载错误后,重新发行依赖操作。 在解决TLB错误或RAW危险时,将重新起始原始负载,并从拾取队列再次发出依赖操作。

    Memory with write port configured for double pump write
    8.
    发明授权
    Memory with write port configured for double pump write 有权
    具有配置为双泵写入的写入端口的内存

    公开(公告)号:US07778105B2

    公开(公告)日:2010-08-17

    申请号:US12049798

    申请日:2008-03-17

    IPC分类号: G11C7/00 G11C8/00

    摘要: A memory with a write port configured for double-pump writes. The memory includes a first and second memory locations each having one or more bit cells, and one or more bit lines each coupled to corresponding ones of the bit cells. A write port is coupled to each of the bit lines. Selection circuitry, responsive to a first clock edge, latches first data from a first data path through the write port, and responsive to a second clock edge, latches second data from a second data path through the write port. A first pulse is generated during a first phase of the clock signal to cause writing of the first data into the first memory location. A second pulse is generated during a second phase of the clock signal to cause writing of the second data into the second memory location.

    摘要翻译: 具有配置为双泵写入的写入端口的存储器。 存储器包括每个具有一个或多个位单元的第一和第二存储器单元,以及每个耦合到相应的位单元的一个或多个位线。 写端口耦合到每个位线。 响应于第一时钟沿的选择电路锁存来自第一数据路径的第一数据通过写入端口,并且响应于第二时钟沿,通过写入端口锁存来自第二数据路径的第二数据。 在时钟信号的第一阶段期间产生第一脉冲,以使第一数据写入第一存储器位置。 在时钟信号的第二阶段期间产生第二脉冲,以使第二数据写入第二存储器位置。

    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS
    9.
    发明申请
    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS 有权
    具有多周期运行的多线程处理器的螺纹公差

    公开(公告)号:US20110276783A1

    公开(公告)日:2011-11-10

    申请号:US12773278

    申请日:2010-05-04

    IPC分类号: G06F9/38

    摘要: Systems and methods for efficient execution of operations in a multi-threaded processor. Each thread may include a blocking instruction. A blocking instruction blocks other threads from utilizing hardware resources for an appreciable amount of time. One example of a blocking type instruction is a Montgomery multiplication cryptographic instruction. Each thread can operate in a thread-based mode that allows the insertion of stall cycles during the execution of blocking instructions, during which other threads may utilize the previously blocked hardware resources. At times when multiple threads are scheduled to execute blocking instructions, the thread-based mode may be changed to increase throughput for these multiple threads. For example, the mode may be changed to disallow the insertion of stall cycles. Therefore, the time for sequential operation of the blocking instructions corresponding to the multiple threads may be reduced.

    摘要翻译: 在多线程处理器中有效执行操作的系统和方法。 每个线程可以包括阻塞指令。 阻塞指令阻止其他线程在相当长的时间内利用硬件资源。 阻塞型指令的一个例子是蒙哥马利乘法加密指令。 每个线程都可以以线程为基础的模式运行,允许在执行阻塞指令期间插入停滞周期,在此期间其他线程可能利用先前阻止的硬件资源。 在多个线程被调度执行阻塞指令的时候,可以改变基于线程的模式,以增加这些多线程的吞吐量。 例如,可以改变该模式以不允许插入失速循环。 因此,可以减少对应于多个线程的阻塞指令的顺序操作的时间。

    Processor and method for managing execution of an instruction which
determine subsequent to dispatch if an instruction is subject to
serialization
    10.
    发明授权
    Processor and method for managing execution of an instruction which determine subsequent to dispatch if an instruction is subject to serialization 失效
    用于管理指令的执行的处理器和方法,所述指令确定在调度指令是否进行序列化之后

    公开(公告)号:US5678016A

    公开(公告)日:1997-10-14

    申请号:US512741

    申请日:1995-08-08

    IPC分类号: G06F9/312 G06F9/38

    摘要: A method and apparatus are disclosed for managing the execution of a floating-point store instruction within a data processing system including a memory and a superscalar processor having a number of floating-point registers (FPRs). According to the present invention, multiple instructions are dispatched for execution by the processor, including a floating-point store instruction having as an operand the content of a particular FPR. A determination is made whether the particular FPR is a destination register for results of a second instruction which precedes the store instruction in program order. If so, a determination is made whether the second instruction must complete before subsequent instructions can be successfully dispatched. In response to a determination that the second instruction must be completed prior to successfully dispatching subsequent instructions, the floating-point instruction is cancelled and redispatched after the completion of the second instruction. In response to a determination that the second instruction need not be completed prior to successfully dispatching subsequent instructions, execution of the floating-point store instruction is initiated by computing the destination address within memory into which the operand of the floating-point store instruction is to be stored, thereby minimizing the delay in executing a floating-point store instruction.

    摘要翻译: 公开了一种用于管理包括具有多个浮点寄存器(FPR)的存储器和超标量处理器的数据处理系统内的浮点存储指令的执行的方法和装置。 根据本发明,调度多个指令以供处理器执行,包括具有作为特定FPR的内容的操作数的浮点存储指令。 确定特定FPR是否是用于以程序顺序在存储指令之前的第二指令的结果的目的地寄存器。 如果是,则确定第二条指令是否必须在后续指令可以成功发送之前完成。 响应于在成功发送后续指令之前必须完成第二条指令的确定,在完成第二条指令之后,浮点指令被取消并重新分配。 响应于在成功发送后续指令之前不需要完成第二指令的确定,通过计算浮点存储指令的操作数所在的存储器内的目标地址来启动浮点存储指令的执行 被存储,从而最小化执行浮点存储指令的延迟。