INSTRUCTION BOUNDARY PREDICTION FOR VARIABLE LENGTH INSTRUCTION SET
    21.
    发明申请
    INSTRUCTION BOUNDARY PREDICTION FOR VARIABLE LENGTH INSTRUCTION SET 有权
    可变长度指令集的指令边界预测

    公开(公告)号:US20140281246A1

    公开(公告)日:2014-09-18

    申请号:US13836374

    申请日:2013-03-15

    IPC分类号: G06F12/08

    摘要: A system, processor, and method to predict with high accuracy and retain instruction boundaries for previously executed instructions in order to decode variable length instructions is disclosed. In at least one embodiment, a disclosed processor includes an instruction fetch unit, an instruction cache, a boundary byte predictor, and an instruction decoder. In some embodiments, the instruction fetch unit provides an instruction address and the instruction cache produces an instruction tag and instruction cache content corresponding to the instruction address. The instruction decoder, in some embodiments, includes boundary byte logic to determine an instruction boundary in the instruction cache content.

    摘要翻译: 公开了一种以高精度预测并保留先前执行的指令的指令边界以便解码可变长度指令的系统,处理器和方法。 在至少一个实施例中,所公开的处理器包括指令提取单元,指令高速缓存,边界字节预测器和指令解码器。 在一些实施例中,指令获取单元提供指令地址,并且指令高速缓冲存储器产生与指令地址对应的指令标签和指令高速缓存内容。 在一些实施例中,指令解码器包括用于确定指令高速缓存内容中的指令边界的边界字节逻辑。

    PROCESSING SYSTEM USING VIRTUAL NETWORK INTERFACE CONTROLLER ADDRESSING AS FLOW CONTROL METADATA
    22.
    发明申请
    PROCESSING SYSTEM USING VIRTUAL NETWORK INTERFACE CONTROLLER ADDRESSING AS FLOW CONTROL METADATA 有权
    处理系统使用虚拟网络接口控制器寻址作为流量控制元数据

    公开(公告)号:US20140056141A1

    公开(公告)日:2014-02-27

    申请号:US13593707

    申请日:2012-08-24

    IPC分类号: H04L12/24 H04L12/50

    摘要: In a processing system comprising a plurality of processing nodes coupled via a switching fabric, a method includes implementing a flow control property for a data flow in the switching fabric based on an addressing property of an address of a virtual network interface controller associated with the data flow. A switching fabric includes a plurality of ports, each port coupleable to a corresponding processing node, and switching logic coupled to the plurality of ports. The switching fabric further includes flow control logic to implement a flow control property for a data flow in the switching logic based on an addressing property of an address of a virtual network interface controller associated with the data flow.

    摘要翻译: 在包括经由交换结构耦合的多个处理节点的处理系统中,一种方法包括基于与数据相关联的虚拟网络接口控制器的地址的寻址属性来实现交换结构中的数据流的流控制属性 流。 交换结构包括多个端口,每个端口可耦合到对应的处理节点,以及耦合到多个端口的交换逻辑。 交换结构还包括流控制逻辑,用于基于与数据流相关联的虚拟网络接口控制器的地址的寻址属性来实现切换逻辑中的数据流的流控制属性。

    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT
    23.
    发明申请
    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT 审中-公开
    动态优化条件咨询

    公开(公告)号:US20120079245A1

    公开(公告)日:2012-03-29

    申请号:US12890638

    申请日:2010-09-25

    IPC分类号: G06F9/312 G06F9/38 G06F9/30

    摘要: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    摘要翻译: 本文描述了用于有条件地提交和/或推测性检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供存储器排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时尝试防止事务用尽硬件资源。 虽然投机检查点能够在中止交易后快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

    Compressing and accessing a microcode ROM
    24.
    发明授权
    Compressing and accessing a microcode ROM 有权
    压缩和访问微码ROM

    公开(公告)号:US08099587B2

    公开(公告)日:2012-01-17

    申请号:US11186240

    申请日:2005-07-20

    IPC分类号: G06F9/00

    摘要: An arrangement is provided for compressing microcode ROM (“uROM”) in a processor and for efficiently accessing a compressed “uROM”. A clustering-based approach may be used to effectively compress a uROM. The approach groups similar columns of microcode into different clusters and identifies unique patterns within each cluster. Only unique patterns identified in each cluster are stored in a pattern storage. Indices, which help map an address of a microcode word (“uOP”) to be fetched from a uROM to unique patterns required for the uOP, may be stored in an index storage. Typically it takes a longer time to fetch a uOP from a compressed uROM than from an uncompressed uROM. The compressed uROM may be so designed that the process of fetching a uOP (or uOPs) from a compressed uROM may be fully-pipelined to reduce the access latency.

    摘要翻译: 提供了一种用于在处理器中压缩微代码ROM(“uROM”)并有效访问压缩的“uROM”的装置。 可以使用基于聚类的方法来有效地压缩uROM。 该方法将相似的微代码列组合成不同的集群,并识别每个集群内的唯一模式。 每个集群中唯一标识的模式都存储在模式存储中。 帮助将从uROM获取的微代码字(“uOP”)的地址映射到uOP所需的唯一模式的索引可以存储在索引存储器中。 通常,从压缩的uROM获取uop比从未压缩的uROM获取更长的时间。 压缩的uROM可以被设计成使得从压缩的uROM获取uop(或uop)的过程可以被完全流水线化以减少访问等待时间。

    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE
    25.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE 审中-公开
    用于提高功率的装置,方法和系统,通过与第二核心类型耦合的第一核心类型的性能效率

    公开(公告)号:US20110320766A1

    公开(公告)日:2011-12-29

    申请号:US12826107

    申请日:2010-06-29

    IPC分类号: G06F9/30 G06F15/76

    摘要: An apparatus and method is described herein for coupling a processor core of a first type with a co-designed core of a second type. Execution of program code on the first core is monitored and hot sections of the program code are identified. Those hot sections are optimize for execution on the co-designed core, such that upon subsequently encountering those hot sections, the optimized hot sections are executed on the co-designed core. When the co-designed core is executing optimized hot code, the first processor core may be in a low-power state to save power or executing other code in parallel. Furthermore, multiple threads of cold code may be pipelined on the first core, while multiple threads of hot code are pipeline on the co-designed core to achieve maximum performance.

    摘要翻译: 本文描述了一种用于将第一类型的处理器核与第二类型的共同设计的核耦合的装置和方法。 对第一个核心上的程序代码执行进行监控,并且识别程序代码的热部分。 这些热部分优化用于在共同设计的芯上执行,使得在随后遇到这些热部分时,优化的热部分在共同设计的核上执行。 当共同设计的核心正在执行优化的热代码时,第一处理器核心可以处于低功率状态以节省功率或并行执行其他代码。 此外,多个冷码线程可以在第一核心上流水线化,而多个热代码线程在共同设计的核心上进行流水线以实现最大性能。

    Method and apparatus for compression, decompression, and execution of program code
    26.
    发明授权
    Method and apparatus for compression, decompression, and execution of program code 失效
    用于压缩,解压缩和执行程序代码的方法和装置

    公开(公告)号:US06343354B1

    公开(公告)日:2002-01-29

    申请号:US09552304

    申请日:2000-04-19

    IPC分类号: G06F1200

    摘要: During a compressing portion, memory (20) is divided into cache line blocks (500). Each cache line block is compressed and modified by replacing address destinations of address indirection instructions with compressed address destinations. Each cache line block is modified to have a flow indirection instruction as the last instruction in each cache line. The compressed cache line blocks (500) are stored in a memory (858). During a decompression portion, a cache line (500) is accessed based on an instruction pointer (902) value. The cache line is decompressed and stored in cache. The cache tag is determined based on the instruction pointer (902) value.

    摘要翻译: 在压缩部分期间,存储器(20)被分成高速缓存行块(500)。 通过用压缩地址目的地替换地址间接指令的地址目的地来对每个高速缓存行块进行压缩和修改。 每个高速缓存行块被修改为具有流间接指令作为每个高速缓存行中的最后一条指令。 压缩高速缓存行块(500)被存储在存储器(858)中。 在解压缩部分期间,基于指令指针(902)值来访问高速缓存行(500)。 缓存行被解压缩并存储在缓存中。 基于指令指针(902)的值确定缓存标签。

    Method and apparatus for code translation optimization
    27.
    发明授权
    Method and apparatus for code translation optimization 失效
    用于代码转换优化的方法和装置

    公开(公告)号:US5805895A

    公开(公告)日:1998-09-08

    申请号:US709422

    申请日:1996-06-09

    IPC分类号: G06F9/45 G06F9/455 G06F9/445

    摘要: A native microprocessor (20) accesses a foreign block of computer code. An initial block scope defining translation parameters is assigned to the block (106). The block of "foreign" code is translated to "native" code (108). An optimization efficiency is calculated for the translated block (110). A rescheduling criterion is established based on the optimization efficiency (112). The block of native code is executed (114). On subsequent accesses of the block when the reschedule criterion is met (116) the block scope is redefined (118).

    摘要翻译: 本地微处理器(20)访问外部的计算机代码块。 定义转换参数的初始块范围被分配给块(106)。 “外国”代码块被翻译成“本机”代码(108)。 为翻译块计算优化效率(110)。 基于优化效率建立了重新安排的准则(112)。 本地代码块被执行(114)。 在满足重新安排标准的块的后续访问(116)中,块范围被重新定义(118)。

    Method and system for efficient instruction execution in a data
processing system having multiple prefetch units
    28.
    发明授权
    Method and system for efficient instruction execution in a data processing system having multiple prefetch units 失效
    具有多个预取单元的数据处理系统中有效指令执行的方法和系统

    公开(公告)号:US5737576A

    公开(公告)日:1998-04-07

    申请号:US754595

    申请日:1996-11-20

    IPC分类号: G06F9/38 G06F12/00

    摘要: In a data processing system, a plurality of prefetch elements are provided for prefetching instructions from a group of memory arrays coupled to each prefetch element. A plurality of instruction words are sequentially stored in each group of memory arrays coupled to each prefetch element. In response to a selected prefetch element receiving a prefetch token, the selected prefetch element sequentially recalls instruction words from the group of memory arrays coupled to the selected prefetch element. Thereafter, the selected prefetch element transfers the sequence of instruction words to a central processing unit at a rate of one instruction word per cycle time. In response to a forthcoming conditional branch instruction, a plurality of prefetch elements may initiate instruction fetching so that the proper instruction may be executed during the cycle time immediately following the conditional branch instruction. By coupling a group of memory banks to each prefetch element, and limiting the location of branch instructions to the last memory bank in the group of memory banks, the number of prefetch elements required to implement a data processing system having substantially similar performance to the prior art architecture is reduced. In an alternative embodiment, video memories are utilized to store instruction words, and provide such instruction words to the CPU at the rate of one instruction word per cycle time.

    摘要翻译: 在数据处理系统中,提供多个预取元素用于从耦合到每个预取元素的一组存储器阵列预取指令。 多个指令字顺序地存储在耦合到每个预取元件的每组存储器阵列中。 响应于所选择的预取元素接收预取标记,所选择的预取元素顺序地从耦合到所选择的预取元素的存储器组组中调用指令字。 此后,所选择的预取元件以每个周期时间的一个指令字的速率将指令字序列传送到中央处理单元。 响应于即将到来的条件分支指令,多个预取元素可以发起指令获取,使得可以在紧接在条件分支指令之后的周期时间期间执行适当的指令。 通过将一组存储器组耦合到每个预取元件,并且将分支指令的位置限制到存储器组组中的最后一个存储器组,实现具有与先前的基本相似性能的数据处理系统所需的预取元件的数量 美术建筑减少。 在替代实施例中,视频存储器用于存储指令字,并且以每个周期时间的一个指令字的速率向CPU提供这样的指令字。

    Method and system for managing cache memory utilizing multiple hash
functions
    29.
    发明授权
    Method and system for managing cache memory utilizing multiple hash functions 失效
    利用多个散列函数来管理高速缓冲存储器的方法和系统

    公开(公告)号:US5659699A

    公开(公告)日:1997-08-19

    申请号:US353005

    申请日:1994-12-09

    IPC分类号: G06F12/08 G06F12/10

    CPC分类号: G06F12/0864

    摘要: In a data processing system, a tag memory is divided into a first tag memory portion and a second tag memory portion. Next, an address for recalling requested data is generated by a central processing unit. Thereafter, a first and second tag memory addresses are concurrently computed, where the first and second tag memory addresses have bits which differ in value in a selected corresponding bit location. In response to the value of the bit in the selected bit location, the first tag memory address is coupled to either the first or second tag memory portion, and, concurrently, the second tag memory address is coupled to the other tag memory portion. Next, tag data is concurrently recalled from both the first and second tag memory portions utilizing the first and second tag memory addresses. A search tag is generated in response to the memory address from the CPU. Thereafter, the search tag and the recalled tag data from the first and second tag memory portions are concurrently compared. If either comparison results in a match, a "hit" is indicated. In response to the indication of a hit, requested data is recalled from the data portion of the cache memory system utilizing the recalled tag data that matched the search tag.

    摘要翻译: 在数据处理系统中,标签存储器被分成第一标签存储部分和第二标签存储器部分。 接下来,由中央处理单元生成用于调用所请求的数据的地址。 此后,同时计算第一和第二标签存储器地址,其中第一和第二标签存储器地址具有在所选择的相应位位置中的值不同的位。 响应于所选位位置中的位的值,第一标签存储器地址耦合到第一或第二标签存储器部分,并且同时,第二标签存储器地址耦合到另一个标签存储器部分。 接下来,使用第一和第二标签存储器地址从第一和第二标签存储器部分同时调用标签数据。 响应于来自CPU的存储器地址生成搜索标签。 此后,同时比较来自第一和第二标签存储器部分的搜索标签和调用的标签数据。 如果任一比较结果匹配,则指示“命中”。 响应于命中的指示,使用与搜索标签匹配的召回的标签数据,从高速缓冲存储器系统的数据部分调用所请求的数据。

    APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION
    30.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION 有权
    设备,方法和系统,用于提供原子地区条件性的决策机制

    公开(公告)号:US20130318507A1

    公开(公告)日:2013-11-28

    申请号:US13893238

    申请日:2013-05-13

    IPC分类号: G06F11/36

    摘要: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    摘要翻译: 本文描述了用于有条件地提交和/或推测性检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供内存排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时尝试防止事务用尽硬件资源。 虽然投机检查点能够在中止交易后快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。