Method and apparatus for executing vector instructions with merging behavior

    公开(公告)号:US11573801B1

    公开(公告)日:2023-02-07

    申请号:US17489301

    申请日:2021-09-29

    Abstract: A processor includes a register file and control logic that detects multiple different sets of sequential zero bits of a register in the register file, wherein each of the multiple different sets has a bit length that corresponds to a partial instruction width and operates at a first partial instruction width or a second partial instruction width with the register file depending on number of sets of zero bits detected in the register. In certain examples, the control logic causes operating at first instruction width that avoids merging of a first bit length of data in the register and operating at the second instruction width that avoids merging of a second bit length of data in the register. In some examples, a register rename map table incudes multiple zero bits that identify the detected multiple different sets of bits of sequential zeros.

    Bit width reconfiguration using a shadow-latch configured register file

    公开(公告)号:US11544065B2

    公开(公告)日:2023-01-03

    申请号:US16585817

    申请日:2019-09-27

    Abstract: A processor includes a front-end with an instruction set that operates at a first bit width and a floating point unit coupled to receive the instruction set in the processor that operates at the first bit width. The floating point unit operates at a second bit width and, based upon a bit width assessment of the instruction set provided to the floating point unit, the floating point unit employs a shadow-latch configured floating point register file to perform bit width reconfiguration. The shadow-latch configured floating point register file includes a plurality of regular latches and a plurality of shadow latches for storing data that is to be either read from or written to the shadow latches. The bit width reconfiguration enables the floating point unit that operates at the second bit width to operate on the instruction set received at the first bit width.

    Faster sparse flush recovery by creating groups that are marked based on an instruction type

    公开(公告)号:US10776123B2

    公开(公告)日:2020-09-15

    申请号:US16207548

    申请日:2018-12-03

    Abstract: Systems, apparatuses, and methods for performing efficient processor pipeline flush recovery are disclosed. A processor core includes a retire queue for storing information of outstanding instructions. When the retire queue logic detects that a pipeline flush condition occurs, the logic creates one or more groups of entries in the retire queue. The logic begins the groups with an entry storing information for a youngest outstanding instruction, and creates other groups in a contiguous manner after creating this first group. The logic marks with a first indication a given group when the given group includes one or more instructions of a given type. The logic marks with a second indication the given group when the given group does not include an instruction of the given type. The logic sends to flush recovery logic information of one or more entries in only groups marked with the first indication.

    FASTER SPARSE FLUSH RECOVERY
    15.
    发明申请

    公开(公告)号:US20200174796A1

    公开(公告)日:2020-06-04

    申请号:US16207548

    申请日:2018-12-03

    Abstract: Systems, apparatuses, and methods for performing efficient processor pipeline flush recovery are disclosed. A processor core includes a retire queue for storing information of outstanding instructions. When the retire queue logic detects that a pipeline flush condition occurs, the logic creates one or more groups of entries in the retire queue. The logic begins the groups with an entry storing information for a youngest outstanding instruction, and creates other groups in a contiguous manner after creating this first group. The logic marks with a first indication a given group when the given group includes one or more instructions of a given type. The logic marks with a second indication the given group when the given group does not include an instruction of the given type. The logic sends to flush recovery logic information of one or more entries in only groups marked with the first indication.

    Accelerated reversal of speculative state changes and resource recovery
    16.
    发明授权
    Accelerated reversal of speculative state changes and resource recovery 有权
    加速逆转投机状态变化和资源回收

    公开(公告)号:US09575763B2

    公开(公告)日:2017-02-21

    申请号:US13918863

    申请日:2013-06-14

    CPC classification number: G06F9/384 G06F9/3842 G06F9/3859 G06F9/3861

    Abstract: A method includes undoing, in reverse program order, changes in a state of a processing device caused by speculative instructions previously dispatched for execution in the processing device and concurrently deallocating resources previously allocated to the speculative instructions in response to interruption of dispatch of instructions due to a flush of the speculative instructions. A processor device comprises a retire queue to store entries for instructions that are awaiting retirement and a finite state machine. The finite state machine is to interrupt dispatch of instructions in response to a flush of speculative instructions previously dispatched for execution in the processing device and to undo, in reverse program order, changes in a state of the processing device caused by the speculative instructions while concurrently deallocating resources previously allocated to the speculative instructions.

    Abstract translation: 一种方法包括以反向程序顺序来撤销由先前在处理设备中执行的推测性指令引起的处理设备的状态的改变,并且响应于由于指令的发送中断而先前分配给推测指令的资源 冲突的投机指示。 处理器设备包括用于存储等待退休的指令的条目的退出队列和有限状态机。 有限状态机是响应于先前调度以在处理设备中执行的推测性指令的刷新来中断指令的分派,并且以反向程序顺序撤销由推测指令引起的处理设备的状态的改变,同时 释放以前分配给投机指示的资源。

    Methods and apparatus for providing mask register optimization for vector operations

    公开(公告)号:US12223324B2

    公开(公告)日:2025-02-11

    申请号:US17957604

    申请日:2022-09-30

    Abstract: A data processing system includes a vector data processing unit that includes a shared scheduler queue configured to store in a same queue, at least one entry that includes at least a mask type instruction and another entry that includes at least a vector type instruction. Shared pipeline control logic controls a vector data path or a mask data path, based a type of instruction picked from the same queue. In some examples, at least one mask type instruction and the at least one vector type instruction each include a source operand having a corresponding shared source register bit field that indexes into both a mask register file and a vector register file. The shared pipeline control logic uses a mask register file or a vector register file depending on whether bits of the shared source register bit field identify a mask source register or a vector source register.

    Thread forward progress and/or quality of service

    公开(公告)号:US12204935B2

    公开(公告)日:2025-01-21

    申请号:US17390149

    申请日:2021-07-30

    Abstract: Methods, systems, and apparatuses provide support for allowing thread forward progress in a processing system and that improves quality of service. One system includes a processor; a bus coupled to the processor; a memory coupled to the processor via the bus; and a floating point unit coupled to the processor via the bus, wherein floating point unit comprises hardware control logic operative to: store for each thread, by a scheduler of the floating point unit, a counter; increase, by the scheduler, a value of the counter for each thread corresponding to a thread when at least one source ready operation exist for the thread; compare, by the scheduler, the value of the counter to a predetermined threshold; and make other threads ineligible to be picked by the scheduler when the counter is greater than or equal to the predetermined threshold.

Patent Agency Ranking