Apparatus and method of improved extract instructions
    71.
    发明授权
    Apparatus and method of improved extract instructions 有权
    改进提取指令的装置和方法

    公开(公告)号:US09588764B2

    公开(公告)日:2017-03-07

    申请号:US13976998

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

    摘要翻译: 描述了一种装置,其包括执行第一,第二,第三和第四指令的指令执行电路,第一和第二指令从第一和第二输入向量的多个第一非重叠部分之一中选择第一组输入向量元素 。 多个第一非重叠部分中的每一个具有与第一组相同的位宽度。 第三和第四指令都从相应的第三和第四输入向量的多个第二非重叠部分之一中选择第二组输入向量元素。 第二组具有比第一位宽大的第二位宽度。 多个第二非重叠部分中的每一个具有与第二组相同的位宽度。 该装置包括掩蔽层电路,以第一粒度和第二粒度掩蔽第一和第二组。

    Systems, apparatuses,and methods for zeroing of bits in a data element
    73.
    发明授权
    Systems, apparatuses,and methods for zeroing of bits in a data element 有权
    用于使数据元素中的位归零的系统,装置和方法

    公开(公告)号:US09207942B2

    公开(公告)日:2015-12-08

    申请号:US13840669

    申请日:2013-03-15

    IPC分类号: G06F9/30 G06F9/00

    摘要: Embodiments of systems, methods and apparatuses for execution a NAME instruction are described. The execution of a VPBZHI causes, on a per data element basis of a second source, a zeroing of bits higher (more significant) than a starting point in the data element. The starting point is defined by the contents of a data element in a first source. The resultant data elements are stored in a corresponding data element position of a destination.

    摘要翻译: 描述用于执行NAME指令的系统,方法和装置的实施例。 VPBZHI的执行在基于每个数据元素的第二源上导致比数据元素中的起始点更高(更高有效)的位的归零。 起始点由第一个数据元素的内容定义。 所得数据元素存储在目的地的相应数据元素位置。

    APPARATUS AND METHOD TO RESERVE AND PERMUTE BITS IN A MASK REGISTER
    74.
    发明申请
    APPARATUS AND METHOD TO RESERVE AND PERMUTE BITS IN A MASK REGISTER 有权
    在掩码寄存器中保存和保留位置的设备和方法

    公开(公告)号:US20150006847A1

    公开(公告)日:2015-01-01

    申请号:US13929563

    申请日:2013-06-27

    IPC分类号: G06F9/30

    摘要: An apparatus and method are described for performing a bit reversal and permutation on mask values. For example, a processor is described to execute an instruction to perform the operations of: reading a plurality of mask bits stored in a source mask register, the mask bits associated with vector data elements of a vector register; and performing a bit reversal operation to copy each mask bit from a source mask register to a destination mask register, wherein the bit reversal operation causes bits from the source mask register to be reversed within the destination mask register resulting in a symmetric, mirror image of the original bit arrangement.

    摘要翻译: 描述了一种用于对掩码值进行位反转和置换的装置和方法。 例如,处理器被描述为执行执行以下操作的指令:读取存储在源屏蔽寄存器中的多个屏蔽位,与向量寄存器的向量数据元素相关联的掩码位; 并且执行位反转操作以将每个屏蔽位从源屏蔽寄存器复制到目的地屏蔽寄存器,其中位反转操作使得来自源屏蔽寄存器的位在目标掩码寄存器内反转,导致对称的镜像 原来的位安排。

    VECTOR FREQUENCY COMPRESS INSTRUCTION
    75.
    发明申请
    VECTOR FREQUENCY COMPRESS INSTRUCTION 有权
    矢量频率压缩指令

    公开(公告)号:US20140317377A1

    公开(公告)日:2014-10-23

    申请号:US13993058

    申请日:2011-12-30

    IPC分类号: G06F9/30

    摘要: A processor core that includes a hardware decode unit to decode a vector frequency compress instruction that includes a source operand and a destination operand. The source operand specifying a source vector register that includes a plurality of source data elements including one or more runs of identical data elements that are each to be compressed in a destination vector register as a value and run length pair. The destination operand identifies the destination vector register. The processor core also includes an execution engine unit to execute the decoded vector frequency compress instruction which causes, for each source data element, a value to be copied into the destination vector register to indicate that source data element's value. One or more runs of the source data elements equal are encoded in the destination vector register as the predetermined compression value followed by a run length for that run.

    摘要翻译: 一种处理器核心,其包括用于解码包括源操作数和目的地操作数的向量频率压缩指令的硬件解码单元。 源操作数指定源向量寄存器,其包括多个源数据元素,其包括在目的地向量寄存器中各自被压缩的相同数据元素的一个或多个游程作为值和游程长度对。 目标操作数标识目标向量寄存器。 处理器核心还包括执行引擎单元,用于执行解码的向量频率压缩指令,其对于每个源数据元素,其将被复制到目的地向量寄存器中的值指示源数据元素的值。 源数据元素相等的一个或多个运行在目标向量寄存器中被编码为预定压缩值,后跟该运行的运行长度。

    INSTRUCTION AND LOGIC TO PROVIDE VECTOR HORIZONTAL COMPARE FUNCTIONALITY
    77.
    发明申请
    INSTRUCTION AND LOGIC TO PROVIDE VECTOR HORIZONTAL COMPARE FUNCTIONALITY 有权
    指令和逻辑提供矢量水平比较功能

    公开(公告)号:US20140258683A1

    公开(公告)日:2014-09-11

    申请号:US13977733

    申请日:2011-11-30

    IPC分类号: G06F9/30

    摘要: Instructions and logic provide vector horizontal compare functionality. Some embodiments, responsive to an instruction specifying: a destination operand, a size of the vector elements, a source operand, and a mask corresponding to a portion of the vector element data fields in the source operand; read values from data fields of the specified size in the source operand, corresponding to the mask and compare the values for equality. In some embodiments, responsive to a detection of inequality, a trap may be taken. In some alternative embodiments, a flag may be set. In other alternative embodiments, a mask field may be set to a masked state for the corresponding unequal value(s). In some embodiments, responsive to all unmasked data fields of the source operand being equal to a particular value, that value may be broadcast to all data fields of the specified size in the destination operand.

    摘要翻译: 指令和逻辑提供向量横向比较功能。 一些实施例,响应于指定目的地操作数,向量元素的大小,源操作数和对应于源操作数中的向量元素数据字段的一部分的掩码的指令; 从源操作数中的指定大小的数据字段读取值,对应于掩码,并比较相等的值。 在一些实施例中,响应于不等式的检测,可以采取陷阱。 在一些替代实施例中,可以设置标志。 在其他替代实施例中,可以将掩模字段设置为对应不等值的掩蔽状态。 在一些实施例中,响应于源操作数的所有未屏蔽的数据字段等于特定值,该值可以广播到目的地操作数中指定大小的所有数据字段。

    INSTRUCTION FOR ELEMENT OFFSET CALCULATION IN A MULTI-DIMENSIONAL ARRAY
    79.
    发明申请
    INSTRUCTION FOR ELEMENT OFFSET CALCULATION IN A MULTI-DIMENSIONAL ARRAY 有权
    元素偏差计算在多维阵列中的指导

    公开(公告)号:US20140201497A1

    公开(公告)日:2014-07-17

    申请号:US13976004

    申请日:2011-12-23

    IPC分类号: G06F9/30

    摘要: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.

    摘要翻译: 描述了具有功能单元逻辑电路的装置。 功能单元逻辑电路具有第一寄存器以存储具有用于多维数据结构的每个维度的元素的第一输入向量操作数。 第一个向量操作数的每个元素指定其相应维度的大小。 功能单元具有第二寄存器,用于存储指定多维结构的特定段的坐标的第二输入向量操作数。 功能单元还具有逻辑电路,用于相对于多维结构的原点片段的地址计算特定片段的地址偏移。

    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT DETECTION FUNCTIONALITY
    80.
    发明申请
    METHODS, APPARATUS, INSTRUCTIONS, AND LOGIC TO PROVIDE VECTOR ADDRESS CONFLICT DETECTION FUNCTIONALITY 有权
    方法,装置,说明和逻辑提供矢量地址冲突检测功能

    公开(公告)号:US20140189308A1

    公开(公告)日:2014-07-03

    申请号:US13731006

    申请日:2012-12-29

    IPC分类号: G06F9/30

    摘要: Instructions and logic provide SIMD address conflict detection functionality. Some embodiments include processors with a register with a variable plurality of data fields, each of the data fields to store an offset for a data element in a memory. A destination register has corresponding data fields, each of these data fields to store a variable second plurality of bits to store a conflict mask having a mask bit for each offset. Responsive to decoding a vector conflict instruction, execution units compare the offset in each data field with every less significant data field to determine if they hold a matching offset, and in corresponding conflict masks in the destination register, set any mask bits corresponding to a less significant data field with a matching offset. Vector address conflict detection can be used with variable sized elements and to generate conflict masks to resolve dependencies in gather-modify-scatter SIMD operations.

    摘要翻译: 指令和逻辑提供SIMD地址冲突检测功能。 一些实施例包括具有可变多个数据字段的寄存器的处理器,每个数据字段存储用于存储器中的数据元素的偏移量。 目的地寄存器具有对应的数据字段,这些数据字段中的每一个用于存储可变的第二多个位以存储具有每个偏移的掩码位的冲突掩码。 响应于对向量冲突指令进行解码,执行单元将每个数据字段中的偏移量与每个较不重要的数据字段进行比较,以确定它们是否保持匹配的偏移,并且在目标寄存器中的相应冲突掩码中,设置对应于较少 具有匹配偏移的重要数据字段。 向量地址冲突检测可以与可变大小的元素一起使用,并生成冲突掩码来解决收集修改分散SIMD操作中的依赖关系。