摘要:
Instructions and logic provide conversions between a mask register and a general purpose register or memory. Some embodiments, responsive to an instruction specifying: a destination operand, a mask length corresponding to a number of mask data fields, and a source operand; values are read from data fields in the source operand, corresponding to the specified mask length, and stored to corresponding data fields in the destination operand specified by the instruction, wherein one of the source or the destination operands is a mask register. Values indicative of masked vector elements may be stored to any data fields in the destination operand other than the number of data fields corresponding to the specified mask length. For some embodiments, the other one of the source or the destination operands may be a general purpose register or a memory location.
摘要:
Vector blend and permute functionality are provided, responsive to instructions specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, a second vector register, and a third operand. Indices are read from fields in the second register. Each index has a first selector portion and a second selector portion. Corresponding unmasked vector elements are stored to fields of the destination register, wherein each vector element, responsive to the respective first selector portion having a first value, is copied to an intermediate vector from a corresponding data field of the first register, and responsive to the respective first selector portion having a second value, is copied to the intermediate vector from a corresponding data field of the third operand. Then unmasked data fields of the destination are replaced by data fields in the intermediate vector indexed by the corresponding second selector portions.
摘要:
A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.
摘要:
Embodiments of systems, apparatuses, and methods for performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode are described.
摘要:
Receive packed data operation mask comparison instruction indicating first packed data operation mask having first packed data operation mask bits and second packed data operation mask having second packed data operation mask bits. Each packed data operation mask bit of first mask corresponds to a packed data operation mask bit of second mask in corresponding position. Modify first flag to first value if bitwise AND of each packed data operation mask bit of first mask with each corresponding packed data operation mask bit of second mask is zero. Otherwise modify first flag to second value. Modify second flag to third value if bitwise AND of each packed data operation mask bit of first mask with bitwise NOT of each corresponding packed data operation mask bit of second mask is zero. Otherwise modify second flag to fourth value.
摘要:
Instructions and logic provide conversions between a mask register and a general purpose register or memory. Some embodiments, responsive to an instruction specifying: a destination operand, a mask length corresponding to a number of mask data fields, and a source operand; values are read from data fields in the source operand, corresponding to the specified mask length, and stored to corresponding data fields in the destination operand specified by the instruction, wherein one of the source or the destination operands is a mask register. Values indicative of masked vector elements may be stored to any data fields in the destination operand other than the number of data fields corresponding to the specified mask length. For some embodiments, the other one of the source or the destination operands may be a general purpose register or a memory location.
摘要:
A processing core implemented on a semiconductor chip is described having first execution unit logic circuitry that includes first comparison circuitry to compare each element in a first input vector against every element of a second input vector. The processing core also has second execution logic circuitry that includes second comparison circuitry to compare a first input value against every data element of an input vector.
摘要:
An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from the destination operand and a second source operand based on index values stored in a first source operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the destination operand and the second source operand may be copied to any one of the data element positions within the destination operand; if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.
摘要:
Receive packed data operation mask comparison instruction indicating first packed data operation mask having first packed data operation mask bits and second packed data operation mask having second packed data operation mask bits. Each packed data operation mask bit of first mask corresponds to a packed data operation mask bit of second mask in corresponding position. Modify first flag to first value if bitwise AND of each packed data operation mask bit of first mask with each corresponding packed data operation mask bit of second mask is zero. Otherwise modify first flag to second value. Modify second flag to third value if bitwise AND of each packed data operation mask bit of first mask with bitwise NOT of each corresponding packed data operation mask bit of second mask is zero. Otherwise modify second flag to fourth value.
摘要:
A loop remainder mask instruction indicates a current iteration count of a loop as a first operand, an iteration limit of a loop as a second operand, and a destination. The loop contains iterations and each iteration includes a data element of the array. A processor receives the loop remainder mask instruction, decodes the instruction for execution, and stores a result of the execution in the destination. The result indicates a number of data elements of the array past an end of a preceding portion of the array that are to be handled separately from the preceding portion, the end of the preceding portion being where the current iteration count is recorded.