-
公开(公告)号:US20230376313A1
公开(公告)日:2023-11-23
申请号:US17747919
申请日:2022-05-18
Applicant: Intel Corporation
Inventor: Menachem Adelman , Amit Gradstein , Cristina Anderson , Marius Cornea-Hasegan
CPC classification number: G06F9/30145 , G06F9/30021 , G06F7/22 , G06F7/483
Abstract: Techniques for instructions for min-max operations are described. An example apparatus comprises decoder circuitry to decode a single instruction, the single instruction to include fields for identifiers of a first source operand, a second source operand, an a destination operand, a field for an immediate operand, and a field for an opcode, the opcode to indicate execution circuitry is to perform a min-max operation, and execution circuitry to execute the decoded instruction according to the opcode to perform the min-max operation to determine a particular operation of five or more minimum and maximum operations in accordance with a value of the immediate operand, perform the determined particular operation on the identified first source operand and the identified second source operand to return a result, and store the result into the identified destination operand. Other examples are described and claimed.
-
公开(公告)号:US09235415B2
公开(公告)日:2016-01-12
申请号:US14533474
申请日:2014-11-05
Applicant: Intel Corporation
Inventor: Cristina Anderson , Mark Buxton , Doron Orenstein , Robert Valentine
CPC classification number: G06F9/30032 , G06F7/76 , G06F9/30036 , G06F9/30047 , G06F9/30145 , G06F9/30167 , G06F9/30181 , G06F12/0875 , G06F2212/452
Abstract: In one embodiment, the present invention includes logic to receive a permute instruction, first and second source operands, and control values, and to perform a permute operation based on an operation between at least two of the control values so that selected portions of the first and second source operands or a predetermined value can be stored into elements of a destination. Multiple permute instructions may be combined to perform efficient table lookups. Other embodiments are described and claimed.
Abstract translation: 在一个实施例中,本发明包括接收置换指令,第一和第二源操作数和控制值的逻辑,以及基于至少两个控制值之间的操作执行置换操作,使得第一 并且可以将第二源操作数或预定值存储到目的地的元素中。 可以组合多个置换指令以执行有效的表查找。 描述和要求保护其他实施例。
-
公开(公告)号:US20150058603A1
公开(公告)日:2015-02-26
申请号:US14533474
申请日:2014-11-05
Applicant: Intel Corporation
Inventor: Cristina Anderson , Mark Buxton , Doron Orenstein , Robert Valentine
CPC classification number: G06F9/30032 , G06F7/76 , G06F9/30036 , G06F9/30047 , G06F9/30145 , G06F9/30167 , G06F9/30181 , G06F12/0875 , G06F2212/452
Abstract: In one embodiment, the present invention includes logic to receive a permute instruction, first and second source operands, and control values, and to perform a permute operation based on an operation between at least two of the control values so that selected portions of the first and second source operands or a predetermined value can be stored into elements of a destination. Multiple permute instructions may be combined to perform efficient table lookups. Other embodiments are described and claimed.
Abstract translation: 在一个实施例中,本发明包括接收置换指令,第一和第二源操作数和控制值的逻辑,以及基于至少两个控制值之间的操作执行置换操作,使得第一 并且可以将第二源操作数或预定值存储到目的地的元素中。 可以组合多个置换指令以执行有效的表查找。 描述和要求保护其他实施例。
-
公开(公告)号:US12229554B2
公开(公告)日:2025-02-18
申请号:US17463405
申请日:2021-08-31
Applicant: Intel Corporation
Inventor: Alexander Heinecke , Menachem Adelman , Robert Valentine , Zeev Sperber , Amit Gradstein , Mark Charney , Evangelos Georganas , Dhiraj Kalamkar , Christopher Hughes , Cristina Anderson
Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand.
-
公开(公告)号:US12204903B2
公开(公告)日:2025-01-21
申请号:US17359522
申请日:2021-06-26
Applicant: Intel Corporation
Inventor: Venkateswara Madduri , Cristina Anderson , Robert Valentine , Mark Charney , Vedvyas Shanbhogue
IPC: G06F9/30
Abstract: Techniques for matrix multiplication are described. In some examples, a single instruction having a format of fields for an opcode, one or more fields to indicate a location of a source/destination operand, one or more fields to indicate a location of a first source operand, and one or more fields to indicate a location of a second source operand is used. Wherein the opcode is to indicate that execution circuitry is to: multiply values from corresponding data elements of the first and second sources, add a first subset of the multiplied values to a first value from the source/destination operand and store in a first data element position of the source/destination operand, and add a second subset of the multiplied values to a second value from the source/destination operand and store in a second data element position of the source/destination operand.
-
公开(公告)号:US10768896B2
公开(公告)日:2020-09-08
申请号:US15850636
申请日:2017-12-21
Applicant: Intel Corporation
Inventor: Cristina Anderson , Elmoustapha Ould-Ahmed-Vall , Marius Cornea-Hasegan , Robert Valentine , Mark Charney , Jesus Corbal , Venkateswara Madduri
Abstract: An apparatus and method for performing a reciprocal. For example one embodiment of a processor comprises: a decoder to decode a reciprocal instruction to generate a decoded reciprocal instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal execution circuitry to execute the decoded reciprocal instruction, the reciprocal execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal execution circuitry to generate a reciprocal of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.
-
公开(公告)号:US10664237B2
公开(公告)日:2020-05-26
申请号:US15850673
申请日:2017-12-21
Applicant: Intel Corporation
Inventor: Cristina Anderson , Elmoustapha Ould-Ahmed-Vall , Marius Cornea-Hasegan , Robert Valentine , Mark Charney , Jesus Corbal , Venkateswara Madduri
Abstract: An apparatus and method for performing a reciprocal square root. For example one embodiment of a processor comprises: a decoder to decode a reciprocal square root instruction to generate a decoded reciprocal square root instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal square root execution circuitry to execute the decoded reciprocal square root instruction, the reciprocal square root execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal square root execution circuitry to generate a reciprocal square root of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.
-
-
-
-
-
-