-
公开(公告)号:US12086594B2
公开(公告)日:2024-09-10
申请号:US18239106
申请日:2023-08-28
Applicant: Intel Corporation
Inventor: Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount
IPC: G06F9/30 , G06F9/34 , H01L29/66 , H01L29/775 , H01L29/78 , H01L29/786
CPC classification number: G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30025 , G06F9/30032 , G06F9/30036 , G06F9/30047 , G06F9/30149 , G06F9/30181 , G06F9/30185 , G06F9/30192 , G06F9/34 , H01L29/66553 , H01L29/775 , H01L29/7831 , H01L29/78696 , G06F9/30018 , H01L29/66
Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.
-
公开(公告)号:US11210096B2
公开(公告)日:2021-12-28
申请号:US17004711
申请日:2020-08-27
Applicant: Intel Corporation
Inventor: Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount
Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.
-
公开(公告)号:US09606847B2
公开(公告)日:2017-03-28
申请号:US14575614
申请日:2014-12-18
Applicant: Intel Corporation
Inventor: Jesus San Adrian Corbal , Dennis R. Bradford , Rohan Sharma
CPC classification number: G06F11/07 , G06F11/0706 , G06F11/0751
Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for detecting and reporting errors in a machine check environment. A processing device includes an error monitoring module, which detects an error corresponding to data associated with execution of an instruction by the processing device and determines whether the error occurs on portion of the data that affects a result of the instruction. The processing device further enables error detection when it is determined that the error occurs on the portion of the data that affects the result of the execution of the instruction.
-
公开(公告)号:US12073214B2
公开(公告)日:2024-08-27
申请号:US17952001
申请日:2022-09-23
Applicant: Intel Corporation
Inventor: Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson
CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30036 , G06F9/30109 , G06F9/30112 , G06F9/3893
Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.
-
公开(公告)号:US11487541B2
公开(公告)日:2022-11-01
申请号:US17107134
申请日:2020-11-30
Applicant: Intel Corporation
Inventor: Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson
Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.
-
公开(公告)号:US10146535B2
公开(公告)日:2018-12-04
申请号:US15299420
申请日:2016-10-20
Applicant: Intel Corporation
Inventor: Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson
Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.
-
公开(公告)号:US09842046B2
公开(公告)日:2017-12-12
申请号:US13631378
申请日:2012-09-28
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall
CPC classification number: G06F12/00 , G06F3/06 , G06F3/0608 , G06F3/0641 , G06F9/30 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F11/1453 , G06F12/0246
Abstract: A method of an aspect includes receiving an instruction indicating a first source packed memory indices, a second source packed data operation mask, and a destination storage location. Memory indices of the packed memory indices are compared with one another. One or more sets of duplicate memory indices are identified. Data corresponding to each set of duplicate memory indices is loaded only once. The loaded data corresponding to each set of duplicate memory indices is replicated for each of the duplicate memory indices in the set. A packed data result in the destination storage location in response to the instruction. The packed data result includes data elements from memory locations that are indicated by corresponding memory indices of the packed memory indices when not blocked by corresponding elements of the packed data operation mask.
-
公开(公告)号:US09696992B2
公开(公告)日:2017-07-04
申请号:US14581815
申请日:2014-12-23
Applicant: INTEL CORPORATION
Inventor: Jesus Corbal San Adrian , Robert N. Hanek , Warren E. Ferguson , Taraneh Bahrami , Avi A. Tevet , Dennis R. Bradford , Michael Ferry , Jingwei Zhang
CPC classification number: G06F9/3001 , G06F9/30014 , G06F9/30076 , G06F9/30145 , G06F9/30181 , G06F9/3861 , G06F9/4552 , G06F9/45525
Abstract: An apparatus and method for performing a check on inputs to a mathematical instruction and selecting a default sequence efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: an arithmetic logic unit (ALU) to perform a plurality of mathematical operations using one or more source operands; instruction check logic to evaluate the source operands for a current mathematical instruction and to determine, based on the evaluation, whether to execute a default sequence of operations including executing the current mathematical instruction by the ALU or to jump to an alternate sequence of operations adapted to provide a result for the mathematical instruction having particular types of source operands more efficiently than the default sequence of operations.
-
公开(公告)号:US10853065B2
公开(公告)日:2020-12-01
申请号:US16169456
申请日:2018-10-24
Applicant: Intel Corporation
Inventor: Jesus Corbal , Robert Valentine , Roman S. Dubtsov , Nikita A. Shustrov , Mark J. Charney , Dennis R. Bradford , Milind B. Girkar , Edward T. Grochowski , Thomas D. Fletcher , Warren E. Ferguson
Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.
-
公开(公告)号:US10795680B2
公开(公告)日:2020-10-06
申请号:US16289506
申请日:2019-02-28
Applicant: Intel Corporation
Inventor: Robert C. Valentine , Jesus Corbal San Adrian , Roger Espasa Sans , Robert D. Cavin , Bret L. Toll , Santiago Galan Duran , Jeffrey G. Wiedemeier , Sridhar Samudrala , Milind Baburao Girkar , Edward Thomas Grochowski , Jonathan Cannon Hall , Dennis R. Bradford , Elmoustapha Ould-Ahmed-Vall , James C. Abel , Mark Charney , Seth Abraham , Suleyman Sair , Andrew Thomas Forsyth , Lisa Wu , Charles Yount
Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.
-
-
-
-
-
-
-
-
-