-
公开(公告)号:US11003455B2
公开(公告)日:2021-05-11
申请号:US16398183
申请日:2019-04-29
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Brian J. Hickmann , Jonathan C. Hall , Christopher J. Hughes
IPC: G06F9/38 , G06F9/30 , G06F12/0875 , G06F12/1027 , G06F15/80 , G06F13/42
Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.
-
公开(公告)号:US10133577B2
公开(公告)日:2018-11-20
申请号:US13997791
申请日:2012-12-19
Applicant: Intel Corporation
Inventor: Jesus Corbal , Dennis R. Bradford , Jonathan C. Hall , Thomas D. Fletcher , Brian J. Hickmann , Dror Markovich , Amit Gradstein
Abstract: A processor includes an instruction schedule and dispatch (schedule/dispatch) unit to receive a single instruction multiple data (SIMD) instruction to perform an operation on multiple data elements stored in a storage location indicated by a first source operand. The instruction schedule/dispatch unit is to determine a first of the data elements that will not be operated to generate a result written to a destination operand based on a second source operand. The processor further includes multiple processing elements coupled to the instruction schedule/dispatch unit to process the data elements of the SIMD instruction in a vector manner, and a power management unit coupled to the instruction schedule/dispatch unit to reduce power consumption of a first of the processing elements configured to process the first data element.
-
公开(公告)号:US09715432B2
公开(公告)日:2017-07-25
申请号:US14581859
申请日:2014-12-23
Applicant: INTEL CORPORATION
Inventor: Ramon Matas , Roger Gramunt , Chung-Lun Chan , Benjamin C. Chaffin , Aditya Kesiraju , Jonathan C. Hall , Jesus Corbal
CPC classification number: G06F11/141 , G06F9/30036 , G06F9/30072 , G06F9/38 , G06F9/3859 , G06F9/3865
Abstract: Exemplary aspects are directed toward resolving fault suppression in hardware, which at the same time does not incur a performance hit. For example, when multiple instructions are executing simultaneously, a mask can specify which elements need not be executed. If the mask is disabled, those elements do not need to be executed. A determination is then made as to whether a fault happens in one of the elements that have been disabled. If there is a fault in one of the elements that has been disabled, a state machine re-fetches the instructions in a special mode. More specifically, the state machine determines if the fault is on a disabled element, and if the fault is on a disabled element, then the state machine specifies that the fault should be ignored. If during the first execution there was no mask, if there is an error present during execution, then the element is re-run with the mask to see if the error is a “real” fault.
-
公开(公告)号:US09645826B2
公开(公告)日:2017-05-09
申请号:US14975292
申请日:2015-12-18
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Brian J. Hickmann , Jonathan C. Hall , Christopher J. Hughes
IPC: G06F12/10 , G06F9/38 , G06F9/30 , G06F12/0875 , G06F12/1027 , G06F15/80 , G06F13/42
CPC classification number: G06F9/3853 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30098 , G06F9/30105 , G06F9/30145 , G06F9/3804 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F12/0875 , G06F12/1027 , G06F13/4282 , G06F15/8007 , G06F2212/1016 , G06F2212/452 , G06F2212/68
Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.
-
公开(公告)号:US09842046B2
公开(公告)日:2017-12-12
申请号:US13631378
申请日:2012-09-28
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Dennis R. Bradford , Jonathan C. Hall
CPC classification number: G06F12/00 , G06F3/06 , G06F3/0608 , G06F3/0641 , G06F9/30 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F11/1453 , G06F12/0246
Abstract: A method of an aspect includes receiving an instruction indicating a first source packed memory indices, a second source packed data operation mask, and a destination storage location. Memory indices of the packed memory indices are compared with one another. One or more sets of duplicate memory indices are identified. Data corresponding to each set of duplicate memory indices is loaded only once. The loaded data corresponding to each set of duplicate memory indices is replicated for each of the duplicate memory indices in the set. A packed data result in the destination storage location in response to the instruction. The packed data result includes data elements from memory locations that are indicated by corresponding memory indices of the packed memory indices when not blocked by corresponding elements of the packed data operation mask.
-
公开(公告)号:US09626193B2
公开(公告)日:2017-04-18
申请号:US14976216
申请日:2015-12-21
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Brian J. Hickmann , Jonathan C. Hall , Christopher J. Hughes
IPC: G06F12/00 , G06F9/38 , G06F9/30 , G06F12/0875 , G06F12/1027 , G06F15/80 , G06F13/42
CPC classification number: G06F9/3853 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30098 , G06F9/30105 , G06F9/30145 , G06F9/3804 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F12/0875 , G06F12/1027 , G06F13/4282 , G06F15/8007 , G06F2212/1016 , G06F2212/452 , G06F2212/68
Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.
-
公开(公告)号:US09626192B2
公开(公告)日:2017-04-18
申请号:US14975327
申请日:2015-12-18
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Brian J. Hickmann , Jonathan C. Hall , Christopher J. Hughes
IPC: G06F12/00 , G06F9/38 , G06F9/30 , G06F12/0875 , G06F12/1027 , G06F15/80 , G06F13/42
CPC classification number: G06F9/3853 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30098 , G06F9/30105 , G06F9/30145 , G06F9/3804 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F12/0875 , G06F12/1027 , G06F13/4282 , G06F15/8007 , G06F2212/1016 , G06F2212/452 , G06F2212/68
Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.
-
公开(公告)号:US10007620B2
公开(公告)日:2018-06-26
申请号:US15282841
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Seth H. Pugsley , Christopher B. Wilkerson , Roger Gramunt , Jonathan C. Hall , Prabhat Jain
IPC: G06F12/00 , G06F12/121 , G06F12/0864 , G06F12/0804 , G06F12/0811 , G06F12/084 , G06F12/0842 , G06F12/0862
CPC classification number: G06F12/126 , G06F1/3275 , G06F12/128 , G06F2212/1021 , G06F2212/70 , Y02D10/14
Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.
-
公开(公告)号:US20180095895A1
公开(公告)日:2018-04-05
申请号:US15282841
申请日:2016-09-30
Applicant: Intel Corporation
Inventor: Seth H. Pugsley , Christopher B. Wilkerson , Roger Gramunt , Jonathan C. Hall , Prabhat Jain
IPC: G06F12/121 , G06F12/0864
CPC classification number: G06F12/121 , G06F1/3275 , G06F12/0804 , G06F12/0811 , G06F12/084 , G06F12/0842 , G06F12/0862 , G06F12/0864 , G06F2212/1021 , G06F2212/1028 , G06F2212/6042 , G06F2212/69 , G06F2212/70
Abstract: A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.
-
公开(公告)号:US09658856B2
公开(公告)日:2017-05-23
申请号:US14976228
申请日:2015-12-21
Applicant: Intel Corporation
Inventor: Andrew T. Forsyth , Brian J. Hickmann , Jonathan C. Hall , Christopher J. Hughes
IPC: G06F12/10 , G06F9/38 , G06F9/30 , G06F12/0875 , G06F12/1027 , G06F15/80 , G06F13/42
CPC classification number: G06F9/3853 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30098 , G06F9/30105 , G06F9/30145 , G06F9/3804 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F12/0875 , G06F12/1027 , G06F13/4282 , G06F15/8007 , G06F2212/1016 , G06F2212/452 , G06F2212/68
Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.
-
-
-
-
-
-
-
-
-