-
公开(公告)号:US10042687B2
公开(公告)日:2018-08-07
申请号:US15231251
申请日:2016-08-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
-
公开(公告)号:US20200342327A1
公开(公告)日:2020-10-29
申请号:US16397283
申请日:2019-04-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Shi Dong , Daniel I. Lowell
Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.
-
3.
公开(公告)号:US10365996B2
公开(公告)日:2019-07-30
申请号:US15331270
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , David A. Roberts , Mitesh R. Meswani , Vilas Sridharan , Steven Raasch , Daniel I. Lowell
Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
-
公开(公告)号:US11803734B2
公开(公告)日:2023-10-31
申请号:US15849617
申请日:2017-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Sergey Voronov , Mayank Daga
Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
-
公开(公告)号:US20190188557A1
公开(公告)日:2019-06-20
申请号:US15849617
申请日:2017-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Sergey Voronov , Mayank Daga
Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
-
公开(公告)号:US10013240B2
公开(公告)日:2018-07-03
申请号:US15188304
申请日:2016-06-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell
CPC classification number: G06F8/30 , G06F8/41 , G06F8/454 , G06F11/1629 , G06F2201/805
Abstract: A first processing element is configured to execute a first thread and one or more second processing elements are configured to execute one or more second threads that are redundant to the first thread. The first thread and the one or more second threads are to selectively bypass one or more comparisons of results of operations performed by the first thread and the one or more second threads depending on whether an event trigger for the comparison has occurred a configurable number of times since a previous comparison of previously encoded values of the results. In some cases the comparison can be performed based on hashed (or encoded) values of the results of a current operation and one or more previous operations.
-
公开(公告)号:US20180039531A1
公开(公告)日:2018-02-08
申请号:US15231251
申请日:2016-08-08
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Manish Gupta
CPC classification number: G06F11/0763 , G06F9/30021 , G06F9/30101 , G06F9/3851 , G06F9/3861 , G06F9/3887 , G06F11/0721 , G06F11/0784
Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
-
8.
公开(公告)号:US20170277441A1
公开(公告)日:2017-09-28
申请号:US15331270
申请日:2016-10-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Gupta , David A. Roberts , Mitesh R. Meswani , Vilas Sridharan , Steven Raasch , Daniel I. Lowell
IPC: G06F3/06
CPC classification number: G06F12/02
Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
-
公开(公告)号:US20240054332A1
公开(公告)日:2024-02-15
申请号:US18496411
申请日:2023-10-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Daniel I. Lowell , Sergey Voronov , Mayank Daga
Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
-
公开(公告)号:US11562248B2
公开(公告)日:2023-01-24
申请号:US16397283
申请日:2019-04-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Shi Dong , Daniel I. Lowell
Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.
-
-
-
-
-
-
-
-
-