-
1.
公开(公告)号:US09971391B2
公开(公告)日:2018-05-15
申请号:US14757903
申请日:2015-12-23
申请人: Intel Corporation
发明人: Devadatta Bodas , Meenakshi Arunachalam , Ilya Sharapov , Charles R. Yount , Scott B. Huck , Ramakrishna Huggahalli , Justin J. Song , Brian J. Griffith , Muralidhar Rajappa , Lingdan (Linda) Zeng
CPC分类号: G06F1/3206 , G06F1/324 , G06F11/3428 , Y02D10/126
摘要: A method of assessing energy efficiency of a High-performance computing (HPC) system, including: selecting a plurality of HPC workloads to run on a system under test (SUT) with one or more power constraints, wherein the SUT includes a plurality of HPC nodes in the HPC system, executing the plurality of HPC workloads on the SUT, and generating a benchmark metric for the SUT based on a baseline configuration for each selected HPC workload and a plurality of measured performance per power values for each executed workload at each selected power constraint is shown.
-
公开(公告)号:US20220309349A1
公开(公告)日:2022-09-29
申请号:US17839010
申请日:2022-06-13
申请人: Intel Corporation
发明人: Meenakshi Arunachalam , Arun Tejusve Raghunath Rajan , Deepthi Karkada , Adam Procter , Vikram Saletore
IPC分类号: G06N3/08 , G06F1/3203 , G06K9/62 , G06F1/324 , G06N3/063 , G06F1/3206 , G06N3/04 , G06V10/94
摘要: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.
-
公开(公告)号:US11029971B2
公开(公告)日:2021-06-08
申请号:US16259608
申请日:2019-01-28
申请人: Intel Corporation
发明人: Meenakshi Arunachalam , Kushal Datta , Vikram Saletore , Vishal Verma , Deepthi Karkada , Vamsi Sripathi , Rahul Khanna , Mohan Kumar
摘要: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.
-
公开(公告)号:US11966843B2
公开(公告)日:2024-04-23
申请号:US17839010
申请日:2022-06-13
申请人: Intel Corporation
发明人: Meenakshi Arunachalam , Arun Tejusve Raghunath Rajan , Deepthi Karkada , Adam Procter , Vikram Saletore
IPC分类号: G06N3/08 , G06F1/3203 , G06F1/3206 , G06F18/214 , G06N3/063 , G06V10/774 , G06V10/82 , G06V10/94 , G06N3/048
CPC分类号: G06N3/08 , G06F1/3203 , G06F1/3206 , G06F18/214 , G06N3/063 , G06V10/774 , G06V10/82 , G06V10/94 , G06N3/048
摘要: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.
-
公开(公告)号:US20190080233A1
公开(公告)日:2019-03-14
申请号:US15704668
申请日:2017-09-14
申请人: Intel Corporation
摘要: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.
-
公开(公告)号:US10922610B2
公开(公告)日:2021-02-16
申请号:US15704668
申请日:2017-09-14
申请人: Intel Corporation
摘要: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.
-
公开(公告)号:US20190155620A1
公开(公告)日:2019-05-23
申请号:US16259608
申请日:2019-01-28
申请人: Intel Corporation
发明人: Meenakshi Arunachalam , Kushal Datta , Vikram Saletore , Vishal Verma , Deepthi Karkada , Vamsi Sripathi , Rahul Khanna , Mohan Kumar
摘要: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.
-
8.
公开(公告)号:US20170185132A1
公开(公告)日:2017-06-29
申请号:US14757903
申请日:2015-12-23
申请人: Intel Corporation
发明人: Devadatta Bodas , Meenakshi Arunachalam , Ilya Sharapov , Charles R. Yount , Scott B. Huck , Ramakrishna Huggahalli , Justin J. Song , Brian J. Griffith , Muralidhar Rajappa , Lingdan (Linda) Zeng
CPC分类号: G06F1/3206 , G06F1/324 , G06F11/3428 , Y02D10/126
摘要: A method of assessing energy efficiency of a High-performance computing (HPC) system, including: selecting a plurality of HPC workloads to run on a system under test (SUT) with one or more power constraints, wherein the SUT includes a plurality of HPC nodes in the HPC system, executing the plurality of HPC workloads on the SUT, and generating a benchmark metric for the SUT based on a baseline configuration for each selected HPC workload and a plurality of measured performance per power values for each executed workload at each selected power constraint is shown.
-
-
-
-
-
-
-