-
公开(公告)号:US11068458B2
公开(公告)日:2021-07-20
申请号:US16202082
申请日:2018-11-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Ibrahim , Onur Kayiran , Yasuko Eckert
IPC: G06F16/22 , G06F16/901
Abstract: A portion of a graph dataset is generated for each computing node in a distributed computing system by, for each subject vertex in a graph, recording for the computing node an offset for the subject vertex, where the offset references a first position in an edge array for the computing node, and for each edge of a set of edges coupled with the subject vertex in the graph, calculating an edge value for the edge based on a connected vertex identifier identifying a vertex coupled with the subject vertex via the edge. When the edge value is assigned to the first position, the edge value is determined by a first calculation, and when the edge value is assigned to position subsequent to the first position, the edge value is determined by a second calculation. In the computing node, the edge value is recorded in the edge array.
-
公开(公告)号:US20210182213A1
公开(公告)日:2021-06-17
申请号:US16716165
申请日:2019-12-16
Applicant: Advanced Micro Devices, Inc.
Inventor: Jieming Yin , Yasuko Eckert , Subhash Sethumurugan
IPC: G06F12/122
Abstract: Systems, apparatuses, and methods for implementing cache line re-reference interval prediction using a physical page address are disclosed. When a cache line is accessed, a controller retrieves a re-reference interval counter value associated with the line. If the counter is less than a first threshold, then the address of the cache line is stored in a small re-use page buffer. If the counter is greater than a second threshold, then the address is stored in a large re-use page buffer. When a new cache line is inserted in the cache, if its address is stored in the small re-use page buffer, then the controller assigns a high priority to the line to cause it to remain in the cache to be re-used. If a match is found in the large re-use page buffer, then the controller assigns a low priority to the line to bias it towards eviction.
-
公开(公告)号:US10970120B2
公开(公告)日:2021-04-06
申请号:US16019374
申请日:2018-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicholas Malaya , Yasuko Eckert
Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.
-
公开(公告)号:US10938709B2
公开(公告)日:2021-03-02
申请号:US16224739
申请日:2018-12-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Ibrahim , Onur Kayiran , Yasuko Eckert , Jieming Yin
IPC: H04L12/761 , H04L12/781 , H04L12/715 , H04L12/931 , H04L12/729 , H04L12/733
Abstract: A method includes receiving, from an origin computing node, a first communication addressed to multiple destination computing nodes in a processor interconnect fabric, measuring a first set of one or more communication metrics associated with a transmission path to one or more of the multiple destination computing nodes, and for each of the destination computing nodes, based on the set of communication metrics, selecting between a multicast transmission mode and unicast transmission mode as a transmission mode for transmitting the first communication to the destination computing node.
-
公开(公告)号:US10452437B2
公开(公告)日:2019-10-22
申请号:US15192784
申请日:2016-06-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Abhinandan Majumdar , Brian J. Kocoloski , Leonardo Piga , Wei Huang , Yasuko Eckert
Abstract: Systems, apparatuses, and methods for performing temperature-aware task scheduling and proactive power management. A SoC includes a plurality of processing units and a task queue storing pending tasks. The SoC calculates a thermal metric for each pending task to predict an amount of heat the pending task will generate. The SoC also determines a thermal gradient for each processing unit to predict a rate at which the processing unit's temperature will change when executing a task. The SoC also monitors a thermal margin of how far each processing unit is from reaching its thermal limit. The SoC minimizes non-uniform heat generation on the SoC by scheduling pending tasks from the task queue to the processing units based on the thermal metrics for the pending tasks, the thermal gradients of each processing unit, and the thermal margin available on each processing unit.
-
公开(公告)号:US10282295B1
公开(公告)日:2019-05-07
申请号:US15825880
申请日:2017-11-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker , Michael W. Boyer , Yasuko Eckert , Gabriel H. Loh
IPC: G06F12/08 , G06F12/0817 , G06F12/0831 , G06F12/0811 , G06F12/128
Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.
-
公开(公告)号:US20190123648A1
公开(公告)日:2019-04-25
申请号:US16130136
申请日:2018-09-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Wei Huang , Yasuko Eckert , Xudong An , Muhammad Shoaib Bin Altaf , Jieming Yin
CPC classification number: H02M3/1582 , G05F1/468 , G05F1/56 , G05F1/575 , H02M2001/0022
Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
-
公开(公告)号:US09921635B2
公开(公告)日:2018-03-20
申请号:US14068207
申请日:2013-10-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Yasuko Eckert , Manish Arora
IPC: G06F1/32
CPC classification number: G06F1/3228 , G06F1/32 , G06F1/3203 , G06F1/3206 , G06F1/3231 , G06F1/324 , G06F1/3246 , Y02D10/126 , Y02D10/173
Abstract: An approach is described herein that includes a method for power management of a device. In one example, the method includes sampling duration characteristics for a plurality of past idle events for a predetermined interval of time and determining whether to transition a device to a powered-down state based on the sampled duration characteristics. In another example, the method includes determining whether an average idle time for a plurality of past idle events exceeds an energy break-even point threshold. If the average idle time for the plurality of past idle events exceeds the energy break-even point threshold, a device is immediately transitioned to a powered-down state upon receipt of a next idle event. If the average idle time for the plurality of past idle events does not exceed the energy break-even point threshold, transition of the device to the powered-down state is delayed.
-
公开(公告)号:US09851777B2
公开(公告)日:2017-12-26
申请号:US14146591
申请日:2014-01-02
Applicant: Advanced Micro Devices, Inc.
Inventor: Manish Arora , Indrani Paul , Yasuko Eckert , Nuwan S. Jayasena , Srilatha Manne , Madhu Saravana Sibi Govindan , William L. Bircher
IPC: G06F1/32
CPC classification number: G06F1/3287 , G06F1/3225 , Y02D10/171 , Y02D50/20
Abstract: Power gating decisions can be made based on measures of cache dirtiness. Analyzer logic can selectively power gate a component of a processor system based on a cache dirtiness of one or more caches associated with the component. The analyzer logic may power gate the component when the cache dirtiness exceeds a threshold and may maintains the component in an idle state when the cache dirtiness does not exceed the threshold. Idle time prediction logic may be used to predict a duration of an idle time of the component. The analyzer logic may then selectively power gates the component based on the cache dirtiness and the predicted idle time.
-
公开(公告)号:US09658663B2
公开(公告)日:2017-05-23
申请号:US14862044
申请日:2015-09-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Wei Huang , Manish Arora , Yasuko Eckert , Indrani Paul
CPC classification number: G06F1/206 , G06F1/3206 , G06F1/3234 , G06F1/324 , G06F1/3296 , G06F11/3024 , G06F11/3058
Abstract: A three-dimensional (3-D) processor stack includes a plurality of processor cores implemented in a plurality of layers. A controller is to selectively throttle one or more of a plurality of processor cores in response to detecting a thermal event. The controller selectively throttles the one or more of the plurality of processor cores based on values of thermal couplings between the plurality of layers and based on measures of criticality of threads executing on the plurality of processor cores.
-
-
-
-
-
-
-
-
-