-
公开(公告)号:US11436016B2
公开(公告)日:2022-09-06
申请号:US16703833
申请日:2019-12-04
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony T. Gutierrez , Bradford M. Beckmann , Marcus Nathaniel Chow
Abstract: A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache, and based on the determining, updating the operand cache.
-
公开(公告)号:US12210398B2
公开(公告)日:2025-01-28
申请号:US18346380
申请日:2023-07-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit N. Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/32 , G06F1/26 , G06F1/324 , G06F1/3287 , G06F1/3296 , G06F9/50
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
公开(公告)号:US11726546B2
公开(公告)日:2023-08-15
申请号:US17033000
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit N. Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/00 , G06F1/3287 , G06F9/50 , G06F1/3296 , G06F1/324
CPC classification number: G06F1/3287 , G06F1/324 , G06F1/3296 , G06F9/50
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
公开(公告)号:US11249765B2
公开(公告)日:2022-02-15
申请号:US16109567
申请日:2018-08-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony T. Gutierrez
Abstract: Techniques for improving performance of accelerated processing devices (“APDs”) when exceptions occur are provided. In APDs, the very large number of parallel processing execution units, and the complexity of the hardware used to execute a large number of work-items in parallel, means that APDs typically stall when an exception occurs (unlike in central processing units (“CPUs”), which are able to execute speculatively and out-of-order). However, the techniques provided herein allow at least some execution to occur past exceptions. Execution past an exception generating instruction occurs by executing instructions that would not lead to a corruption while skipping those that would lead to a corruption. After the exception has been satisfied, execution occurs in a replay mode in which the potentially exception-generating instruction is executed and in which instructions that did not execute in the exception-wait mode are executed. A mask and counter are used to control execution in replay mode.
-
公开(公告)号:US11150899B2
公开(公告)日:2021-10-19
申请号:US15948795
申请日:2018-04-09
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony T. Gutierrez , Sergey Blagodurov , Scott A. Moe , Xianwei Zhang , Jieming Yin , Matthew D. Sinclair
Abstract: An electronic device includes a controller functional block and a computational functional block. During operation, while the computational functional block executes a test portion of a workload at at least one precision level, the controller functional block monitors a behavior of the computational functional block. Based on the behavior of the computational functional block while executing the test portion of the workload at the at least one precision level, the controller functional block selects a given precision level from among a set of two or more precision levels at which the computational functional block is to execute a remaining portion of the workload. The controller functional block then configures the computational block to execute the remaining portion of the workload at the given precision level.
-
公开(公告)号:US20210173650A1
公开(公告)日:2021-06-10
申请号:US16703833
申请日:2019-12-04
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony T. Gutierrez , Bradford M. Beckmann , Marcus Nathaniel Chow
Abstract: A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache, and based on the determining, updating the operand cache.
-
公开(公告)号:US12190174B2
公开(公告)日:2025-01-07
申请号:US16425881
申请日:2019-05-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Sergey Blagodurov , Anthony T. Gutierrez , Matthew D. Sinclair , David A. Wood , Bradford M. Beckmann
Abstract: A technique for synchronizing workgroups is provided. Multiple workgroups execute a wait instruction that specifies a condition variable and a condition. A workgroup scheduler stops execution of a workgroup that executes a wait instruction and an advanced controller begins monitoring the condition variable. In response to the advanced controller detecting that the condition is met, the workgroup scheduler determines whether there is a high contention scenario, which occurs when the wait instruction is part of a mutual exclusion synchronization primitive and is detected by determining that there is a low number of updates to the condition variable prior to detecting that the condition has been met. In a high contention scenario, the workgroup scheduler wakes up one workgroup and schedules another workgroup to be woken up at a time in the future. In a non-contention scenario, more than one workgroup can be woken up at the same time.
-
公开(公告)号:US11947487B2
公开(公告)日:2024-04-02
申请号:US17852306
申请日:2022-06-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Robert Alsop , Karthik Ramu Sangaiah , Anthony T. Gutierrez
IPC: G06F15/82
CPC classification number: G06F15/825
Abstract: Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.
-
公开(公告)号:US20230418782A1
公开(公告)日:2023-12-28
申请号:US17852306
申请日:2022-06-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Robert Alsop , Karthik Ramu Sangaiah , Anthony T. Gutierrez
IPC: G06F15/82
CPC classification number: G06F15/825
Abstract: Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.
-
公开(公告)号:US20230350485A1
公开(公告)日:2023-11-02
申请号:US18346380
申请日:2023-07-03
Applicant: Advanced Micro Devices, Inc.
Inventor: Vedula Venkata Srikant Bharadwaj , Shomit Das , Anthony T. Gutierrez , Vignesh Adhinarayanan
IPC: G06F1/3287 , G06F9/50 , G06F1/3296 , G06F1/324
CPC classification number: G06F1/3287 , G06F9/50 , G06F1/3296 , G06F1/324
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
-
-
-
-
-
-
-
-
-