-
公开(公告)号:US12190174B2
公开(公告)日:2025-01-07
申请号:US16425881
申请日:2019-05-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Sergey Blagodurov , Anthony T. Gutierrez , Matthew D. Sinclair , David A. Wood , Bradford M. Beckmann
Abstract: A technique for synchronizing workgroups is provided. Multiple workgroups execute a wait instruction that specifies a condition variable and a condition. A workgroup scheduler stops execution of a workgroup that executes a wait instruction and an advanced controller begins monitoring the condition variable. In response to the advanced controller detecting that the condition is met, the workgroup scheduler determines whether there is a high contention scenario, which occurs when the wait instruction is part of a mutual exclusion synchronization primitive and is detected by determining that there is a low number of updates to the condition variable prior to detecting that the condition has been met. In a high contention scenario, the workgroup scheduler wakes up one workgroup and schedules another workgroup to be woken up at a time in the future. In a non-contention scenario, more than one workgroup can be woken up at the same time.
-
公开(公告)号:US12131026B2
公开(公告)日:2024-10-29
申请号:US18090916
申请日:2022-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Nuwan S Jayasena , Niti Madan
IPC: G06F3/06
CPC classification number: G06F3/061 , G06F3/0659 , G06F3/0673
Abstract: Adaptive scheduling of memory requests and processing-in-memory requests is described. In accordance with the described techniques, a memory controller receives a plurality of processing-in-memory requests and a plurality of non-processing-in-memory requests from a host. The memory controller schedules an order of execution for the plurality of processing-in-memory requests and the plurality of non-processing-in-memory requests based at least in part on a processing-in-memory request stall threshold and a non-processing-in-memory request stall threshold. In response to a system switching (e.g., from executing processing-in-memory requests to executing non-processing-in-memory requests or from executing non-processing-in-memory requests to executing processing-in-memory requests), the memory controller modifies the processing-in-memory request stall threshold and the non-processing-in-memory request stall threshold. The memory controller continues scheduling an order of execution for subsequent requests received from the host using the modified stall thresholds.
-
公开(公告)号:US11809902B2
公开(公告)日:2023-11-07
申请号:US17031424
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Marcus Nathaniel Chow , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood
CPC classification number: G06F9/4881 , G06F9/3838 , G06F9/545
Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.
-
公开(公告)号:US11481250B2
公开(公告)日:2022-10-25
申请号:US16024244
申请日:2018-06-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
Abstract: A first workgroup is preempted in response to threads in the first workgroup executing a first wait instruction including a first value of a signal and a first hint indicating a type of modification for the signal. The first workgroup is scheduled for execution on a processor core based on a first context after preemption in response to the signal having the first value. A second workgroup is scheduled for execution on the processor core based on a second context in response to preempting the first workgroup and in response to the signal having a second value. A third context it is prefetched into registers of the processor core based on the first hint and the second value. The first context is stored in a first portion of the registers and the second context is prefetched into a second portion of the registers prior to preempting the first workgroup.
-
公开(公告)号:US20220091880A1
公开(公告)日:2022-03-24
申请号:US17031424
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Marcus Nathaniel Chow , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood
Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.
-
公开(公告)号:US20240419330A1
公开(公告)日:2024-12-19
申请号:US18211544
申请日:2023-06-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Sooraj Puthoor
IPC: G06F3/06
Abstract: Scheduling processing-in-memory transactions in systems with multiple memory controllers is described. In accordance with the described techniques, an addressing system segments operations of a transaction into multiple microtransactions, where each microtransaction includes a subset of the transaction operations that are scheduled by a corresponding one of the multiple memory controllers. Each transaction, and its associated microtransactions, is assigned a transaction identifier based on a current counter value maintained at the multiple memory controllers, and the multiple memory controllers schedule execution of microtransactions based on associated transaction identifiers to ensure atomic execution of operations for a transaction without interruption by operations of a different transaction.
-
公开(公告)号:US12131199B2
公开(公告)日:2024-10-29
申请号:US17029935
申请日:2020-09-23
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
CPC classification number: G06F9/522 , G06F9/3005 , G06F9/461 , G06F11/3024 , G06F11/3476 , G06F11/3495 , G06N20/00
Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.
-
8.
公开(公告)号:US20230130969A1
公开(公告)日:2023-04-27
申请号:US17512662
申请日:2021-10-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Vaibhav Ramakrishnan Ramachandran , Michael W. Boyer
Abstract: A memory includes two or more portions of memory circuitry and two or more processor in memory (PIM) functional blocks, each PIM functional block associated with a respective portion of the memory circuitry. In operation, at least one other PIM functional block other than a particular PIM functional block copies data from a source location accessible to the other PIM functional block. The other PIM functional block then provides the data to the particular PIM functional block. The particular acquires and stores the data in a destination location accessible to the particular PIM functional block. The particular PIM functional block next performs one or more PIM operations using the data.
-
公开(公告)号:US20220206851A1
公开(公告)日:2022-06-30
申请号:US17138819
申请日:2020-12-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu
Abstract: A method and processing apparatus are provided for executing a program. The processing apparatus comprises memory and a processor. The processor is configured to dispatch a parent work group of a program to be executed and execute a spawn work group instruction to enable a child work group of the parent work group to be executed. The processor is also configured to dispatch the child work group for execution when a sufficient amount of resources are determined to be available to execute the child work group and execute the child work group on one or more compute units. The spawn work group instruction comprises a pointer to a synchronization variable, and the processor is also configured to execute a join workgroup instruction which comprises the pointer to the synchronization variable in the spawn work group instruction.
-
公开(公告)号:US20210096909A1
公开(公告)日:2021-04-01
申请号:US16588872
申请日:2019-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Alexandru Dutu , Matthew D. Sinclair , Bradford M. Beckmann , David A. Wood
Abstract: A technique for synchronizing workgroups is provided. The techniques comprise detecting that one or more non-executing workgroups are ready to execute, placing the one or more non-executing workgroups into one or more ready queues based on the synchronization status of the one or more workgroups, detecting that computing resources are available for execution of one or more ready workgroups, and scheduling for execution one or more ready workgroups from the one or more ready queues in an order that is based on the relative priority of the ready queues.
-
-
-
-
-
-
-
-
-