-
公开(公告)号:US11481250B2
公开(公告)日:2022-10-25
申请号:US16024244
申请日:2018-06-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
Abstract: A first workgroup is preempted in response to threads in the first workgroup executing a first wait instruction including a first value of a signal and a first hint indicating a type of modification for the signal. The first workgroup is scheduled for execution on a processor core based on a first context after preemption in response to the signal having the first value. A second workgroup is scheduled for execution on the processor core based on a second context in response to preempting the first workgroup and in response to the signal having a second value. A third context it is prefetched into registers of the processor core based on the first hint and the second value. The first context is stored in a first portion of the registers and the second context is prefetched into a second portion of the registers prior to preempting the first workgroup.
-
公开(公告)号:US20240220315A1
公开(公告)日:2024-07-04
申请号:US18091443
申请日:2022-12-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Suchita Pati , Shaizeen Aga , Nuwan Jayasena , Matthew David Sinclair
CPC classification number: G06F9/4881 , G06F9/52
Abstract: A processing system includes a scheduling mechanism for producing data for fine-grained reordering of workgroups of a kernel to produce blocks of data, such as for communication across devices to enable overlapping of a producer computation with an all-reduce communication across the network. This scheduling mechanism enables a first parallel processor to schedule and execute a set of workgroups of a producer operation to generate data for transmission to a second parallel processor in a desired traffic pattern. At the same time, the second parallel processor schedules and executes a different set of workgroups of the producer operation to generate data for transmission in a desired traffic pattern to a third parallel processor or back to the first parallel processor.
-
公开(公告)号:US12131199B2
公开(公告)日:2024-10-29
申请号:US17029935
申请日:2020-09-23
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Alexandru Dutu , Matthew David Sinclair , Bradford Beckmann , David A. Wood
CPC classification number: G06F9/522 , G06F9/3005 , G06F9/461 , G06F11/3024 , G06F11/3476 , G06F11/3495 , G06N20/00
Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.
-
-