-
公开(公告)号:US20240087077A1
公开(公告)日:2024-03-14
申请号:US17944542
申请日:2022-09-14
Applicant: Intel Corporation
Inventor: Joydeep Ray , Abhishek R. Appu , Prathamesh Raghunath Shinde , John Wiegert
Abstract: Embodiments described herein provide a technique to merge partial cache line writes to a cache memory. One embodiment provides a graphics processor comprising a graphics core, a cache coupled with the graphics core, and memory access circuitry to process memory access messages received from the graphics core. The memory access circuitry includes partial cache line write merge circuitry configured to merge a first partial write to a cache line of the cache with a second partial write to the cache line of the cache.
-
公开(公告)号:US20220413899A1
公开(公告)日:2022-12-29
申请号:US17358882
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Vasanth Ranganathan , James Valerio , Joydeep Ray , Abhishek R. Appu , Alan Curtis , Prathamesh Raghunath Shinde , Brandon Fliflet , Ben J. Ashbaugh , John Wiegert
Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
-
公开(公告)号:US12299766B2
公开(公告)日:2025-05-13
申请号:US17484066
申请日:2021-09-24
Applicant: Intel Corporation
Inventor: Joydeep Ray , Prathamesh Raghunath Shinde , Ben J. Ashbaugh , Wei-Yu Chen , Abhishek R. Appu , Vasanth Ranganathan , Dmitry Yurievich Babokin , Ankur N. Shah
Abstract: Systems and methods for supporting generic pointers in hardware of a graphics processing unit (GPU) are provided. In various examples, a GPU includes multiple sub-cores each having a processing resource and a load/store pipeline. The processing resource is operable to receive a memory access message including a pointer and a memory type identifier indicative of the pointer representing a generic pointer. The processing resource is further operable to output a load or store operation to the load/store pipeline based on the memory access message, including computing an address for the load or store operation by adding a base address of a named memory type of a plurality of named memory types referenced by the generic pointer to an offset into a memory of the named memory type. The load/store pipeline is operable to, responsive to receipt of the load or store operation, access the memory at the address.
-
公开(公告)号:US20240160478A1
公开(公告)日:2024-05-16
申请号:US17987185
申请日:2022-11-15
Applicant: Intel Corporation
Inventor: Jiasheng Chen , Chunhui Mei , Ben J. Ashbaugh , Naveen Matam , Joydeep Ray , Timothy Bauer , Guei-Yuan Lueh , Vasanth Ranganathan , Prashant Chaudhari , Vikranth Vemulapalli , Nishanth Reddy Pendluru , Piotr Reiter , Jain Philip , Marek Rudniewski , Christopher Spencer , Parth Damani , Prathamesh Raghunath Shinde , John Wiegert , Fataneh Ghodrat
IPC: G06F9/50 , G06F12/0875
CPC classification number: G06F9/5016 , G06F12/0875 , G06F2212/452
Abstract: An apparatus to facilitate increasing processing resources in processing cores of a graphics environment is disclosed. The apparatus includes a plurality of processing resources to execute one or more execution threads; a plurality of message arbiter-processing resource (MA-PR) routers, wherein a respective MA-PR router of the plurality of MA-PR routers corresponds to a pair of processing resources of the plurality of processing resources and is to arbitrate routing of a thread control message from a message arbiter between the pair of processing resources; a plurality of local shared cache (LSC) sequencers to provide an interface between at least one LSC of the processing core and the plurality of processing resources; and a plurality of instruction caches (ICs) to store instructions of the one or more execution threads, wherein a respective IC of the plurality of ICs interfaces with a portion of the plurality of processing resources.
-
公开(公告)号:US20250130848A1
公开(公告)日:2025-04-24
申请号:US18934573
申请日:2024-11-01
Applicant: Intel Corporation
Inventor: Vasanth Ranganathan , James Valerio , Joydeep Ray , Abhishek R. Appu , Alan Curtis , Prathamesh Raghunath Shinde , Brandon Fliflet , Ben J. Ashbaugh , John Wiegert
Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
-
公开(公告)号:US20240211403A1
公开(公告)日:2024-06-27
申请号:US18086441
申请日:2022-12-21
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Joydeep Ray , Karthik Vaidyanathan , Sreedhar Chalasani , Eric Liskay , Prathamesh Raghunath Shinde , Vasanth Ranganathan , Michael J. Norris , Rajasekhar Pantangi , Altug Koker
IPC: G06F12/0837 , G06F12/0811
CPC classification number: G06F12/0837 , G06F12/0811
Abstract: One embodiment provides a graphics processor comprising memory access circuitry configured to generate a virtual address for pixel data at a pixel coordinate on a surface in memory to facilitate the caching of the pixel data in a cache memory before the actual memory address of the pixel coordinate is able to be determined.
-
公开(公告)号:US20220413854A1
公开(公告)日:2022-12-29
申请号:US17358859
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Joydeep Ray , Supratim Pal , Prathamesh Raghunath Shinde , Ben J. Ashbaugh , Changwon Rhee , Hong Jiang , FangWen Fu
Abstract: An apparatus to facilitate 64-bit two-dimensional (2D) block load with transpose is disclosed. The apparatus includes a processor comprising processing resources; and load store pipeline hardware circuitry coupled to the processing resources, the load store pipeline hardware circuitry to receive a 64-bit two-dimensional (2D) block load message with transpose from the processing resources. The load store pipeline hardware circuitry comprising a load store pipeline sequencer to map rows of a block of memory corresponding to the 64-bit 2D block load message with transpose to 64-bit standard load messages; and load store pipeline return circuitry to: sequentially number general register files (GRFs) used for returning elements of the block of memory accessed by the 64-bit standard load messages to the processing resources; and return, to the processing resources, the sequentially numbered GRFs in response to the 64-bit 2D block load message with transpose.
-
公开(公告)号:US12164952B2
公开(公告)日:2024-12-10
申请号:US17358882
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Vasanth Ranganathan , James Valerio , Joydeep Ray , Abhishek R. Appu , Alan Curtis , Prathamesh Raghunath Shinde , Brandon Fliflet , Ben J. Ashbaugh , John Wiegert
Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
-
公开(公告)号:US20240281249A1
公开(公告)日:2024-08-22
申请号:US18170808
申请日:2023-02-17
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , Karthik Vaidyanathan , Sreedhar Chalasani , Eric Liskay , Prathamesh Raghunath Shinde , Vasanth Ranganathan , Michael J. Norris , Rajasekhar Pantangi
CPC classification number: G06F9/30043 , G06F9/30047 , G06F9/546
Abstract: One embodiment provides a graphics processor comprising memory access circuitry configured to receive a message from an instruction execution resource and determine a destination for the message, the destination one of shared function circuitry of a graphics core or a set of memory banks within the graphics core. The memory access circuitry then routes the message to the shared function circuitry in response to a determination that the message is directed to the shared function circuitry or routes the message to a message sequencer associated with the instruction execution resource in response to a determination that the message is directed to the set of memory banks.
-
公开(公告)号:US20240104025A1
公开(公告)日:2024-03-28
申请号:US17951914
申请日:2022-09-23
Applicant: Intel Corporation
Inventor: Biju George , Zamshed I. Chowdhury , Prathamesh Raghunath Shinde , Chunhui Mei , Fangwen Fu
IPC: G06F12/123 , G06F12/0862
CPC classification number: G06F12/123 , G06F12/0862 , G06F2212/1021
Abstract: Prefetch aware LRU cache replacement policy is described. An example of an apparatus includes one or more processors including a graphic processor, the graphics processor including a load store cache having multiple cache lines (CLs), each including bits for a cache line level (CL level) and one or more sectors for data storage; wherein the graphics processor is to receive one or more data elements for storage in the cache; set a CL level to track each CL receiving data, including setting CL level 1 for a CL receiving data in response to a miss in the cache and setting a CL level 2 for a CL receiving prefetched data in response to a prefetch request, and, upon determining that space is required in the cache to store data, apply a cache replacement policy, the policy being based at least in part on set CL levels for the CLs.
-
-
-
-
-
-
-
-
-