-
公开(公告)号:US20210125581A1
公开(公告)日:2021-04-29
申请号:US17062871
申请日:2020-10-05
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Balaji Vembu , Murali Ramadoss , Guei-Yuan Lueh , James A. Valerio , Prasoonkumar Surti , Abhishek R. Appu , Vasanth Ranganathan , Kalyan K. Bhiravabhatla , Arthur D. Hunter, JR. , Wei-Yu Chen , Subramaniam M. Maiyuran
IPC: G09G5/36 , G06F12/0875 , G06F9/46 , G09G5/00
Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
-
公开(公告)号:US10956330B2
公开(公告)日:2021-03-23
申请号:US16727127
申请日:2019-12-26
Applicant: Intel Corporation
Inventor: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
IPC: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/063 , G06N3/04
Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20210081774A1
公开(公告)日:2021-03-18
申请号:US17083080
申请日:2020-10-28
Applicant: Intel Corporation
Inventor: Rajkishore Barik , Elmoustapha Ould-Ahmed-Vall , Xiaoming Chen , Dhawal Srivastava , Anbang Yao , Kevin Nealis , Eriko Nurvitadhi , Sara S. Baghsorkhi , Balaji Vembu , Tatiana Shpeisman , Ping T. Tang
Abstract: One embodiment provides for a general-purpose graphics processing unit including a scheduler to schedule multiple matrix operations for execution by a general-purpose graphics processing unit. The multiple matrix operations are determined based on a single machine learning compute instruction. The single machine learning compute instruction is a convolution instruction and the multiple matrix operations are associated with a convolution operation.
-
公开(公告)号:US20210065329A1
公开(公告)日:2021-03-04
申请号:US17036950
申请日:2020-09-29
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Abhishek R. Appu , Balaji Vembu
Abstract: A mechanism is described for facilitating dynamic merging of atomic operations in computing devices. A method of embodiments, as described herein, includes facilitating detecting atomic messages and a plurality of slot addresses. The method further includes comparing one or more slot addresses of the plurality of slot addresses with other slot addresses of the plurality of slot addresses to seek one or more matched slot addresses, where the one or more matched slot addresses are merged into one or more merged groups. The method may further include generating one or more merged atomic operations based on and corresponding to the one or more merged groups.
-
公开(公告)号:US10936214B2
公开(公告)日:2021-03-02
申请号:US16441338
申请日:2019-06-14
Applicant: Intel Corporation
Inventor: Travis T. Schluessler , Prasoonkumar Surti , Aravindh V. Anantaraman , Abhishek R. Appu , Joydeep Ray , Altug Koker , Balaji Vembu
IPC: G06F3/06 , G06F1/3234 , G06F1/3225 , G11C11/406 , G11C11/4074
Abstract: Briefly, in accordance with one or more embodiments, an apparatus comprises a memory comprising one or more physical memory chips, and a processor to implement a working set monitor to monitor a working set resident in the one or more physical memory chips. The working set monitor is to adjust a number of the physical memory chips that are powered on based on a size of the working set.
-
公开(公告)号:US10915608B2
公开(公告)日:2021-02-09
申请号:US16126060
申请日:2018-09-10
Applicant: Intel Corporation
Inventor: Balaji Vembu , Vidhya Krishnan , Sandeep Sodhi , Sreekanth Mavila , Altug Koker , Aditya Navale , Scott Janus , Changliang Wang
IPC: H04L9/00 , G06F21/12 , G06T15/00 , H04N21/254 , G06F9/48 , H04L9/08 , G06F21/60 , G06T1/20 , G06T1/60
Abstract: Apparatus and method for scalable content protection. For example, one embodiment of an apparatus comprises: cryptographic management circuitry to securely store one or more keys associated with one or more media apps/applications; a plurality of processing engines, each processing engine comprising circuitry to process media content of the one or more media apps/applications; and a scheduler to schedule processing of the media content by the processing engines; wherein the cryptographic management circuitry is to restore a first cryptographic state including a first key associated with a first media app/application and/or first media content responsive to a request to process the first media content on a first processing engine.
-
公开(公告)号:US10908939B2
公开(公告)日:2021-02-02
申请号:US15420376
申请日:2017-01-31
Applicant: Intel Corporation
Inventor: Balaji Vembu , Altug Koker , David Puffer , Murali Ramadoss , Bryan R. White , Hema C. Nalluri , Aditya Navale
Abstract: An apparatus and method are described for fine grained sharing of graphics processing resources for example, one embodiment of a graphics processing apparatus comprises: a plurality of command buffers to store work elements from a plurality of virtual machines or applications, each work element indicating a command to be processed by graphics hardware and data identifying the virtual machine or application which generated the work element; a plurality of doorbell registers or memory regions, each doorbell register or memory region associated with a particular virtual machine or application, a virtual machine or application to store an indication in its doorbell register or memory region when it has stored a work element to a command buffer; and a work scheduler to read a work element from a command buffer responsive to detecting an indication in a doorbell register, the work scheduler to combine work elements from multiple virtual machines or applications in a submission to a graphics engine, the graphics engine to execute a work element using the data identifying a virtual machine or application associated with the work element, wherein different graphics engines are configured to simultaneously execute workloads belonging to different virtual machines or applications.
-
公开(公告)号:US10908905B2
公开(公告)日:2021-02-02
申请号:US16599239
申请日:2019-10-11
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Balaji Vembu , Abhishek R. Appu , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10891773B2
公开(公告)日:2021-01-12
申请号:US15482677
申请日:2017-04-07
Applicant: Intel Corporation
Inventor: Joydeep Ray , Abhishek R. Appu , Pattabhiraman K , Balaji Vembu , Altug Koker , Niranjan L. Cooray , Josh B. Mastronarde
IPC: G06T15/00 , G06F9/455 , G06T1/60 , G09G5/36 , G09G5/00 , G09G5/393 , G06F9/48 , G06F9/50 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20
Abstract: An apparatus and method are described for allocating local memories to virtual machines. For example, one embodiment of an apparatus comprises: a command streamer to queue commands from a plurality of virtual machines (VMs) or applications, the commands to be distributed from the command streamer and executed by graphics processing resources of a graphics processing unit (GPU); a tile cache to store graphics data associated with the plurality of VMs or applications as the commands are executed by the graphics processing resources; and tile cache allocation hardware logic to allocate a first portion of the tile cache to a first VM or application and a second portion of the tile cache to a second VM or application; the tile cache allocation hardware logic to further allocate a first region in system memory to store spill-over data when the first portion of the tile cache and/or the second portion of the file cache becomes full.
-
公开(公告)号:US20200334200A1
公开(公告)日:2020-10-22
申请号:US16869223
申请日:2020-05-07
Applicant: Intel Corporation
Inventor: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-