-
公开(公告)号:US20180067763A1
公开(公告)日:2018-03-08
申请号:US15255553
申请日:2016-09-02
Applicant: Intel Corporation
Inventor: Yuting Yang , Yuenian Yang , Julia A. Gould , Guei-Yuan Lueh
CPC classification number: G06F9/4881 , G06F9/52 , G06T1/20 , G06T2200/28
Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.
-
公开(公告)号:US10552211B2
公开(公告)日:2020-02-04
申请号:US15255553
申请日:2016-09-02
Applicant: Intel Corporation
Inventor: Yuting Yang , Yuenian Yang , Julia A. Gould , Guei-Yuan Lueh
Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.
-
公开(公告)号:US09824414B2
公开(公告)日:2017-11-21
申请号:US14565240
申请日:2014-12-09
Applicant: INTEL CORPORATION
Inventor: Julia A. Gould , Haihua Wu
CPC classification number: G06T1/20 , G06F9/3009 , G06F9/3851 , G06F9/4881 , H04N19/70
Abstract: Techniques to dispatch threads of a graphics kernel for execution to increase the interval between dependent threads and the associated are disclosed. The dispatch interval may be increased by dispatching associated threads, followed by threads without any dependencies, followed by threads dependent on the earlier dispatched associated threads. As such, the interval between dependent threads and their associated threads can be increased, leading to increased parallelism.
-
公开(公告)号:US10332232B2
公开(公告)日:2019-06-25
申请号:US15817978
申请日:2017-11-20
Applicant: INTEL CORPORATION
Inventor: Julia A. Gould , Haihua Wu
Abstract: Techniques to dispatch threads of a graphics kernel for execution to increase the interval between dependent threads and the associated are disclosed. The dispatch interval may be increased by dispatching associated threads, followed by threads without any dependencies, followed by threads dependent on the earlier dispatched associated threads. As such, the interval between dependent threads and their associated threads can be increased, leading to increased parallelism.
-
公开(公告)号:US09952901B2
公开(公告)日:2018-04-24
申请号:US14564199
申请日:2014-12-09
Applicant: Intel Corporation
Inventor: Haihua Wu , Julia A. Gould , Li-An Tang
CPC classification number: G06F9/4893 , Y02D10/24
Abstract: Described herein are technologies related to enforcing thread dependency using a hybrid scoreboard. An encoded video information that includes a plurality of threads is received, a first set and a second set of threads from the plurality of thread is determined, the first and second sets of threads are assigned to a hardware and a software, respectively, and dependency threads in the first and second sets of threads is enforced.
-
6.
公开(公告)号:US09589311B2
公开(公告)日:2017-03-07
申请号:US14133096
申请日:2013-12-18
Applicant: INTEL CORPORATION
Inventor: Julia A. Gould , Haihua Wu
CPC classification number: G06T1/20 , G06F9/5066
Abstract: Techniques to saturate a graphics processing unit (GPU) with independent threads from multiple kernels are described. An apparatus may include a graphics processing unit driver for a graphics processing unit having a first partition including a first plurality of execution units and a second partition including a second plurality of execution units, the graphics processing unit driver to dispatch one or more threads of a first kernel to the first partition and to dispatch one or more threads of a second kernel to the second partition to increase a utilization of the plurality of execution units and avoid hardware resource competition.
Abstract translation: 描述了利用来自多个内核的独立线程饱和图形处理单元(GPU)的技术。 一种装置可以包括用于图形处理单元的图形处理单元驱动器,该图形处理单元具有包括第一多个执行单元的第一分区和包括第二多个执行单元的第二分区,所述图形处理单元驱动程序分派一个或多个线程 第一内核到第一分区,并且将第二内核的一个或多个线程分派到第二分区,以增加多个执行单元的利用率并避免硬件资源竞争。
-
-
-
-
-