Mechanism to Increase Thread Parallelism in a Graphics Processor

    公开(公告)号:US20180067763A1

    公开(公告)日:2018-03-08

    申请号:US15255553

    申请日:2016-09-02

    CPC classification number: G06F9/4881 G06F9/52 G06T1/20 G06T2200/28

    Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.

    Mechanism to increase thread parallelism in a graphics processor

    公开(公告)号:US10552211B2

    公开(公告)日:2020-02-04

    申请号:US15255553

    申请日:2016-09-02

    Abstract: A processing apparatus is described. The apparatus includes a plurality of execution threads having a first thread space configuration including a first plurality of rows of execution threads to process data in parallel, wherein each thread in a row is dependent on a top neighbor thread in a preceding row, partition logic to partition the plurality of execution threads into a plurality of banks, wherein each bank includes one or more of the first plurality of rows of execution threads and transform logic to transform the first thread space configuration to a second thread space configuration including a second plurality of rows of execution threads to enable the plurality of execution threads in each of the plurality of banks to operate in parallel.

    Thread dispatching for graphics processors

    公开(公告)号:US10332232B2

    公开(公告)日:2019-06-25

    申请号:US15817978

    申请日:2017-11-20

    Abstract: Techniques to dispatch threads of a graphics kernel for execution to increase the interval between dependent threads and the associated are disclosed. The dispatch interval may be increased by dispatching associated threads, followed by threads without any dependencies, followed by threads dependent on the earlier dispatched associated threads. As such, the interval between dependent threads and their associated threads can be increased, leading to increased parallelism.

    Power efficient hybrid scoreboard method

    公开(公告)号:US09952901B2

    公开(公告)日:2018-04-24

    申请号:US14564199

    申请日:2014-12-09

    CPC classification number: G06F9/4893 Y02D10/24

    Abstract: Described herein are technologies related to enforcing thread dependency using a hybrid scoreboard. An encoded video information that includes a plurality of threads is received, a first set and a second set of threads from the plurality of thread is determined, the first and second sets of threads are assigned to a hardware and a software, respectively, and dependency threads in the first and second sets of threads is enforced.

    Independent thread saturation of graphics processing units
    6.
    发明授权
    Independent thread saturation of graphics processing units 有权
    图形处理单元的独立螺纹饱和度

    公开(公告)号:US09589311B2

    公开(公告)日:2017-03-07

    申请号:US14133096

    申请日:2013-12-18

    CPC classification number: G06T1/20 G06F9/5066

    Abstract: Techniques to saturate a graphics processing unit (GPU) with independent threads from multiple kernels are described. An apparatus may include a graphics processing unit driver for a graphics processing unit having a first partition including a first plurality of execution units and a second partition including a second plurality of execution units, the graphics processing unit driver to dispatch one or more threads of a first kernel to the first partition and to dispatch one or more threads of a second kernel to the second partition to increase a utilization of the plurality of execution units and avoid hardware resource competition.

    Abstract translation: 描述了利用来自多个内核的独立线程饱和图形处理单元(GPU)的技术。 一种装置可以包括用于图形处理单元的图形处理单元驱动器,该图形处理单元具有包括第一多个执行单元的第一分区和包括第二多个执行单元的第二分区,所述图形处理单元驱动程序分派一个或多个线程 第一内核到第一分区,并且将第二内核的一个或多个线程分派到第二分区,以增加多个执行单元的利用率并避免硬件资源竞争。

Patent Agency Ranking