EFFICIENT DEPENDENCY DETECTION FOR CONCURRENT BINNING GPU WORKLOADS

    公开(公告)号:US20200027189A1

    公开(公告)日:2020-01-23

    申请号:US16042172

    申请日:2018-07-23

    Abstract: Methods, systems, and devices for dependency detection of a graphical processor unit (GPU) workload at a device are described. The method relates to generating a resource packet for a first GPU workload of a set of GPU workloads, the resource packet including a list of resources, identifying a first resource from the list of resources, retrieving a GPU address from a first memory location associated with the first resource, determining whether a dependency of the first resource exists between the first GPU workload and a second GPU workload from the set of GPU workloads based on the retrieving of the GPU address, and processing, when the dependency exists, the first resource after waiting for a duration to lapse.

    ZERO PIXEL CULLING FOR GRAPHICS PROCESSING
    2.
    发明申请
    ZERO PIXEL CULLING FOR GRAPHICS PROCESSING 有权
    ZERO PIXEL CULLING FOR GRAPHICAL PROCESSING

    公开(公告)号:US20170024926A1

    公开(公告)日:2017-01-26

    申请号:US14805088

    申请日:2015-07-21

    Abstract: A graphics processing unit (GPU) may include a triangle setup engine (TSE) configured to determine coordinates of a triangle, rotate coordinates of the triangle based on an angle. To rotate the coordinates, the TSE generates coordinates of the triangle in a rotated domain, and determines coordinates of a bounding box in the rotated domain based on the coordinates of the triangle in the rotated domain. The TSE determines a first plurality of parallel scanlines in the rotated domain, and a second plurality of parallel scanlines in the rotated domain. The first and second pluralities of scanlines are perpendicular. The TSE determines whether the bounding box coordinates are located within two adjacent scanlines. If the bounding box coordinates are located within the two adjacent scanlines, the TSE removes the triangle from the scene.

    Abstract translation: 图形处理单元(GPU)可以包括被配置为确定三角形的坐标的三角形设置引擎(TSE),基于角度旋转三角形的坐标。 为了旋转坐标,TSE在旋转的域中生成三角形的坐标,并根据旋转域中的三角形的坐标确定旋转域中边界框的坐标。 TSE确定旋转域中的第一多个平行扫描线和旋转域中的第二多个并行扫描线。 第一和第二扫描线是垂直的。 TSE确定边界框坐标是否位于两个相邻的扫描线之内。 如果边界框坐标位于两个相邻的扫描线内,则TSE会从场景中删除三角形。

    Cache memory system and method using dynamically allocated dirty mask space
    4.
    发明授权
    Cache memory system and method using dynamically allocated dirty mask space 有权
    缓存内存系统和方法使用动态分配的脏屏蔽空间

    公开(公告)号:US09342461B2

    公开(公告)日:2016-05-17

    申请号:US13687761

    申请日:2012-11-28

    Abstract: A cache memory system includes a cache memory including a plurality of cache memory lines and a dirty buffer including a plurality of dirty masks. A cache controller is configured to allocate one of the dirty masks to each of the cache memory lines when a write to the respective cache memory line is not a full write to that cache memory line. Each of the dirty masks indicates dirty states of data units in one of the cache memory lines. The cache controller may include a dirty buffer index which stores an identification (ID) information that associates the dirty masks with the cache memory lines to which the dirty masks are allocated. A cache line may include a fully dirty flag indicating when each byte in that cache line is dirty, so that a dirty mask does not need to be allocated for that cache line.

    Abstract translation: 高速缓冲存储器系统包括包括多个高速缓存存储器线的高速缓冲存储器和包括多个脏掩模的脏缓冲器。 高速缓存控制器被配置为当对相应高速缓存存储器线的写入不是对该高速缓存存储器线的完全写入时,将一个脏掩模分配给每个高速缓存存储器线。 每个脏屏蔽指示一个缓存存储器线中的数据单元的脏状态。 高速缓存控制器可以包括脏缓冲器索引,该脏缓冲器索引存储将脏掩码与分配有脏掩码的高速缓冲存储器线相关联的标识(ID)信息。 高速缓存行可以包括完全脏标志,指示该高速缓存行中的每个字节何时是脏的,从而不需要为该高速缓存行分配脏掩码。

    MEMORY MANAGEMENT USING DYNAMICALLY ALLOCATED DIRTY MASK SPACE
    5.
    发明申请
    MEMORY MANAGEMENT USING DYNAMICALLY ALLOCATED DIRTY MASK SPACE 有权
    使用动态分配的真皮掩蔽空间进行记忆管理

    公开(公告)号:US20140149685A1

    公开(公告)日:2014-05-29

    申请号:US13687761

    申请日:2012-11-28

    Abstract: Systems and methods related to a memory system including a cache memory are disclosed. The cache memory system includes a cache memory including a plurality of cache memory lines and a dirty buffer including a plurality of dirty masks. A cache controller is configured to allocate one of the dirty masks to each of the cache memory lines when a write to the respective cache memory line is not a full write to that cache memory line. Each of the dirty masks indicates dirty states of data units in one of the cache memory lines. The cache controller stores an identification (ID) information that associates the dirty masks with the cache memory lines to which the dirty masks are allocated.

    Abstract translation: 公开了与包括高速缓冲存储器的存储器系统有关的系统和方法。 高速缓冲存储器系统包括包括多个高速缓存存储器线的高速缓存存储器和包括多个脏掩模的脏缓冲器。 高速缓存控制器被配置为当对相应高速缓存存储器线的写入不是对该高速缓存存储器线的完全写入时,将一个脏掩模分配给每个高速缓存存储器线。 每个脏屏蔽指示一个缓存存储器线中的数据单元的脏状态。 高速缓存控制器存储将脏屏蔽与分配有脏屏蔽的高速缓冲存储器线相关联的标识(ID)信息。

    SLICED GRAPHICS PROCESSING UNIT (GPU) ARCHITECTURE IN PROCESSOR-BASED DEVICES

    公开(公告)号:US20240221279A1

    公开(公告)日:2024-07-04

    申请号:US18609624

    申请日:2024-03-19

    CPC classification number: G06T15/005

    Abstract: A sliced graphics processing unit (GPU) architecture in processor-based devices is disclosed. In some aspects, a GPU based on a sliced GPU architecture includes multiple hardware slices. The GPU further includes a sliced low-resolution Z buffer (LRZ) that is communicatively coupled to each hardware slice of the plurality of hardware slices, and that comprises a plurality of LRZ regions. Each hardware slice is configured to store, in an LRZ region corresponding exclusively to the hardware slice among the plurality of LRZ regions, a pixel tile assigned to the hardware slice.

    Zero pixel culling for graphics processing

    公开(公告)号:US09959665B2

    公开(公告)日:2018-05-01

    申请号:US14805088

    申请日:2015-07-21

    Abstract: A graphics processing unit (GPU) may include a triangle setup engine (TSE) configured to determine coordinates of a triangle, rotate coordinates of the triangle based on an angle. To rotate the coordinates, the TSE generates coordinates of the triangle in a rotated domain, and determines coordinates of a bounding box in the rotated domain based on the coordinates of the triangle in the rotated domain. The TSE determines a first plurality of parallel scanlines in the rotated domain, and a second plurality of parallel scanlines in the rotated domain. The first and second pluralities of scanlines are perpendicular. The TSE determines whether the bounding box coordinates are located within two adjacent scanlines. If the bounding box coordinates are located within the two adjacent scanlines, the TSE removes the triangle from the scene.

    ADAPTIVE MEMORY ADDRESS SCANNING BASED ON SURFACE FORMAT FOR GRAPHICS PROCESSING
    8.
    发明申请
    ADAPTIVE MEMORY ADDRESS SCANNING BASED ON SURFACE FORMAT FOR GRAPHICS PROCESSING 审中-公开
    基于图形处理的表面格式的自适应存储器地址扫描

    公开(公告)号:US20160321774A1

    公开(公告)日:2016-11-03

    申请号:US14699806

    申请日:2015-04-29

    CPC classification number: G06T1/20 G06T1/60 G06T11/40

    Abstract: This disclosure describes an adaptive memory address scanning technique that defines an address scanning pattern, to be used for a particular surface, based on one or more properties of the surface. In addition, a number, shape, and arrangement of sub-primitives of a surface to process in parallel may be determined. In one example of the disclosure, a memory accessing method for graphics processing comprises, determining, by a graphics processing unit (GPU), properties of a surface, determining, by the GPU, a memory address scanning technique based on the determined properties of the surface, and performing, by the GPU, at least one of a read or a write of data associated with the surface in a memory based on the determined memory address scanning technique.

    Abstract translation: 本公开描述了一种自适应存储器地址扫描技术,其基于表面的一个或多个属性定义要用于特定表面的地址扫描图案。 此外,可以确定并行处理的表面的子图元的数量,形状和布置。 在本公开的一个示例中,用于图形处理的存储器访问方法包括:由图形处理单元(GPU)确定表面的属性,由GPU确定基于所确定的特性的存储器地址扫描技术 基于所确定的存储器地址扫描技术,通过GPU执行与存储器中的表面相关联的读取或写入数据中的至少一个。

Patent Agency Ranking