Patent search ap:("QUALCOMM INCORPORATED") AND inv:"Chun YU" Page 1

1.

发明公开
PERFORMING MATRIX MULTIPLICATION IN A STREAMING PROCESSOR 审中-公开

公开(公告)号：US20240037183A1

公开(公告)日：2024-02-01

申请号：US18487918

申请日：2023-10-16

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS

IPC: G06F17/16 , G06F7/57

CPC classification number: G06F17/16 , G06F7/57

Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.

2.

发明申请
CONCURRENT BINNING AND RENDERING 审中-公开

公开(公告)号：US20200020067A1

公开(公告)日：2020-01-16

申请号：US16035372

申请日：2018-07-13

Applicant: QUALCOMM Incorporated

Inventor： Jian LIANG , Tao WANG , Chun YU , Andrew Evan GRUBER , Donghyun KIM , Nigel POOLE , Tzun-Wei LEE , Shambhoo KHANDELWAL

IPC: G06T1/20 , G06T15/00 , G06T1/60

Abstract: A method, an apparatus, and a computer-readable medium may be configured to perform a binning pass for a first frame. The apparatus may be configured to perform a rendering pass for the first frame in parallel with the binning pass. The apparatus may be configured to enhance efficiency in performing a binning pass and a rendering pass for tile-based rendering, such that the binning pass and rendering pass are performed concurrently. The apparatus may be configured to perform the binning pass using a first hardware pipeline, and may be configured to perform the rendering pass using a second hardware pipeline.

3.

发明申请
METHODS AND APPARATUS FOR CONSTANT DATA STORAGE 有权

公开(公告)号：US20220414814A1

公开(公告)日：2022-12-29

申请号：US17356434

申请日：2021-06-23

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Andrew Evan GRUBER , Chihong ZHANG , Jian JIANG , Gang ZHONG , Baoguang YANG , Yang XIA , Chun YU , Eric DEMERS

IPC: G06T1/20 , G06T1/60

Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may generate a table including a plurality of entries to store data associated with at least one of a constant value or an immediate value. The apparatus may also process, upon generating the table, first data including at least one of a constant value or an immediate value. Further, the apparatus may store, in the generated table, at least one of the constant value or the immediate value of the first data. The apparatus may also transmit, upon storing at least one of the constant value or the immediate value in the table, the table including the stored at least one of the constant value or the immediate value of the first data.

4.

发明申请
OUT OF ORDER WAVE SLOT RELEASE FOR A TERMINATED WAVE 有权

公开(公告)号：US20210209717A1

公开(公告)日：2021-07-08

申请号：US16734252

申请日：2020-01-03

Applicant: QUALCOMM INCORPORATED

Inventor： Yun DU , Chun YU , Andrew Evan GRUBER , Zilin YING , Baoguang YANG

IPC: G06T1/20 , G06T11/00

Abstract: Methods, systems, and devices for image processing are described. A device may determine, based on a test operation, to terminate a first wave associated with a first slot of a set of slots. The device may update a terminated wave bit associated with the first slot based on the determination to terminate the first wave. In some aspects, the device may update a number of invocations field associated with the first wave based on the determination to terminate the first wave. The device may release the first slot based on updating the terminated wave bit and the number of invocations field. In some examples, the device may output the number of invocations field to a rendering backend of the device based on the terminated wave bit.

5.

发明申请
METHODS AND APPARATUS TO PERFORM MATRIX MULTIPLICATION IN A STREAMING PROCESSOR 有权

公开(公告)号：US20210200836A1

公开(公告)日：2021-07-01

申请号：US17137226

申请日：2020-12-29

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS

IPC: G06F17/16 , G06F7/57

Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.

6.

发明申请
GENERAL PURPOSE REGISTER AND WAVE SLOT ALLOCATION IN GRAPHICS PROCESSING 审中-公开

公开(公告)号：US20200312006A1

公开(公告)日：2020-10-01

申请号：US16364829

申请日：2019-03-26

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Andrew Evan GRUBER , Chun YU , Chihong ZHANG , Hongjiang SHANG , Zilin YING , Fei WEI

IPC: G06T15/04 , G06F9/38 , G06F9/54

Abstract: Example techniques are described for generating graphics content by obtaining texture operation instructions corresponding to a texture operation, in response to determining at least one of insufficient general purpose register space is available for the texture operation or insufficient wave slots are available for the texture operation, generating an indication that the texture operation corresponds to a deferred wave, executing the texture operation, sending, to a texture processor, initial texture sample instructions corresponding to the texture operation that was executed, and receiving texture mapped data corresponding to the initial texture sample instructions.

7.

发明公开
RUN-TIME MECHANISM FOR OPTIMAL SHADER 审中-公开

公开(公告)号：US20230377240A1

公开(公告)日：2023-11-23

申请号：US17664033

申请日：2022-05-18

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Eric DEMERS , Andrew Evan GRUBER , Chun YU , Chihong ZHANG , Baoguang YANG , Yuehai DU , Gang ZHONG , Avinash SEETHARAMAIAH , Jonnala Gadda NAGENDRA KUMAR

IPC: G06T15/00 , G06T1/60

CPC classification number: G06T15/005 , G06T1/60

Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.

8.

发明申请
METHODS AND APPARATUS TO FACILITATE A DEDICATED BINDLESS STATE PROCESSOR 有权

公开(公告)号：US20230019763A1

公开(公告)日：2023-01-19

申请号：US17758219

申请日：2020-01-31

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Andrew Evan GRUBER , Chun YU , Chihong ZHANG , Thomas Edwin FRISINGER , Richard HAMMERSTONE , Zilin YING , Heng QI , Quanquan XU , Sheng GU

IPC: G06T1/60

Abstract: The present disclosure relates to methods and apparatus for graphics processing. For example, disclosed techniques facilitate improving bindless state processing at a graphics processor. Aspects of the present disclosure can receive, at a graphics processor, a shader program including a preamble section and a main instructions section. Aspects of the present disclosure can also execute, with a scalar processor dedicated to processing preamble sections, instructions of the preamble section to implement a bindless mechanism for loading constant data associated with the shader program. Additionally, aspects of the present disclosure can distribute the main instructions section and the constant data to a streaming processor for executing the shader program.

9.

发明申请
METHODS AND APPARATUS FOR WAVE SLOT RETIREMENT PROCEDURES 有权

公开(公告)号：US20220357983A1

公开(公告)日：2022-11-10

申请号：US17315205

申请日：2021-05-07

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Andrew Evan GRUBER , Zilin YING , Gang ZHONG , Baoguang YANG , Yang YU , Yang XIA , Ravindra KUMAR , Chun YU , Eric DEMERS

IPC: G06F9/48 , G06F12/0875 , G06T1/20

Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a plurality of workloads based on a workload order, each of the plurality of workloads being received in the workload order including at least a first workload and a second workload. The apparatus may also allocate one or more workloads of the plurality of workloads to one or more wave slots. Additionally, the apparatus may execute the one or more allocated workloads at the one or more wave slots, such that at least the first workload is executed at the first wave slot and the second workload is executed at the second wave slot. The apparatus may also allocate at least one other workload of the plurality of workloads to at least one previously-allocated wave slot of the one or more wave slots.

10.

发明公开
RUNTIME MECHANISM TO OPTIMIZE SHADER EXECUTION FLOW 审中-公开

公开(公告)号：US20240046543A1

公开(公告)日：2024-02-08

申请号：US17817815

申请日：2022-08-05

Applicant: QUALCOMM Incorporated

Inventor： Yun DU , Eric DEMERS , Andrew Evan GRUBER , Chun YU , Baoguang YANG , Chihong ZHANG , Yuehai DU , Avinash SEETHARAMAIAH , Jonnala Gadda NAGENDRA KUMAR , Gang ZHONG , Zilin YING , Fei WEI

IPC: G06T15/00 , G06T15/80

CPC classification number: G06T15/005 , G06T15/80

Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification