-
公开(公告)号:US20210200836A1
公开(公告)日:2021-07-01
申请号:US17137226
申请日:2020-12-29
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS
Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.
-
公开(公告)号:US20240037183A1
公开(公告)日:2024-02-01
申请号:US18487918
申请日:2023-10-16
Applicant: QUALCOMM Incorporated
Inventor: Yun DU , Gang ZHONG , Fei WEI , Yibin ZHANG , Jing HAN , Hongjiang SHANG , Elina KAMENETSKAYA , Minjie HUANG , Alexei Vladimirovich BOURD , Chun YU , Andrew Evan GRUBER , Eric DEMERS
Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.
-
公开(公告)号:US20230394738A1
公开(公告)日:2023-12-07
申请号:US18035507
申请日:2020-11-09
Applicant: QUALCOMM Incorporated
Inventor: Yibin ZHANG , Zilin YING , Yun DU , Heng QI , Jiexia YU , Yang YU , Andrew Evan GRUBER , Jian LIANG , Tao WANG , Alexei Vladimirovich BOURD , Gang ZHONG , Minjie HUANG
IPC: G06T15/00
CPC classification number: G06T15/005
Abstract: The present disclosure relates to methods and apparatus for graphics processing, e.g., a GPU. The apparatus may receive an image including a plurality of pixels associated with one or more workgroups and one or more pixel tiles, each of the workgroups and the pixel tiles including one or more pixels of the plurality of pixels. The apparatus may determine whether the one or more workgroups are misaligned with the one or more pixel tiles. The apparatus may determine a conversion order of the one or more workgroups when the one or more workgroups are misaligned with the one or more pixel tiles, the conversion order corresponding to a common multiple of one of the one or more workgroups and one of the one or more pixel tiles. The apparatus may convert each of the one or more workgroups based on the conversion order of the one or more workgroups.
-
-