Patent search ap:("QUALCOMM Incorporated") AND inv:"Lin Chen" Page 2

11.

发明申请
LOAD SCHEME FOR SHARED REGISTER IN GPU 有权
Title translation: GPU中共享注册表的加载方案

公开(公告)号：US20150379680A1

公开(公告)日：2015-12-31

申请号：US14316391

申请日：2014-06-26

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Andrew Evan Gruber , Lin Chen

IPC: G06T1/60 , G06T7/40 , G06T15/80

CPC classification number: G06T1/60 , G06T15/80 , G09G5/363 , G09G2352/00 , G09G2360/06

Abstract: Techniques are described for determining whether data of a variable for each of a plurality of graphics items is same. If determined that the data is the same, the techniques store the data in a storage location of a specialized shared general purpose register that is associated with the variable.

Abstract translation: 描述了用于确定多个图形项目中的每一个的变量的数据是否相同的技术。如果确定数据相同，则该技术将数据存储在与变量相关联的专用共享通用寄存器的存储位置中。

12.

发明申请
UTILIZING PIPELINE REGISTERS AS INTERMEDIATE STORAGE 有权
Title translation: 使用管道注册器作为中间存储

公开(公告)号：US20150324196A1

公开(公告)日：2015-11-12

申请号：US14275047

申请日：2014-05-12

Applicant: QUALCOMM Incorporated

Inventor： Lin Chen , Yun Du , Sumesh Udayakumaran , Chihong Zhang , Andrew Evan Gruber

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3012 , G06F9/30032 , G06F9/3017 , G06F9/3869 , G06F9/3875

Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.

Abstract translation: 在一个示例中，一种方法包括响应于由处理单元接收一个或多个请求将第一值从第一通用寄存器（GPR）移动到第三GPR的指令，并且第二值从第二个 GPR到第四个GPR，由初始逻辑单元和在第一时钟周期期间将第一个值复制到初始流水线寄存器，通过初始逻辑复制第二个时钟周期，将第二个值复制到初始流水线寄存器，由最终逻辑单元和在第三时钟周期期间将第一值从最终流水线寄存器复制到第三GPR，并且由最终逻辑单元复制并在第四时钟周期期间从最终管道复制第二值注册到第四个GPR。

13.

发明申请
TECHNIQUES FOR SERIALIZED EXECUTION IN A SIMD PROCESSING SYSTEM 审中-公开
Title translation: SIMD处理系统中串行执行的技术

公开(公告)号：US20150317157A1

公开(公告)日：2015-11-05

申请号：US14268215

申请日：2014-05-02

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber , Lin Chen , Yun Du , Alexei Vladimirovich Bourd

IPC: G06F9/30

CPC classification number: G06F9/3851 , G06F9/3887

Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.

Abstract translation: SIMD处理器可以被配置为从多个线程确定一个或多个活动线程，从一个或多个活动线程中选择一个活动线程，并对所选择的活动线程执行发散操作。发散操作可以是串行操作。

14.

发明授权
Dynamic shader instruction nullification for graphics processing 有权

公开(公告)号：US10430912B2

公开(公告)日：2019-10-01

申请号：US15432170

申请日：2017-02-14

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber , Lin Chen

IPC: G06T1/00 , G06T1/20 , G06T1/60 , G06T15/00

Abstract: A GPU may be configured to detect and nullify unnecessary instructions. Nullifying unnecessary instructions include overwriting a detected unnecessary instruction with a no operation (NOP) instruction. In another example, nullifying unnecessary instructions may include writing a value to a 1-bit instruction memory. Each bit of the 1-bit instruction memory may be associated with a particular instruction of the draw call. If the 1-bit instruction memory has a true value (e.g., 1), the GPU is configured to not execute the particular instruction.

15.

发明申请
GENERAL PURPOSE REGISTER ALLOCATION IN STREAMING PROCESSOR 审中-公开

公开(公告)号：US20180165092A1

公开(公告)日：2018-06-14

申请号：US15379195

申请日：2016-12-14

Applicant: QUALCOMM Incorporated

Inventor： Yun Du , Liang Han , Lin Chen , Chihong Zhang , Hongjiang Shang , Jing Wu , Zilin Ying , Chun Yu , Guofang Jiao , Andrew Gruber , Eric Demers

IPC: G06F9/30 , G06F9/38

Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

16.

发明授权
Emulation of fused multiply-add operations 有权

公开(公告)号：US09645792B2

公开(公告)日：2017-05-09

申请号：US14461890

申请日：2014-08-18

Applicant: QUALCOMM Incorporated

Inventor： Pramod Vasant Argade , Andrew Evan Gruber , Chiente Ho , Stewart Griffin Hall , Lin Chen

IPC: G06F7/544 , G06F7/483

CPC classification number: G06F7/5443 , G06F5/01 , G06F7/483 , G06F7/57

Abstract: At least one processor may emulate a fused multiply-add operation for a first operand, a second operand, and a third operand. The at least one processor may determine an intermediate value based at least in part on multiplying the first operand with the second operand, determine at least one of an upper intermediate value or a lower intermediate value, wherein determining the upper intermediate value comprises rounding, towards zero, the intermediate value by a specified number of bits, and wherein determining the lower intermediate value comprises subtracting the intermediate value by the upper intermediate value, determine an upper value and a lower value based at least in part on adding or subtracting the third operand to one of the upper intermediate value or the lower intermediate value, and determine an emulated fused multiply-add result by adding the upper value and the lower value.

17.

发明申请
GPU DIVERGENCE BARRIER 有权

公开(公告)号：US20150095914A1

公开(公告)日：2015-04-02

申请号：US14043562

申请日：2013-10-01

Applicant: QUALCOMM Incorporated

Inventor： Chunhui Mei , Alexei Vladimirovich Bourd , Lin Chen

IPC: G06F9/48 , G06F9/52 , G06T1/20

CPC classification number: G06F9/4843 , G06F9/3887 , G06F9/522 , G06T1/20

Abstract: A device includes a memory, and at least one programmable processor configured to determine, for each warp of a plurality of warps, whether a Boolean expression is true for a corresponding thread of each warp, pause execution of each warp having a corresponding thread for which the expression is true, determine a number of active threads for each of the plurality of warps for which the expression is true, sort the plurality of warps for which the expression is true based on the number of active threads in each of the plurality of warps, swap thread data of an active thread of a first warp of the plurality of warps with thread data of an inactive thread of a second warp of the plurality of warps, and resume execution of the at least one of the plurality of warps for which the expression is true.

Abstract translation: 一种设备包括存储器，以及至少一个可编程处理器，其被配置为针对多个经线的每个翘曲确定布线表达式对于每个翘曲的相应线程是否为真，每个经线的暂停执行具有相应的线程，表达式是真实的，确定表达式为真的多个经线中的每一个的多个活动线程，基于多个经线中的每一个中的活动线程的数量对表达式为真的多个经线进行排序通过多个经纱中的第二扭曲的无效线程的线程数据交换多个经纱中的第一翘曲的活动线程的线程数据，并且恢复多个经线中的至少一个经线的执行，表达是真实的。

18.

发明授权
Shuffler circuit for lane shuffle in SIMD architecture 有权

公开(公告)号：US10592468B2

公开(公告)日：2020-03-17

申请号：US15209057

申请日：2016-07-13

Applicant: QUALCOMM Incorporated

Inventor： Liang Han , Xiangdong Jin , Lin Chen , Yun Du , Alexei Vladimirovich Bourd

IPC: G06F15/80 , G06F9/30 , G06F13/40 , G06F9/38

Abstract: Techniques are described to perform a shuffle operation. Rather than using an all-lane to all-lane cross bar, a shuffler circuit having a smaller cross bar is described. The shuffler circuit performs the shuffle operation piecewise by reordering data received from processing lanes and outputting the reordered data.

19.

发明授权
Techniques for serialized execution in a SIMD processing system 有权

公开(公告)号：US10133572B2

公开(公告)日：2018-11-20

申请号：US14268215

申请日：2014-05-02

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber , Lin Chen , Yun Du , Alexei Vladimirovich Bourd

IPC: G06F9/30 , G06F9/38

Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.

20.

发明授权
Constant multiplication with texture unit of graphics processing unit 有权

公开(公告)号：US10089708B2

公开(公告)日：2018-10-02

申请号：US15141519

申请日：2016-04-28

Applicant: QUALCOMM Incorporated

Inventor： Andrew Evan Gruber , Lin Chen , Liang Li , Chunhui Mei

IPC: G06T15/80 , G06T15/04 , G06T1/20 , G06T1/60 , G06T15/00

Abstract: A texture unit of a graphics processing unit (GPU) may receive a texture data. The texture unit may receive the texture data from the memory. The texture unit may also multiply, by a multiplier circuit of the texture unit, the texture data by at least one constant, where the constant is not associated with a filtering operation, and where the texture data comprises at least one texel. The texture unit may also output, by the texture unit, a result of multiplying the texture data by the at least one constant.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification