Patent search ap:("QUALCOMM Incorporated") AND inv:"Eric Wayne Mahurin" Page 2

11.

发明授权
Hybrid convolution operation 有权

公开(公告)号：US12093800B2

公开(公告)日：2024-09-17

申请号：US17165648

申请日：2021-02-02

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne Mahurin

IPC: G06N3/04 , G06N3/0464

CPC classification number: G06N3/04 , G06N3/0464

Abstract: A device includes one or more processors configured to retrieve a first block of data, the data corresponding to array of values arranged along at least a first dimension and a second dimension, to retrieve at least a portion of a second block of the data, and to perform a first hybrid convolution operation that applies a filter across the first block and at least the portion of the second block to generate output data. The output data includes a first accumulated block and at least a portion of a second accumulated block. The one or more processors are also configured to store the first accumulated block as first output data. The portion of the second block is adjacent to the first block along the first dimension and the portion of the second accumulated block is adjacent to the first accumulated block along the second dimension.

12.

发明申请
MULTI-THREAD POWER LIMITING VIA SHARED LIMIT 有权

公开(公告)号：US20210240251A1

公开(公告)日：2021-08-05

申请号：US16829942

申请日：2020-03-25

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne Mahurin , Vijay Kiran Kalyanam

IPC: G06F1/3287 , G01R21/133 , H03H17/04

Abstract: Systems and methods for multi-thread power limiting via a shared limit estimates power consumed in a processing core on a thread-by-thread basis by counting how many power events occur in each thread. Power consumed by each thread is approximated based on the number of power events that have occurred. Power consumed by individual threads is compared to a shared power limit derived from a sum of the power consumed by all threads. Threads that are above the shared power limit are stalled while threads below the shared power limit are allowed to continue without throttling. In this fashion, the most power intensive threads are throttled to stay below the shared power limit while still maintaining performance.

13.

发明授权
Proactive clock gating system to mitigate supply voltage droops 有权

公开(公告)号：US10860051B2

公开(公告)日：2020-12-08

申请号：US16563563

申请日：2019-09-06

Applicant: QUALCOMM Incorporated

Inventor： Vijay Kiran Kalyanam , Eric Wayne Mahurin

IPC: G06F1/08 , H03C3/09 , H03L7/08 , H03L7/07 , H03K19/00

Abstract: A clock gating system (CGS) includes a digital power estimator configured to generate indications of a predicted energy consumption per cycle of a clock signal and a maximum energy consumption per cycle of the clock signal. The CGS further includes a voltage-clock gate (VCG) circuit coupled to the digital power estimator. The VCG circuit is configured to gate and un-gate the clock signal based on the indications prior to occurrence of a voltage droop event and using hardware voltage model circuitry of the VCG circuit. The VCG circuit is further configured to gate the clock signal based on an undershoot phase associated with the voltage droop event and to un-gate the clock signal based on an overshoot phase associated with the voltage droop event.

14.

发明授权
SIMD instructions for multi-stage cube networks 有权

公开(公告)号：US10459723B2

公开(公告)日：2019-10-29

申请号：US14804190

申请日：2015-07-20

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne Mahurin

IPC: G06F9/30 , G06F15/80

Abstract: Systems and methods relate to performing data movement operations using single instruction multiple data (SIMD) instructions. A first SIMD instruction comprises a first input data vector having a number N of two or more data elements in corresponding N SIMD lanes and a control vector having N control elements in the corresponding N SIMD lanes. A first multi-stage cube network is controllable by the first SIMD instruction, and includes movement elements, with one movement element per SIMD lane, per stage. A movement element selects between one of two data elements based on a corresponding control element and moves the data elements across the stages of the first multi-stage cube network by a zero distance or power-of-two distance between adjacent stages to generate a first output data vector. A second multi-stage cube network can be used in conjunction to generate all possible data movement operations of the input data vector.

15.

发明授权
Coprocessor for out-of-order loads 有权

公开(公告)号：US09678758B2

公开(公告)日：2017-06-13

申请号：US14499044

申请日：2014-09-26

Applicant: QUALCOMM Incorporated

Inventor： Lucian Codrescu , Christopher Edward Koob , Eric Wayne Mahurin , Suresh Kumar Venkumahanti

IPC: G06F9/312 , G06F15/76 , G06F15/80 , G06F9/38 , G06F9/30

CPC classification number: G06F9/3877 , G06F9/30036 , G06F9/30043 , G06F9/3814 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F15/8053

Abstract: Systems and methods for implementing certain load instructions, such as vector load instructions by cooperation of a main processor and a coprocessor. The load instructions which are identified by the main processor for offloading to the coprocessor are committed in the main processor without receiving corresponding load data. Post-commit, the load instructions are processed in the coprocessor, such that latencies incurred in fetching the load data are hidden from the main processor. By implementing an out-of-order load data buffer associated with an in-order instruction buffer, the coprocessor is also configured to avoid stalls due to long latencies which may be involved in fetching the load data from levels of memory hierarchy, such as L2, L3, L4 caches, main memory, etc.

16.

发明申请
PARALLELIZATION OF SCALAR OPERATIONS BY VECTOR PROCESSORS USING DATA-INDEXED ACCUMULATORS IN VECTOR REGISTER FILES, AND RELATED CIRCUITS, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
Title translation: 使用矢量寄存器文件中的数据索引累加器的矢量处理器和相关电路，方法和计算机可读介质的标量运算的并行化

公开(公告)号：US20160026607A1

公开(公告)日：2016-01-28

申请号：US14486326

申请日：2014-09-15

Applicant: QUALCOMM Incorporated

Inventor： Lucian Codrescu , Eric Wayne Mahurin

IPC: G06F15/82 , G06F9/30

CPC classification number: G06F15/82 , G06F9/3001 , G06F9/30098 , G06F9/30109 , G06F9/3012

Abstract: Parallelization of scalar operations by vector processors using data-indexed accumulators in vector register files, related circuits, methods, and computer-readable media are disclosed. In one aspect, a vector processor comprises a vector register file providing a plurality of write ports and a plurality of vector registers each providing a plurality of accumulators. The vector processor receives an input data vector. For each of the plurality of write ports, the vector processor executes vector operation(s) for accessing an input data value of the input data vector, and determining, based on the input data value, a register index for a vector register among the plurality of vector registers, and an accumulator index for an accumulator among the plurality of accumulators of the vector register. Based on the register index, a register value is retrieved from the register index, and a scalar operation is performed based on the register value and the accumulator index.

Abstract translation: 公开了使用向量寄存器文件，相关电路，方法和计算机可读介质中的数据索引累加器的矢量处理器的标量运算的并行化。一方面，向量处理器包括提供多个写入端口的向量寄存器文件和多个向量寄存器，每个向量寄存器提供多个累加器。向量处理器接收输入数据向量。对于多个写入端口中的每一个，向量处理器执行用于访问输入数据向量的输入数据值的向量操作，并且基于输入数据值，确定多个写入端口中的向量寄存器的寄存器索引矢量寄存器的多个累加器中的累加器的累加器索引。基于寄存器索引，从寄存器索引检索寄存器值，并且基于寄存器值和累加器索引执行标量运算。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification