Patent search ap:("QUALCOMM Incorporated") AND inv:"Eric Wayne MAHURIN" Page 1

1.

发明申请
SCATTER TO GATHER OPERATION 审中-公开

公开(公告)号：US20170371657A1

公开(公告)日：2017-12-28

申请号：US15192992

申请日：2016-06-24

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN , Jakub Pawal GOLAB , Lucian CODRESCU

IPC: G06F9/30 , G06F3/06

Abstract: Systems and methods relate to efficient memory operations. A single instruction multiple data (SIMD) gather operation is implemented with a gather result buffer located within or in close proximity to memory, to receive or gather multiple data elements from multiple orthogonal locations in a memory, and once the gather result buffer is complete, the gathered data is transferred to a processor register. A SIMD copy operation is performed by executing two or more instructions for copying multiple data elements from multiple orthogonal source addresses to corresponding multiple destination addresses within the memory, without an intermediate copy to a processor register. Thus, the memory operations are performed in a background mode without direction by the processor.

2.

发明申请
COPROCESSOR FOR OUT-OF-ORDER LOADS 有权
Title translation: 用于不合适的负载的共同控制器

公开(公告)号：US20160092238A1

公开(公告)日：2016-03-31

申请号：US14499044

申请日：2014-09-26

Applicant: QUALCOMM Incorporated

Inventor： Lucian CODRESCU , Christopher Edward KOOB , Eric Wayne MAHURIN , Suresh Kumar VENKUMAHANTI

IPC: G06F9/38

CPC classification number: G06F9/3877 , G06F9/30036 , G06F9/30043 , G06F9/3814 , G06F9/3824 , G06F9/3836 , G06F9/3887 , G06F15/8053

Abstract: Systems and methods for implementing certain load instructions, such as vector load instructions by cooperation of a main processor and a coprocessor. The load instructions which are identified by the main processor for offloading to the coprocessor are committed in the main processor without receiving corresponding load data. Post-commit, the load instructions are processed in the coprocessor, such that latencies incurred in fetching the load data are hidden from the main processor. By implementing an out-of-order load data buffer associated with an in-order instruction buffer, the coprocessor is also configured to avoid stalls due to long latencies which may be involved in fetching the load data from levels of memory hierarchy, such as L2, L3, L4 caches, main memory, etc.

Abstract translation: 用于实现某些加载指令的系统和方法，例如通过主处理器和协处理器协作的向量加载指令。由主处理器识别的用于卸载到协处理器的加载指令在主处理器中提交，而不接收相应的负载数据。提交后，加载指令在协处理器中进行处理，这样在取出加载数据时产生的延迟从主处理器中隐藏起来。通过实现与按顺序指令缓冲器相关联的无序负载数据缓冲器，协处理器还被配置为避免由于长时间延迟而导致的延迟，这可能涉及从诸如L2的存储器层级的级别中提取负载数据，L3，L4高速缓存，主内存等

3.

发明申请
REDUCED RESULT MATRIX 有权

公开(公告)号：US20220035891A1

公开(公告)日：2022-02-03

申请号：US17390257

申请日：2021-07-30

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN , Erich PLONDKE

IPC: G06F17/16 , G06F7/525

Abstract: Matrix multiple operations may use a reduced result matrix to increase the speed and accuracy of the operation. In one example, each higher precision row/column is decomposed into multiple component rows/columns of the base type that can be combined as weighted sums to form the original higher precision row/column. In another example, the decomposition may be independent for each input matrix and decompose to any multiple of the base type. In another example, the base type for each input matrix could be different. In another example, after decomposition, a matrix operation is performed (e.g. matrix multiply, convolutional layer, or possibly other matrix operation) on decomposed base type input matrices to yield a result matrix that contains components of the higher precision results. The results may be combined together to obtain higher-precision results.

4.

发明申请
MIXED-WIDTH SIMD OPERATIONS USING EVEN/ODD REGISTER PAIRS FOR WIDE DATA ELEMENTS 审中-公开
Title translation: 使用偶数/ ODD寄存器对进行宽数据元素的混合宽度SIMD操作

公开(公告)号：US20170024209A1

公开(公告)日：2017-01-26

申请号：US14805456

申请日：2015-07-21

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN , Ajay Anant INGLE

IPC: G06F9/30

Abstract: Systems and methods relate to a mixed-width single instruction multiple data (SIMD) instruction which has at least a source vector operand comprising data elements of a first bit-width and a destination vector operand comprising data elements of a second bit-width, wherein the second bit-width is either half of or twice the first bit-width. Correspondingly, one of the source or destination vector operands is expressed as a pair of registers, a first register and a second register. The other vector operand is expressed as a single register. Data elements of the first register correspond to even-numbered data elements of the other vector operand expressed as a single register, and data elements of the second register correspond to data elements of the other vector operand expressed as a single register.

Abstract translation: 系统和方法涉及混合宽度单指令多数据（SIMD）指令，其具有至少包括第一位宽的数据元素和包含第二位宽的数据元素的目的地向量操作数的源向量操作数，其中第二个位宽是第一个位宽的一半或两倍。相应地，源或目标向量操作数之一被表示为一对寄存器，第一寄存器和第二寄存器。另一个向量操作数表示为单个寄存器。第一寄存器的数据元素对应于表示为单个寄存器的另一向量操作数的偶数数据元，第二寄存器的数据元对应于表示为单个寄存器的另一向量操作数的数据元。

5.

发明申请
SIMD INSTRUCTIONS FOR MULTI-STAGE CUBE NETWORKS 审中-公开
Title translation: 用于多级网络的SIMD指令

公开(公告)号：US20170024208A1

公开(公告)日：2017-01-26

申请号：US14804190

申请日：2015-07-20

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN

IPC: G06F9/30

CPC classification number: G06F9/30032 , G06F9/30036 , G06F9/30072 , G06F15/8053

Abstract: Systems and methods relate to performing data movement operations using single instruction multiple data (SIMD) instructions. A first SIMD instruction comprises a first input data vector having a number N of two or more data elements in corresponding N SIMD lanes and a control vector having N control elements in the corresponding N SIMD lanes. A first multi-stage cube network is controllable by the first SIMD instruction, and includes movement elements, with one movement element per SIMD lane, per stage. A movement element selects between one of two data elements based on a corresponding control element and moves the data elements across the stages of the first multi-stage cube network by a zero distance or power-of-two distance between adjacent stages to generate a first output data vector. A second multi-stage cube network can be used in conjunction to generate all possible data movement operations of the input data vector.

Abstract translation: 系统和方法涉及使用单指令多数据（SIMD）指令执行数据移动操作。第一SIMD指令包括在相应的N SIMD通道中具有N个两个或更多个数据元素的第一输入数据向量和在相应的N SIMD通道中具有N个控制元素的控制向量。第一多级立方体网络可由第一SIMD指令控制，并且每个阶段包括每SIMD通道一个移动元件的运动元件。移动元件基于相应的控制元素选择两个数据元素中的一个元素，并且将数据元素跨越第一多级立方体网络的各级移动零距离或相邻级之间的两倍的距离，以产生第一输出数据向量。可以结合使用第二多级立方体网络来产生输入数据向量的所有可能的数据移动操作。

6.

发明申请
EFFICIENT MULTIPLICATION APPROXIMATION FOR ARTIFICIAL INTELLIGENT (AI) ENGINES 有权

公开(公告)号：US20250131248A1

公开(公告)日：2025-04-24

申请号：US18493649

申请日：2023-10-24

Applicant: QUALCOMM Incorporated

Inventor： Jihad MASRI , Gregory SMITH , Venkata Ravi Kiran DAYANA , Eric Wayne MAHURIN

IPC: G06N3/0464 , G06N3/09

Abstract: A processor-implemented method for multiplication approximation includes receiving inputs to be processed using an artificial intelligence (AI) compute engine. The inputs have a first precision. The AI compute engine is configured for processing in a second precision different from the first precision. A first parameter for the inputs and a second parameter for the AI compute engine are defined. The first parameter and the second parameter respectively indicate a first portion of the first precision and a second portion of the second precision to use for computation by the AI compute engine. The inputs and the second set of compute engine parameters are respectively adapted according to the first parameter and the second parameter to generate a first representation and a second representation. An approximation of an AI workload for the inputs is generated based on the first representation and the second representation.

7.

发明公开
SINGLE INSTRUCTION MULTIPLE DATA (SIMD) SPARSE DECOMPRESSION WITH VARIABLE DENSITY 审中-公开

公开(公告)号：US20240118902A1

公开(公告)日：2024-04-11

申请号：US18339797

申请日：2023-06-22

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN , Erich PLONDKE , Hitesh Kumar GUPTA , Colin Beaton VERRILLI , Rexford Alan HILL

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3887 , G06F9/30178

Abstract: An aspect of the disclosure relates to a data processing system, including: an input medium configured to include a first set of blocks of data including a first set of block of compressed data and a first set of metadata, respectively; an output medium configured to include a first set of blocks of decompressed data each having a predetermined number of decompressed elements; and a set of single instruction multiple data (SIMD) processors configured to: access the first set of blocks of data from the input medium, respectively; decompress the first set of blocks of compressed data to generate the first set of blocks of decompressed data based on the first set of metadata, respectively; and provide the first set of blocks of decompressed data to the output medium, respectively.

8.

发明申请
Artificial Intelligence Processor Architecture For Dynamic Scaling Of Neural Network Quantization 有权

公开(公告)号：US20220309314A1

公开(公告)日：2022-09-29

申请号：US17210644

申请日：2021-03-24

Applicant: QUALCOMM Incorporated

Inventor： Hee Jun PARK , Eric Wayne MAHURIN , Tijmen Pieter Frederik BLANKEVOORT

IPC: G06N3/04 , G06N3/063

Abstract: Various embodiments include methods and devices for processing a neural network by an artificial intelligence (AI) processor. Embodiments may include receiving an AI processor operating condition information, dynamically adjusting an AI quantization level for a segment of a neural network in response to the operating condition information, and processing the segment of the neural network quantization using the adjusted AI quantization level.

9.

发明申请
PROACTIVE CLOCK GATING SYSTEM TO MITIGATE SUPPLY VOLTAGE DROOPS 审中-公开

公开(公告)号：US20200081479A1

公开(公告)日：2020-03-12

申请号：US16563563

申请日：2019-09-06

Applicant: QUALCOMM Incorporated

Inventor： Vijay Kiran KALYANAM , Eric Wayne MAHURIN

IPC: G06F1/08 , H03C3/09 , H03K19/00 , H03L7/07 , H03L7/08

Abstract: A clock gating system (CGS) includes a digital power estimator configured to generate indications of a predicted energy consumption per cycle of a clock signal and a maximum energy consumption per cycle of the clock signal. The CGS further includes a voltage-clock gate (VCG) circuit coupled to the digital power estimator. The VCG circuit is configured to gate and un-gate the clock signal based on the indications prior to occurrence of a voltage droop event and using hardware voltage model circuitry of the VCG circuit. The VCG circuit is further configured to gate the clock signal based on an undershoot phase associated with the voltage droop event and to un-gate the clock signal based on an overshoot phase associated with the voltage droop event.

10.

发明申请
AREA EFFICIENT ASYNCHRONOUS FIRST-IN-FIRST-OUT (FIFO) BUFFER FOR HIGH BANDWIDTH DATA TRANSFER USING EVENT TRANSFER BLOCKS 有权

公开(公告)号：US20250021498A1

公开(公告)日：2025-01-16

申请号：US18601341

申请日：2024-03-11

Applicant: QUALCOMM Incorporated

Inventor： Eric Wayne MAHURIN , Hitesh Kumar GUPTA , Francisco PEREZ

IPC: G06F13/16

Abstract: A clock domain crossing interface is described. The clock domain crossing interface includes a transmit clock domain and a receive clock domain using a different clock from the transmit clock domain. The clock domain crossing interface also includes a first-in-first-out (FIFO) buffer coupled between the transmit clock domain and the receive clock domain. The FIFO buffer to store ordered transactions sent from the transmit clock domain to the receive clock domain. The clock domain crossing interface further includes a transmit clock domain event transfer block to notify the receive clock domain of a new transaction pushed onto the FIFO buffer in the transmit clock domain. The clock domain crossing interface also includes a receive clock domain event transfer block to notify the transmit clock domain of a new transaction pulled from the FIFO buffer in the receive clock domain.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification