Patent search ap:("QUALCOMM Incorporated") AND inv:"Natarajan Vaidhyanathan" Page 1

1.

发明授权
Providing flexible management of heterogeneous memory systems using spatial quality of service (QoS) tagging in processor-based systems 有权

公开(公告)号：US10055158B2

公开(公告)日：2018-08-21

申请号：US15272951

申请日：2016-09-22

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Carl Alan Waldspurger , Natarajan Vaidhyanathan , Mattheus Cornelis Antonius Adrianus Heddes , Koustav Bhattacharya

IPC: G06F12/0891 , G06F3/06 , G06F12/0802 , G06F13/16

CPC classification number: G06F3/0629 , G06F3/0604 , G06F3/0685 , G06F12/0802 , G06F12/0804 , G06F12/0891 , G06F12/126 , G06F13/1668 , G06F13/1694 , G06F2212/1016 , G06F2212/20 , G06F2212/60 , G06F2212/601

Abstract: Providing flexible management of heterogeneous memory systems using spatial Quality of Service (QoS) tagging in processor-based systems is disclosed. In one aspect, a heterogeneous memory system of a processor-based system includes a first memory and a second memory. The heterogeneous memory system is divided into a plurality of memory regions, each associated with a QoS identifier (QoSID), which may be set and updated by software. A memory controller of the heterogeneous memory system provides a QoS policy table, which operates to associate each QoSID with a QoS policy state, and which also may be software-configurable. Upon receiving a memory access request including a memory address of a memory region, the memory controller identifies a software-configurable QoSID associated with the memory address, and associates the QoSID with a QoS policy state using the QoS policy table. The memory controller then applies the QoS policy state to perform the memory access operation.

2.

发明申请
PROVIDING SPACE-EFFICIENT STORAGE FOR DYNAMIC RANDOM ACCESS MEMORY (DRAM) CACHE TAGS 审中-公开

公开(公告)号：US20170286214A1

公开(公告)日：2017-10-05

申请号：US15085350

申请日：2016-03-30

Applicant: QUALCOMM Incorporated

Inventor： Natarajan Vaidhyanathan , Mattheus Cornelis Antonius Adrianus Heddes , Colin Beaton Verrilli

IPC: G06F11/10 , G06F12/08 , G11C7/10

CPC classification number: G06F11/1064 , G06F12/0806 , G06F12/0895 , G06F2212/1008 , G06F2212/40 , G06F2212/403 , G06F2212/621 , G06F2212/7209 , G11C7/1072

Abstract: Providing space-efficient storage for dynamic random access memory (DRAM) cache tags is provided. In one aspect, a DRAM cache management circuit provides a plurality of cache entries, each of which contains a tag storage region, a data storage region, and an error protection region. The DRAM cache management circuit is configured to store data to be cached in the data storage region of each cache entry. The DRAM cache management circuit is also configured to use an error detection code (EDC) instead of an error correcting code (ECC), and to store a tag and the EDC for each cache entry in the error protection region of the cache entry. In this manner, the capacity of a DRAM cache can be increased by avoiding the need for the tag storage region for each cache entry, while still providing error detection for the cache entry.

3.

发明授权
Memory controllers employing memory capacity and/or bandwidth compression with next read address prefetching, and related processor-based systems and methods 有权

公开(公告)号：US09740621B2

公开(公告)日：2017-08-22

申请号：US14716108

申请日：2015-05-19

Applicant: QUALCOMM Incorporated

Inventor： Mattheus Cornelis Antonius Adrianus Heddes , Natarajan Vaidhyanathan , Colin Beaton Verrilli

IPC: G06F12/00 , G06F13/00 , G06F12/0875 , G06F12/0862 , G06F12/02 , G06F12/1009 , H03M7/30

CPC classification number: G06F12/0875 , G06F12/0246 , G06F12/0862 , G06F12/1009 , G06F2212/1016 , G06F2212/1056 , G06F2212/251 , G06F2212/401 , G06F2212/45 , G06F2212/602 , H03M7/30 , Y02D10/13

Abstract: Memory controllers employing memory capacity and/or bandwidth compression with next read address prefetching, and related processor-based systems and methods are disclosed. In certain aspects, memory controllers are employed that can provide memory capacity compression. In certain aspects disclosed herein, a next read address prefetching scheme can be used by a memory controller to speculatively prefetch data from system memory at another address beyond the currently accessed address. Thus, when memory data is addressed in the compressed memory, if the next read address is stored in metadata associated with the memory block at the accessed address, the memory data at the next read address can be prefetched by the memory controller to be available in case a subsequent read operation issued by a central processing unit (CPU) has been prefetched by the memory controller.

4.

发明授权
Memory storage format for supporting machine learning acceleration 有权

公开(公告)号：US12165237B2

公开(公告)日：2024-12-10

申请号：US17946753

申请日：2022-09-16

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Natarajan Vaidhyanathan , Matthew Simpson , Geoffrey Carlton Berry , Sandeep Pande

IPC: G06T1/60 , G06N3/063

Abstract: A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device is described. The method includes receiving an image in a first layer storage format of a neural network. The method also includes assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format. The method further includes storing the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels. The method also includes accelerating inference video processing of the image according to the assigned addresses for the image pixels corresponding to the blocked ML storage acceleration format.

5.

发明授权
Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems 有权

公开(公告)号：US11144368B2

公开(公告)日：2021-10-12

申请号：US16443954

申请日：2019-06-18

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Natarajan Vaidhyanathan

IPC: G06F9/46 , G06F9/52

Abstract: Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems is disclosed. In one aspect, a synchronization management circuit provides a semaphore including a counting semaphore value indicator, a current wait count indicator, and a target wait count indicator. When a consumer completes a wait operation, the synchronization management circuit adjusts the value of the current wait count indicator towards the value of the target wait count indicator, and compares the value of the current wait count indicator to the value of the target wait count indicator. If the value of the current wait count indicator has reached the value of the target wait count indicator, the synchronization management circuit infers that all consumers have observed the semaphore, and accordingly resets both the counting semaphore value indicator and the current wait count indicator to an initial wait value to place the semaphore in its initial state for reuse.

6.

发明授权
Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices 有权

公开(公告)号：US10936943B2

公开(公告)日：2021-03-02

申请号：US16117952

申请日：2018-08-30

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Mattheus Cornelis Antonius Adrianus Heddes , Natarajan Vaidhyanathan , Koustav Bhattacharya , Robert Dreyer

IPC: G06N3/063 , G06F15/80 , G06F17/16 , G06N3/04

Abstract: Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices is disclosed. In this regard, a matrix-processor-based device provides a central processing unit (CPU) and a matrix processor. The matrix processor reorganizes a plurality of weight matrices and a plurality of input matrices into swizzled weight matrices and swizzled input matrices, respectively, that have regular dimensions natively supported by the matrix processor. The matrix-processor-based device then performs a convolution operation using the matrix processor to perform matrix multiplication/accumulation operations for the regular dimensions of the weight matrices and the input matrices, and further uses the CPU to execute instructions for handling the irregular dimensions of the weight matrices and the input matrices (e.g., by executing a series of nested loops, as a non-limiting example). The matrix-processor-based device thus provides efficient hardware acceleration by taking advantage of dimensional regularity, while maintaining the flexibility to handle different variations of convolution.

7.

发明授权
Providing efficient floating-point operations using matrix processors in processor-based systems 有权

公开(公告)号：US10747501B2

公开(公告)日：2020-08-18

申请号：US16118099

申请日：2018-08-30

Applicant: QUALCOMM Incorporated

Inventor： Mattheus Cornelis Antonius Adrianus Heddes , Natarajan Vaidhyanathan , Robert Dreyer , Colin Beaton Verrilli , Koustav Bhattacharya

IPC: G06F7/483 , G06F7/544 , G06F15/80 , G06F7/499 , G06F15/78

Abstract: Providing efficient floating-point operations using matrix processors in processor-based systems is disclosed. In this regard, a matrix-processor-based device provides a matrix processor comprising a positive partial sum accumulator and a negative partial sum accumulator. As the matrix processor processes pairs of floating-point operands, the matrix processor calculates an intermediate product based on a first floating-point operand and a second floating-point operand and determines a sign of the intermediate product. Based on the sign, the matrix processor normalizes the intermediate product with a partial sum fraction of the positive partial sum accumulator or the negative partial sum accumulator, then adds the intermediate product to the positive sum accumulator or the negative sum accumulator. After processing all pairs of floating-point operands, the matrix processor subtracts the negative partial sum accumulator from the positive partial sum accumulator to generate a final sum, then renormalizes the final sum a single time.

8.

发明申请
PROVIDING MATRIX MULTIPLICATION USING VECTOR REGISTERS IN PROCESSOR-BASED DEVICES 审中-公开

公开(公告)号：US20190079903A1

公开(公告)日：2019-03-14

申请号：US16129480

申请日：2018-09-12

Applicant: QUALCOMM Incorporated

Inventor： Robert Dreyer , Mattheus Cornelis Antonius Adrianus Heddes , Colin Beaton Verrilli , Natarajan Vaidhyanathan , Koustav Bhattacharya

IPC: G06F17/16 , G06F15/80

Abstract: Providing matrix multiplication using vector registers in processor-based devices is disclosed. In one aspect, a method for providing matrix multiplication comprises rearranging elements of a first submatrix and a second submatrix into first and second vectors, respectively, which are stored in first and second vector registers. A matrix multiplication vector operation using the first and second vector registers as input operands is then performed to generate an output vector that is stored in an output vector register. Each element E of the output vector, where 0≤E

9.

发明授权
Providing memory bandwidth compression using adaptive compression in central processing unit (CPU)-based systems 有权

公开(公告)号：US10176090B2

公开(公告)日：2019-01-08

申请号：US15266765

申请日：2016-09-15

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Natarajan Vaidhyanathan , Mattheus Cornelis Antonius Adrianus Heddes

IPC: G06F12/02 , G06F12/0804 , G06F12/0866 , G06F12/0875 , H04L29/06 , H04L12/811

Abstract: Providing memory bandwidth compression using adaptive compression in central processing unit (CPU)-based systems is disclosed. In one aspect, a compressed memory controller (CMC) is configured to implement two compression mechanisms: a first compression mechanism for compressing small amounts of data (e.g., a single memory line), and a second compression mechanism for compressing large amounts of data (e.g., multiple associated memory lines). When performing a memory write operation using write data that includes multiple associated memory lines, the CMC compresses each of the memory lines separately using the first compression mechanism, and also compresses the memory lines together using the second compression mechanism. If the result of the second compression is smaller than the result of the first compression, the CMC stores the second compression result in the system memory. Otherwise, the first compression result is stored.

10.

发明授权
Providing memory bandwidth compression using multiple last-level cache (LLC) lines in a central processing unit (CPU)-based system 有权

公开(公告)号：US10146693B2

公开(公告)日：2018-12-04

申请号：US15718449

申请日：2017-09-28

Applicant: QUALCOMM Incorporated

Inventor： Colin Beaton Verrilli , Mattheus Cornelis Antonius Adrianus Heddes , Mark Anthony Rinaldi , Natarajan Vaidhyanathan

IPC: G06F12/04 , G06F12/12 , G06F12/0875 , G06F12/0897 , G06F12/084 , G06F12/0811 , G06F12/0862

Abstract: Providing memory bandwidth compression using multiple last-level cache (LLC) lines in a central processing unit (CPU)-based system is disclosed. In some aspects, a compressed memory controller (CMC) provides an LLC comprising multiple LLC lines, each providing a plurality of sub-lines the same size as a system cache line. The contents of the system cache line(s) stored within a single LLC line are compressed and stored in system memory within the memory sub-line region corresponding to the LLC line. A master table stores information indicating how the compressed data for an LLC line is stored in system memory by storing an offset value and a length value for each sub-line within each LLC line. By compressing multiple system cache lines together and storing compressed data in a space normally allocated to multiple uncompressed system lines, the CMC enables compression sizes to be smaller than the memory read/write granularity of the system memory.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification