Patent search ap:("Arm Limited") AND inv:"Jesse Garrett Beu" Page 1

1.

发明公开
System and Method for Accelerating Neural Networks 审中-公开

公开(公告)号：US20230195419A1

公开(公告)日：2023-06-22

申请号：US17554024

申请日：2021-12-17

Applicant: Arm Limited

Inventor： Dibakar Gope , Jesse Garrett Beu , Milos Milosavljevic

IPC: G06F7/544 , G06N3/04

CPC classification number: G06F7/5443 , G06N3/04 , G06F2207/4824

Abstract: A neural network system, method and apparatus are provided. A truth table matrix, an index vector and an input data tensor are read from a memory. At least a portion of the input data tensor is flattened into an input data vector. A scatter accumulate instruction is executed on the index vector and the input data vector to generate an intermediate vector. The truth table matrix and the intermediate vector are then multiplied to generate an output data vector.

2.

发明申请
SNOOP FILTER WITH IMPRECISE ENCODING 有权

公开(公告)号：US20220308999A1

公开(公告)日：2022-09-29

申请号：US17215435

申请日：2021-03-29

Applicant: Arm Limited

Inventor： Joshua Randall , Jamshed Jalal , Tusher P. Ringe , Jesse Garrett Beu

IPC: G06F12/0831 , G06F12/084 , G06F12/0891

Abstract: An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to said address.

3.

发明申请
MIXED-PRECISION COMPUTATION UNIT 有权

公开(公告)号：US20210089889A1

公开(公告)日：2021-03-25

申请号：US16836117

申请日：2020-03-31

Applicant: Arm Limited

Inventor： Dibakar Gope , Jesse Garrett Beu , Paul Nicholas Whatmough , Matthew Mattina

IPC: G06N3/08

Abstract: The present disclosure advantageously provides a mixed precision computation (MPC) unit for executing one or more mixed-precision layers of an artificial neural network (ANN). The MPC unit includes a multiplier circuit configured to input a pair of operands and output a product, a first adder circuit coupled to the multiplier circuit, a second adder circuit, coupled to the first adder circuit, configured to input a pair of operands, an accumulator circuit, coupled to the multiplier circuit and the first adder circuit, configured to output an accumulated value, and a controller, coupled to the multiplier circuit, the first adder circuit, the second adder circuit and the accumulator circuit, configured to input a mode control signal. The controller has a plurality of operating modes including a high precision mode, a low precision add mode and a low precision multiply mode.

4.

发明授权
Matching consecutive values in a data processing apparatus 有权

公开(公告)号：US10678506B2

公开(公告)日：2020-06-09

申请号：US15665715

申请日：2017-08-01

Applicant: ARM Limited

Inventor： Alejandro Martinez Vicente , Jesse Garrett Beu , Mbou Eyole , Timothy Hayes

IPC: G06F9/30 , G06F7/02

Abstract: An apparatus and a method of operating the apparatus are provided for performing a comparison operation to match a given sequence of values within an input vector. Instruction decoder circuitry is responsive to a string match instruction specifying a segment of an input vector to generate control signals to control the data processing circuitry to perform a comparison operation. The comparison operation determines a comparison value indicative of whether each input element of a required set of consecutive input elements of the segment has a value which matches a respective value in consecutive reference elements of the reference data item. A plurality of comparison operations may be performed to determine a match vector corresponding to the segment of the input vector to indicate the start position of the substring in the input vector. A string match instruction, as well as simulator virtual machine implementations, are also provided.

5.

发明公开
Vectorized Operations for Sparse Kernels 审中-公开

公开(公告)号：US20230367843A1

公开(公告)日：2023-11-16

申请号：US17743705

申请日：2022-05-13

Applicant: Arm Limited

Inventor： Joshua Randall , Jesse Garrett Beu , Krishnendra Nathella , Tuan Quang Ta

IPC: G06F17/16 , G06F7/36 , G06F7/50

CPC classification number: G06F17/16 , G06F7/36 , G06F7/50

Abstract: A data processing method and processor instructions are provided that leverage scatter operations to efficiently merge vector and matrix indices, as compared to standard matrix and vector operations, as well as merge other arithmetic results, lists of numbers, etc.

6.

发明授权
Snoop filter with imprecise encoding 有权

公开(公告)号：US11567870B2

公开(公告)日：2023-01-31

申请号：US17215435

申请日：2021-03-29

Applicant: Arm Limited

Inventor： Joshua Randall , Jamshed Jalal , Tushar P. Ringe , Jesse Garrett Beu

IPC: G06F12/08 , G06F12/0831 , G06F12/0891 , G06F12/084

Abstract: An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to the given address.

7.

发明申请
Hybrid Filter Banks for Artificial Neural Networks 有权

公开(公告)号：US20210089888A1

公开(公告)日：2021-03-25

申请号：US16836110

申请日：2020-03-31

Applicant: Arm Limited

Inventor： Dibakar Gope , Jesse Garrett Beu , Paul Nicholas Whatmough , Matthew Mattina

IPC: G06N3/08 , G06F7/483

Abstract: The present disclosure advantageously provides a system including a memory, a processor, and a circuitry to execute one or more mixed precision layers of an artificial neural network (ANN), each mixed precision layer including high-precision weight filters and low precision weight filters. The circuitry is configured to perform one or more calculations on an input feature map having a plurality of input channels (cin) using the high precision weight filters to create a high precision output feature map having a first number of output channels (k), perform one or more calculations on the input feature map using the low precision weight filters to create a low precision output feature map having a second number of output channels (cout−k), and concatenate the high precision output feature map and the low precision output feature map to create a unified output feature map having a plurality of output channels (cout).

8.

发明申请
SYSTEM, DEVICES AND/OR PROCESSES FOR ADAPTING NEURAL NETWORK PROCESSING DEVICES 有权

公开(公告)号：US20220405597A1

公开(公告)日：2022-12-22

申请号：US17349780

申请日：2021-06-16

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Jesse Garrett Beu , Dibakar Gope , Mark John O'Connor

IPC: G06N3/08 , G06N3/04 , G06K9/62 , G06K9/46 , G06K9/00 , G10L15/02 , G10L15/16

Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to adapt a computing device to classify physical features in a deployment environment. In a particular implementation, computing resources may be selectively de-allocated from at least one of one or more elements of a computing architecture based, at least in part, on assessed impacts to the one or more elements of the computing architecture.

9.

发明授权
Apparatus and method for maintaining cache coherence data for memory blocks of different size granularities using a snoop filter storage comprising an n-way set associative storage structure 有权

公开(公告)号：US11151039B2

公开(公告)日：2021-10-19

申请号：US16821271

申请日：2020-03-17

Applicant: Arm Limited

Inventor： Joshua Randall , Jesse Garrett Beu

IPC: G06F12/0815 , G06F12/0895 , G06F12/14 , G06F12/0884 , G06F12/02

Abstract: An apparatus is provided for receiving requests from a plurality of processing units, at least some of which may have associated cache storage. A snoop unit implements a cache coherency protocol when a request received by the apparatus identifies a cacheable memory address. Snoop filter storage is provided comprising an N-way set associative storage structure with a plurality of entries. Each entry stores coherence data for an associated address range identifying a memory block, and the coherence data is used to determine which cache storages need to be subjected to a snoop operation when implementing the cache coherency protocol in response to a received request. The snoop filter storage stores coherence data for memory blocks of at least a plurality P of different size granularities, and is organised as a plurality of at least P banks that are accessible in parallel, where each bank has entries within each of the N-ways of the snoop filter storage. The snoop control circuitry controls access to the snoop filter storage, and is responsive to a received address to create a group of indexes, the group of indexes comprising an index for each different size granularity amongst the P different size granularities, and each index in the group being constrained so as to identify an entry in a different bank of the snoop filter storage. The snoop control circuitry uses the group of indexes to perform a lookup operation in parallel within the snoop filter storage in order to determine, taking into account each of the different size granularities, whether an entry stores coherence data for the received address.

10.

发明申请
Skip Predictor for Pre-Trained Recurrent Neural Networks 有权

公开(公告)号：US20210056422A1

公开(公告)日：2021-02-25

申请号：US16855681

申请日：2020-04-22

Applicant: Arm Limited

Inventor： Urmish Ajit Thakker , Jin Tao , Ganesh Suryanarayan Dasika , Jesse Garrett Beu

IPC: G06N3/08 , G06N3/04 , G06K9/62 , G06F17/18

Abstract: The present disclosure advantageously provides a system and a method for skipping recurrent neural network (RNN) state updates using a skip predictor. Sequential input data are received and divided into sequences of input data values, each input data value being associated with a different time step for a pre-trained RNN model. At each time step, the hidden state vector for a prior time step is received from the pre-trained RNN model, and a determination, based on the input data value and the hidden state vector for at least one prior time step, is made whether to provide or not provide the input data value associated with the time step to the pre-trained RNN model for processing. When the input data value is not provided, the pre-trained RNN model does not update its hidden state vector. Importantly, the skip predictor is trained without retraining the pre-trained RNN model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification