Patent search ap:("Facebook Page Inc.") AND inv:"Abdulkadir Utku Diril"

31.

发明申请
ACCELERATOR HARDWARE FOR COMPRESSION AND DECOMPRESSION 审中-公开

公开(公告)号：US20190190538A1

公开(公告)日：2019-06-20

申请号：US15846110

申请日：2017-12-18

Applicant: Facebook, Inc.

Inventor： Jong Soo Park , Nadav Rotem , Mikhail Smelyanskiy , Abdulkadir Utku Diril

IPC: H03M7/30 , G06N3/02

CPC classification number: H03M7/6011 , G06N3/02 , H03M7/70

Abstract: A system may include a memory device that stores parameters of a layer of a neural network that have been compressed. The system may also include a special-purpose hardware processing unit programmed to, for the layer of the neural network: (1) receive the compressed parameters from the memory device, (2) decompress the compressed parameters, and (3) apply the decompressed parameters in an arithmetic operation of the layer of the neural network. Various other methods, systems, and accelerators are also disclosed.

32.

发明申请
HIGH BANDWIDTH MEMORY SYSTEM WITH CROSSBAR SWITCH FOR DYNAMICALLY PROGRAMMABLE DISTRIBUTION SCHEME 有权

公开(公告)号：US20210182196A1

公开(公告)日：2021-06-17

申请号：US16717998

申请日：2019-12-17

Applicant: Facebook, Inc.

Inventor： Olivia Wu , Abdulkadir Utku Diril , Krishnakumar Narayanan Nair , Aravind Kalaiah , Anup Ramesh Kadkol , Pankaj Kansal

IPC: G06F12/0813 , G06F13/16 , G06N3/02

Abstract: A system comprises a processor coupled to a plurality of memory units. Each of the plurality of memory units includes a request processing unit and a plurality of memory banks. Each request processing unit includes a plurality of decomposition units and a crossbar switch, the crossbar switch communicatively connecting each of the plurality of decomposition units to each of the plurality of memory banks. The processor includes a plurality of processing elements and a communication network communicatively connecting the plurality of processing elements to the plurality of memory units. At least a first processing element of the plurality of processing elements includes a control logic unit and a matrix compute engine. The control logic unit is configured to access the plurality of memory units using a dynamically programmable distribution scheme.

33.

发明授权
Systems and methods for efficient scaling of quantized integers 有权

公开(公告)号：US11023240B1

公开(公告)日：2021-06-01

申请号：US16692899

申请日：2019-11-22

Applicant: Facebook, Inc.

Inventor： Nadav Rotem , Jong Soo Park , Zhaoxia Deng , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Roman Dzhabarov , James Hegeman

IPC: G06F7/483 , G06F9/355 , G06F7/49

Abstract: The disclosed computer-implemented method may include receiving an input value and a floating-point scaling factor and determining (1) an integer scaling factor based on the floating-point scaling factor, (2) a pre-scaling adjustment value representative of a number of places by which to shift a binary representation of the input value prior to a scaling operation, and (3) a post-scaling adjustment value representative of a number of places by which to shift the binary representation of the input value following the scaling operation. The method may further include calculating a scaled result value by (1) shifting rightwards the binary representation of the input value by the pre-scaling adjustment value, (2) scaling the shifted binary representation of the input value by the integer scaling factor, and (3) shifting rightwards the shifted and scaled binary value by the post-scaling adjustment value. Various other methods, systems, and computer-readable media are also disclosed.

34.

发明申请
HIGH THROUGHPUT MATRIX PROCESSOR WITH SUPPORT FOR CONCURRENTLY PROCESSING MULTIPLE MATRICES 有权

公开(公告)号：US20210124794A1

公开(公告)日：2021-04-29

申请号：US16667791

申请日：2019-10-29

Applicant: Facebook, Inc.

Inventor： Krishnakumar Narayanan Nair , Olivia Wu , Ehsan Khish Ardestani Zadeh , Abdulkadir Utku Diril , Thomas Mark Ulrich , Yuchen Hao , Rakesh Komuravelli , Aravind Kalaiah

IPC: G06F17/16 , G06F7/544 , G06F17/15

Abstract: A system comprises a data input vector unit, a weight input vector unit, and a plurality of calculation units of a matrix processor unit. The data input vector unit is configured to concurrently receive elements of different rows of a first and second data matrix. The weight input vector unit is configured to receive a combined weight vector and at least in part concurrently provide obtained weight elements of a first and second weight matrix to a corresponding first and second group of calculation units. Each calculation unit of the first and second group of calculation units is configured to multiply elements from the data input vector unit with elements of the corresponding weight matrix from the weight input vector unit and sum together multiplication results of the corresponding calculation unit to at least in part determine a corresponding element in a first or second convolution result matrix.

35.

发明授权
Systems and methods for efficiently updating neural networks 有权

公开(公告)号：US10699190B1

公开(公告)日：2020-06-30

申请号：US15911120

申请日：2018-03-04

Applicant: Facebook, Inc.

Inventor： Nadav Rotem , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Jong Soo Park , Christopher Dewan

IPC: G06F21/60 , H04L29/06 , G06N3/08 , G06N3/10

Abstract: The disclosed computer-implemented method for efficiently updating neural networks may include (i) identifying a neural network that comprises sets of interconnected nodes represented at least in part by a plurality of matrices and that is trained on a training computing device and executes on at least one endpoint device, (ii) constraining a training session for the neural network to reduce the size in memory of the difference between the previous values of the matrices prior to the training session and the new values of the matrices after the training session, (iii) creating a delta update for the neural network that describes the difference between the previous values and the new values, and (iv) updating the neural network on the endpoint device to the new state by sending the delta update from the training computing device to the endpoint computing device. Various other methods, systems, and computer-readable media are also disclosed.

36.

发明授权
Systems and methods for efficient scaling of quantized integers 有权

公开(公告)号：US10579383B1

公开(公告)日：2020-03-03

申请号：US15992793

申请日：2018-05-30

Applicant: Facebook, Inc.

Inventor： Nadav Rotem , Jong Soo Park , Zhaoxia Deng , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Roman Dzhabarov , James Wesley Hegeman

IPC: G06F7/483 , G06F9/355 , G06F7/49

Abstract: The disclosed computer-implemented method may include receiving an input value and a floating-point scaling factor and determining (1) an integer scaling factor based on the floating-point scaling factor, (2) a pre-scaling adjustment value representative of a number of places by which to shift a binary representation of the input value prior to a scaling operation, and (3) a post-scaling adjustment value representative of a number of places by which to shift the binary representation of the input value following the scaling operation. The method may further include calculating a scaled result value by (1) shifting rightwards the binary representation of the input value by the pre-scaling adjustment value, (2) scaling the shifted binary representation of the input value by the integer scaling factor, and (3) shifting rightwards the shifted and scaled binary value by the post-scaling adjustment value. Various other methods, systems, and computer-readable media are also disclosed.

37.

发明授权
Systems and methods for employing predication in computational models 有权

公开(公告)号：US10553207B2

公开(公告)日：2020-02-04

申请号：US15857990

申请日：2017-12-29

Applicant: Facebook, Inc.

Inventor： Nadav Rotem , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Jong Soo Park , James Kenneth Reed

IPC: G10L15/16 , G10L15/22 , G10L15/18 , G06N3/04

Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of a computational model is dependent upon a Boolean predication value, (2) based on the next operation not being dependent on the Boolean predication value, performing the next operation, where a state of the computational model is updated as a result of performing the next operation, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the computational model, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the computational model. Various other methods and systems are also disclosed.

38.

发明授权
Sparsity-aware hardware accelerators 有权

公开(公告)号：US10482156B2

公开(公告)日：2019-11-19

申请号：US15857918

申请日：2017-12-29

Applicant: Facebook, Inc.

Inventor： Abdulkadir Utku Diril , Jong Soo Park , Nadav Rotem , Mikhail Smelyanskiy

IPC: G06F17/16 , G06N3/063 , G06F7/544 , G06N3/04

Abstract: A special-purpose, hardware-based accelerator may include an input subsystem configured to receive first and second vectors as operands of a full dot-product operation. The accelerator may also include a sparsity-aware dot-product engine communicatively coupled to the input subsystem and configured to perform adaptive dot-product processing by: (1) identifying, within the first and second vectors, at least one zero-value element and (2) executing, in response to identifying the zero-value element, a reduced dot-product operation that excludes, relative to the full dot-product operation, at least one mathematical operation in which the zero-value element is an operand. The accelerator may also include an output subsystem that is communicatively coupled to the sparsity-aware dot-product engine and configured to send a result of the reduced dot-product operation to a storage subsystem. Various other accelerators, computing systems, and methods are also disclosed.

39.

发明申请
DYNAMIC POWER MANAGEMENT FOR ARTIFICIAL INTELLIGENCE HARDWARE ACCELERATORS 审中-公开

公开(公告)号：US20190187775A1

公开(公告)日：2019-06-20

申请号：US15846117

申请日：2017-12-18

Applicant: Facebook, Inc.

Inventor： Nadav Rotem , Jong Soo Park , Mikhail Smelyanskiy , Abdulkadir Utku Diril

IPC: G06F1/32 , G06N5/02 , G06F9/38

Abstract: A computer-implemented method for dynamically managing the power usage and/or performance of an artificial intelligence (AI) hardware accelerator may include (1) receiving an instruction stream that includes one or more instructions for performing at least one AI-specific computing task, (2) identifying a plurality of special-purpose, hardware-based functional units configured to perform AI-specific computing tasks, (3) predicting, based on an analysis of at least a portion of the instruction stream, a power-usage requirement for at least one of the functional units when executing the instruction stream, and then (4) modifying, based on the power-usage requirement, the power supplied to at least one of the functional units. Various other methods and systems are also disclosed.

40.

发明申请
HARDWARE ACCELERATOR PRE-CONFIGURED WITH COEFFICIENTS FOR MATRIX-TRANSFORM OPERATIONS 审中-公开

公开(公告)号：US20190179869A1

公开(公告)日：2019-06-13

申请号：US15839229

申请日：2017-12-12

Applicant: Facebook, Inc.

Inventor： Jong Soo Park , Nadav Rotem , Mikhail Smelyanskiy , Abdulkadir Utku Diril

IPC: G06F17/16 , G06F7/523 , G06F15/18 , G06N3/02 , G06F12/0875

CPC classification number: G06F17/16 , G06F7/523 , G06F12/0875 , G06F17/156 , G06N3/02 , G06N20/00

Abstract: A special-purpose hardware accelerator may include a cache configured to store an input matrix related to performing a convolution operation and a matrix-multiplication subsystem pre-configured with matrix-transform coefficients for performing matrix-transform operations. The matrix-multiplication subsystem may perform the convolution operation by (1) reading the input matrix from the cache, (2) transforming the input matrix via matrix multiplication, (3) transforming, via matrix multiplication, a parameter matrix that includes convolution parameters for performing the convolution operation, (4) applying the transformed parameter matrix to the transformed input matrix via an element-wise multiplication operation, and then (5) performing an inverse-transformation operation on the results of the element-wise multiplication operation to create an output matrix for the convolution operation. Various other systems and methods are also disclosed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification