Lowering hardware for neural networks

    公开(公告)号:US11256977B2

    公开(公告)日:2022-02-22

    申请号:US15857909

    申请日:2017-12-29

    申请人: Facebook, Inc.

    摘要: A disclosed computing system may include a special-purpose hardware device having an input subsystem, a linearization subsystem, and a matrix multiplication unit. The input subsystem may facilitate on-the-fly convolution lowering within a neural network convolution layer by directing input volume patches to logical unit(s) of the device. The linearization subsystem may be configured to receive a patch from the input subsystem and to linearize the patch by arranging elements of the patch as a portion of a data matrix row. The matrix multiplication unit of device may be configured to receive the data matrix from the linearization subsystem and to apply a filter matrix to the data matrix via a matrix multiplication operation. Various other methods, systems, and computer-readable media are also disclosed.

    Hardware accelerator pre-configured with coefficients for matrix-transform operations

    公开(公告)号:US10372787B2

    公开(公告)日:2019-08-06

    申请号:US15839229

    申请日:2017-12-12

    申请人: Facebook, Inc.

    摘要: A special-purpose hardware accelerator may include a cache configured to store an input matrix related to performing a convolution operation and a matrix-multiplication subsystem pre-configured with matrix-transform coefficients for performing matrix-transform operations. The matrix-multiplication subsystem may perform the convolution operation by (1) reading the input matrix from the cache, (2) transforming the input matrix via matrix multiplication, (3) transforming, via matrix multiplication, a parameter matrix that includes convolution parameters for performing the convolution operation, (4) applying the transformed parameter matrix to the transformed input matrix via an element-wise multiplication operation, and then (5) performing an inverse-transformation operation on the results of the element-wise multiplication operation to create an output matrix for the convolution operation. Various other systems and methods are also disclosed.

    Systems and methods for employing predication in computational models

    公开(公告)号:US11264011B2

    公开(公告)日:2022-03-01

    申请号:US16749328

    申请日:2020-01-22

    申请人: Facebook, Inc.

    摘要: The disclosed method may include (1) determining whether a next operation of a plurality of operations of an artificial neural network (ANN) is dependent upon a Boolean predication value based on a representative value for a weight or an input of a node of the ANN, (2) based on the next operation not being dependent on the Boolean predication value, allowing the next operation to update a state of the ANN, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the ANN, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the ANN. Various other methods and systems are also disclosed.

    Systems and methods for protecting neural network weights

    公开(公告)号:US10719613B1

    公开(公告)日:2020-07-21

    申请号:US15903162

    申请日:2018-02-23

    申请人: Facebook, Inc.

    摘要: The disclosed computer-implemented method may include (i) identifying a neural network that comprises an interconnected set of nodes organized in a set of layers represented by a plurality of matrices that each comprise a plurality of weights, where each weight represents a connection between a node in the interconnected set of nodes that resides in one layer in the set of layers and an additional node in the set of interconnected nodes that resides in a different layer in the set of layers, (ii) encrypting, using an encryption cipher, the plurality of weights, (iii) detecting that execution of the neural network has been initiated, and (iv) decrypting, using the encryption cipher, the plurality of weights in response to detecting that the execution of the neural network has been initiated. Various other methods, systems, and computer-readable media are also disclosed.

    LOWERING HARDWARE FOR NEURAL NETWORKS
    6.
    发明申请

    公开(公告)号:US20190205735A1

    公开(公告)日:2019-07-04

    申请号:US15857909

    申请日:2017-12-29

    申请人: Facebook, Inc.

    摘要: A disclosed computing system may include a special-purpose hardware device having an input subsystem, a linearization subsystem, and a matrix multiplication unit. The input subsystem may facilitate on-the-fly convolution lowering within a neural network convolution layer by directing input volume patches to logical unit(s) of the device. The linearization subsystem may be configured to receive a patch from the input subsystem and to linearize the patch by arranging elements of the patch as a portion of a data matrix row. The matrix multiplication unit of device may be configured to receive the data matrix from the linearization subsystem and to apply a filter matrix to the data matrix via a matrix multiplication operation. Various other methods, systems, and computer-readable media are also disclosed.

    Systems and methods for optimizing power usage for systems within quality-of-service constraints

    公开(公告)号:US10948966B1

    公开(公告)日:2021-03-16

    申请号:US15914362

    申请日:2018-03-07

    申请人: Facebook, Inc.

    IPC分类号: G06F1/32 G06F1/3234

    摘要: The disclosed computer-implemented method may include (i) identifying an artificial neural network that processes each input to the artificial neural network in a fixed number of operations, (ii) performing an analysis on the artificial neural network to determine an execution metric that represents the fixed number of operations performed by the artificial neural network to process each input, (iii) determining a quality-of-service metric for an executing system that executes the artificial neural network, and (iv) optimizing power consumption of the executing system by configuring, based on the execution metric and the quality-of-service metric, a processing throughput of at least one physical processor of the executing system, thereby causing the executing system to execute the artificial neural network at a rate that satisfies the quality-of-service metric while limiting the power consumption of the executing system. Various other methods, systems, and computer-readable media are also disclosed.

    In-memory processing based on combining output currents

    公开(公告)号:US10777251B1

    公开(公告)日:2020-09-15

    申请号:US16408331

    申请日:2019-05-09

    申请人: Facebook, Inc.

    摘要: A first value is stored in a first memory cell. A first component output current, from a first electronic component, is provided based on the stored first value, wherein the first component output current is proportional to a place value represented by the first value. A second value is stored in a second memory cell. A second component output current, from a second electronic component, is provided based on the stored second value, wherein the second component output current is proportional to a place value represented by the second value. A combined current of at least the first component output current and the second component output current is detected, wherein the combined current corresponds to a sum of at least the first value and the second value.

    Dynamic power management for artificial intelligence hardware accelerators

    公开(公告)号:US10671147B2

    公开(公告)日:2020-06-02

    申请号:US15846117

    申请日:2017-12-18

    申请人: Facebook, Inc.

    摘要: A computer-implemented method for dynamically managing the power usage and/or performance of an artificial intelligence (AI) hardware accelerator may include (1) receiving an instruction stream that includes one or more instructions for performing at least one AI-specific computing task, (2) identifying a plurality of special-purpose, hardware-based functional units configured to perform AI-specific computing tasks, (3) predicting, based on an analysis of at least a portion of the instruction stream, a power-usage requirement for at least one of the functional units when executing the instruction stream, and then (4) modifying, based on the power-usage requirement, the power supplied to at least one of the functional units. Various other methods and systems are also disclosed.