-
公开(公告)号:US11264011B2
公开(公告)日:2022-03-01
申请号:US16749328
申请日:2020-01-22
Applicant: Facebook, Inc.
Inventor: Nadav Rotem , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Jong Soo Park , James Kenneth Reed
Abstract: The disclosed method may include (1) determining whether a next operation of a plurality of operations of an artificial neural network (ANN) is dependent upon a Boolean predication value based on a representative value for a weight or an input of a node of the ANN, (2) based on the next operation not being dependent on the Boolean predication value, allowing the next operation to update a state of the ANN, and (3) based on the next operation being dependent on the Boolean predication value, performing at least one of (a) allowing, based on the Boolean predication value being a first value, the next operation to update the state of the ANN, and (b) preventing, based on the Boolean predication value being a second value different from the first value, the next operation from updating the state of the ANN. Various other methods and systems are also disclosed.
-
公开(公告)号:US20210349694A1
公开(公告)日:2021-11-11
申请号:US16869288
申请日:2020-05-07
Applicant: Facebook, Inc.
Inventor: Thomas Mark Ulrich , Abdulkadir Utku Diril , Zhao Wang
Abstract: A device (e.g., integrated circuit chip) includes a first operand register, a second operand register, a multiplication unit, and a hardware logic component. The first operand register is configured to store a first operand value. The second operand register is configured to store a second operand value. The multiplication unit is configured to at least multiply the first operand value with the second operand value. The hardware logic component is configured to detect whether a zero value is provided and in response to a detection that the zero value is being provided: cause an update of at least the first operand register to be disabled, and cause a result of a multiplication of the first operand value with the second operand value to be a zero-value result.
-
公开(公告)号:US20210334072A1
公开(公告)日:2021-10-28
申请号:US16855927
申请日:2020-04-22
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a plurality of dot product processor units and element-wise multiplication units. The dot product processor units perform a depthwise convolution of a data matrix with a separate depthwise convolution weight matrix for each data matrix channel. Each dot product processor unit performs at least a portion of the depthwise convolution for one or more data matrix channels. The element-wise multiplication units perform multiplication operations of a pointwise convolution. Each element-wise multiplication unit applies to each depthwise convolution partial result element received from one or more of the dot product processor units a corresponding data element from each of a plurality of pointwise convolution weight filters to determine element-wise multiplication unit results. The processor system sums together different groups of data elements from the element-wise multiplication unit results to at least in part calculate different data elements of a result of the pointwise convolution.
-
公开(公告)号:US20210319076A1
公开(公告)日:2021-10-14
申请号:US16843645
申请日:2020-04-08
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element. The processing elements sum together a portion of the channel convolution result data elements from a group of different convolution processor units to determine a groupwise convolution result data element.
-
公开(公告)号:US11138292B1
公开(公告)日:2021-10-05
申请号:US16414703
申请日:2019-05-16
Applicant: FACEBOOK, INC.
Inventor: Krishnakumar Nair , Abdulkadir Utku Diril , Dheevatsa Mudigere , Ehsan Khish Ardestani Zadeh , Olivia Wu , Yuchen Hao
Abstract: An electronic circuit performs depthwise convolution of an input matrix with a kernel matrix to generate an output matrix. In each of a plurality of rounds of operations, a row of kernel matrix elements is selected for the round of operations, and applied to the input matrix to obtain an intermediate data array corresponding to the selected row of kernel elements. The electronic circuit includes a plurality of subcircuits operable in parallel to generate, in each operation, a set of intermediate data elements in the intermediate data array. Each subcircuit generates a respective intermediate data element that is the sum of a respective row of the input matrix elements weighted by a set of weight elements including the selected row of kernel elements and at least one zero element. The selected row of kernel elements is successively shifted among the set of weight elements in the round of operations.
-
公开(公告)号:US20210294875A1
公开(公告)日:2021-09-23
申请号:US16826697
申请日:2020-03-23
Applicant: Facebook, Inc.
Inventor: Rakesh Komuravelli , Krishnakumar Narayanan Nair , Abdulkadir Utku Diril , Ehsan Khish Ardestani Zadeh , Yuchen Hao , Martin Schatz , Thomas Mark Ulrich , Olivia Wu , Anup Ramesh Kadkol , Amin Firoozshahian
Abstract: A processor system comprises a hardware channel convolution processor unit and dot product processor unit. The channel convolution processor unit is configured to perform depthwise convolution, including by multiplying each data element of a first group of data elements of a convolution data matrix with a corresponding data element of a second group of data elements of a plurality of depthwise convolution weight matrices and summing together, for each specific channel, multiplication results corresponding to the specific channel to determine one corresponding result data element in a corresponding channel convolution result matrix to calculate a portion of depthwise convolution results. The dot product processor unit is configured to perform pointwise convolution, including applying pointwise weight matrices to the portion of depthwise convolution results to determine a portion of separable convolution results while at least another portion of the depthwise convolution results is being calculated by the processor system.
-
27.
公开(公告)号:US20210125044A1
公开(公告)日:2021-04-29
申请号:US16667700
申请日:2019-10-29
Applicant: Facebook, Inc.
Inventor: Yuchen Hao , Krishnakumar Narayanan Nair , Ehsan Khish Ardestani Zadeh , Rakesh Komuravelli , Abdulkadir Utku Diril , Thomas Mark Ulrich
Abstract: A first group of elements is element-wise multiplied with a second group of elements using a plurality of multipliers belonging to a matrix multiplication hardware unit. Results of the plurality of multipliers are added together using a hierarchical tree of adders belonging to the matrix multiplication hardware unit and a final result of the hierarchical tree of adders or any of a plurality of intermediate results of the hierarchical tree of adders is selectively provided for use in determining an output result matrix. A control unit is used to instruct the matrix multiplication hardware unit to perform a plurality of different matrix multiplications in parallel by using a combined matrix that includes elements of a plurality of different operand matrices and utilize one or more selected ones of the intermediate results of the hierarchical tree of adders for use in determining the output result matrix that includes different groups of elements representing different multiplication results corresponding to different ones of the different operand matrices.
-
公开(公告)号:US10834385B1
公开(公告)日:2020-11-10
申请号:US16004982
申请日:2018-06-11
Applicant: Facebook, Inc.
Inventor: Abdulkadir Utku Diril
IPC: H04N7/12 , H04N19/105 , G06K9/00 , H04N19/172 , G06K9/62 , G06N20/00
Abstract: A computer-implemented method for encoding videos using reference objects may include identifying, by a computing device, a video to be encoded. The method may also include identifying, by the computing device, a set of objects that appear in the video as reference images for video encoding. In addition, the method may include training a machine learning algorithm to detect an object from the set of objects. Furthermore, the method may include encoding each frame of the video using the trained machine learning algorithm. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US10719613B1
公开(公告)日:2020-07-21
申请号:US15903162
申请日:2018-02-23
Applicant: Facebook, Inc.
Inventor: Nadav Rotem , Abdulkadir Utku Diril , Mikhail Smelyanskiy , Jong Soo Park , Roman Levenstein
Abstract: The disclosed computer-implemented method may include (i) identifying a neural network that comprises an interconnected set of nodes organized in a set of layers represented by a plurality of matrices that each comprise a plurality of weights, where each weight represents a connection between a node in the interconnected set of nodes that resides in one layer in the set of layers and an additional node in the set of interconnected nodes that resides in a different layer in the set of layers, (ii) encrypting, using an encryption cipher, the plurality of weights, (iii) detecting that execution of the neural network has been initiated, and (iv) decrypting, using the encryption cipher, the plurality of weights in response to detecting that the execution of the neural network has been initiated. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US20190205735A1
公开(公告)日:2019-07-04
申请号:US15857909
申请日:2017-12-29
Applicant: Facebook, Inc.
Inventor: Mikhail Smelyanskiy , Abdulkadir Utku Diril , Jong Soo Park , Nadav Rotem
Abstract: A disclosed computing system may include a special-purpose hardware device having an input subsystem, a linearization subsystem, and a matrix multiplication unit. The input subsystem may facilitate on-the-fly convolution lowering within a neural network convolution layer by directing input volume patches to logical unit(s) of the device. The linearization subsystem may be configured to receive a patch from the input subsystem and to linearize the patch by arranging elements of the patch as a portion of a data matrix row. The matrix multiplication unit of device may be configured to receive the data matrix from the linearization subsystem and to apply a filter matrix to the data matrix via a matrix multiplication operation. Various other methods, systems, and computer-readable media are also disclosed.
-
-
-
-
-
-
-
-
-