Abstract:
Certain aspects of the present disclosure provide a processor, comprising: a configurable nonlinear activation function circuit configured to: determine, based on a selected nonlinear activation function, a set of parameters for the nonlinear activation function; and generate output data based on application of the set of parameters for the nonlinear activation function, wherein: the configurable nonlinear activation function circuit comprises at least one nonlinear approximator comprising at least two successive linear approximators, and each linear approximator of the at least two successive linear approximators is configured to approximate a linear function using one or more function parameters of the set of parameters.
Abstract:
Certain aspects of the present disclosure provide methods and apparatus for producing programmable probability distribution function of pseudo-random numbers that can be utilized for filtering (dropping and passing) neuron spikes. The present disclosure provides a simpler, smaller, and lower-power circuit than that typically used. It can be programmed to produce any of a variety of non-uniformly distributed sequences of numbers. These sequences can approximate true probabilistic distributions, but maintain sufficient pseudo-randomness to still be considered random in a probabilistic sense. This circuit can be an integral part of a filter block within an ASIC chip emulating an artificial nervous system.
Abstract:
A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
Abstract:
A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.
Abstract:
Certain aspects of the present disclosure are directed to methods and apparatus for circular floating point addition. An example method generally includes obtaining a first floating point number represented by a first significand and a first exponent, obtaining a second floating point number represented by a second significand and second exponent, and adding the first floating point number and the second floating point number using a circular accumulator device.