PROCESSING OF ASYMMETRICALLY QUANTIZED INPUT AND KERNEL COEFFICIENTS IN NEURAL NETWORK PROCESSOR

    公开(公告)号:US20240329929A1

    公开(公告)日:2024-10-03

    申请号:US18127528

    申请日:2023-03-28

    Applicant: Apple Inc.

    CPC classification number: G06F7/523 G06F7/50

    Abstract: Embodiments relate to performing multiply-accumulator operation on asymmetrically quantized input data and kernel data in a neural processor. Instead of adjusting to the input data at a multiply-accumulator to account for the asymmetric quantization of the input data, an adjusted bias for the multiply-accumulator operation is computed beforehand and stored in the multiply-accumulator. On the other hand, kernel coefficients derived from the kernel data are adjusted at the multiply-accumulator to account for the asymmetric quantization. In this way, computational complexity associated with asymmetric quantization may be reduced while increasing the efficiency of the convolution operations at the neural processor.

    Multi-mode planar engine for neural processor

    公开(公告)号:US12229657B2

    公开(公告)日:2025-02-18

    申请号:US16596439

    申请日:2019-10-08

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor that include a plurality of neural engine circuits and one or more planar engine circuits. The plurality of neural engine circuits can perform convolution operations of input data of the neural engine circuits with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. The planar engine circuit generates an output from input data that corresponds to output of the neural engine circuits or a version of input data of the neural processor. The planar engine circuit can be configured to multiple modes. In a pooling mode, the planar engine circuit reduces a spatial size of a version of the input data. In an elementwise mode, the planar engine circuit performs an elementwise operation on the input data. In a reduction mode, the planar engine circuit reduces the rank of a tensor.

    Reduction mode of planar engine in neural processor

    公开(公告)号:US11537864B2

    公开(公告)日:2022-12-27

    申请号:US16695782

    申请日:2019-11-26

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In a reduction mode, the planar engine circuit may process values arranged in one or more dimensions of input to generate a reduced value. The reduced values across multiple input data may be accumulated. The planar engine circuit may program a filter circuit as a reduction tree to gradually reduce the data into a reduced value. The reduction operation reduces the size of one or more dimensions of a tensor.

    Broadcasting mode of planar engine for neural processor

    公开(公告)号:US12124943B2

    公开(公告)日:2024-10-22

    申请号:US18120218

    申请日:2023-03-10

    Applicant: Apple Inc.

    CPC classification number: G06N3/063 G06F7/78 G06F9/542 G06N3/084 G06N20/10

    Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

    BROADCASTING MODE OF PLANAR ENGINE FOR NEURAL PROCESSOR

    公开(公告)号:US20230206051A1

    公开(公告)日:2023-06-29

    申请号:US18120218

    申请日:2023-03-10

    Applicant: Apple Inc.

    CPC classification number: G06N3/063 G06N3/084 G06F7/78 G06F9/542 G06N20/10

    Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

    REDUCTION OPERATION WITH RETENTION IN NEURAL NETWORK PROCESSOR

    公开(公告)号:US20230121448A1

    公开(公告)日:2023-04-20

    申请号:US17505426

    申请日:2021-10-19

    Applicant: Apple Inc.

    Abstract: Embodiments of the present disclosure relate to a reduction operation in a neural processor circuit where results of the reduction operation are retained for multiple post-processing operations. The neural processor circuit includes neural engine circuits and a planar engine circuit coupled to the neural engine circuits. At least one neural engine circuit performs a convolution operation to generate output data. The planar engine circuit includes a filter circuit and a line buffer coupled to the filter circuit. The filter circuit performs a reduction operation for each patch of a tensor from the output data to generate a respective reduced value associated with a corresponding channel of the tensor. The line buffer stores reduced values each being associated with a respective channel of the tensor. The line buffer retains the reduced values for a defined number of operating cycles as indicated by a refresh flag defining resetting of the line buffer.

    Multi-Mode Planar Engine For Neural Processor

    公开(公告)号:US20210103803A1

    公开(公告)日:2021-04-08

    申请号:US16596439

    申请日:2019-10-08

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor that include a plurality of neural engine circuits and one or more planar engine circuits. The plurality of neural engine circuits can perform convolution operations of input data of the neural engine circuits with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. The planar engine circuit generates an output from input data that corresponds to output of the neural engine circuits or a version of input data of the neural processor. The planar engine circuit can be configured to multiple modes. In a pooling mode, the planar engine circuit reduces a spatial size of a version of the input data. In an elementwise mode, the planar engine circuit performs an elementwise operation on the input data. In a reduction mode, the planar engine circuit reduces the rank of a tensor.

    Broadcasting mode of planar engine for neural processor

    公开(公告)号:US11630991B2

    公开(公告)日:2023-04-18

    申请号:US16781824

    申请日:2020-02-04

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor that includes one or more neural engine circuits and planar engine circuits. The neural engine circuits can perform convolution operations of input data with one or more kernels to generate outputs. The planar engine circuit is coupled to the plurality of neural engine circuits. A planar engine circuit can be configured to multiple modes. In an elementwise mode, the planar engine circuit may combine two tensors by performing operations element by element. The planar engine circuit may support elementwise operation for two tensors that are in different sizes and ranks. The planar engine circuit may perform a broadcasting operation to duplicate one or more values across one or more channels to make a smaller tensor matching the size of the larger tensor.

    Ternary mode of planar engine for neural processor

    公开(公告)号:US11604975B2

    公开(公告)日:2023-03-14

    申请号:US16844964

    申请日:2020-04-09

    Applicant: Apple Inc.

    Abstract: A neural processor includes one or more neural engine circuits and a planar engine circuit. The neural engine circuits can perform convolution operations of first input data with one or more kernels to generate a first output. The planar engine circuit receives second input data that corresponds to a version of the first input data. The planar engine circuit also receives third input data that includes fourth input data and fifth input data stored together in a dimension of third input data. The planar engine circuit performs a first elementwise operation between a version of the second input data and a version of the fourth input data to generate intermediate data. The planar engine circuit performs a second elementwise operation between the intermediate data and a version of the fifth input data to generate a second output.

    TERNARY MODE OF PLANAR ENGINE FOR NEURAL PROCESSOR

    公开(公告)号:US20210319290A1

    公开(公告)日:2021-10-14

    申请号:US16844964

    申请日:2020-04-09

    Applicant: Apple Inc.

    Abstract: A neural processor includes one or more neural engine circuits and a planar engine circuit. The neural engine circuits can perform convolution operations of first input data with one or more kernels to generate a first output. The planar engine circuit receives second input data that corresponds to a version of the first input data. The planar engine circuit also receives third input data that includes fourth input data and fifth input data stored together in a dimension of third input data. The planar engine circuit performs a first elementwise operation between a version of the second input data and a version of the fourth input data to generate intermediate data. The planar engine circuit performs a second elementwise operation between the intermediate data and a version of the fifth input data to generate a second output.

Patent Agency Ranking