Method and system of performing convolution in neural networks with variable dilation rate

    公开(公告)号:US11423251B2

    公开(公告)日:2022-08-23

    申请号:US16733314

    申请日:2020-01-03

    Abstract: A method of performing convolution in a neural network with variable dilation rate is provided. The method includes receiving a size of a first kernel and a dilation rate, determining at least one of size of one or more disintegrated kernels based on the size of the first kernel, a baseline architecture of a memory and the dilation rate, determining an address of one or more blocks of an input image based on the dilation rate, and one or more parameters associated with a size of the input image and the memory. Thereafter, the one or more blocks of the input image and the one or more disintegrated kernels are fetched from the memory, and an output image is obtained based on convolution of each of the one or more disintegrated kernels and the one or more blocks of the input image.

    Method and apparatus with neural network performing deconvolution

    公开(公告)号:US10885433B2

    公开(公告)日:2021-01-05

    申请号:US16107717

    申请日:2018-08-21

    Abstract: A neural network apparatus configured to perform a deconvolution operation includes a memory configured to store a first kernel; and a processor configured to: obtain, from the memory, the first kernel; calculate a second kernel by adjusting an arrangement of matrix elements comprised in the first kernel; generate sub-kernels by dividing the second kernel; perform a convolution operation between an input feature map and the sub-kernels using a convolution operator; and generate an output feature map, as a deconvolution of the input feature map, by merging results of the convolution operation.

    NEURAL PROCESSOR
    23.
    发明申请
    NEURAL PROCESSOR 审中-公开

    公开(公告)号:US20200026980A1

    公开(公告)日:2020-01-23

    申请号:US16552945

    申请日:2019-08-27

    Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

    NEURAL PROCESSOR
    24.
    发明申请
    NEURAL PROCESSOR 审中-公开

    公开(公告)号:US20190392287A1

    公开(公告)日:2019-12-26

    申请号:US16446610

    申请日:2019-06-19

    Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

    APPARATUS WITH ACCELERATED MACHINE LEARNING PROCESSING

    公开(公告)号:US20220036243A1

    公开(公告)日:2022-02-03

    申请号:US17147858

    申请日:2021-01-13

    Abstract: An apparatus includes a global memory and a systolic array. The global memory is configured to store and provide an input feature map (IFM) vector stream from an IFM tensor and a kernel vector stream from a kernel tensor. The systolic array is configured to receive the IFM vector stream and the kernel vector stream from the global memory. The systolic array is on-chip together with the global memory. The systolic array includes a plurality of processing elements (PEs) each having a plurality of vector units, each of the plurality of vector units being configured to perform a dot-product operation on at least one IFM vector of the IFM vector stream and at least one kernel vector of the kernel vector stream per unit clock cycle to generate a plurality of output feature maps (OFMs).

    Neural network method and apparatus

    公开(公告)号:US10909418B2

    公开(公告)日:2021-02-02

    申请号:US16884232

    申请日:2020-05-27

    Abstract: A processor-implemented neural network method includes: obtaining, from a memory, data of an input feature map and kernels having a binary-weight, wherein the kernels are to be processed in a layer of a neural network; decomposing each of the kernels into a first type sub-kernel reconstructed with weights of a same sign, and a second type sub-kernel for correcting a difference between a respective kernel, among the kernels, and the first type sub-kernel; performing a convolution operation by using the input feature map and the first type sub-kernels and the second type sub-kernels decomposed from each of the kernels; and obtaining an output feature map by combining results of the convolution operation.

Patent Agency Ranking