APPARATUS AND METHOD FOR TRAINING LOW BIT-PRECISION DEEP NEURAL NETWORK

    公开(公告)号:US20220222523A1

    公开(公告)日:2022-07-14

    申请号:US17206164

    申请日:2021-03-19

    Abstract: Disclosed herein are an apparatus and method for training a low-bit-precision deep neural network. The apparatus includes an input unit configured to receive training data to train the deep neural network, and a training unit configured to train the deep neural network using training data, wherein the training unit includes a training module configured to perform training using first precision, a representation form determination module configured to determine a representation form for internal data generated during an operation procedure for the training and determine a position of a decimal point of the internal data so that a permissible overflow bit in a dynamic fixed-point system varies randomly, and a layer-wise precision determination module configured to determine precision of each layer during an operation in each of a feed-forward stage and an error propagation stage and automatically change the precision of a corresponding layer based on the result of determination.

    APPARATUS AND METHOD FOR TRAINING DEEP NEURAL NETWORK

    公开(公告)号:US20210056427A1

    公开(公告)日:2021-02-25

    申请号:US16988737

    申请日:2020-08-10

    Abstract: Disclosed herein are an apparatus and method for training a deep neural network. An apparatus for training a deep neural network including N layers, each having multiple neurons, includes an error propagation processing unit configured to, when an error occurs in an N-th layer in response to initiation of training of the deep neural network, determine an error propagation value for an arbitrary layer based on the error occurring in the N-th layer and directly propagate the error propagation value to the arbitrary layer, a weight gradient update processing unit configured to update a forward weight for the arbitrary layer based on a feed-forward value input to the arbitrary layer and the error propagation value in response to the error propagation value, and a feed-forward processing unit configured to, when update of the forward weight is completed, perform a feed-forward operation in the arbitrary layer using the forward weight.

    CONJUGATE GRADIENT ACCELERATION APPARATUS USING BAND MATRIX COMPRESSION IN DEPTH FUSION TECHNOLOGY

    公开(公告)号:US20240078282A1

    公开(公告)日:2024-03-07

    申请号:US18139952

    申请日:2023-04-27

    CPC classification number: G06F17/16 G06T7/521 H04N19/174

    Abstract: Disclosed is a conjugate gradient acceleration apparatus using band matrix compression in depth fusion technology including a band matrix conversion unit configured to convert an adjacency matrix for correcting depth data acquired from data of an image sensor through deep learning based on depth information acquired from a depth sensor into a band matrix using rows of the adjacency matrix as addresses of query points and columns of the adjacency matrix as the nearest neighbors at the query points, a band matrix compression unit configured to mark an index on each band in order to compress the band matrix and to compress data, a memory unit configured to store tile data of the band matrix, and a band matrix calculation unit configured to perform computation of the band matrix and a transposed band matrix or computation of a symmetric band matrix with respect to the band matrix and a vector.

    ENERGY-EFFICIENT RETRAINING METHOD OF GENERATIVE NEURAL NETWORK FOR DOMAIN-SPECIFIC OPTIMIZATION

    公开(公告)号:US20230098672A1

    公开(公告)日:2023-03-30

    申请号:US17574501

    申请日:2022-01-12

    Abstract: Disclosed is an energy-efficient retraining method of a generative neural network for domain-specific optimization, including (a) retraining, by a mobile device, a pretrained generative neural network model with respect to some data of a new user dataset, (b) comparing, by the mobile device, the pretrained generative neural network model and a generative neural network model retrained for each layer with each other in terms of a relative change rate of weights, (c) selecting, by the mobile device, specific layers having high relative change rate of weights, among layers of the pretrained generative neural network model, as layers to be retrained, and (d) performing, by the mobile device, weight update for only the layers selected in step (c), wherein only some of all layers are selected and trained in a retraining process that requires a large amount of operation, whereby rapid retraining is performed in the mobile device.

    3D POINT CLOUD-BASED DEEP LEARNING NEURAL NETWORK ACCELERATION APPARATUS AND METHOD

    公开(公告)号:US20230376756A1

    公开(公告)日:2023-11-23

    申请号:US18199995

    申请日:2023-05-22

    CPC classification number: G06N3/08

    Abstract: Disclosed is a 3D point cloud-based deep learning neural network acceleration apparatus including a depth image input unit configured to receive a depth image, a depth data storage unit configured to store depth data derived from the depth image, a sampling unit configured to sample the depth image in units of a sampling window having a predetermined first size, a grouping unit configured to generate a grouping window having a predetermined second size and to group inner 3D point data by grouping window, and a convolution computation unit configured to separate point feature data and group feature data, among channel-direction data of 3D point data constituting the depth image, to perform convolution computation with respect to the point feature data and the group feature data, to sum the results of convolution computation by group grouped by the grouping unit, and to derive the final result.

    FLOATING-POINT COMPUTATION APPARATUS AND METHOD USING COMPUTING-IN-MEMORY

    公开(公告)号:US20230195420A1

    公开(公告)日:2023-06-22

    申请号:US17741509

    申请日:2022-05-11

    Abstract: Disclosed herein are a floating-point computation apparatus and method using Computing-in-Memory (CIM). The floating-point computation apparatus performs a multiply-and-accumulation operation on pieces of input neuron data represented in a floating-point format, and includes a data preprocessing unit configured to separate and extract an exponent and a mantissa from each of the pieces of input neuron data, an exponent processing unit configured to perform CIM on input neuron exponents, which are exponents separated and extracted from the input neuron data, and a mantissa processing unit configured to perform a high-speed computation on input neuron mantissas, separated and extracted from the input neuron data, wherein the exponent processing unit determines a mantissa shift size for a mantissa computation and transfers the mantissa shift size to the mantissa processing unit, and the mantissa processing unit normalizes a result of the mantissa computation and transfers a normalization value to the exponent processing unit.

Patent Agency Ranking