NEURAL ARCHITECTURE SEARCH VIA SIMILARITY-BASED OPERATOR RANKING

    公开(公告)号:US20220092381A1

    公开(公告)日:2022-03-24

    申请号:US17058624

    申请日:2020-09-18

    IPC分类号: G06N3/04 G06N3/08

    摘要: Network architecture search (NAS) received a lot of attention. The supernet-based differentiable approach is popular because it can effectively share the weights and lead to more efficient search. However, the mismatch between the architecture and weights caused by weight sharing still exists. Moreover, the coupling effects among different operators are also neglected. To alleviate these problems, embodiments of an effective NAS methodology by similarity-based operator ranking are presented herein. With the aim of approximating each layer's output in the supernet, a similarity-based operator ranking based on statistical random comparison is used. In one or more embodiments, then the operator that possibly causes the least change to feature distribution discrepancy is pruned. In one or more embodiments, a fair sampling process may be used to mitigate the operators' Matthew effect that happened frequently in previous supernet approaches.

    AUTOMATIC CHANNEL PRUNING VIA GRAPH NEURAL NETWORK BASED HYPERNETWORK

    公开(公告)号:US20230084203A1

    公开(公告)日:2023-03-16

    申请号:US17846555

    申请日:2022-06-22

    申请人: Baidu USA, LLC

    IPC分类号: G06N3/08 G06N3/04

    摘要: Model pruning is used to trim large neural networks, like convolutional neural networks (CNNs), to reduce computation overheads. Existing model pruning methods mainly rely on heuristics rules or local relationships of CNN layers. A novel hypernetwork based on graph neural network is disclosed for generating and evaluating pruned networks. A graph is first constructed according to information flow of channels and layers in a CNN network, with channels and layers represented as nodes and information flows represented as edges. A graph neural network is applied to aggregate both local and global dependencies across all channels and layers of the CNN network, resulting in informative node embeddings. With such embeddings, pruned CNN networks including their architectures and weights may be effectively generated and evaluated.

    RANK SELECTION IN TENSOR DECOMPOSITION BASED ON REINFORCEMENT LEARNING FOR DEEP NEURAL NETWORKS

    公开(公告)号:US20210241094A1

    公开(公告)日:2021-08-05

    申请号:US16979522

    申请日:2019-11-26

    IPC分类号: G06N3/08 G06N3/04

    摘要: Tensor decomposition can be advantageous for compressing deep neural networks (DNNs). In many applications of DNNs, reducing the number of parameters and computation workload is helpful to accelerate inference speed in deployment. Modern DNNs comprise multiple layers with multi-array weights where tensor decomposition is a natural way to perform compression—in which the weight tensors in convolutional layers or fully-connected layers are decomposed with specified tensor ranks (e.g., canonical ranks, tensor train ranks). Conventional tensor decomposition with DNNs involves selecting ranks manually, which requires tedious human efforts to finetune the performance. Accordingly, presented herein are rank selection embodiments, which are inspired by reinforcement learning, to automatically select ranks in tensor decomposition. Experimental results validate that the learning-based rank selection embodiments significantly outperform hand-crafted rank selection heuristics on a number of tested datasets, for the purpose of effectively compressing deep neural networks while maintaining comparable accuracy.

    DEEP LEARNING MODEL EMBODIMENTS AND TRAINING EMBODIMENTS FOR FASTER TRAINING

    公开(公告)号:US20210110213A1

    公开(公告)日:2021-04-15

    申请号:US16600148

    申请日:2019-10-11

    申请人: Baidu USA, LLC

    摘要: Presented herein are embodiments of a training deep learning models. In one or more embodiments, a compact deep learning model comprises fewer layers, which require fewer floating-point operations (FLOPs). Presented herein are also embodiments of a new learning rate function, which can adaptively change the learning rate between two linear functions. In one or more embodiments, combinations of half-precision floating point format training together with larger batch size in the training process may also be employed to aid the training process.

    ROBUST AND EFFICIENT BLIND SUPER-RESOLUTION USING VARIATIONAL KERNEL AUTOENCODER

    公开(公告)号:US20240185386A1

    公开(公告)日:2024-06-06

    申请号:US18556653

    申请日:2021-09-30

    IPC分类号: G06T3/4076 G06T3/4046

    CPC分类号: G06T3/4076 G06T3/4046

    摘要: Image super-resolution (SR) refers to the process of recovering high-resolution (HR) images from low-resolution (LR) inputs. Blind image SR is a more challenging task which involves unknown blurring kernels and characterizes the degradation process from HR to LR. In the present disclosure, embodiments of a variational autoencoder (VAE) are leveraged to train a kernel autoencoder for more accurate degradation representation and more efficient kernel estimation. In one or more embodiments, a kernel-agnostic loss is used to learn more robust kernel features in the latent space from LR inputs without using ground-truth kernel references. In addition, attention-based adaptive pooling is introduced to improve kernel estimation accuracy, and spatially non-uniform kernel features are passed into SR restoration resulting in additional kernel estimation error tolerance. Extensive experiments on synthetic and real-world images show that embodiments of the presented model outperform state-of-the-art methods significantly with the peak signal-to-noise ratio (PSNR) raised considerably.

    DEEP RESIDUAL NETWORK FOR COLOR FILTER ARRAY IMAGE DENOISING

    公开(公告)号:US20210241429A1

    公开(公告)日:2021-08-05

    申请号:US16981866

    申请日:2020-01-23

    摘要: Described herein are embodiments of a deep residual network dedicated to color filter array mosaic patterns. A mosaic stride convolution layer is introduced to match the mosaic pattern of a multispectral filter arrays (MSFA) or a color filter array raw image. Embodiments of a data augmentation using MSFA shifting and dynamic noise are applied to make the model robust to different noise levels. Embodiments of network optimization criteria may be created by using the noise standard deviation to normalize the L1 loss function. Comprehensive experiments demonstrate that embodiments of the disclosed deep residual network outperform the state-of-the-art denoising algorithms in MSFA field.

    CURSOR-BASED ADAPTIVE QUANTIZATION FOR DEEP NEURAL NETWORKS

    公开(公告)号:US20210232890A1

    公开(公告)日:2021-07-29

    申请号:US16966834

    申请日:2019-09-24

    IPC分类号: G06N3/04 G06N3/08

    摘要: Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. Presented herein are novel cursor-based adaptive quantization embodiments. In embodiments, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. In embodiments, the cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. In embodiments, a new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.

    VIDEO ACTION SEGMENTATION BY MIXED TEMPORAL DOMAIN ADAPTION

    公开(公告)号:US20210174093A1

    公开(公告)日:2021-06-10

    申请号:US16706590

    申请日:2019-12-06

    申请人: Baidu USA, LLC

    摘要: Embodiments herein treat the action segmentation as a domain adaption (DA) problem and reduce the domain discrepancy by performing unsupervised DA with auxiliary unlabeled videos. In one or more embodiments, to reduce domain discrepancy for both the spatial and temporal directions, embodiments of a Mixed Temporal Domain Adaptation (MTDA) approach are presented to jointly align frame-level and video-level embedded feature spaces across domains, and, in one or more embodiments, further integrate with a domain attention mechanism to focus on aligning the frame-level features with higher domain discrepancy, leading to more effective domain adaptation. Comprehensive experiment results validate that embodiments outperform previous state-of-the-art methods. Embodiments can adapt models effectively by using auxiliary unlabeled videos, leading to further applications of large-scale problems, such as video surveillance and human activity analysis.