-
公开(公告)号:US20220092381A1
公开(公告)日:2022-03-24
申请号:US17058624
申请日:2020-09-18
发明人: Baopu LI , Yanwen FAN , Zhihong PAN , Teng XI , Gang ZHANG
摘要: Network architecture search (NAS) received a lot of attention. The supernet-based differentiable approach is popular because it can effectively share the weights and lead to more efficient search. However, the mismatch between the architecture and weights caused by weight sharing still exists. Moreover, the coupling effects among different operators are also neglected. To alleviate these problems, embodiments of an effective NAS methodology by similarity-based operator ranking are presented herein. With the aim of approximating each layer's output in the supernet, a similarity-based operator ranking based on statistical random comparison is used. In one or more embodiments, then the operator that possibly causes the least change to feature distribution discrepancy is pruned. In one or more embodiments, a fair sampling process may be used to mitigate the operators' Matthew effect that happened frequently in previous supernet approaches.
-
公开(公告)号:US20210241421A1
公开(公告)日:2021-08-05
申请号:US16960304
申请日:2019-07-05
发明人: Zhihong PAN , Baopu LI , Yingze BAO , Hsuchun CHENG
摘要: Described herein are systems and embodiments for multispectral image demosaicking using deep panchromatic image guided residual interpolation. Embodiments of a ResNet-based deep learning model are disclosed to reconstruct the full-resolution panchromatic image from multispectral filter array (MSFA) mosaic image. In one or more embodiments, the reconstructed deep panchromatic image (DPI) is deployed as the guide to recover the full-resolution multispectral image using a two-pass guided residual interpolation methodology. Experiment results demonstrate that the disclosed method embodiments outperform some state-of-the-art conventional and deep learning demosaicking methods both qualitatively and quantitatively.
-
3.
公开(公告)号:US20210241094A1
公开(公告)日:2021-08-05
申请号:US16979522
申请日:2019-11-26
发明人: Zhiyu CHENG , Baopu LI , Yanwen FAN , Yingze BAO
摘要: Tensor decomposition can be advantageous for compressing deep neural networks (DNNs). In many applications of DNNs, reducing the number of parameters and computation workload is helpful to accelerate inference speed in deployment. Modern DNNs comprise multiple layers with multi-array weights where tensor decomposition is a natural way to perform compression—in which the weight tensors in convolutional layers or fully-connected layers are decomposed with specified tensor ranks (e.g., canonical ranks, tensor train ranks). Conventional tensor decomposition with DNNs involves selecting ranks manually, which requires tedious human efforts to finetune the performance. Accordingly, presented herein are rank selection embodiments, which are inspired by reinforcement learning, to automatically select ranks in tensor decomposition. Experimental results validate that the learning-based rank selection embodiments significantly outperform hand-crafted rank selection heuristics on a number of tested datasets, for the purpose of effectively compressing deep neural networks while maintaining comparable accuracy.
-
公开(公告)号:US20240185386A1
公开(公告)日:2024-06-06
申请号:US18556653
申请日:2021-09-30
发明人: Zhihong PAN , Baopu LI , Dongliang HE , Wenhao WU , Tianwei LIN
IPC分类号: G06T3/4076 , G06T3/4046
CPC分类号: G06T3/4076 , G06T3/4046
摘要: Image super-resolution (SR) refers to the process of recovering high-resolution (HR) images from low-resolution (LR) inputs. Blind image SR is a more challenging task which involves unknown blurring kernels and characterizes the degradation process from HR to LR. In the present disclosure, embodiments of a variational autoencoder (VAE) are leveraged to train a kernel autoencoder for more accurate degradation representation and more efficient kernel estimation. In one or more embodiments, a kernel-agnostic loss is used to learn more robust kernel features in the latent space from LR inputs without using ground-truth kernel references. In addition, attention-based adaptive pooling is introduced to improve kernel estimation accuracy, and spatially non-uniform kernel features are passed into SR restoration resulting in additional kernel estimation error tolerance. Extensive experiments on synthetic and real-world images show that embodiments of the presented model outperform state-of-the-art methods significantly with the peak signal-to-noise ratio (PSNR) raised considerably.
-
公开(公告)号:US20210241429A1
公开(公告)日:2021-08-05
申请号:US16981866
申请日:2020-01-23
发明人: Zhihong PAN , Baopu LI , Hsuchun CHENG , Yingze BAO
摘要: Described herein are embodiments of a deep residual network dedicated to color filter array mosaic patterns. A mosaic stride convolution layer is introduced to match the mosaic pattern of a multispectral filter arrays (MSFA) or a color filter array raw image. Embodiments of a data augmentation using MSFA shifting and dynamic noise are applied to make the model robust to different noise levels. Embodiments of network optimization criteria may be created by using the noise standard deviation to normalize the L1 loss function. Comprehensive experiments demonstrate that embodiments of the disclosed deep residual network outperform the state-of-the-art denoising algorithms in MSFA field.
-
公开(公告)号:US20210232890A1
公开(公告)日:2021-07-29
申请号:US16966834
申请日:2019-09-24
发明人: Baopu LI , Yanwen FAN , Zhiyu CHENG , Yingze BAO
摘要: Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. Presented herein are novel cursor-based adaptive quantization embodiments. In embodiments, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. In embodiments, the cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. In embodiments, a new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.
-
公开(公告)号:US20230084203A1
公开(公告)日:2023-03-16
申请号:US17846555
申请日:2022-06-22
申请人: Baidu USA, LLC
发明人: Baopu LI , Qiuling SUO , Yuchen BIAN
摘要: Model pruning is used to trim large neural networks, like convolutional neural networks (CNNs), to reduce computation overheads. Existing model pruning methods mainly rely on heuristics rules or local relationships of CNN layers. A novel hypernetwork based on graph neural network is disclosed for generating and evaluating pruned networks. A graph is first constructed according to information flow of channels and layers in a CNN network, with channels and layers represented as nodes and information flows represented as edges. A graph neural network is applied to aggregate both local and global dependencies across all channels and layers of the CNN network, resulting in informative node embeddings. With such embeddings, pruned CNN networks including their architectures and weights may be effectively generated and evaluated.
-
公开(公告)号:US20210110213A1
公开(公告)日:2021-04-15
申请号:US16600148
申请日:2019-10-11
申请人: Baidu USA, LLC
发明人: Baopu LI , Zhiyu CHENG , Yingze BAO
摘要: Presented herein are embodiments of a training deep learning models. In one or more embodiments, a compact deep learning model comprises fewer layers, which require fewer floating-point operations (FLOPs). Presented herein are also embodiments of a new learning rate function, which can adaptively change the learning rate between two linear functions. In one or more embodiments, combinations of half-precision floating point format training together with larger batch size in the training process may also be employed to aid the training process.
-
公开(公告)号:US20210174093A1
公开(公告)日:2021-06-10
申请号:US16706590
申请日:2019-12-06
申请人: Baidu USA, LLC
发明人: Baopu LI , Min-Hung CHEN , Yingze BAO
摘要: Embodiments herein treat the action segmentation as a domain adaption (DA) problem and reduce the domain discrepancy by performing unsupervised DA with auxiliary unlabeled videos. In one or more embodiments, to reduce domain discrepancy for both the spatial and temporal directions, embodiments of a Mixed Temporal Domain Adaptation (MTDA) approach are presented to jointly align frame-level and video-level embedded feature spaces across domains, and, in one or more embodiments, further integrate with a domain attention mechanism to focus on aligning the frame-level features with higher domain discrepancy, leading to more effective domain adaptation. Comprehensive experiment results validate that embodiments outperform previous state-of-the-art methods. Embodiments can adapt models effectively by using auxiliary unlabeled videos, leading to further applications of large-scale problems, such as video surveillance and human activity analysis.
-
-
-
-
-
-
-
-