-
公开(公告)号:US20240214592A1
公开(公告)日:2024-06-27
申请号:US18458507
申请日:2023-08-30
申请人: Tencent America LLC
发明人: Yang SUI , Ding DING , Xiaozhong XU , Shan LIU
IPC分类号: H04N19/42 , H04N19/119 , H04N19/167 , H04N19/176
CPC分类号: H04N19/42 , H04N19/119 , H04N19/167 , H04N19/176
摘要: Methods and apparatuses for decoding a compressed image using a neural image compression network may be provided. The method may include generating long-range context model parameters associated with a high resolution compressed image, the long-range context model parameters corresponding to a first area. The method may also include splitting the generated long-range context model parameters into a first number of context parameter blocks. The method may also include for each block in the first number of context parameter blocks, predicting respective context features using a long-range context model and respective context parameter blocks, wherein the long-range context model uses a corner-to-center latent decoding strategy or an edge-to-center latent decoding strategy to decode latents associated with the high resolution compressed image. Then, the high resolution compressed image may be reconstructed based on predicted context features.
-
公开(公告)号:US20240212215A1
公开(公告)日:2024-06-27
申请号:US18458595
申请日:2023-08-30
申请人: Tencent America LLC
发明人: Yang SUI , Ding DING , Xiaozhong XU , Shan LIU
CPC分类号: G06T9/00 , G06T7/13 , G06V10/44 , G06T2207/20164
摘要: Methods and apparatuses for decoding a compressed image using a neural image compression network are provided. The method may include generating context parameters associated with a compressed image, the context parameters corresponding to a first area. The method may also include determining that a long range global dependency exists between a first latent and a second latent in a long-range global area within the compressed image and predicting a plurality of context features using a transformer-based long-range context prediction model. The method may then include reconstructing the compressed image based on the predicted plurality of context features.
-
3.
公开(公告)号:US20220405979A1
公开(公告)日:2022-12-22
申请号:US17826806
申请日:2022-05-27
申请人: Tencent America LLC
摘要: Aspects of the disclosure provide a method, an apparatus, and non-transitory computer-readable storage medium for video decoding. The apparatus includes processing circuitry that reconstructs blocks of an image that is to be reconstructed from a coded video bitstream. The processing circuitry decodes first deblocking information in the coded video bitstream including a first deblocking parameter of a deep neural network (DNN) in a video decoder. The first deblocking parameter of the DNN is an updated parameter that has been previously determined by a content adaptive training process. The processing circuitry determines the DNN for a first boundary region comprising a subset of samples in the reconstructed blocks based on the first deblocking parameter included in the first deblocking information. The processing circuitry deblocks the first boundary region comprising the subset of samples in the reconstructed blocks based on the determined DNN corresponding to the first deblocking parameter.
-
公开(公告)号:US20240333950A1
公开(公告)日:2024-10-03
申请号:US18616793
申请日:2024-03-26
申请人: TENCENT AMERICA LLC
发明人: Ding DING , Xiaozhong XU , Shan Liu
IPC分类号: H04N19/44 , H04N19/176 , H04N19/88
CPC分类号: H04N19/44 , H04N19/176 , H04N19/88
摘要: A method and apparatus comprising computer code configured to cause a processor or processors to receive a video bitstream comprising a current block in a current picture and reconstruct the current block by transforming the current block by a neural network comprising a plurality of upsample modules and activation modules, and at least one of the upsample modules includes a convolution layer and a pixel shuffle layer, and at least one of the activation modules includes a LeakyReLu function and a convolution function.
-
公开(公告)号:US20240291980A1
公开(公告)日:2024-08-29
申请号:US18656413
申请日:2024-05-06
申请人: TENCENT AMERICA LLC
发明人: Ding DING , Roman CHERNYAK , Wei JIANG , Wei WANG , Shan LIU
IPC分类号: H04N19/117 , H04N19/136 , H04N19/176 , H04N19/42
CPC分类号: H04N19/117 , H04N19/136 , H04N19/176 , H04N19/42
摘要: Aspects of the disclosure provide methods and apparatuses, for video decoding and encoding. The apparatus includes processing circuitry configured to receive an image/video comprising one or more blocks and metadata for a machine task associated with the image/video. The metadata specifies neural network post-filtering characteristics for machine consumption. The processing circuitry decodes a first post-filtering parameter in the image/video corresponding to the one or more blocks to be reconstructed. The first post-filtering parameter applies to a block in the one or more blocks and has been updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and the metadata. The processing circuitry determines the post-filtering NN in a video decoder corresponding to the one or more blocks based on the first post-filtering parameter, and decodes the block based on the determined post-filtering NN corresponding to the block and the metadata.
-
公开(公告)号:US20240236346A1
公开(公告)日:2024-07-11
申请号:US18455968
申请日:2023-08-25
申请人: Tencent America LLC
发明人: Ding DING , Xiaozhong XU , Shan LIU
IPC分类号: H04N19/436 , H04N19/103 , H04N19/132 , H04N19/61
CPC分类号: H04N19/436 , H04N19/103 , H04N19/132 , H04N19/61
摘要: Methods and apparatuses for neural network based image compression may be provided. The method may include receiving a compressed input image; generating a first prediction of the input image using a first combination of one or more first convolutional nets, a first activation function, and the compressed input image, the generating includes at least: upsampling an output image from the one or more first convolutional nets; and performing tensor transform based on the upsampled output image; and decoding the compressed input image using the generated first prediction.
-
公开(公告)号:US20230316048A1
公开(公告)日:2023-10-05
申请号:US18122645
申请日:2023-03-16
申请人: Tencent America LLC
发明人: Ding DING , Xiaozhong XU , Shan LIU
IPC分类号: G06N3/0455
CPC分类号: G06N3/0455
摘要: In some examples, an apparatus for image/video processing includes processing circuitry. The processing circuitry determines, from a coded bitstream that carries a compressed image, a value of a parameter for tuning a compression rate of the compressed image. The compressed image is generated by a neural network based encoder according to the value of the parameter. The processing circuitry inputs the value of the parameter to a multi-rate compression domain computer vision task decoder, the multi-rate compression domain computer vision task decoder includes one or more neural networks for performing a computer vision task from compressed images according to corresponding values of the parameter that are used for generating the compressed images. The multi-rate compression domain computer vision task decoder generates a computer vision task result according to the compressed image in the coded bitstream and the value of the parameter.
-
8.
公开(公告)号:US20230186081A1
公开(公告)日:2023-06-15
申请号:US17952865
申请日:2022-09-26
申请人: Tencent America LLC
CPC分类号: G06N3/08 , G06N3/0454 , G06T9/002
摘要: Iterative content-adaptive online training for end-to-end (E2E) neural image compression (NIC) using a neural network performed by at least one processor, is provided, including receiving an input image, to an E2E NIC framework, fine-tuning the E2E NIC framework, based on the input image, computing parameter updates using a first neural network of the fine-tuned E2E NIC framework, enhancing the fine-tuned E2E NIC framework based on a second neural network, the second neural network being a post-enhancement network, and generating an updated E2E NIC framework, based on the enhanced E2E NIC framework and the parameter updates.
-
9.
公开(公告)号:US20220400272A1
公开(公告)日:2022-12-15
申请号:US17825339
申请日:2022-05-26
申请人: TENCENT AMERICA LLC
IPC分类号: H04N19/42 , H04N19/147 , H04N19/186 , H04N19/30 , H04N19/50
摘要: A method and apparatus for neural network based cross component prediction with scaling factors during encoding or decoding of an image frame or a video sequence, which may include training a deep neural network (DNN) cross component prediction (CCP) model with at least one or more scaling factors, wherein the at least one or more scaling factors are learned by optimizing a rate-distortion loss based on an input video sequence comprising a luma component, and reconstructing a chroma component based on the luma component using the trained DNN CCP model with the at least one or more scaling factors for chroma prediction. The trained DNN CCP may be updated for chroma prediction of the input video sequence using the one or more scaling factors, and performing chroma prediction of the input video sequence using the updated DNN CCP model with the one or more scaling factors.
-
公开(公告)号:US20220353528A1
公开(公告)日:2022-11-03
申请号:US17729978
申请日:2022-04-26
申请人: Tencent America LLC
IPC分类号: H04N19/593 , H04N19/176 , G06N3/08
摘要: Aspects of the disclosure provide a method, an apparatus, and a non-transitory computer-readable storage medium for video decoding. The apparatus can include processing circuitry. The processing circuitry is configured to decode first neural network update information in a coded bitstream for a first neural network in the video decoder. The first neural network is configured with first pretrained parameters. The first neural network update information corresponds to a first block in an image to be reconstructed and indicates a first replacement parameter corresponding to a first pretrained parameter in the first pretrained parameters. The processing circuitry is configured to update the first neural network in the video decoder based on the first replacement parameter. The processing circuitry can decode the first block based on the updated first neural network for the first block.
-
-
-
-
-
-
-
-
-