BLOCK-BASED LONG-RANGE CONTEXT MODEL IN NEURAL IMAGE COMPRESSION

    公开(公告)号:US20240214592A1

    公开(公告)日:2024-06-27

    申请号:US18458507

    申请日:2023-08-30

    摘要: Methods and apparatuses for decoding a compressed image using a neural image compression network may be provided. The method may include generating long-range context model parameters associated with a high resolution compressed image, the long-range context model parameters corresponding to a first area. The method may also include splitting the generated long-range context model parameters into a first number of context parameter blocks. The method may also include for each block in the first number of context parameter blocks, predicting respective context features using a long-range context model and respective context parameter blocks, wherein the long-range context model uses a corner-to-center latent decoding strategy or an edge-to-center latent decoding strategy to decode latents associated with the high resolution compressed image. Then, the high resolution compressed image may be reconstructed based on predicted context features.

    LONG-RANGE CONTEXT MODEL IN NEURAL IMAGE COMPRESSION

    公开(公告)号:US20240212215A1

    公开(公告)日:2024-06-27

    申请号:US18458595

    申请日:2023-08-30

    IPC分类号: G06T9/00 G06T7/13 G06V10/44

    摘要: Methods and apparatuses for decoding a compressed image using a neural image compression network are provided. The method may include generating context parameters associated with a compressed image, the context parameters corresponding to a first area. The method may also include determining that a long range global dependency exists between a first latent and a second latent in a long-range global area within the compressed image and predicting a plurality of context features using a transformer-based long-range context prediction model. The method may then include reconstructing the compressed image based on the predicted plurality of context features.

    CONTENT-ADAPTIVE ONLINE TRAINING METHOD AND APPARATUS FOR DEBLOCKING IN BLOCK-WISE IMAGE COMPRESSION

    公开(公告)号:US20220405979A1

    公开(公告)日:2022-12-22

    申请号:US17826806

    申请日:2022-05-27

    摘要: Aspects of the disclosure provide a method, an apparatus, and non-transitory computer-readable storage medium for video decoding. The apparatus includes processing circuitry that reconstructs blocks of an image that is to be reconstructed from a coded video bitstream. The processing circuitry decodes first deblocking information in the coded video bitstream including a first deblocking parameter of a deep neural network (DNN) in a video decoder. The first deblocking parameter of the DNN is an updated parameter that has been previously determined by a content adaptive training process. The processing circuitry determines the DNN for a first boundary region comprising a subset of samples in the reconstructed blocks based on the first deblocking parameter included in the first deblocking information. The processing circuitry deblocks the first boundary region comprising the subset of samples in the reconstructed blocks based on the determined DNN corresponding to the first deblocking parameter.

    CONTENT-ADAPTIVE ONLINE TRAINING METHOD AND APPARATUS FOR POST-FILTERING

    公开(公告)号:US20240291980A1

    公开(公告)日:2024-08-29

    申请号:US18656413

    申请日:2024-05-06

    摘要: Aspects of the disclosure provide methods and apparatuses, for video decoding and encoding. The apparatus includes processing circuitry configured to receive an image/video comprising one or more blocks and metadata for a machine task associated with the image/video. The metadata specifies neural network post-filtering characteristics for machine consumption. The processing circuitry decodes a first post-filtering parameter in the image/video corresponding to the one or more blocks to be reconstructed. The first post-filtering parameter applies to a block in the one or more blocks and has been updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and the metadata. The processing circuitry determines the post-filtering NN in a video decoder corresponding to the one or more blocks based on the first post-filtering parameter, and decodes the block based on the determined post-filtering NN corresponding to the block and the metadata.

    MULTI-RATE COMPUTER VISION TASK NEURAL NETWORKS IN COMPRESSION DOMAIN

    公开(公告)号:US20230316048A1

    公开(公告)日:2023-10-05

    申请号:US18122645

    申请日:2023-03-16

    IPC分类号: G06N3/0455

    CPC分类号: G06N3/0455

    摘要: In some examples, an apparatus for image/video processing includes processing circuitry. The processing circuitry determines, from a coded bitstream that carries a compressed image, a value of a parameter for tuning a compression rate of the compressed image. The compressed image is generated by a neural network based encoder according to the value of the parameter. The processing circuitry inputs the value of the parameter to a multi-rate compression domain computer vision task decoder, the multi-rate compression domain computer vision task decoder includes one or more neural networks for performing a computer vision task from compressed images according to corresponding values of the parameter that are used for generating the compressed images. The multi-rate compression domain computer vision task decoder generates a computer vision task result according to the compressed image in the coded bitstream and the value of the parameter.

    CONTENT-ADAPTIVE ONLINE TRAINING FOR DNN-BASED CROSS COMPONENT PREDICTION WITH SCALING FACTORS

    公开(公告)号:US20220400272A1

    公开(公告)日:2022-12-15

    申请号:US17825339

    申请日:2022-05-26

    摘要: A method and apparatus for neural network based cross component prediction with scaling factors during encoding or decoding of an image frame or a video sequence, which may include training a deep neural network (DNN) cross component prediction (CCP) model with at least one or more scaling factors, wherein the at least one or more scaling factors are learned by optimizing a rate-distortion loss based on an input video sequence comprising a luma component, and reconstructing a chroma component based on the luma component using the trained DNN CCP model with the at least one or more scaling factors for chroma prediction. The trained DNN CCP may be updated for chroma prediction of the input video sequence using the one or more scaling factors, and performing chroma prediction of the input video sequence using the updated DNN CCP model with the one or more scaling factors.

    BLOCK-WISE CONTENT-ADAPTIVE ONLINE TRAINING IN NEURAL IMAGE COMPRESSION

    公开(公告)号:US20220353528A1

    公开(公告)日:2022-11-03

    申请号:US17729978

    申请日:2022-04-26

    摘要: Aspects of the disclosure provide a method, an apparatus, and a non-transitory computer-readable storage medium for video decoding. The apparatus can include processing circuitry. The processing circuitry is configured to decode first neural network update information in a coded bitstream for a first neural network in the video decoder. The first neural network is configured with first pretrained parameters. The first neural network update information corresponds to a first block in an image to be reconstructed and indicates a first replacement parameter corresponding to a first pretrained parameter in the first pretrained parameters. The processing circuitry is configured to update the first neural network in the video decoder based on the first replacement parameter. The processing circuitry can decode the first block based on the updated first neural network for the first block.