NEURAL NETWORK WITH TRANSFORMER BASED VIDEO CODING TOOL

    公开(公告)号:US20250119556A1

    公开(公告)日:2025-04-10

    申请号:US18889977

    申请日:2024-09-19

    Abstract: A method of processing video data includes receiving a picture; and filtering a current block of the picture, through a neural network and based on local correlations of proximate samples and distant, non-local correlations of non-proximate samples relative to the current block, to generate a filtered current block. The neural network comprises one or more backbone blocks and one or more transformer blocks. Each of the one or more transformer blocks is associated with a backbone block of the one or more backbone blocks. At least one of the backbone blocks is configured to capture the local correlations, relative to the current block and the proximate samples of the current block, and at least one of the transformer blocks is configured to generate features, based on applying an attention mechanism, that capture the distant, non-local correlations, relative to the current block and the non-proximate samples, in the picture for processing.

    LOW COMPLEXITY NN-BASED IN LOOP FILTER ARCHITECTURES WITH SEPARABLE CONVOLUTION

    公开(公告)号:US20240414378A1

    公开(公告)日:2024-12-12

    申请号:US18738842

    申请日:2024-06-10

    Abstract: Example techniques for filtering video data are described. An example device for at least one of encoding or decoding video data includes one or more memories configured to store the video data and one or more processors. The one or more processors are configured to receive a picture of video data and reconstruct the picture of video data. The one or more processors are also configured to apply a neural network (NN)-based filter to the reconstructed picture of video data. The NN-based filter includes a unified filter. The unified filter includes a head block, a transition block, one or more backbone blocks, and a tail block. At least one of the head block, the transition block, the one or more backbone blocks, or the tail block includes a Canonical Polyadic (CP) decomposition with separable convolution.

    NN-BASED IN LOOP FILTER ARCHITECTURES WITH SEPARABLE CONVOLUTION AND SWITCHING ORDER OF DECOMPOSITION

    公开(公告)号:US20240414377A1

    公开(公告)日:2024-12-12

    申请号:US18738612

    申请日:2024-06-10

    Abstract: A device for decoding video data receives a picture of video data; reconstructs a block of the picture of video data to generate a reconstructed block; applies a neural network (NN)-based filter process to the reconstructed block to generate a filtered block, wherein the NN-based filter process includes a first backbone block process followed by a second backbone block process, wherein the first backbone block process comprises a first M×N convolution followed by a first N×M convolution, and the second backbone block process comprises a second N×M convolution followed by a second M×N convolution, wherein N and M are different integer values; determines a decoded block of video data based on the filtered block; and outputs a decoded version of the picture, wherein the decoded version of the picture comprises the decoded block of video data.

Patent Agency Ranking