-
公开(公告)号:US12132919B2
公开(公告)日:2024-10-29
申请号:US17987844
申请日:2022-11-15
发明人: Yang Yang , Hoang Cong Minh Le , Yinhao Zhu , Reza Pourreza , Amir Said , Yizhe Zhang , Taco Sebastiaan Cohen
IPC分类号: H04N19/124 , H04N19/119 , H04N19/147 , H04N19/17 , H04N19/436
CPC分类号: H04N19/436 , H04N19/119 , H04N19/124 , H04N19/147 , H04N19/17
摘要: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.
-
公开(公告)号:US11490083B2
公开(公告)日:2022-11-01
申请号:US17166639
申请日:2021-02-03
发明人: Amir Said , Reza Pourreza
IPC分类号: H04N19/124 , H04N19/176 , G06N20/00 , H04N19/186 , H04N19/18
摘要: A video encoder may determine a set of quantization offset parameters for a group of scaled transform coefficients for a block of video data based on side information associated with the block of video data. The video encoder may further quantize the group of scaled transform coefficients for the block of video data to generate quantized transform coefficients for the block of video data based at least in part on the set of quantization offset parameters. The video encoder may further generate an encoded video bitstream based at least in part on the quantized transform coefficients for the block of video data.
-
公开(公告)号:US11943460B2
公开(公告)日:2024-03-26
申请号:US17573568
申请日:2022-01-11
发明人: Yadong Lu , Yang Yang , Yinhao Zhu , Amir Said , Reza Pourreza , Taco Sebastiaan Cohen
IPC分类号: H04N19/42 , H04N19/124 , H04N19/13 , H04N19/136 , H04N19/30 , H04N19/36
CPC分类号: H04N19/42 , H04N19/124 , H04N19/13 , H04N19/136 , H04N19/30
摘要: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.
-
公开(公告)号:US11831909B2
公开(公告)日:2023-11-28
申请号:US17198813
申请日:2021-03-11
摘要: Techniques are described for processing video data, such as by performing learned bidirectional coding using a unidirectional coding system and an interpolated reference frame. For example, a process can include obtaining a first reference frame and a second reference frame. The process can include generating a third reference frame at least in part by performing interpolation between the first reference frame and the second reference frame. The process can include performing unidirectional inter-prediction on an input frame based on the third reference frame, such as by estimating motion between an input frame and the third reference frame, and generating a warped frame at least in part by warping one or more pixels of the third reference frame based on the estimated motion. The process can include generating, based on the warped frame and a predicted residual, a reconstructed frame representing the input frame, the reconstructed frame including a bidirectionally-predicted frame.
-
公开(公告)号:US11405626B2
公开(公告)日:2022-08-02
申请号:US17091570
申请日:2020-11-06
发明人: Adam Waldemar Golinski , Yang Yang , Reza Pourreza , Guillaume Konrad Sautiere , Ties Jehan Van Rozendaal , Taco Sebastiaan Cohen
IPC分类号: H04N19/42 , H04N19/137 , G06N3/08 , H04N19/85 , H04N19/172
摘要: Techniques are described herein for coding video content using recurrent-based machine learning tools. A device can include a neural network system including encoder and decoder portions. The encoder portion can generate output data for the current time step of operation of the neural network system based on an input video frame for a current time step of operation of the neural network system, reconstructed motion estimation data from a previous time step of operation, reconstructed residual data from the previous time step of operation, and recurrent state data from at least one recurrent layer of a decoder portion of the neural network system from the previous time step of operation. A decoder portion of the neural network system can generate, based on the output data and recurrent state data from the previous time step of operation, a reconstructed video frame for the current time step of operation.
-
公开(公告)号:US11798197B2
公开(公告)日:2023-10-24
申请号:US17200694
申请日:2021-03-12
发明人: Hoang Cong Minh Le , Reza Pourreza , Yang Yang , Yinhao Zhu , Amir Said , Yizhe Zhang , Taco Sebastiaan Cohen
CPC分类号: G06T9/002 , G06N3/08 , G06T3/4046
摘要: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.
-
公开(公告)号:US11638025B2
公开(公告)日:2023-04-25
申请号:US17207244
申请日:2021-03-19
发明人: Reza Pourreza , Amir Said , Yang Yang , Yinhao Zhu , Taco Sebastiaan Cohen
IPC分类号: H04N19/51 , H04N19/172 , H04N19/137 , H04N19/107 , H04N19/593 , G06N3/08
摘要: Systems and techniques are described for encoding and/or decoding data based on motion estimation that applies variable-scale warping. An encoding device can receive an input frame and a reference frame that depict a scene at different times. The encoding device can generate an optical flow identifying movements in the scene between the two frames. The encoding device can generate a weight map identifying how finely or coarsely the reference frame can be warped for input frame prediction. The encoding device can generate encoded video data based on the optical flow and the weight map. A decoding device can generate a reconstructed optical flow and a reconstructed weight map from the encoded data. A decoding device can generate a prediction frame by warping the reference frame based on the reconstructed optical flow and the reconstructed weight map. The decoding device can generate a reconstructed input frame based on the prediction frame.
-
公开(公告)号:US20210243442A1
公开(公告)日:2021-08-05
申请号:US17166639
申请日:2021-02-03
发明人: Amir Said , Reza Pourreza
IPC分类号: H04N19/124 , H04N19/176 , H04N19/18 , H04N19/186 , G06N20/00
摘要: A video encoder may determine a set of quantization offset parameters for a group of scaled transform coefficients for a block of video data based on side information associated with the block of video data. The video encoder may further quantize the group of scaled transform coefficients for the block of video data to generate quantized transform coefficients for the block of video data based at least in part on the set of quantization offset parameters. The video encoder may further generate an encoded video bitstream based at least in part on the quantized transform coefficients for the block of video data.
-
-
-
-
-
-
-