-
公开(公告)号:US20240121398A1
公开(公告)日:2024-04-11
申请号:US18458006
申请日:2023-08-29
发明人: Noor Fathima Khanum MOHAMED GHOUSE , Jens PETERSEN , Tianlin XU , Guillaume Konrad SAUTIERE , Auke Joris WIGGERS
IPC分类号: H04N19/137 , H04N19/147 , H04N19/162
CPC分类号: H04N19/137 , H04N19/147 , H04N19/162
摘要: Systems and techniques are described for processing image data using a residual model that can be configured with an adjustable number of sampling steps. For example, a process can include obtaining a latent representation of an image and processing, using a decoder of a machine learning model, the latent representation of the image to generate an initial reconstructed image. The process can further include processing, using the residual model, the initial reconstructed image and noise data to predict a plurality of predictions of a residual over a number of sampling steps. The residual represents a difference between the image and the initial reconstructed image. The process can include obtaining, from the plurality of predictions of the residual, a final residual representing the difference between the image and the initial reconstructed image. The process can further include combining the initial reconstructed image and the residual to generate a final reconstructed image.
-
公开(公告)号:US20240364925A1
公开(公告)日:2024-10-31
申请号:US18636126
申请日:2024-04-15
发明人: Hoang Cong Minh LE , Qiqi HOU , Farzad FARHADZADEH , Amir SAID , Auke Joris WIGGERS , Guillaume Konrad SAUTIERE , Reza POURREZA
IPC分类号: H04N19/597 , H04N19/137 , H04N19/436
CPC分类号: H04N19/597 , H04N19/137 , H04N19/436
摘要: Systems and techniques are described herein for processing video data. For example, a machine-learning based stereo video coding system can obtain video data including at least a right-view image of a right view of a scene and a left-view image of a left view of the scene. The machine-learning based stereo video coding system can compress the right-view image and the left-view image in parallel to generate a latent representation of the right-view image and the left-view image. The right-view image and the left-view image can be compressed in parallel based on inter-view information between the right-view image and the left-view image, determined using one or more parallel autoencoders.
-
公开(公告)号:US20240323415A1
公开(公告)日:2024-09-26
申请号:US18188070
申请日:2023-03-22
发明人: David Wilson ROMERO GUZMAN , Gabriele CESA , Guillaume Konrad SAUTIERE , Yunfan ZHANG , Taco Sebastiaan COHEN , Auke Joris WIGGERS
IPC分类号: H04N19/42 , G06T3/40 , H04N19/182
CPC分类号: H04N19/42 , G06T3/4046 , H04N19/182
摘要: Certain aspects of the present disclosure provide techniques and apparatus for encoding content using a neural network. An example method generally includes encoding video content into a latent space representation through an encoder implemented by a first machine learning model. A code is generated by upsampling the latent space representation of the video content. A prior is calculated based on a conditional probability of obtaining the upsampled latent space representation conditioned by the latent space representation of the video content. A compressed version of the video content is generated based on a probabilistic model implemented by a second machine learning model, the generated code, and the calculated prior, and the compressed version of the video content is output for transmission.
-
公开(公告)号:US20210281867A1
公开(公告)日:2021-09-09
申请号:US17091570
申请日:2020-11-06
发明人: Adam Waldemar GOLINSKI , Yang YANG , Reza POURREZA , Guillaume Konrad SAUTIERE , Ties Jehan VAN ROZENDAAL , Taco Sebastiaan COHEN
IPC分类号: H04N19/42 , H04N19/137 , H04N19/172 , H04N19/85 , G06N3/08
摘要: Techniques are described herein for coding video content using recurrent-based machine learning tools. A device can include a neural network system including encoder and decoder portions. The encoder portion can generate output data for the current time step of operation of the neural network system based on an input video frame for a current time step of operation of the neural network system, reconstructed motion estimation data from a previous time step of operation, reconstructed residual data from the previous time step of operation, and recurrent state data from at least one recurrent layer of a decoder portion of the neural network system from the previous time step of operation. A decoder portion of the neural network system can generate, based on the output data and recurrent state data from the previous time step of operation, a reconstructed video frame for the current time step of operation.
-
-
-