DIFFUSION-BASED DATA COMPRESSION
    1.
    发明公开

    公开(公告)号:US20240121398A1

    公开(公告)日:2024-04-11

    申请号:US18458006

    申请日:2023-08-29

    摘要: Systems and techniques are described for processing image data using a residual model that can be configured with an adjustable number of sampling steps. For example, a process can include obtaining a latent representation of an image and processing, using a decoder of a machine learning model, the latent representation of the image to generate an initial reconstructed image. The process can further include processing, using the residual model, the initial reconstructed image and noise data to predict a plurality of predictions of a residual over a number of sampling steps. The residual represents a difference between the image and the initial reconstructed image. The process can include obtaining, from the plurality of predictions of the residual, a final residual representing the difference between the image and the initial reconstructed image. The process can further include combining the initial reconstructed image and the residual to generate a final reconstructed image.

    VIDEO COMPRESSION USING RECURRENT-BASED MACHINE LEARNING SYSTEMS

    公开(公告)号:US20210281867A1

    公开(公告)日:2021-09-09

    申请号:US17091570

    申请日:2020-11-06

    摘要: Techniques are described herein for coding video content using recurrent-based machine learning tools. A device can include a neural network system including encoder and decoder portions. The encoder portion can generate output data for the current time step of operation of the neural network system based on an input video frame for a current time step of operation of the neural network system, reconstructed motion estimation data from a previous time step of operation, reconstructed residual data from the previous time step of operation, and recurrent state data from at least one recurrent layer of a decoder portion of the neural network system from the previous time step of operation. A decoder portion of the neural network system can generate, based on the output data and recurrent state data from the previous time step of operation, a reconstructed video frame for the current time step of operation.