EFFICIENT SELF-ATTENTION FOR VIDEO PROCESSING

    公开(公告)号:US20220301311A1

    公开(公告)日:2022-09-22

    申请号:US17696797

    申请日:2022-03-16

    Abstract: A processor-implemented method for processing a video includes receiving the video as an input at an artificial neural network (ANN). The video includes a sequence of frames. A set of features of a current frame of the video and a prior frame of the video are extracted. The set of features including a set of support features for a set of pixels of the prior frame to be aligned with a set of reference features of the current frame. A similarity between a support feature for each pixel in the set of pixels of the set of support features of the prior frame and a corresponding reference feature of the current frame is computed. An attention map is generated based on the similarity. An output including a reconstruction of the current frame is generated based on the attention map.

    MODIFYING VIDEO CONTENT
    2.
    发明申请

    公开(公告)号:US20250166133A1

    公开(公告)日:2025-05-22

    申请号:US18596543

    申请日:2024-03-05

    Abstract: Systems and techniques are described herein for modifying video data. For instance, a method for modifying video data is provided. The method may include obtaining first tokens based on a first frame of video data, wherein each of the first tokens comprises a feature vector corresponding to a respective location within the first frame of video data; obtaining second tokens based on a second frame of video data, wherein each of the second tokens comprises a feature vector corresponding to a respective location within the second frame of video data; determining a destination token from among the first tokens; determining candidate tokens from among the second tokens based on respective relationships between the candidate tokens and the destination token; merging the candidate tokens with the destination token resulting in modified second tokens; and processing the modified second tokens using a diffusion model.

    SKIP CONVOLUTIONS FOR EFFICIENT VIDEO PROCESSING

    公开(公告)号:US20250119561A1

    公开(公告)日:2025-04-10

    申请号:US18984662

    申请日:2024-12-17

    Abstract: A method for video processing via an artificial neural network includes receiving a video stream as an input at the artificial neural network. A residual is computed based on a difference between a first feature of a current frame of the video stream and a second feature of a previous frame of the video stream. One or more portions of the current frame of the video stream are processed based on the residual. Additionally, processing is skipped for one or more portions of the current frame of the video based on the residual.

    VIDEO PROCESSING USING DELTA DISTILLATION
    4.
    发明公开

    公开(公告)号:US20230154169A1

    公开(公告)日:2023-05-18

    申请号:US18054274

    申请日:2022-11-10

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for processing video content using an artificial neural network. An example method generally includes receiving a video data stream including at least a first frame and a second frame. First features are extracted from the first frame using a teacher neural network. A difference between the first frame and the second frame is determined. Second features are extracted from at least the difference between the first frame and the second frame using a student neural network. A feature map for the second frame is generated based a summation of the first features and the second features. An inference is generated for at least the second frame of the video data stream based on the generated feature map for the second feature.

    PROCESSING VIDEO DATA USING DELTA QUANTIZATION

    公开(公告)号:US20240169708A1

    公开(公告)日:2024-05-23

    申请号:US18338184

    申请日:2023-06-20

    CPC classification number: G06V10/776 G06V10/7715 G06V20/46 G06V10/82

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for delta quantization for video processing and other data streams with temporal content. An example method generally includes receiving image data including at least a first frame and a second frame, generating a first convolutional output based on a first frame using a machine learning model, generating a second convolutional output based on a difference between the first frame and the second frame using one or more quantizers of the machine learning model, generating a third convolutional output associated with the second frame as a combination of the first convolutional output and the second convolutional output, and performing image processing based on the first convolutional output and the third convolutional output.

    Conditional Computation For Continual Learning

    公开(公告)号:US20210150345A1

    公开(公告)日:2021-05-20

    申请号:US17097811

    申请日:2020-11-13

    Abstract: Various aspects provide methods for learning, such as continual learning, that support task-incremental learning using a multi-head classification architecture. Various aspects may enable conditional computing to support multi-head classification. Various aspects provide methods for learning, such as continual learning, that support class-incremental learning using a single-head classification architecture. Various aspects may enable conditional computing to support single-head classification by predicting the task associated with a given test input and selecting an associated classification head based at least in part on the task prediction.

Patent Agency Ranking