PROCESSING VIDEO CONTENT USING GATED TRANSFORMER NEURAL NETWORKS

    公开(公告)号:US20230090941A1

    公开(公告)日:2023-03-23

    申请号:US17933840

    申请日:2022-09-20

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for processing a video stream using a machine learning model. An example method generally includes generating a first group of tokens from a first frame of the video stream and a second group of tokens from a second frame of the video stream. A first set of tokens associated with features to be reused from the first frame and a second set of tokens associated with features to be computed from the second frame are identified based on a comparison of tokens from the first group of tokens to corresponding tokens in the second group of tokens. A feature output is generated for portions of the second frame corresponding to the second set of tokens. Features associated with the first set of tokens are combined with the generated feature output into a representation of the second frame.

Patent Agency Ranking