Pose Empowered RGB-Flow Net
    2.
    发明公开

    公开(公告)号:US20230419538A1

    公开(公告)日:2023-12-28

    申请号:US18464912

    申请日:2023-09-11

    Applicant: Google LLC

    Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.

    Pose empowered RGB-flow net
    4.
    发明授权

    公开(公告)号:US11776156B2

    公开(公告)日:2023-10-03

    申请号:US17303969

    申请日:2021-06-11

    Applicant: Google LLC

    Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.

    Pose Empowered RGB-Flow Net
    5.
    发明申请

    公开(公告)号:US20210390733A1

    公开(公告)日:2021-12-16

    申请号:US17303969

    申请日:2021-06-11

    Applicant: Google LLC

    Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.

    VIDEO LOCALIZATION USING ARTIFICIAL INTELLIGENCE

    公开(公告)号:US20240371164A1

    公开(公告)日:2024-11-07

    申请号:US18652703

    申请日:2024-05-01

    Applicant: Google LLC

    Abstract: Methods and systems for video localization using artificial intelligence are provided herein. A set of video embeddings representing features of one or more video frames of a media it em and a set of textual embeddings corresponding to an event associated with the media item are obtained. Fused video-textual data is generated based on the set of video embeddings and the set of textual embeddings. The fused video-textual data indicates features of the video frames of the media item and textual data pertaining to the media item. The fused video-textual data is provided as an input to an artificial intelligence (AI) model trained to perform multiple video localization tasks with respect to media items of a platform. One or move outputs of the AI model are obtained. A segment of the media item that depicts the event is determined based on the one or move outputs of the AI model.

Patent Agency Ranking