-
公开(公告)号:US10049279B2
公开(公告)日:2018-08-14
申请号:US15267621
申请日:2016-09-16
摘要: A method of predicting action labels for a video stream includes receiving the video stream and calculating an optical flow of consecutive frames of the video stream. An attention map is generated from the current frame of the video stream and the calculated optical flow. An action label is predicted for the current frame based on the optical flow, a previous hidden state and the attention map.
-
公开(公告)号:US09830709B2
公开(公告)日:2017-11-28
申请号:US15249280
申请日:2016-08-26
CPC分类号: G06T7/11 , G06K9/00335 , G06K9/00718 , G06N3/0445 , G06N3/0454 , G06N3/08 , G06T7/0081 , G06T2207/10004 , G06T2207/20084
摘要: A method of processing data within a convolutional attention recurrent neural network (RNN) includes generating a current multi-dimensional attention map. The current multi-dimensional attention map indicates areas of interest in a first frame from a sequence of spatio-temporal data. The method further includes receiving a multi-dimensional feature map. The method also includes convolving the current multi-dimensional attention map and the multi-dimensional feature map to obtain a multi-dimensional hidden state and a next multi-dimensional attention map. The method identifies a class of interest in the first frame based on the multi-dimensional hidden state and training data.
-
公开(公告)号:US12067777B2
公开(公告)日:2024-08-20
申请号:US17654986
申请日:2022-03-15
发明人: Hanul Kim , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
摘要: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
公开(公告)号:US20220301310A1
公开(公告)日:2022-09-22
申请号:US17654986
申请日:2022-03-15
发明人: Hanul KIM , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
摘要: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
公开(公告)号:US11842540B2
公开(公告)日:2023-12-12
申请号:US17219460
申请日:2021-03-31
摘要: Systems and techniques are provided for performing holistic video understanding. For example a process can include obtaining a first video and determining, using a machine learning model decision engine, a first machine learning model from a set of machine learning models to use for processing at least a portion of the first video. The first machine learning model can be determined based on one or more characteristics of at least the portion of the first video. The process can include processing at least the portion of the first video using the first machine learning model.
-
公开(公告)号:US11256964B2
公开(公告)日:2022-02-22
申请号:US16599078
申请日:2019-10-10
发明人: Kyle Jordan Brown , Mihir Jain , Ahmed Kamel Sadek
摘要: A method for predicting a future action of agents in a scene includes assigning a fidelity level to agents observed in the scene. The method also includes recursively predicting future actions of the agents by traversing the scene. A different forward prediction model is used at each recursion level. The method further includes controlling an action of an ego agent based on the predicted future actions of the agents.
-
公开(公告)号:US10776628B2
公开(公告)日:2020-09-15
申请号:US16152301
申请日:2018-10-04
摘要: A method for processing a sequence of frames includes receiving a sequence of frames and multiple action proposals for the sequence of frames. The method also includes generating a representation of the sequence of frames and pooling the representation around each of the action proposals. The method further includes classifying the action proposals based on the pooled representations and controlling a device based on the classifying.
-
-
-
-
-
-