-
公开(公告)号:US11663827B2
公开(公告)日:2023-05-30
申请号:US17863445
申请日:2022-07-13
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , Alexis Bienvenu , David Ross , Timothy Novikoff , Arvind Balasubramanian
IPC: G06V20/40 , G06F3/0484 , G06N3/04 , G06V20/30
Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.
-
公开(公告)号:US20220351516A1
公开(公告)日:2022-11-03
申请号:US17863445
申请日:2022-07-13
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , Alexis Bienvenu , David Ross , Timothy Novikoff , Arvind Balasubramanian
IPC: G06V20/40 , G06F3/0484 , G06N3/04 , G06V20/30
Abstract: A computer-implemented method includes receiving a video that includes multiple frames. The method further includes identifying a start time and an end time of each action in the video based on application of one or more of an audio classifier, an RGB classifier, and a motion classifier. The method further includes identifying video segments from the video that include frames between the start time and the end time for each action in the video. The method further includes generating a confidence score for each of the video segments based on a probability that a corresponding action corresponds to one or more of a set of predetermined actions. The method further includes selecting a subset of the video segments based on the confidence score for each of the video segments.
-
公开(公告)号:US11074454B1
公开(公告)日:2021-07-27
申请号:US16410863
申请日:2019-05-13
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , George Dan Toderici , Yue Hei Ng , Matthew John Hausknecht , Oriol Vinyals , Rajat Monga
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying videos using neural networks. One of the methods includes obtaining a temporal sequence of video frames, wherein the temporal sequence comprises a respective video frame from a particular video at each of a plurality time steps; for each time step of the plurality of time steps: processing the video frame at the time step using a convolutional neural network to generate features of the video frame; and processing the features of the video frame using an LSTM neural network to generate a set of label scores for the time step and classifying the video as relating to one or more of the topics represented by labels in the set of labels from the label scores for each of the plurality of time steps.
-
公开(公告)号:US11045949B2
公开(公告)日:2021-06-29
申请号:US16823947
申请日:2020-03-19
Applicant: Google LLC
Inventor: Sudheendra Vijayanarasimhan , Eric Jang , Peter Pastor Sampedro , Sergey Levine
Abstract: Deep machine learning methods and apparatus related to manipulation of an object by an end effector of a robot. Some implementations relate to training a semantic grasping model to predict a measure that indicates whether motion data for an end effector of a robot will result in a successful grasp of an object; and to predict an additional measure that indicates whether the object has desired semantic feature(s). Some implementations are directed to utilization of the trained semantic grasping model to servo a grasping end effector of a robot to achieve a successful grasp of an object having desired semantic feature(s).
-
公开(公告)号:US10235428B2
公开(公告)日:2019-03-19
申请号:US15195105
申请日:2016-06-28
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , Sudheendra Vijayanarasimhan , Sanketh Shetty , Nisarg Dilipkumar Kothari , Nicholas Delmonico Rizzolo
Abstract: Techniques identify time-sensitive content and present the time-sensitive content to communication devices of users interested or potentially interested in the time-sensitive content. A content management component analyzes video or audio content, and extracts information from the content and determines whether the content is time-sensitive content, such as recent news-related content, based on analysis of the content and extracted information. The content management component evaluates user-related information and the extracted information, and determines whether a user(s) is likely to be interested in the time-sensitive content based on the evaluation results. The content management component sends a notification to the communication device(s) of the user(s) in response to determining the user(s) is likely to be interested in the time-sensitive content.
-
公开(公告)号:US20180239964A1
公开(公告)日:2018-08-23
申请号:US15959858
申请日:2018-04-23
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susanna Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
CPC classification number: G06K9/00718 , G06F16/7834 , G06K9/00744 , G06K9/00751 , G06K9/00765 , G06K2209/27
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
公开(公告)号:US09953222B2
公开(公告)日:2018-04-24
申请号:US14848216
申请日:2015-09-08
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susanna Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
CPC classification number: G06K9/00718 , G06F17/30787 , G06K9/00744 , G06K9/00751 , G06K9/00765 , G06K2209/27
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
-
-
-
-
-