-
公开(公告)号:US12141199B2
公开(公告)日:2024-11-12
申请号:US17548859
申请日:2021-12-13
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
IPC: G06K9/62 , G06F16/78 , G06F16/783 , G06F18/214 , G06F18/22 , G06F18/2413 , G06V20/40 , G06V20/70 , H04N5/265
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US20220207873A1
公开(公告)日:2022-06-30
申请号:US17548859
申请日:2021-12-13
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US20180025228A1
公开(公告)日:2018-01-25
申请号:US15722756
申请日:2017-10-02
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
CPC classification number: G06K9/00718 , G06F16/783 , G06F16/7867 , G06K9/52 , G06K9/6201 , G06K9/6256 , G06K9/627 , H04N5/265
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US12118036B2
公开(公告)日:2024-10-15
申请号:US17352067
申请日:2021-06-18
Applicant: Google LLC
Inventor: Yi Shen , Xiangrong Chen , Min-hsuan Tsai , Yun Shi , Tianpeng Jin , Zheng Sun , Weilong Yang , Jingbin Wang , Carolyn Au , James Futrell
IPC: G06F16/738 , G06F16/783 , G06F18/22 , G06T7/20 , G06T7/90 , G06V20/40 , G11B27/031
CPC classification number: G06F16/739 , G06F16/7837 , G06F16/785 , G06F16/786 , G06F18/22 , G06T7/20 , G06T7/90 , G06V20/47 , G11B27/031 , G06T2207/10016
Abstract: Systems and methods of automatically extracting summaries of video content are described herein. A data processing system can access, from a video database, a first video content element including a first plurality of frame. The data processing system can select an intervallic subset of the first plurality of frames of the first video content element. The data processing system can calculate, for each of a plurality of further subsets comprising a predetermined number of frames from the intervallic subset, a score for the further subset. The data processing system can identify, from the plurality of further subsets, a further subset having a highest score. The data processing system can select a portion of the first video content element comprising the frames of the further subset having the highest score. The data processing system can generate a second video content element comprising the selected portion of the first video content element.
-
公开(公告)号:US20240212246A1
公开(公告)日:2024-06-27
申请号:US18400629
申请日:2023-12-29
Applicant: Google LLC
Inventor: Tianhao Zhang , Weilong Yang , Honglak Lee , Hung-Yu Tseng , Irfan Aziz Essa , Lu Jiang
Abstract: A method for generating an output image from an input image and an input text instruction that specifies a location and a modification of an edit applied to the input image using a neural network is described. The neural network includes an image encoder, an image decoder, and an instruction attention network. The method includes receiving the input image and the input text instruction; extracting, from the input image, an input image feature that represents features of the input image using the image encoder; generating a spatial feature and a modification feature from the input text instruction using the instruction attention network; generating an edited image feature from the input image feature, the spatial feature and the modification feature; and generating the output image from the edited image feature using the image decoder.
-
公开(公告)号:US20210312186A1
公开(公告)日:2021-10-07
申请号:US17352067
申请日:2021-06-18
Applicant: Google LLC
Inventor: Yi Shen , Xiangrong Chen , Min-hsuan Tsai , Yun Shi , Tianpeng Jin , Zheng Sun , Weilong Yang , Jingbin Wang
IPC: G06K9/00 , G06T7/90 , G06F16/738 , G06F16/783 , G06K9/62 , G06T7/20 , G11B27/031
Abstract: Systems and methods of automatically extracting summaries of video content are described herein. A data processing system can access, from a video database, a first video content element including a first plurality of frame. The data processing system can select an intervallic subset of the first plurality of frames of the first video content element. The data processing system can calculate, for each of a plurality of further subsets comprising a predetermined number of frames from the intervallic subset, a score for the further subset. The data processing system can identify, from the plurality of further subsets, a further subset having a highest score. The data processing system can select a portion of the first video content element comprising the frames of the further subset having the highest score. The data processing system can generate a second video content element comprising the selected portion of the first video content element.
-
公开(公告)号:US11562518B2
公开(公告)日:2023-01-24
申请号:US17340671
申请日:2021-06-07
Applicant: Google LLC
Inventor: Tianhao Zhang , Weilong Yang , Honglak Lee , Hung-Yu Tseng , Irfan Aziz Essa , Lu Jiang
Abstract: A method for generating an output image from an input image and an input text instruction that specifies a location and a modification of an edit applied to the input image using a neural network is described. The neural network includes an image encoder, an image decoder, and an instruction attention network. The method includes receiving the input image and the input text instruction; extracting, from the input image, an input image feature that represents features of the input image using the image encoder; generating a spatial feature and a modification feature from the input text instruction using the instruction attention network; generating an edited image feature from the input image feature, the spatial feature and the modification feature; and generating the output image from the edited image feature using the image decoder.
-
公开(公告)号:US20210383584A1
公开(公告)日:2021-12-09
申请号:US17340671
申请日:2021-06-07
Applicant: Google LLC
Inventor: Tianhao Zhang , Weilong Yang , Honglak Lee , Hung-Yu Tseng , Irfan Aziz Essa , Lu Jiang
Abstract: A method for generating an output image from an input image and an input text instruction that specifies a location and a modification of an edit applied to the input image using a neural network is described. The neural network includes an image encoder, an image decoder, and an instruction attention network. The method includes receiving the input image and the input text instruction; extracting, from the input image, an input image feature that represents features of the input image using the image encoder; generating a spatial feature and a modification feature from the input text instruction using the instruction attention network; generating an edited image feature from the input image feature, the spatial feature and the modification feature; and generating the output image from the edited image feature using the image decoder.
-
公开(公告)号:US20200082173A1
公开(公告)日:2020-03-12
申请号:US16687118
申请日:2019-11-18
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US12014542B2
公开(公告)日:2024-06-18
申请号:US17120525
申请日:2020-12-14
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susana Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
IPC: G06K9/00 , G06F16/783 , G06V20/40
CPC classification number: G06V20/41 , G06F16/7834 , G06V20/46 , G06V20/47 , G06V20/49 , G06V2201/10
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
-
-
-
-
-
-
-
-