-
公开(公告)号:US12189921B2
公开(公告)日:2025-01-07
申请号:US18233823
申请日:2023-08-14
Applicant: Google LLC
Inventor: Matthias Grundmann , Jokubas Zukerman , Marco Paglia , Kenneth Conley , Karthik Raveendran , Reed Morse
IPC: G06F3/0482 , G06F3/0485 , G06F3/04883 , G11B27/028 , G11B27/029 , H04N5/262
Abstract: The technology disclosed herein includes a user interface for viewing and combining media items into a video. An example method includes presenting a user interface that displays media items in a first portion of the user interface; receiving user input in the first portion that comprises a selection of a first media item; upon receiving the user input, adding the first media item to a set of selected media items in a second portion of the user interface, and presenting a selectable control element in the second portion of the user interface, wherein the control element enables a user to initiate an operation pertaining to the creation of the video based on the set of selected media items, and creating the video based on video content of the set of selected media items.
-
22.
公开(公告)号:US11694087B2
公开(公告)日:2023-07-04
申请号:US17947816
申请日:2022-09-19
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Yury Kartynnik , Andrei Vakunov , Karthik Raveendran , Matthias Grundmann
IPC: G06N20/10 , G06N3/084 , G06N3/04 , G06N3/08 , G06V40/16 , G06F18/21 , G06V10/764 , G06V10/82 , G06V10/44
CPC classification number: G06N3/084 , G06F18/217 , G06N3/04 , G06N3/08 , G06V10/454 , G06V10/764 , G06V10/82 , G06V40/165 , G06V40/171
Abstract: A computing system is disclosed including a convolutional neural configured to receive an input that describes a facial image and generate a facial object recognition output that describes one or more facial feature locations with respect to the facial image. The convolutional neural network can include a plurality of convolutional blocks. At least one of the convolutional blocks can include one or more separable convolutional layers configured to apply a depthwise convolution and a pointwise convolution during processing of an input to generate an output. The depthwise convolution can be applied with a kernel size that is greater than 3×3. At least one of the convolutional blocks can include a residual shortcut connection from its input to its output.
-
23.
公开(公告)号:US20230017459A1
公开(公告)日:2023-01-19
申请号:US17947816
申请日:2022-09-19
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Yury Kartynnik , Andrei Vakunov , Karthik Raveendran , Matthias Grundmann
Abstract: A computing system is disclosed including a convolutional neural configured to receive an input that describes a facial image and generate a facial object recognition output that describes one or more facial feature locations with respect to the facial image. The convolutional neural network can include a plurality of convolutional blocks. At least one of the convolutional blocks can include one or more separable convolutional layers configured to apply a depthwise convolution and a pointwise convolution during processing of an input to generate an output. The depthwise convolution can be applied with a kernel size that is greater than 3×3. At least one of the convolutional blocks can include a residual shortcut connection from its input to its output.
-
公开(公告)号:US20220270290A1
公开(公告)日:2022-08-25
申请号:US17745125
申请日:2022-05-16
Applicant: Google LLC
Inventor: Jianing Wei , Matthias Grundmann
Abstract: The present disclosure provides systems and methods for calibration-free instant motion tracking useful, for example, for rending virtual content in augmented reality settings. In particular, a computing system can iteratively augment image frames that depict a scene to insert virtual content at an anchor region within the scene, including situations in which the anchor region moves relative to the scene. To do so, the computing system can estimate, for each of a number of sequential image frames: a rotation of an image capture system that captures the image frames; and a translation of the anchor region relative to an image capture system, thereby providing sufficient information to determine where and at what orientation to render the virtual content within the image frame.
-
公开(公告)号:US11221737B1
公开(公告)日:2022-01-11
申请号:US16741091
申请日:2020-01-13
Applicant: Google LLC
Inventor: Matthias Grundmann , Jokubas Zukerman , Marco Paglia , Kenneth Conley , Karthik Raveendran , Reed Morse
IPC: G06F3/0482 , G06F3/0485 , G11B27/028 , G11B27/029 , G06F3/0488 , H04N5/262
Abstract: The technology disclosed herein includes a user interface for viewing and combining media items into a video. An example method includes presenting a user interface facilitating a creation of a video from a plurality of media items, wherein the user interface displays video content of the first and second media items in a first portion; receiving user input in the first portion of the user interface, wherein the user input comprises a selection of the first media item; updating the user interface to comprise a control element and a second portion, and adding the first media item to a set of selected media items, wherein the second portion displays image content of the set of selected media items and the control element enables a user to initiate the creation of the video; and creating the video based on video content of the set of selected media items.
-
公开(公告)号:US11120835B2
公开(公告)日:2021-09-14
申请号:US16222437
申请日:2018-12-17
Applicant: Google LLC
Inventor: Sharadh Ramaswamy , Matthias Grundmann , Kenneth Conley
IPC: G11B27/034 , G11B27/28 , G11B27/34 , H04N21/237 , H04N21/845 , H04N21/8549
Abstract: A computer-implemented method includes determining interesting moments in a video. The method further includes generating video segments based on the interesting moments, wherein each of the video segments includes at least one of the interesting moments from the video. The method further includes generating a collage from the video segments, where the collage includes at least two windows and wherein each window includes one of the video segments.
-
公开(公告)号:US20200211288A1
公开(公告)日:2020-07-02
申请号:US16620264
申请日:2019-10-07
Applicant: Google LLC
Inventor: Bryan Woods , Jianingwei Wei , Sundeep Vaddadi , Cheng Yang , Konstantine Tsotsos , Keith Schaefer , Leon Wong , Keir Banks Mierle , Matthias Grundmann
IPC: G06T19/00 , G06T19/20 , G06F3/0481
Abstract: In a general aspect, a method can include receiving data defining an augmented reality (AR) environment including a representation of a physical environment, and changing tracking of an AR object within the AR environment between region-tracking mode and plane-tracking mode.
-
公开(公告)号:US20190005334A1
公开(公告)日:2019-01-03
申请号:US16125045
申请日:2018-09-07
Applicant: Google LLC
Inventor: Matthias Grundmann , Alexandra Ivanna Hawkins , Sergey Ioffe
Abstract: Methods, systems, and media for summarizing a video with video thumbnails are provided. In some embodiments, the method comprises: receiving a plurality of video frames corresponding to the video and associated information associated with each of the plurality of video frames; extracting, for each of the plurality of video frames, a plurality of features; generating candidate clips that each includes at least a portion of the received video frames based on the extracted plurality of features and the associated information; calculating, for each candidate clip, a clip score based on the extracted plurality of features from the video frames associated with the candidate clip; calculating, between adjacent candidate clips, a transition score based at least in part on a comparison of video frame features between frames from the adjacent candidate clips; selecting a subset of the candidate clips based at least in part on the clip score and the transition score associated with each of the candidate clips; and automatically generating an animated video thumbnail corresponding to the video that includes a plurality of video frames selected from each of the subset of candidate clips.
-
公开(公告)号:US10074015B1
公开(公告)日:2018-09-11
申请号:US15098024
申请日:2016-04-13
Applicant: Google LLC
Inventor: Matthias Grundmann , Alexandra Ivanna Hawkins , Sergey Ioffe
CPC classification number: G06K9/00765 , G06K9/00751 , G06K9/623
Abstract: Methods, systems, and media for summarizing a video with video thumbnails are provided. In some embodiments, the method comprises: receiving a plurality of video frames corresponding to the video and associated information associated with each of the plurality of video frames; extracting, for each of the plurality of video frames, a plurality of features; generating candidate clips that each includes at least a portion of the received video frames based on the extracted plurality of features and the associated information; calculating, for each candidate clip, a clip score based on the extracted plurality of features from the video frames associated with the candidate clip; calculating, between adjacent candidate clips, a transition score based at least in part on a comparison of video frame features between frames from the adjacent candidate clips; selecting a subset of the candidate clips based at least in part on the clip score and the transition score associated with each of the candidate clips; and automatically generating an animated video thumbnail corresponding to the video that includes a plurality of video frames selected from each of the subset of candidate clips.
-
公开(公告)号:US20240412334A1
公开(公告)日:2024-12-12
申请号:US18735050
申请日:2024-06-05
Applicant: Google LLC
Inventor: Raman Sarokin , Yu-Hui Chen , Juhyun Lee , Jiuqiang Tang , Chuo-Ling Chang , Andrei Kulik , Matthias Grundmann
Abstract: Systems, methods, devices, and related techniques for accelerating execution of diffusion models or of other neural networks that involve similar operations. Some aspects include accelerating inference computations in neural networks, including inference computations utilized in denoising (also referred to as “diffusion”) neural networks.
-
-
-
-
-
-
-
-
-