-
公开(公告)号:US20250138704A1
公开(公告)日:2025-05-01
申请号:US19011541
申请日:2025-01-06
Applicant: Google LLC
Inventor: Matthias Grundmann , Jokubas Zukerman , Marco Paglia , Kenneth Conley , Karthik Raveendran , Reed Morse
IPC: G06F3/0482 , G06F3/0485 , G06F3/04883 , G11B27/028 , G11B27/029 , H04N5/262
Abstract: An example method includes presenting a user interface facilitating a creation of a video from an image associated with a first media item of a plurality of media items, wherein the first media item comprises the image and a video clip that are captured concurrently, receiving user input via the user interface, wherein the user input comprises a selection of a selectable control element presented in the user interface, and upon receiving the user input, presenting the video clip of the first media item in the user interface, wherein the video clip of the first media item is played in the user interface and comprises video content from before and after the image is captured.
-
公开(公告)号:US20230384911A1
公开(公告)日:2023-11-30
申请号:US18233823
申请日:2023-08-14
Applicant: Google LLC
Inventor: Matthias Grundmann , Jokubas Zukerman , Marco Paglia , Kenneth Conley , Karthik Raveendran , Reed Morse
IPC: G06F3/0482 , G11B27/029 , G06F3/0485 , G06F3/04883 , H04N5/262 , G11B27/028
CPC classification number: G06F3/0482 , G11B27/029 , G06F3/0485 , G06F3/04883 , H04N5/2628 , G11B27/028
Abstract: The technology disclosed herein includes a user interface for viewing and combining media items into a video. An example method includes presenting a user interface that displays media items in a first portion of the user interface; receiving user input in the first portion that comprises a selection of a first media item; upon receiving the user input, adding the first media item to a set of selected media items in a second portion of the user interface, and presenting a selectable control element in the second portion of the user interface, wherein the control element enables a user to initiate an operation pertaining to the creation of the video based on the set of selected media items, and creating the video based on video content of the set of selected media items.
-
公开(公告)号:US11783496B2
公开(公告)日:2023-10-10
申请号:US17527463
申请日:2021-11-16
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Fan Zhang , Andrei Vakunov , Andrei Tkachenka , Matthias Grundmann
CPC classification number: G06T7/251 , G06T7/75 , G06V40/28 , G06T2207/20081 , G06T2207/30196
Abstract: Example aspects of the present disclosure are directed to computing systems and methods for hand tracking using a machine-learned system for palm detection and key-point localization of hand landmarks. In particular, example aspects of the present disclosure are directed to a multi-model hand tracking system that performs both palm detection and hand landmark detection. Given a sequence of image frames, for example, the hand tracking system can detect one or more palms depicted in each image frame. For each palm detected within an image frame, the machine-learned system can determine a plurality of hand landmark positions of a hand associated with the palm. The system can perform key-point localization to determine precise three-dimensional coordinates for the hand landmark positions. In this manner, the machine-learned system can accurately track a hand depicted in the sequence of images using the precise three-dimensional coordinates for the hand landmark positions.
-
公开(公告)号:US11182909B2
公开(公告)日:2021-11-23
申请号:US16709128
申请日:2019-12-10
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Fan Zhang , Andrei Vakunov , Andrei Tkachenka , Matthias Grundmann
Abstract: Example aspects of the present disclosure are directed to computing systems and methods for hand tracking using a machine-learned system for palm detection and key-point localization of hand landmarks. In particular, example aspects of the present disclosure are directed to a multi-model hand tracking system that performs both palm detection and hand landmark detection. Given a sequence of image frames, for example, the hand tracking system can detect one or more palms depicted in each image frame. For each palm detected within an image frame, the machine-learned system can determine a plurality of hand landmark positions of a hand associated with the palm. The system can perform key-point localization to determine precise three-dimensional coordinates for the hand landmark positions. In this manner, the machine-learned system can accurately track a hand depicted in the sequence of images using the precise three-dimensional coordinates for the hand landmark positions.
-
公开(公告)号:US20200250852A1
公开(公告)日:2020-08-06
申请号:US16717603
申请日:2019-12-17
Applicant: Google LLC
Inventor: Jianing Wei , Matthias Grundmann
Abstract: The present disclosure provides systems and methods for calibration-free instant motion tracking useful, for example, for rending virtual content in augmented reality settings. In particular, a computing system can iteratively augment image frames that depict a scene to insert virtual content at an anchor region within the scene, including situations in which the anchor region moves relative to the scene. To do so, the computing system can estimate, for each of a number of sequential image frames: a rotation of an image capture system that captures the image frames; and a translation of the anchor region relative to an image capture system, thereby providing sufficient information to determine where and at what orientation to render the virtual content within the image frame.
-
公开(公告)号:US12272096B2
公开(公告)日:2025-04-08
申请号:US18335614
申请日:2023-06-15
Applicant: Google LLC
Inventor: Jianing Wei , Matthias Grundmann
Abstract: The present disclosure provides systems and methods for calibration-free instant motion tracking useful, for example, for rending virtual content in augmented reality settings. In particular, a computing system can iteratively augment image frames that depict a scene to insert virtual content at an anchor region within the scene, including situations in which the anchor region moves relative to the scene. To do so, the computing system can estimate, for each of a number of sequential image frames: a rotation of an image capture system that captures the image frames; and a translation of the anchor region relative to an image capture system, thereby providing sufficient information to determine where and at what orientation to render the virtual content within the image frame.
-
公开(公告)号:US11487407B1
公开(公告)日:2022-11-01
申请号:US17536350
申请日:2021-11-29
Applicant: Google LLC
Inventor: Matthias Grundmann , Jokubas Zukerman , Marco Paglia , Kenneth Conley , Karthik Raveendran , Reed Morse
IPC: G06F3/0482 , G06F3/04883 , H04N5/262 , G06F3/0485 , G11B27/028 , G11B27/029
Abstract: The technology disclosed herein includes a user interface for viewing and combining media items into a video. An example method includes presenting a user interface that displays media items in a first portion of the user interface; receiving user input in the first portion that comprises a selection of a first media item; upon receiving the user input, adding the first media item to a set of selected media items and updating the user interface to comprise a control element and a second portion, wherein the first and second portions are concurrently displayed and are each scrollable along a different axis, and the second portion displays image content of the set and the control element enables a user to initiate the creation of the video based on the set of selected media items; and creating the video based on video content of the set of selected media items.
-
8.
公开(公告)号:US11449714B2
公开(公告)日:2022-09-20
申请号:US16668303
申请日:2019-10-30
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Yury Kartynnik , Andrei Vakunov , Karthik Raveendran , Matthias Grundmann
Abstract: A computing system is disclosed including a convolutional neural configured to receive an input that describes a facial image and generate a facial object recognition output that describes one or more facial feature locations with respect to the facial image. The convolutional neural network can include a plurality of convolutional blocks. At least one of the convolutional blocks can include one or more separable convolutional layers configured to apply a depthwise convolution and a pointwise convolution during processing of an input to generate an output. The depthwise convolution can be applied with a kernel size that is greater than 3×3. At least one of the convolutional blocks can include a residual shortcut connection from its input to its output.
-
公开(公告)号:US11341676B2
公开(公告)日:2022-05-24
申请号:US16717603
申请日:2019-12-17
Applicant: Google LLC
Inventor: Jianing Wei , Matthias Grundmann
Abstract: The present disclosure provides systems and methods for calibration-free instant motion tracking useful, for example, for rending virtual content in augmented reality settings. In particular, a computing system can iteratively augment image frames that depict a scene to insert virtual content at an anchor region within the scene, including situations in which the anchor region moves relative to the scene. To do so, the computing system can estimate, for each of a number of sequential image frames: a rotation of an image capture system that captures the image frames; and a translation of the anchor region relative to an image capture system, thereby providing sufficient information to determine where and at what orientation to render the virtual content within the image frame.
-
公开(公告)号:US20210174519A1
公开(公告)日:2021-06-10
申请号:US16709128
申请日:2019-12-10
Applicant: Google LLC
Inventor: Valentin Bazarevsky , Fan Zhang , Andrei Vakunov , Andrei Tkachenka , Matthias Grundmann
Abstract: Example aspects of the present disclosure are directed to computing systems and methods for hand tracking using a machine-learned system for palm detection and key-point localization of hand landmarks. In particular, example aspects of the present disclosure are directed to a multi-model hand tracking system that performs both palm detection and hand landmark detection. Given a sequence of image frames, for example, the hand tracking system can detect one or more palms depicted in each image frame. For each palm detected within an image frame, the machine-learned system can determine a plurality of hand landmark positions of a hand associated with the palm. The system can perform key-point localization to determine precise three-dimensional coordinates for the hand landmark positions. In this manner, the machine-learned system can accurately track a hand depicted in the sequence of images using the precise three-dimensional coordinates for the hand landmark positions.
-
-
-
-
-
-
-
-
-