Image Generation with Minimal Denoising Diffusion Steps

    公开(公告)号:US20250157008A1

    公开(公告)日:2025-05-15

    申请号:US18949522

    申请日:2024-11-15

    Applicant: Google LLC

    Abstract: Provided is a one-step text-to-image generative model, which represents a fusion of GAN and diffusion model elements. In particular, despite the promising outcomes of prior diffusion GAN hybrid models, achieving one-step sampling and extending their utility to text-to-image generation remains a complex challenge. The present disclosure provides a number of innovative techniques to enhance diffusion GAN models, resulting in an ultra-fast text-to-image model capable of producing high-quality images in a single sampling step.

    Object Pose Estimation and Tracking Using Machine Learning

    公开(公告)号:US20220191542A1

    公开(公告)日:2022-06-16

    申请号:US17122292

    申请日:2020-12-15

    Applicant: Google LLC

    Abstract: A method includes receiving a video comprising images representing an object, and determining, using a machine learning model, based on a first image of the images, and for each respective vertex of vertices of a bounding volume for the object, first two-dimensional (2D) coordinates of the respective vertex. The method also includes tracking, from the first image to a second image of the images, a position of each respective vertex along a plane underlying the bounding volume, and determining, for each respective vertex, second 2D coordinates of the respective vertex based on the position of the respective vertex along the plane. The method further includes determining, for each respective vertex, (i) first three-dimensional (3D) coordinates of the respective vertex based on the first 2D coordinates and (ii) second 3D coordinates of the respective vertex based on the second 2D coordinates.

    RESOURCE-EFFICIENT DIFFUSION MODELS

    公开(公告)号:US20250165756A1

    公开(公告)日:2025-05-22

    申请号:US18949875

    申请日:2024-11-15

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a data item by performing a single-step denoising process using a diffusion model neural network. For example, the data items can be images, videos, audio waveforms, sensor outputs, and so on.

    Real-time pose estimation for unseen objects

    公开(公告)号:US11436755B2

    公开(公告)日:2022-09-06

    申请号:US16988683

    申请日:2020-08-09

    Applicant: Google LLC

    Abstract: Example embodiments allow for fast, efficient determination of bounding box vertices or other pose information for objects based on images of a scene that may contain the objects. An artificial neural network or other machine learning algorithm is used to generate, from an input image, a heat map and a number of pairs of displacement maps. The location of a peak within the heat map is then used to extract, from the displacement maps, the two-dimensional displacement, from the location of the peak within the image, of vertices of a bounding box that contains the object. This bounding box can then be used to determine the pose of the object within the scene. The artificial neural network can be configured to generate intermediate segmentation maps, coordinate maps, or other information about the shape of the object so as to improve the estimated bounding box.

Patent Agency Ranking