Learnable cost volume for determining pixel correspondence

    公开(公告)号:US11790550B2

    公开(公告)日:2023-10-17

    申请号:US17292647

    申请日:2020-07-08

    Applicant: Google LLC

    CPC classification number: G06T7/593 G06T7/215 G06T2207/10012 G06T2207/20081

    Abstract: A method includes obtaining a first plurality of feature vectors associated with a first image and a second plurality of feature vectors associated with a second image. The method also includes generating a plurality of transformed feature vectors by transforming each respective feature vector of the first plurality of feature vectors by a kernel matrix trained to define an elliptical inner product space. The method additionally includes generating a cost volume by determining, for each respective transformed feature vector of the plurality of transformed feature vectors, a plurality of inner products, wherein each respective inner product of the plurality of inner products is between the respective transformed feature vector and a corresponding candidate feature vector of a corresponding subset of the second plurality of feature vectors. The method further includes determining, based on the cost volume, a pixel correspondence between the first image and the second image.

    FINE-GRAINED CONTROLLABLE VIDEO GENERATION

    公开(公告)号:US20250166135A1

    公开(公告)日:2025-05-22

    申请号:US18951203

    申请日:2024-11-18

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controllable video generation. One of the methods includes receiving a text prompt that specifies an object; receiving a control input that comprises an image that depicts a particular instance of the object; generating a video that comprises a respective video frame at each of a plurality of time steps in the video and that depicts the particular instance of the object. Generating the video includes, at each of the plurality of time steps: obtaining a text prompt embedding; obtaining a control input embedding; and generating the respective video frame at the time step using a video generation neural network while the video generation neural network is conditioned on the text prompt embedding and on the control input embedding.

    Machine Learning Models for Image Interpolation

    公开(公告)号:US20240265490A1

    公开(公告)日:2024-08-08

    申请号:US18436509

    申请日:2024-02-08

    Applicant: Google LLC

    CPC classification number: G06T3/4007 G06T3/18 G06T7/248 G06T2207/20081

    Abstract: Provided is a computer system that includes one or more processors and one or more non-transitory computer-readable media that collectively store a machine-learned image interpolation model. The machine-learned image interpolation model is configured to: extract, for each of multiple different scales, a respective set of feature values from each of a pair of input images; generate, for each of the multiple different scales, a respective flow estimate for each of the pair of input images that indicates a respective flow from the interpolation time to the respective capture time; warp, for each of the multiple different scales, the respective set of feature values for each of the pair of input images according to the respective flow estimate to generate respective warped sets of features; and generate a interpolated image based on the respective warped sets of features for the pair of input images and the multiple different scales.

    Learnable Cost Volume for Determining Pixel Correspondence

    公开(公告)号:US20220189051A1

    公开(公告)日:2022-06-16

    申请号:US17292647

    申请日:2020-07-08

    Applicant: Google LLC

    Abstract: A method includes obtaining a first plurality of feature vectors associated with a first image and a second plurality of feature vectors associated with a second image. The method also includes generating a plurality of transformed feature vectors by transforming each respective feature vector of the first plurality of feature vectors by a kernel matrix trained to define an elliptical inner product space. The method additionally includes generating a cost volume by determining, for each respective transformed feature vector of the plurality of transformed feature vectors, a plurality of inner products, wherein each respective inner product of the plurality of inner products is between the respective transformed feature vector and a corresponding candidate feature vector of a corresponding subset of the second plurality of feature vectors. The method further includes determining, based on the cost volume, a pixel correspondence between the first image and the second image.

Patent Agency Ranking