-
公开(公告)号:US20230064328A1
公开(公告)日:2023-03-02
申请号:US17459964
申请日:2021-08-27
Applicant: Google LLC
Inventor: Colvin Pitts , Yukun Zhu , Xuhui Jia
Abstract: A computing system and method that can be used for safe and privacy preserving video representations of participants in a videoconference. In particular, the present disclosure provides a general pipeline for generating reconstructions of videoconference participants based on semantic statuses and/or activity statuses of the participants. The systems and methods of the present disclosure allow for videoconferences that convey necessary or meaningful information of participants through presentation of generalized representations of participants while filtering unnecessary or unwanted information from the representations by leveraging machine-learning models.
-
公开(公告)号:US20250166136A1
公开(公告)日:2025-05-22
申请号:US18957367
申请日:2024-11-22
Applicant: Google LLC
Inventor: Mark Jeffrey Matthews , Prafull Sharma , Dmitry Lagun , Xuhui Jia , Yuanzhen Li , Varun Jampani , William Tafel Freeman
Abstract: Provided are systems and methods for controlling material attributes such as roughness, metallic, albedo, and transparency in real images. This method leverages the generative prior of text-to-image models known for their photorealistic capabilities, offering an alternative to traditional rendering pipelines. As one example, the technology can be used to alter the appearance of an object in an image, making it appear more metallic or changing its roughness to create a more matte or glossy finish. This can be particularly useful in various fields where the ability to manipulate the appearance of products in images can be a powerful tool.
-
公开(公告)号:US20250166135A1
公开(公告)日:2025-05-22
申请号:US18951203
申请日:2024-11-18
Applicant: Google LLC
Inventor: Yu-Chuan Su , Hsin-Ping Huang , Ming-Hsuan Yang , Deqing Sun , Lu Jiang , Yukun Zhu , Xuhui Jia
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controllable video generation. One of the methods includes receiving a text prompt that specifies an object; receiving a control input that comprises an image that depicts a particular instance of the object; generating a video that comprises a respective video frame at each of a plurality of time steps in the video and that depicts the particular instance of the object. Generating the video includes, at each of the plurality of time steps: obtaining a text prompt embedding; obtaining a control input embedding; and generating the respective video frame at the time step using a video generation neural network while the video generation neural network is conditioned on the text prompt embedding and on the control input embedding.
-
公开(公告)号:US20240205278A1
公开(公告)日:2024-06-20
申请号:US18591787
申请日:2024-02-29
Applicant: Google LLC
Inventor: Colvin Pitts , Yukun Zu , Xuhui Jia
CPC classification number: H04L65/403 , G06T11/00 , G06V20/41 , G06V40/20 , H04L63/105 , H04N5/272
Abstract: A computing system and method that can be used for safe and privacy preserving video representations of participants in a videoconference. In particular, the present disclosure provides a general pipeline for generating reconstructions of videoconference participants based on semantic statuses and/or activity statuses of the participants. The systems and methods of the present disclosure allow for videoconferences that convey necessary or meaningful information of participants through presentation of generalized representations of participants while filtering unnecessary or unwanted information from the representations by leveraging machine-learning models.
-
公开(公告)号:US20240119307A1
公开(公告)日:2024-04-11
申请号:US18474934
申请日:2023-09-26
Applicant: Google LLC
Inventor: Hong-You Chen , Boqing Gong , Mingda Zhang , Hang Qi , Xuhui Jia , Li Zhang
IPC: G06N3/098
CPC classification number: G06N3/098
Abstract: The embodiments are directed towards providing personalized federated learning (PFL) models via sharable federated basis models. A model architecture and learning algorithm for PFL models is disclosed. The embodiments learn a set of basis models, which can be combined layer by layer to form a personalized model for each client using specifically learned combination coefficients. The set of basis models are shared with each client of a set of the clients. Thus, the set of basis models is common to each client of the set of clients. However, each client may generate a unique PFL based on their specifically learned combination coefficients. The unique combination of coefficients for each client may be encoded in a separate personalized vector for each of the clients.
-
公开(公告)号:US11949724B2
公开(公告)日:2024-04-02
申请号:US17459964
申请日:2021-08-27
Applicant: Google LLC
Inventor: Colvin Pitts , Yukun Zhu , Xuhui Jia
CPC classification number: H04L65/403 , G06T11/00 , G06V20/41 , G06V40/20 , H04L63/105 , H04N5/272
Abstract: A computing system and method that can be used for safe and privacy preserving video representations of participants in a videoconference. In particular, the present disclosure provides a general pipeline for generating reconstructions of videoconference participants based on semantic statuses and/or activity statuses of the participants. The systems and methods of the present disclosure allow for videoconferences that convey necessary or meaningful information of participants through presentation of generalized representations of participants while filtering unnecessary or unwanted information from the representations by leveraging machine-learning models.
-
公开(公告)号:US20230359865A1
公开(公告)日:2023-11-09
申请号:US18044842
申请日:2020-09-16
Applicant: Google LLC
Inventor: Zhuoran Shen , Raviteja Vemulapalli , Irwan Bello , Xuhui Jia , Ching-Hui Chen
Abstract: The present disclosure provides systems, methods, and computer program products for modeling dependencies throughout a network using a global-self attention model with a content attention layer and a positional attention layer that operate in parallel. The model receives input data comprising content values and context positions. The content attention layer generates one or more output features for each context position based on a global attention operation applied to the content values independent of the context positions. The positional attention layer generates an attention map for each of the context positions based on one or more content values of the respective context position and associated neighboring positions. Output is determined based on the output features generated by the content attention layer and the attention map generated for each context position by the positional attention layer. The model improves efficiency and can be used throughout a deep network.
-
公开(公告)号:US20230343073A1
公开(公告)日:2023-10-26
申请号:US17729878
申请日:2022-04-26
Applicant: Google LLC
IPC: G06V10/774 , G06V10/82 , G06V10/764 , G06V10/74 , G06V10/42 , G06V10/44 , G06N3/08
CPC classification number: G06V10/774 , G06V10/82 , G06V10/764 , G06V10/761 , G06V10/42 , G06V10/44 , G06N3/084
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing novel category discovery. One of the methods includes generating first local feature tensors from a first training image; obtaining previous local feature tensors generated from a previous training image; generating a first similarity tensor representing a similarity between the first local feature tensors and the previous local feature tensors; obtaining a second similarity tensor for a second training image; processing, using a neural network, the first training image to generate a first training output representing a class prediction for the first training image; obtaining a second training output representing a class prediction for the second training image; and generating an update to the neural network from (i) a similarity between the first similarity tensor and the second similarity tensor and (ii) a similarity between the first training output and the second training output.
-
公开(公告)号:US20230281979A1
公开(公告)日:2023-09-07
申请号:US18006078
申请日:2020-08-03
Applicant: Xuhui JIA , Raviteja VEMULAPALLI , Yukun ZHU , Bradley Ray GREEN , Bardia DOOSTI , Ching-Hui CHEN , Google LLC
Inventor: Xuhui Jia , Raviteja Vemulapalli , Bradley Ray Green , Bardia Doosti , Ching-Hui Chen
IPC: G06V10/82 , G06V10/776
CPC classification number: G06V10/82 , G06V10/776
Abstract: Systems and methods of the present disclosure are directed to a method for training a machine-learned visual attention model. The method can include obtaining image data that depicts a head of a person and an additional entity. The method can include processing the image data with an encoder portion of the visual attention model to obtain latent head and entity encodings. The method can include processing the latent encodings with the visual attention model to obtain a visual attention value and processing the latent encodings with a machine-learned visual location model to obtain a visual location estimation. The method can include training the models by evaluating a loss function that evaluates differences between the visual location estimation and a pseudo visual location label derived from the image data and between the visual attention value and a ground truth visual attention label.
-
公开(公告)号:US20230222628A1
公开(公告)日:2023-07-13
申请号:US17572923
申请日:2022-01-11
Applicant: Google LLC
Inventor: Yang Zhao , Yu-Chuan Su , Chun-Te Chu , Yandong Li , Marius Renn , Yukun Zhu , Xuhui Jia , Bradley Ray Green
CPC classification number: G06T5/001 , G06V40/168 , G06T2207/30201 , G06T2207/20081 , G06T2207/20084
Abstract: Systems and methods for training a restoration model can leverage training for two sub-tasks to train the restoration model to generate realistic and identity-preserved outputs. The systems and methods can balance the training of the generation task and the reconstruction task to ensure the generated outputs preserve the identity of the original subject while generating realistic outputs. The systems and methods can further leverage a feature quantization model and skip connections to improve the model output and overall training.
-
-
-
-
-
-
-
-
-