Patent search ap:("Google LLC") AND inv:"Yukun Zhu" Page 1

1.

发明公开
Multi-Stage Machine Learning Model Synthesis for Efficient Inference 审中-公开

公开(公告)号：US20230297852A1

公开(公告)日：2023-09-21

申请号：US18007379

申请日：2021-07-29

Applicant: Google LLC

Inventor： Li Zhang , Andrew Gerald Howard , Brendan Wesley Jou , Yukun Zhu , Mingda Zhang , Andrey Zhmoginov

IPC: G06N5/022

CPC classification number: G06N5/022

Abstract: Example implementations of the present disclosure combine efficient model design and dynamic inference. With a standalone lightweight model, the unnecessary computation on easy examples is avoided and the information extracted by the lightweight model also guide the synthesis of a specialist network from the basis models. With extensive experiments on ImageNet it is shown that a proposed example BasisNet is particularly effective for image classification and a BasisNet-MV3 achieves 80.3% top-1 accuracy with 290 M MAdds without early termination.

2.

发明公开
AUDIO-VISUAL HEARING AID 审中-公开

公开(公告)号：US20230267942A1

公开(公告)日：2023-08-24

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L25/57

CPC classification number: G10L21/0208 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

3.

发明申请
NEURAL ARCHITECTURE SEARCH FOR DENSE IMAGE PREDICTION TASKS 审中-公开

公开(公告)号：US20190370648A1

公开(公告)日：2019-12-05

申请号：US16425900

申请日：2019-05-29

Applicant: Google LLC

Inventor： Barret Zoph , Jonathon Shlens , Yukun Zhu , Maxwell Donald Emmet Collins , Liang-Chieh Chen , Gerhard Florian Schroff , Hartwig Adam , Georgios Papandreou

IPC: G06N3/08 , G06N3/04 , G06K9/62 , G06N20/00 , G06F17/15

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes obtaining training data for a dense image prediction task; and determining an architecture for a neural network configured to perform the dense image prediction task, comprising: searching a space of candidate architectures to identify one or more best performing architectures using the training data, wherein each candidate architecture in the space of candidate architectures comprises (i) the same first neural network backbone that is configured to receive an input image and to process the input image to generate a plurality of feature maps and (ii) a different dense prediction cell configured to process the plurality of feature maps and to generate an output for the dense image prediction task; and determining the architecture for the neural network based on the best performing candidate architectures.

4.

发明授权
Audio-visual hearing aid 有权

公开(公告)号：US12073844B2

公开(公告)日：2024-08-27

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

CPC classification number: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

5.

发明公开
Training a Restoration Model for Balanced Generation and Reconstruction 审中-公开

公开(公告)号：US20230222628A1

公开(公告)日：2023-07-13

申请号：US17572923

申请日：2022-01-11

Applicant: Google LLC

Inventor： Yang Zhao , Yu-Chuan Su , Chun-Te Chu , Yandong Li , Marius Renn , Yukun Zhu , Xuhui Jia , Bradley Ray Green

IPC: G06T5/00 , G06V40/16

CPC classification number: G06T5/001 , G06V40/168 , G06T2207/30201 , G06T2207/20081 , G06T2207/20084

Abstract: Systems and methods for training a restoration model can leverage training for two sub-tasks to train the restoration model to generate realistic and identity-preserved outputs. The systems and methods can balance the training of the generation task and the reconstruction task to ensure the generated outputs preserve the identity of the original subject while generating realistic outputs. The systems and methods can further leverage a feature quantization model and skip connections to improve the model output and overall training.

6.

发明申请
FINE-GRAINED CONTROLLABLE VIDEO GENERATION 有权

公开(公告)号：US20250166135A1

公开(公告)日：2025-05-22

申请号：US18951203

申请日：2024-11-18

Applicant: Google LLC

Inventor： Yu-Chuan Su , Hsin-Ping Huang , Ming-Hsuan Yang , Deqing Sun , Lu Jiang , Yukun Zhu , Xuhui Jia

IPC: G06T5/60 , G06T7/00 , G06T7/246 , G06T11/20 , G06T11/60

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controllable video generation. One of the methods includes receiving a text prompt that specifies an object; receiving a control input that comprises an image that depicts a particular instance of the object; generating a video that comprises a respective video frame at each of a plurality of time steps in the video and that depicts the particular instance of the object. Generating the video includes, at each of the plurality of time steps: obtaining a text prompt embedding; obtaining a control input embedding; and generating the respective video frame at the time step using a video generation neural network while the video generation neural network is conditioned on the text prompt embedding and on the control input embedding.

7.

发明申请
MEDIA TREND DETECTION AND MAINTENANCE AT A CONTENT SHARING PLATFORM 有权

公开(公告)号：US20250111675A1

公开(公告)日：2025-04-03

申请号：US18900467

申请日：2024-09-27

Applicant: Google LLC

Inventor： Hui Miao , Chun-Te Chu , Mingyan Gao , Huanfen Yao , Ting Liu , Long Zhao , Liangzhe Yuan , Yukun Zhu , Vinay Kumar Bettadapura , Ye Jin

IPC: G06V20/40 , G06V10/74 , G06V10/75 , G06V10/762 , G06V10/80

Abstract: Methods and systems for media trend detection and maintenance are provided herein. A set of media items each having common media characteristics is identified. A set of pose values is determined for each respective media item of the set of media items. Each pose value is associated with a particular predefined pose for objects depicted by the set of media items. A set of distance scores is calculated. Each distance score represents a distance between the respective set of pose values determined for a media item and a respective set of pose values determined for an additional media item. A coherence score is determined for the set of media items based on the calculated set of distance scores. Responsive to a determination that the coherence score satisfies one or more coherence criteria, a determination is made that the set of media items corresponds to a media trend of a platform.

8.

发明申请
AUDIO-VISUAL HEARING AID 有权

公开(公告)号：US20240428816A1

公开(公告)日：2024-12-26

申请号：US18797400

申请日：2024-08-07

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

9.

发明授权
Safe and privacy preserving video representation 有权

公开(公告)号：US11949724B2

公开(公告)日：2024-04-02

申请号：US17459964

申请日：2021-08-27

Applicant: Google LLC

Inventor： Colvin Pitts , Yukun Zhu , Xuhui Jia

IPC: G06T11/00 , G06V20/40 , G06V40/20 , H04L9/40 , H04L65/403 , H04N5/272

CPC classification number: H04L65/403 , G06T11/00 , G06V20/41 , G06V40/20 , H04L63/105 , H04N5/272

Abstract: A computing system and method that can be used for safe and privacy preserving video representations of participants in a videoconference. In particular, the present disclosure provides a general pipeline for generating reconstructions of videoconference participants based on semantic statuses and/or activity statuses of the participants. The systems and methods of the present disclosure allow for videoconferences that convey necessary or meaningful information of participants through presentation of generalized representations of participants while filtering unnecessary or unwanted information from the representations by leveraging machine-learning models.

10.

发明申请
Self-Supervised Learning of Photo Quality Using Implicitly Preferred Photos in Temporal Clusters 有权

公开(公告)号：US20230113131A1

公开(公告)日：2023-04-13

申请号：US17909579

申请日：2020-03-05

Applicant: Shawn O'Banion , Wenhuan WEI , Yukun ZHU , Google LLC

Inventor： Shawn Ryan O'Banion , Wenhuan Wei , Yukun Zhu

IPC: G06V20/70 , G06V10/98 , G06V10/771 , G06V10/75 , G06V10/74 , G06V10/82

Abstract: The present disclosure is directed to systems and methods for performing automated labeling of images. Labeled images can be used to train machine-learned models to infer image attributes such as quality for suggesting user actions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification