Patent search ap:("Google LLC") AND inv:"Inbar Mosseri" Page 1

1.

发明授权
Audio-visual hearing aid 有权

公开(公告)号：US12073844B2

公开(公告)日：2024-08-27

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

CPC classification number: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

2.

发明授权
Audio-visual speech separation 有权

公开(公告)号：US11894014B2

公开(公告)日：2024-02-06

申请号：US17951002

申请日：2022-09-22

Applicant: Google LLC

Inventor： Inbar Mosseri , Michael Rubinstein , Ariel Ephrat , William Freeman , Oran Lang , Kevin William Wilson , Tali Dekel , Avinatan Hassidim

IPC: G10L25/57 , G10L15/16 , G10L21/10 , G10L21/18 , G06V20/40 , G06V40/16 , G10L15/25 , G06F18/214 , G10L17/18

CPC classification number: G10L25/57 , G06F18/214 , G06V20/41 , G06V40/161 , G10L15/16 , G10L15/25 , G10L17/18 , G10L21/10 , G10L21/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

3.

发明申请
AUDIO-VISUAL SPEECH SEPARATION 有权

公开(公告)号：US20230122905A1

公开(公告)日：2023-04-20

申请号：US17951002

申请日：2022-09-22

Applicant: Google LLC

Inventor： Inbar Mosseri , Michael Rubinstein , Ariel Ephrat , William Freeman , Oran Lang , Kevin William Wilson , Tali Dekel , Avinatan Hassidim

IPC: G10L21/10 , G10L15/16 , G10L21/18 , G06V20/40 , G06V40/16 , G10L15/25 , G06F18/214

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

4.

发明授权
Generating cartoon images from photos 有权

公开(公告)号：US10529115B2

公开(公告)日：2020-01-07

申请号：US15921207

申请日：2018-03-14

Applicant: Google LLC

Inventor： Aaron Sarna , Dilip Krishnan , Forrester Cole , Inbar Mosseri

IPC: G06T13/80

Abstract: A system and method for generating cartoon images from photos are described. The method includes receiving an image of a user, determining a template for a cartoon avatar, determining an attribute needed for the template, processing the image with a classifier trained for classifying the attribute included in the image, determining a label generated by the classifier for the attribute, determining a cartoon asset for the attribute based on the label, and rendering the cartoon avatar personifying the user using the cartoon asset.

5.

发明公开
Text-Based Real Image Editing with Diffusion Models 审中-公开

公开(公告)号：US20240355017A1

公开(公告)日：2024-10-24

申请号：US18302508

申请日：2023-04-18

Applicant: Google LLC

Inventor： Shiran Elyahu Zada , Bahjat Kawar , Oran Lang , Omer Tov , Huiwen Chang , Tali Dekel , Inbar Mosseri

IPC: G06T11/60 , G06T3/40

CPC classification number: G06T11/60 , G06T3/4053

Abstract: Methods and systems for editing an image are disclosed herein. The method includes receiving an input image and a target text, the target text indicating a desired edit for the input image and obtaining, by the computing system, a target text embedding based on the target text. The method also includes obtaining, by the computing system, an optimized text embedding based on the target text embedding and the input image and fine-tuning, by the computing system, a diffusion model based on the optimized text embedding. The method can further include interpolating, by the computing system, the target text embedding and the optimized text embedding to obtain an interpolated embedding and generating, by the computing system, an edited image including the desired edit using the diffusion model based on the input image and the interpolated embedding.

6.

发明公开
AUDIO-VISUAL HEARING AID 审中-公开

公开(公告)号：US20230267942A1

公开(公告)日：2023-08-24

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L25/57

CPC classification number: G10L21/0208 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

7.

发明申请
Deep Saliency Prior 有权

公开(公告)号：US20230015117A1

公开(公告)日：2023-01-19

申请号：US17856370

申请日：2022-07-01

Applicant: Google LLC

Inventor： Kfir Aberman , David Edward Jacobs , Kai Jochen Kohlhoff , Michael Rubinstein , Yossi Gandelsman , Junfeng He , Inbar Mosseri , Yael Pritch Knaan

IPC: G06T7/194 , G06V40/20 , G06T7/11 , G06T3/00 , G06T11/00

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

8.

发明申请
AUDIO-VISUAL HEARING AID 有权

公开(公告)号：US20240428816A1

公开(公告)日：2024-12-26

申请号：US18797400

申请日：2024-08-07

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

9.

发明授权
Facial image editing and enhancement using a personalized prior 有权

公开(公告)号：US11721007B2

公开(公告)日：2023-08-08

申请号：US17982842

申请日：2022-11-08

Applicant: Google LLC

Inventor： Kfir Aberman , Yotam Nitzan , Orly Liba , Yael Pritch Knaan , Qiurui He , Inbar Mosseri , Yossi Gandelsman , Michal Yarom

IPC: G06T5/00 , G06T5/50 , G06T3/40

CPC classification number: G06T5/50 , G06T3/40 , G06T5/001 , G06T2207/20081 , G06T2207/20084

Abstract: Systems and methods for identifying a personalized prior within a generative model's latent vector space based on a set of images of a given subject. In some examples, the present technology may further include using the personalized prior to confine the inputs of a generative model to a latent vector space associated with the given subject, such that when the model is tasked with editing an image of the subject (e.g., to perform inpainting to fill in masked areas, improve resolution, or deblur the image), the subject's identifying features will be reflected in the images the model produces.

10.

发明公开
FACIAL IMAGE EDITING AND ENHANCEMENT USING A PERSONALIZED PRIOR 审中-公开

公开(公告)号：US20230222636A1

公开(公告)日：2023-07-13

申请号：US17982842

申请日：2022-11-08

Applicant: Google LLC

Inventor： Kfir Aberman , Yotam Nitzan , Orly Liba , Yael Pritch Knaan , Qiurui He , Inbar Mosseri , Yossi Gandelsman , Michal Yarom

IPC: G06T5/50 , G06T3/40 , G06T5/00

CPC classification number: G06T5/50 , G06T3/40 , G06T5/001 , G06T2207/20081 , G06T2207/20084

Abstract: Systems and methods for identifying a personalized prior within a generative model's latent vector space based on a set of images of a given subject. In some examples, the present technology may further include using the personalized prior to confine the inputs of a generative model to a latent vector space associated with the given subject, such that when the model is tasked with editing an image of the subject (e.g., to perform inpainting to fill in masked areas, improve resolution, or deblur the image), the subject's identifying features will be reflected in the images the model produces.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification