Patent search ap:("GOOGLE LLC") AND inv:"Dominik Roblek" Page 1

1.

发明申请
Self-Supervised Audio Representation Learning for Mobile Devices 有权

公开(公告)号：US20230085596A1

公开(公告)日：2023-03-16

申请号：US17986477

申请日：2022-11-14

Applicant: Google LLC

Inventor： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC: G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18

Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

2.

发明申请
GENERATING AUDIO WAVEFORMS USING ENCODER AND DECODER NEURAL NETWORKS 有权

公开(公告)号：US20230013370A1

公开(公告)日：2023-01-19

申请号：US17856292

申请日：2022-07-01

Applicant: Google LLC

Inventor： Yunpeng Li , Marco Tagliasacchi , Dominik Roblek , Félix de Chaumont Quitry , Beat Gfeller , Hannah Raphaelle Muckenhirn , Victor Ungureanu , Oleg Rybakov , Karolis Misiunas , Zalán Borsos

IPC: G10L19/022 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

3.

发明授权
Aggregation of related media content 有权

公开(公告)号：US10770112B2

公开(公告)日：2020-09-08

申请号：US16266522

申请日：2019-02-04

Applicant: Google LLC

Inventor： Yossi Matias , Matthew Sharifi , Thomas Bugnon , Dominik Roblek , Annie Chen

IPC: G11B27/031 , G11B27/034 , G11B27/10 , G11B27/28 , G06F16/44 , G06K9/00 , G11B27/30 , H04N5/232 , H04N5/04

Abstract: Systems and methods for media aggregation are disclosed herein. The system includes a media system that can transform media items into one aggregated media item. A synchronization component synchronizes media items with respect to time. The synchronized media items can be analyzed and transformed into an aggregated media item for storage and/or display. In one implementation, the aggregated media item is capable of being displayed in multiple ways to create an enhanced and customizable viewing and/or listening experience.

4.

发明授权
Speaker identification using a text-independent model and a text-dependent model 有权

公开(公告)号：US10255922B1

公开(公告)日：2019-04-09

申请号：US15191892

申请日：2016-06-24

Applicant: Google LLC

Inventor： Matthew Sharifi , Dominik Roblek

IPC: G10L17/24 , G10L17/04 , G10L15/02 , G10L15/22

Abstract: In some implementations, a single registration utterance that includes a hotword and an introduction declaration is received. A user is registered, including training a text-dependent speaker identification model using the hotword of the single registration utterance and training a text-independent speaker identification model using the introduction declaration of the single registration utterance. An authentication utterance by the user that includes the hotword and a voice command that is different from the introduction declaration is received. The user is authenticated, including processing the hotword of the authentication utterance using the text-dependent speaker identification model and processing the voice command using the text-independent speaker identification model. Access to an access-controlled personal resource of the user is provided without requiring the user to submit any further authentication information other than the single registration utterance by the user that includes the hotword and the introduction declaration to the speech-enabled home device.

5.

发明申请
PERSONALIZED ENTITY REPOSITORY 有权

公开(公告)号：US20250024237A1

公开(公告)日：2025-01-16

申请号：US18900067

申请日：2024-09-27

Applicant: GOOGLE LLC

Inventor： Matthew Sharifi , Jorge Pereira , Dominik Roblek , Julian Odell , Cong Li , David Petrou

IPC: H04W4/60 , G06F16/23 , G06F16/2457 , G06F16/248 , G06F16/587 , G06F16/907 , G06F16/9535 , G06V20/62 , H04L67/50 , H04W4/029 , H04W4/18

Abstract: Systems and methods are provided for a personalized entity repository. For example, a computing device comprises a personalized entity repository having fixed sets of entities from an entity repository stored at a server, a processor, and memory storing instructions that cause the computing device to identify fixed sets of entities that are relevant to a user based on context associated with the computing device, rank the fixed sets by relevancy, and update the personalized entity repository using selected sets determined based on the rank and on set usage parameters applicable to the user. In another example, a method includes generating fixed sets of entities from an entity repository, including location-based sets and topic-based sets, and providing a subset of the fixed sets to a client, the client requesting the subset based on the client's location and on items identified in content generated for display on the client.

6.

发明授权
Generating audio waveforms using encoder and decoder neural networks 有权

公开(公告)号：US12190896B2

公开(公告)日：2025-01-07

申请号：US17856292

申请日：2022-07-01

Applicant: Google LLC

Inventor： Yunpeng Li , Marco Tagliasacchi , Dominik Roblek , Félix de Chaumont Quitry , Beat Gfeller , Hannah Raphaelle Muckenhirn , Victor Ungureanu , Oleg Rybakov , Karolis Misiunas , Zalán Borsos

IPC: G10L19/022 , G06N3/045

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input audio waveform using a generator neural network to generate an output audio waveform. In one aspect, a method comprises: receiving an input audio waveform; processing the input audio waveform using an encoder neural network to generate a set of feature vectors representing the input audio waveform; and processing the set of feature vectors representing the input audio waveform using a decoder neural network to generate an output audio waveform that comprises a respective output audio sample for each of a plurality of output time steps.

7.

发明授权
Compressing audio waveforms using neural networks and vector quantizers 有权

公开(公告)号：US11990148B2

公开(公告)日：2024-05-21

申请号：US18106094

申请日：2023-02-06

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G06N3/045 , G06N3/08 , G10L19/00 , G10L25/30

CPC classification number: G10L19/038 , G06N3/045 , G06N3/08 , G10L25/30 , G10L2019/0002

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.

8.

发明申请
AUTOMATED MINING OF REAL-WORLD AUDIO TRAINING DATA 有权

公开(公告)号：US20230033103A1

公开(公告)日：2023-02-02

申请号：US17769624

申请日：2019-11-18

Applicant: Google LLC

Inventor： Dominik Roblek

IPC: G10L15/06 , G10L15/08 , H04R1/40 , H04R3/00 , G10L15/02

Abstract: Methods, systems, and apparatus, for generated labeled training examples for machine learning. In one aspect, a method includes receiving sets of audio recordings by a user device. For each set of audio recordings, each audio recording in the set is recorded over a respective separate microphone in the user device during a particular time interval, and each particular time interval is different for each set of audio recordings. For each set of audio recordings, a detector determines whether an audio recording in the set of audio recordings includes a particular audio feature, and whether another one of the audio recordings does not include the particular audio feature. For each set of audio recordings determined to include an audio recording that includes the particular audio feature and to include another audio recording that does not include the particular audio feature, a labeled training example is generated.

9.

发明申请
COMPRESSING AUDIO WAVEFORMS USING NEURAL NETWORKS AND VECTOR QUANTIZERS 有权

公开(公告)号：US20230019128A1

公开(公告)日：2023-01-19

申请号：US17856856

申请日：2022-07-01

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G10L25/30 , G06N3/04 , G06N3/08

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.

10.

发明申请
Training Keyword Spotters 有权

公开(公告)号：US20220262345A1

公开(公告)日：2022-08-18

申请号：US17662021

申请日：2022-05-04

Applicant: Google LLC

Inventor： Matthew Sharifi , Kevin Kilgour , Dominik Roblek , James Lin

IPC: G10L15/06 , G06N3/04 , G06N3/08 , G10L13/00 , G10L15/16 , G10L15/22

Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification