Patent search ap:("GOOGLE LLC") AND inv:"Dominik Roblek" Page 4

31.

发明授权
Compressing audio waveforms using neural networks and vector quantizers 有权

公开(公告)号：US11600282B2

公开(公告)日：2023-03-07

申请号：US17856856

申请日：2022-07-01

Applicant: Google LLC

Inventor： Neil Zeghidour , Marco Tagliasacchi , Dominik Roblek

IPC: G10L19/038 , G10L25/30 , G10L19/00 , G06N3/08 , G06N3/04

Abstract: Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.

32.

发明申请
MULTI-TASK ADAPTER NEURAL NETWORKS 有权

公开(公告)号：US20220383112A1

公开(公告)日：2022-12-01

申请号：US17764005

申请日：2020-09-23

Applicant: Google LLC

Inventor： Marco Tagliasacchi , Félix de Chaumont Quitry , Dominik Roblek

IPC: G06N3/08 , G06N3/04 , G10L25/30

Abstract: A system including a multi-task adapter neural network for performing multiple machine learning tasks is described. The adapter neural network is configured to receive a shared input for the machine learning tasks, and process the shared input to generate, for each of the machine learning tasks, a respective predicted output. The adapter neural network includes (i) a shared encoder configured to receive the shared input and to process the shared input to extract shared feature representations for the machine learning tasks, and (ii) multiple task-adapter encoders, each of the task-adapter encoders being associated with a respective machine learning task in the machine learning tasks and configured to: receive the shared input, receive the shared feature representations from the shared encoder, and process the shared input and the shared feature representations to generate the respective predicted output for the respective machine learning task.

33.

发明申请
AGGREGATION OF RELATED MEDIA CONTENT 有权

公开(公告)号：US20220277773A1

公开(公告)日：2022-09-01

申请号：US17745252

申请日：2022-05-16

Applicant: Google LLC

Inventor： Yossi Matias , Matthew Sharifi , Thomas Bugnon , Dominik Roblek , Annie Chen

IPC: G11B27/031 , G11B27/034 , G11B27/10 , G11B27/28 , G06V20/40 , G06F16/44 , G11B27/30 , H04N5/232

Abstract: Systems and methods for media aggregation are disclosed herein. The system includes a media system that can transform media items into one aggregated media item. A synchronization component synchronizes media items with respect to time. The synchronized media items can be analyzed and transformed into an aggregated media item for storage and/or display. In one implementation, the aggregated media item is capable of being displayed in multiple ways to create an enhanced and customizable viewing and/or listening experience.

34.

发明申请
SEGMENT-BASED SPEAKER VERIFICATION USING DYNAMICALLY GENERATED PHRASES 有权

公开(公告)号：US20210295850A1

公开(公告)日：2021-09-23

申请号：US17303928

申请日：2021-06-10

Applicant: Google LLC

Inventor： Dominik Roblek , Matthew Sharifi

IPC: G10L17/24 , G10L17/04 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

35.

发明申请
Training Keyword Spotters 有权

公开(公告)号：US20210183367A1

公开(公告)日：2021-06-17

申请号：US16717518

申请日：2019-12-17

Applicant: Google LLC

Inventor： Matthew Sharifi , Kevin Kilgour , Dominik Roblek , James Lin

IPC: G10L15/06 , G10L15/22 , G10L15/16 , G10L13/04 , G06N3/04 , G06N3/08

Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples. The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.

36.

发明申请
Self-Supervised Audio Representation Learning for Mobile Devices 有权

公开(公告)号：US20210056980A1

公开(公告)日：2021-02-25

申请号：US16548146

申请日：2019-08-22

Applicant: Google LLC

Inventor： Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi

IPC: G10L19/035 , G10L25/18 , G10L19/038 , G06N20/00

Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.

37.

发明授权
Object detection using neural network systems 有权

公开(公告)号：US10467493B2

公开(公告)日：2019-11-05

申请号：US15650790

申请日：2017-07-14

Applicant: Google LLC

Inventor： Dominik Roblek , Christian Szegedy , Jacek Slawosz Jurewicz

IPC: G06K9/32 , G06T7/11 , G06K9/46 , G06K9/66 , G06N3/08 , G06K9/62

Abstract: Systems, methods, and apparatus, including computer programs encoded on a computer storage medium. In one aspect, a system includes initial neural network layers configured to: receive an input image, and process the input image to generate a plurality of first feature maps that characterize the input image; a location generating convolutional neural network layer configured to perform a convolution on the representation of the first plurality of feature maps to generate data defining a respective location of each of a predetermined number of bounding boxes in the input image, wherein each bounding box identifies a respective first region of the input image; and a confidence score generating convolutional neural network layer configured to perform a convolution on the representation of the first plurality of feature maps to generate a confidence score for each of the predetermined number of bounding boxes in the input image.

38.

发明授权
Frequency based audio analysis using neural networks 有权

公开(公告)号：US10460747B2

公开(公告)日：2019-10-29

申请号：US15151362

申请日：2016-05-10

Applicant: Google LLC

Inventor： Dominik Roblek , Matthew Sharifi

IPC: G06N3/04 , G06N3/08 , G10L25/30 , G06F11/07

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for frequency based audio analysis using neural networks. One of the methods includes training a neural network that includes a plurality of neural network layers on training data, wherein the neural network is configured to receive frequency domain features of an audio sample and to process the frequency domain features to generate a neural network output for the audio sample, wherein the neural network comprises (i) a convolutional layer that is configured to map frequency domain features to logarithmic scaled frequency domain features, wherein the convolutional layer comprises one or more convolutional layer filters, and (ii) one or more other neural network layers having respective layer parameters that are configured to process the logarithmic scaled frequency domain features to generate the neural network output.

39.

发明授权
Audio data classification 有权

公开(公告)号：US10424321B1

公开(公告)日：2019-09-24

申请号：US13932158

申请日：2013-07-01

Applicant: Google LLC

Inventor： Matthew Sharifi , Dominik Roblek

IPC: G10L25/48 , G06F17/00 , G10H1/00 , G10L25/81

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing an audio sample to determine whether the audio sample includes music audio data. One or more detectors, including a spectral fluctuation detector, a peak repetition detector, and a beat pitch detector, may analyze the audio sample and generate a score that represents whether the audio sample includes music audio data. One or more of the scores may be combined to determine whether the audio sample includes music audio data or non-music audio data.

40.

发明授权
Segment content displayed on a computing device into regions based on pixels of a screenshot image that captures the content 有权

公开(公告)号：US10147197B2

公开(公告)日：2018-12-04

申请号：US15839797

申请日：2017-12-12

Applicant: Google LLC

Inventor： Dominik Roblek , David Petrou , Matthew Sharifi

IPC: G06T7/30 , G06F17/30 , G06T7/90 , G06F3/0484 , G06F3/0488

Abstract: Methods and apparatus directed to segmenting content displayed on a computing device into regions. The segmenting of content displayed on the computing device into regions is accomplished via analysis of pixels of a “screenshot image” that captures at least a portion of (e.g., all of) the displayed content. Individual pixels of the screenshot image may be analyzed to determine one or more regions of the screenshot image and to optionally assign a corresponding semantic type to each of the regions. Some implementations are further directed to generating, based on one or more of the regions, interactive content to provide for presentation to the user via the computing device.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification