-
公开(公告)号:US11568879B2
公开(公告)日:2023-01-31
申请号:US17303928
申请日:2021-06-10
Applicant: Google LLC
Inventor: Dominik Roblek , Matthew Sharifi
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
-
公开(公告)号:US11501787B2
公开(公告)日:2022-11-15
申请号:US16548146
申请日:2019-08-22
Applicant: Google LLC
Inventor: Beat Gfeller , Dominik Roblek , Félix de Chaumont Quitry , Marco Tagliasacchi
IPC: G10L19/035 , G06N20/00 , G10L19/038 , G10L25/18
Abstract: Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.
-
公开(公告)号:US11335380B2
公开(公告)日:2022-05-17
申请号:US17009934
申请日:2020-09-02
Applicant: Google LLC
Inventor: Yossi Matias , Matthew Sharifi , Thomas Bugnon , Dominik Roblek , Annie Chen
IPC: G11B27/031 , G11B27/034 , G11B27/10 , G11B27/28 , G06F16/44 , G06K9/00 , G11B27/30 , H04N5/232 , H04N5/04
Abstract: Systems and methods for media aggregation are disclosed herein. The system includes a media system that can transform media items into one aggregated media item. A synchronization component synchronizes media items with respect to time. The synchronized media items can be analyzed and transformed into an aggregated media item for storage and/or display. In one implementation, the aggregated media item is capable of being displayed in multiple ways to create an enhanced and customizable viewing and/or listening experience.
-
公开(公告)号:US11056120B2
公开(公告)日:2021-07-06
申请号:US16675420
申请日:2019-11-06
Applicant: Google LLC
Inventor: Dominik Roblek , Matthew Sharifi
Abstract: A method includes obtaining enrollment audio data representing a particular user speaking an enrollment phrase, and in response to receiving a request to verify an identity of an unverified user, prompting the unverified user to speak a verification utterance. The method also includes receiving verification audio data representing the unverified user speaking the verification utterance and determining whether the unverified user speaking the verification phrase includes the particular user who spoke the enrollment phrase based on the enrollment audio data and the verification audio data. The method also includes verifying the identity of the unverified user as the particular user.
-
公开(公告)号:US11003987B2
公开(公告)日:2021-05-11
申请号:US15151374
申请日:2016-05-10
Applicant: Google LLC
Inventor: Dominik Roblek , Matthew Sharifi
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio processing using neural networks. One of the systems includes multiple neural network layers, wherein the neural network system is configured to receive time domain features of an audio sample and to process the time domain features to generate a neural network output for the audio sample, the plurality of neural network layers comprising: a frequency-transform (F-T) layer that is configured to apply a transformation defined by a set of F-T layer parameters that transforms a window of time domain features into frequency domain features; and one or more other neural network layers having respective layer parameters, wherein the one or more neural network layers are configured to process frequency domain features to generate a neural network output.
-
公开(公告)号:US20200226187A1
公开(公告)日:2020-07-16
申请号:US16241704
申请日:2019-01-07
Applicant: Google LLC
Inventor: Matthew Sharifi , Jorge Pereira , Dominik Roblek , Julian Odell , Cong Li , David Petrou
IPC: G06F16/9535 , G06K9/32 , G06F16/587 , G06F16/907
Abstract: Systems and methods are provided for a personalized entity repository. For example, a computing device comprises a personalized entity repository having fixed sets of entities from an entity repository stored at a server, a processor, and memory storing instructions that cause the computing device to identify fixed sets of entities that are relevant to a user based on context associated with the computing device, rank the fixed sets by relevancy, and update the personalized entity repository using selected sets determined based on the rank and on set usage parameters applicable to the user. In another example, a method includes generating fixed sets of entities from an entity repository, including location-based sets and topic-based sets, and providing a subset of the fixed sets to a client, the client requesting the subset based on the client's location and on items identified in content generated for display on the client.
-
公开(公告)号:US10199069B1
公开(公告)日:2019-02-05
申请号:US14842506
申请日:2015-09-01
Applicant: Google LLC
Inventor: Yossi Matias , Matthew Sharifi , Thomas Bugnon , Dominik Roblek , Annie Chen
IPC: G11B27/031 , G06F17/30 , G11B27/30 , H04N5/232 , G06K9/00
Abstract: Systems and methods for media aggregation are disclosed herein. The system includes a media system that can transform media items into one aggregated media item. A synchronization component synchronizes media items with respect to time. The synchronized media items can be analyzed and transformed into an aggregated media item for storage and/or display. In one implementation, the aggregated media item is capable of being displayed in multiple ways to create an enhanced and customizable viewing and/or listening experience.
-
公开(公告)号:US12136412B2
公开(公告)日:2024-11-05
申请号:US17662021
申请日:2022-05-04
Applicant: Google LLC
Inventor: Matthew Sharifi , Kevin Kilgour , Dominik Roblek , James Lin
Abstract: A method of training a custom hotword model includes receiving a first set of training audio samples. The method also includes generating, using a speech embedding model configured to receive the first set of training audio samples as input, a corresponding hotword embedding representative of a custom hotword for each training audio sample of the first set of training audio samples. The speech embedding model is pre-trained on a different set of training audio samples with a greater number of training audio samples than the first set of training audio samples. The method further includes training the custom hotword model to detect a presence of the custom hotword in audio data. The custom hotword model is configured to receive, as input, each corresponding hotword embedding and to classify, as output, each corresponding hotword embedding as corresponding to the custom hotword.
-
公开(公告)号:US20230379678A1
公开(公告)日:2023-11-23
申请号:US18227751
申请日:2023-07-28
Applicant: GOOGLE LLC
Inventor: Matthew Sharifi , Jorge Pereira , Dominik Roblek , Julian Odell , Cong Li , David Petrou
IPC: H04W4/60 , G06F16/248 , G06F16/9535 , G06F16/2457 , H04W4/029 , G06F16/907 , G06F16/587 , G06V20/62 , H04L67/50 , H04W4/18 , G06F16/23
CPC classification number: H04W4/60 , G06F16/248 , G06F16/9535 , G06F16/24578 , H04W4/029 , G06F16/907 , G06F16/587 , G06V20/62 , H04L67/535 , H04W4/18 , G06F16/23 , G06F16/235 , G06F16/2358
Abstract: Systems and methods are provided for a personalized entity repository. For example, a computing device comprises a personalized entity repository having fixed sets of entities from an entity repository stored at a server, a processor, and memory storing instructions that cause the computing device to identify fixed sets of entities that are relevant to a user based on context associated with the computing device, rank the fixed sets by relevancy, and update the personalized entity repository using selected sets determined based on the rank and on set usage parameters applicable to the user. In another example, a method includes generating fixed sets of entities from an entity repository, including location-based sets and topic-based sets, and providing a subset of the fixed sets to a client, the client requesting the subset based on the client's location and on items identified in content generated for display on the client.
-
公开(公告)号:US11756530B2
公开(公告)日:2023-09-12
申请号:US17640579
申请日:2020-09-25
Applicant: GOOGLE LLC
Inventor: Marco Tagliasacchi , Mihajlo Velimirovic , Matthew Sharifi , Dominik Roblek , Christian Frank , Beat Gfeller
IPC: G10L15/06 , G10L21/013 , G10L25/30 , G10L25/90
CPC classification number: G10L15/063 , G10L21/013 , G10L25/30 , G10L25/90
Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.
-
-
-
-
-
-
-
-
-