专利检索 ap:("Google LLC") AND inv:"Diego Melendo Casado" 第 5 页

41.

发明申请
AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE 有权

公开(公告)号：US20210280177A1

公开(公告)日：2021-09-09

申请号：US17328400

申请日：2021-05-24

申请人： Google LLC

发明人： Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno , William Zhang

IPC分类号： G10L15/197 , G10L15/00 , G10L15/22 , G10L15/30 , G10L15/08 , G10L15/14 , G10L15/18 , G10L13/00

摘要： Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user. Some implementations perform speech recognition in each of multiple languages assigned to the user profile, and utilize criteria to select only one of the speech recognitions as appropriate for generating and providing content that is responsive to the spoken utterance.

42.

发明授权
Modality learning on mobile devices 有权

公开(公告)号：US10831366B2

公开(公告)日：2020-11-10

申请号：US15393676

申请日：2016-12-29

申请人： Google LLC

发明人： Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem

IPC分类号： G06F3/0488 , G06F3/16 , G06F1/16 , G06F3/023 , G06F40/166 , G06F40/289 , G10L15/22

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

43.

发明申请
ADAPTIVE INTERFACE IN A VOICE-BASED NETWORKED SYSTEM 审中-公开

公开(公告)号：US20190318724A1

公开(公告)日：2019-10-17

申请号：US15973466

申请日：2018-05-07

申请人： Google LLC

发明人： Pu-sen Chao , Diego Melendo Casado , Ignacio Lopez Moreno

IPC分类号： G10L15/14 , G10L15/02 , G10L15/18

摘要： The present disclosure relates generally to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. The system can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.

44.

发明申请
MULTI-USER AUTHENTICATION ON A DEVICE 审中-公开

公开(公告)号：US20180308472A1

公开(公告)日：2018-10-25

申请号：US15956493

申请日：2018-04-18

申请人： Google LLC

发明人： Ignacio Lopez Moreno , Diego Melendo Casado

IPC分类号： G10L15/08 , G10L17/06 , G06F21/32

CPC分类号： G10L17/06 , G06F17/30764 , G06F21/32 , G06K9/00362 , G10L15/07 , G10L15/08 , G10L15/22 , G10L15/265 , G10L17/005

摘要： In some implementations, an utterance is determined to include a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword. In response to determining that an utterance includes a particular user speaking a hotword based at least on a first set of samples of the particular user speaking the hotword, at least a portion of the utterance is stored as a new sample. A second set of samples of the particular user speaking the utterance is obtained, where the second set of samples includes the new sample and less than all the samples in the first set of samples. A second utterance is determined to include the particular user speaking the hotword based at least on the second set of samples of the user speaking the hotword.

45.

发明申请
HOTWORD DETECTION ON MULTIPLE DEVICES 审中-公开

公开(公告)号：US20180286406A1

公开(公告)日：2018-10-04

申请号：US15952434

申请日：2018-04-13

申请人： Google LLC

发明人： Diego Melendo Casado , Alexander H. Gruenstein , Jakob N. Foerster

IPC分类号： G10L15/30 , G10L15/22 , G10L25/78 , H04L29/08 , G10L15/16 , G10L15/08

CPC分类号： G10L15/30 , G10L15/16 , G10L15/22 , G10L25/78 , G10L2015/088 , G10L2015/223 , H04L67/10 , H05K999/99

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

46.

发明授权
Multi-stage hotword detection 有权

公开(公告)号：US10008207B2

公开(公告)日：2018-06-26

申请号：US15233090

申请日：2016-08-10

申请人： Google LLC

发明人： Jakob Nicolaus Foerster , Alexander H. Gruenstein , Diego Melendo Casado

IPC分类号： G10L15/16 , G10L17/10 , G10L15/20 , G10L15/22 , G10L17/02 , G10L17/18 , G10L17/24 , G10L15/08 , G10L25/30 , G10L15/12 , G10L17/00

CPC分类号： G10L17/10 , G10L15/12 , G10L15/16 , G10L15/20 , G10L15/22 , G10L17/00 , G10L17/02 , G10L17/18 , G10L17/24 , G10L25/30 , G10L2015/088 , G10L2015/223

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multi-stage hotword detection are disclosed. In one aspect, a method includes the actions of receiving, by a second stage hotword detector of a multi-stage hotword detection system that includes at least a first stage hotword detector and the second stage hotword detector, audio data that corresponds to an initial portion of an utterance. The actions further include determining a likelihood that the initial portion of the utterance includes a hotword. The actions further include determining that the likelihood that the initial portion of the utterance includes the hotword satisfies a threshold. The actions further include, in response to determining that the likelihood satisfies the threshold, transmitting a request for the first stage hotword detector to cease providing additional audio data that corresponds to one or more subsequent portions of the utterance.

47.

发明授权
Hotword detection on multiple devices 有权

公开(公告)号：US09972320B2

公开(公告)日：2018-05-15

申请号：US15278269

申请日：2016-09-28

申请人： Google LLC

发明人： Diego Melendo Casado , Alexander H. Gruenstein , Jakob Nicolaus Foerster

IPC分类号： G10L21/00 , G10L15/30 , G10L25/78 , H04L29/08 , G10L15/08 , G10L15/16

CPC分类号： G10L15/30 , G10L15/16 , G10L15/22 , G10L25/78 , G10L2015/088 , G10L2015/223 , H04L67/10

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

48.

发明授权
Recognizing speech in the presence of additional audio 有权

公开(公告)号：US11942083B2

公开(公告)日：2024-03-26

申请号：US17303139

申请日：2021-05-21

申请人： Google LLC

发明人： Diego Melendo Casado , Ignacio Lopez Moreno , Javier Gonzalez-Dominguez

IPC分类号： G10L15/00 , G06F3/16 , G10L15/20 , G10L15/22 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/30 , G10L15/26 , G10L17/00

CPC分类号： G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005 , G10L15/26 , G10L17/00

摘要： The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.

49.

发明公开
Modality Learning on Mobile Devices 审中-公开

公开(公告)号：US20240086063A1

公开(公告)日：2024-03-14

申请号：US18517825

申请日：2023-11-22

申请人： Google LLC

发明人： Yu Ouyang , Diego Melendo Casado , Mohammadinamul Hasan Sheik , Francoise Beaufays , Dragan Zivkovic , Meltem Oktem

IPC分类号： G06F3/04886 , G06F1/16 , G06F3/023 , G06F3/04883 , G06F3/16 , G06F40/166 , G06F40/289

CPC分类号： G06F3/04886 , G06F1/1626 , G06F3/0233 , G06F3/04883 , G06F3/167 , G06F40/166 , G06F40/289 , G06F2203/0381 , G10L15/22

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

50.

发明授权
Multi-user authentication on a device 有权

公开(公告)号：US11727918B2

公开(公告)日：2023-08-15

申请号：US17375573

申请日：2021-07-14

申请人： GOOGLE LLC

发明人： Ignacio Lopez Moreno , Diego Melendo Casado

IPC分类号： G10L15/08 , G06F21/32 , G10L17/06 , G06F16/635 , G10L15/22 , G10L17/00 , G06V40/10 , G10L15/07 , G10L15/26

CPC分类号： G10L15/08 , G06F16/636 , G06F21/32 , G06V40/10 , G10L15/07 , G10L15/22 , G10L17/00 , G10L17/06 , G10L15/26 , G10L2015/088

摘要： In some implementations, a set of audio recordings capturing utterances of a user is received by a first speech-enabled device. Based on the set of audio recordings, the first speech-enabled device generates a first user voice recognition model for use in subsequently recognizing a voice of the user at the first speech-enabled device. Further, a particular user account associated with the first voice recognition model is determined, and an indication that a second speech-enabled device that is associated with the particular user account is received. In response to receiving the indication, the set of audio recordings is provided to the second speech-enabled device. Based on the set of audio recordings, the second speech-enabled device generates a second user voice recognition model for use in subsequently recognizing the voice of the user at the second speech-enabled device.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类