专利检索 ap:("GOOGLE LLC") AND inv:"Gabor Simko" 第 1 页

1.

发明申请
DETECTING CONTINUING CONVERSATIONS WITH COMPUTING DEVICES 有权

公开(公告)号：US20220414333A1

公开(公告)日：2022-12-29

申请号：US17902543

申请日：2022-09-02

申请人： GOOGLE LLC

发明人： Nathan David Howard , Gabor Simko , Andrei Giurgiu , Behshad Behzadi , Marcin M. Nowak-Przygodzki

IPC分类号： G06F40/284 , G06F16/903 , G06F16/901 , G06N5/02 , G10L15/22 , G10L25/51 , G10L15/08

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

2.

发明授权
Joint endpointing and automatic speech recognition 有权

公开(公告)号：US11475880B2

公开(公告)日：2022-10-18

申请号：US16809403

申请日：2020-03-04

申请人： Google LLC

发明人： Shuo-yiin Chang , Rohit Prakash Prabhavalkar , Gabor Simko , Tara N. Sainath , Bo Li , Yangzhang He

IPC分类号： G10L15/16 , G10L15/02 , G10L15/14 , G10L15/28 , G10L15/08

摘要： A method includes receiving audio data of an utterance and processing the audio data to obtain, as output from a speech recognition model configured to jointly perform speech decoding and endpointing of utterances: partial speech recognition results for the utterance; and an endpoint indication indicating when the utterance has ended. While processing the audio data, the method also includes detecting, based on the endpoint indication, the end of the utterance. In response to detecting the end of the utterance, the method also includes terminating the processing of any subsequent audio data received after the end of the utterance was detected.

3.

发明授权
End of query detection 有权

公开(公告)号：US10593352B2

公开(公告)日：2020-03-17

申请号：US16001140

申请日：2018-06-06

申请人： Google LLC

发明人： Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC分类号： G10L17/00 , G10L25/78 , G10L15/22 , G10L15/187 , G10L15/065 , G10L15/18

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

4.

发明授权
End of query detection 有权

公开(公告)号：US11551709B2

公开(公告)日：2023-01-10

申请号：US16778222

申请日：2020-01-31

申请人： Google LLC

发明人： Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC分类号： G10L25/78 , G10L15/22 , G10L15/187 , G10L15/065 , G10L15/18

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

5.

发明授权
Utterance classifier 有权

公开(公告)号：US11545147B2

公开(公告)日：2023-01-03

申请号：US16401349

申请日：2019-05-02

申请人： Google LLC

发明人： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC分类号： G10L15/08 , G10L15/22 , G06F3/16 , G10L15/16 , G10L15/18 , G10L15/30 , G10L17/00

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant. Receiving, from the classifier, an indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant. Selectively instructing the automated assistant based at least on the indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant.

6.

发明授权
Unified endpointer using multitask and multidomain learning 有权

公开(公告)号：US10929754B2

公开(公告)日：2021-02-23

申请号：US16711172

申请日：2019-12-11

申请人： Google LLC

发明人： Shuo-yiin Chang , Bo Li , Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC分类号： G10L15/16 , G06N3/08 , G06N3/04 , G06N20/20 , G06K9/62 , G06N5/04

摘要： A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

7.

发明授权
Detecting continuing conversations with computing devices 有权

公开(公告)号：US11893350B2

公开(公告)日：2024-02-06

申请号：US17902543

申请日：2022-09-02

申请人： GOOGLE LLC

发明人： Nathan David Howard , Gabor Simko , Andrei Giurgiu , Behshad Behzadi , Marcin M. Nowak-Przygodzki

IPC分类号： G06F40/284 , G06F16/903 , G06F16/901 , G06N5/02 , G10L15/22 , G10L25/51 , G10L15/08

CPC分类号： G06F40/284 , G06F16/9024 , G06F16/90335 , G06N5/02 , G10L15/08 , G10L15/22 , G10L25/51

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

8.

发明申请
UTTERANCE CLASSIFIER 有权

公开(公告)号：US20220293101A1

公开(公告)日：2022-09-15

申请号：US17804657

申请日：2022-05-31

申请人： Google LLC

发明人： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC分类号： G10L15/22 , G06F3/16 , G10L15/16 , G10L15/18 , G10L15/30

摘要： A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.

9.

发明申请
END OF QUERY DETECTION 审中-公开

公开(公告)号：US20200168242A1

公开(公告)日：2020-05-28

申请号：US16778222

申请日：2020-01-31

申请人： Google LLC

发明人： Gabor Simko , Maria Carolina Parada San Martin , Sean Matthew Shannon

IPC分类号： G10L25/78 , G10L15/18 , G10L15/065 , G10L15/187 , G10L15/22

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

10.

发明授权
Utterance classifier 有权

公开(公告)号：US10311872B2

公开(公告)日：2019-06-04

申请号：US15659016

申请日：2017-07-25

申请人： Google LLC

发明人： Nathan David Howard , Gabor Simko , Maria Carolina Parada San Martin , Ramkarthik Kalyanasundaram , Guru Prakash Arumugam , Srinivas Vasudevan

IPC分类号： G10L15/08 , G10L15/22 , G10L15/16 , G10L15/30 , G10L15/18

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant. Receiving, from the classifier, an indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant. Selectively instructing the automated assistant based at least on the indication of whether the utterance corresponding to the received audio data is likely directed to the automated assistant or is likely not directed to the automated assistant.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类