-
公开(公告)号:US20150364129A1
公开(公告)日:2015-12-17
申请号:US14313490
申请日:2014-06-24
Applicant: Google Inc.
Inventor: Javier Gonzalez-Dominguez , Ignacio L. Moreno , David P. Eustis
CPC classification number: G10L15/005 , G10L15/183 , G10L15/32
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language identification. In some implementations, speech data for an utterance is received and provided to (i) a language identification module and (ii) multiple speech recognizers that are each configured to recognize speech in a different language. From the language identification module, language identification scores corresponding to different languages are received, the language identification scores each indicating a likelihood that the utterance is speech in the corresponding language. A language model confidence score that indicates a level of confidence that a language model has in a transcription of the utterance in a language corresponding to the language model is received. A language is selected based on the language identification scores and the language model confidence scores.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于语言识别。 在一些实现中,接收用于话语的语音数据并提供给(i)语言识别模块和(ii)多个语音识别器,每个语音识别器被配置为以不同语言识别语音。 从语言识别模块接收与不同语言相对应的语言识别分数,语言识别分数各自表示发音是相应语言的语音的可能性。 语言模型可信度得分表示语言模型在对应于语言模型的语言的语音转录中的置信水平。 基于语言识别分数和语言模型置信度得分选择语言。
-
公开(公告)号:US09922645B2
公开(公告)日:2018-03-20
申请号:US15460342
申请日:2017-03-16
Applicant: Google Inc.
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
-
公开(公告)号:US09601116B2
公开(公告)日:2017-03-21
申请号:US15093309
申请日:2016-04-07
Applicant: Google Inc.
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
-
公开(公告)号:US20170186424A1
公开(公告)日:2017-06-29
申请号:US15460342
申请日:2017-03-16
Applicant: Google Inc.
IPC: G10L15/20 , G10L21/034 , G10L25/84
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
-
公开(公告)号:US20150235637A1
公开(公告)日:2015-08-20
申请号:US14181345
申请日:2014-02-14
Applicant: Google Inc.
Inventor: Diego Melendo Casado , Ignacio L. Moreno , Javier Gonzalez-Dominguez
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
Abstract translation: 本文中描述的技术可以以计算机实现的方法来实现,该方法包括在处理系统处接收包括扬声器设备的输出和附加音频信号的第一信号。 该方法还包括至少部分地基于经训练以识别扬声器设备的输出的模型来确定该附加音频信号对应于用户的话语。 该方法还包括基于确定附加音频信号对应于用户的话语来启动扬声器设备的音频输出电平的降低。
-
公开(公告)号:US20160225373A1
公开(公告)日:2016-08-04
申请号:US15093309
申请日:2016-04-07
Applicant: Google Inc.
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
-
7.
公开(公告)号:US09318112B2
公开(公告)日:2016-04-19
申请号:US14181345
申请日:2014-02-14
Applicant: Google Inc.
Inventor: Diego Melendo Casado , Ignacio L. Moreno , Javier Gonzalez-Dominguez
CPC classification number: G10L15/20 , G06F3/165 , G06F3/167 , G10L15/222 , G10L15/265 , G10L17/00 , G10L17/06 , G10L21/034 , G10L25/84 , H03G3/3005
Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
Abstract translation: 本文中描述的技术可以以计算机实现的方法来实现,该方法包括在处理系统处接收包括扬声器设备的输出和附加音频信号的第一信号。 该方法还包括至少部分地基于经训练以识别扬声器设备的输出的模型来确定该附加音频信号对应于用户的话语。 该方法还包括基于确定附加音频信号对应于用户的话语来启动扬声器设备的音频输出电平的降低。
-
公开(公告)号:US20160035344A1
公开(公告)日:2016-02-04
申请号:US14817302
申请日:2015-08-04
Applicant: Google Inc.
Inventor: Javier Gonzalez-Dominguez , Hasim Sak , Ignacio Lopez Moreno
IPC: G10L15/00
CPC classification number: G10L15/005 , G06N3/0445 , G06N3/084 , G10L15/16
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying the language of a spoken utterance. One of the methods includes receiving a plurality of audio frames that collectively represent at least a portion of a spoken utterance; processing the plurality of audio frames using a long short term memory (LSTM) neural network to generate a respective language score for each of a plurality of languages, wherein the respective language score for each of the plurality of languages represents a likelihood that the spoken utterance was spoken in the language; and classifying the spoken utterance as being spoken in one of the plurality of languages using the language scores.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于识别口语发音的语言。 其中一种方法包括:接收多个音频帧,它们共同表示说出话语的至少一部分; 使用长的短期存储器(LSTM)神经网络来处理所述多个音频帧以针对多种语言中的每一种产生相应的语言得分,其中所述多种语言中的每一种的相应语言得分表示所述语音发音的可能性 用语言说; 并且使用语言分数将口语说话分类为以多种语言之一说出来。
-
-
-
-
-
-
-