-
公开(公告)号:US09899024B1
公开(公告)日:2018-02-20
申请号:US15393770
申请日:2016-12-29
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Golan Pundak
IPC: G10L15/04 , G10L15/00 , G10L21/00 , G10L17/00 , G10L15/22 , G10L15/02 , G10L15/187 , G10L25/63 , G09B19/04
CPC classification number: G10L15/22 , G09B5/04 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63 , G10L2015/025 , G10L2015/223
Abstract: Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.
-
公开(公告)号:US20170094049A1
公开(公告)日:2017-03-30
申请号:US14869223
申请日:2015-09-29
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Marcel M.M. Yung
CPC classification number: H04M1/72577 , G06F21/32 , G06F2221/2139 , H04W12/02 , H04W12/08
Abstract: A computing device receives voice input that includes first voice input from a first user and second voice input from a second user. The computing device may determine, based at least in part on the received voice input, a change in possession of the computing device. The computing device may determine, based at least in part on the first voice input and the second voice input, delegation of the computing device from the first user to the second user. The computing device may, in response to determining the change in possession of the computing device and the delegation of the computing device, change at least a level of access to functionality of the computing device from a first level of access to a second level of access.
-
公开(公告)号:US20160180214A1
公开(公告)日:2016-06-23
申请号:US14577301
申请日:2014-12-19
Applicant: Google Inc.
CPC classification number: G06N3/08 , G06N3/0454 , G10L15/063 , G10L2015/088
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes training a neural network using sharp discrepancy learning by providing training data to the neural network, calculating a gradient using a sharp discrepancy output layer objective function to classify the neural network parameters for correct and incorrect network model states, and training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于训练神经网络的计算机程序。 其中一种方法包括通过向神经网络提供训练数据来训练使用尖锐差异学习的神经网络,使用尖锐差异输出层目标函数计算梯度,以将神经网络参数分类为正确和不正确的网络模型状态,并训练 神经网络使用梯度来确定由神经网络接收的数据的概率具有与一个或多个关键词或关键短语的关键特征相似的特征。
-
公开(公告)号:US20170076749A1
公开(公告)日:2017-03-16
申请号:US14856270
申请日:2015-09-16
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Golan Pundak
CPC classification number: G11B20/10527 , G06F3/16 , G06F3/165 , G10L17/00 , G10L21/0202 , G10L21/0208 , G10L21/028 , G10L21/0364 , G10L25/51 , G10L25/84 , G11B2020/10546 , H04M3/56 , H04M3/568
Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for identifying that a first audio stream includes first, second, and third sources of audio. A computing system identifies that a second audio stream includes the first, second, and third sources of audio. The computing system determines that the first and second sources of audio are part of a first conversation. The computing system generates a third audio stream that combines the first source of audio from the first audio stream, the first source of audio from the second audio stream, the second source of audio from the first audio stream, and the second source of audio from the second audio stream, and diminishes the third source of audio from the first audio stream, and the third source of audio from the second audio stream.
Abstract translation: 通常,本公开中描述的主题可以体现在用于识别第一音频流包括第一,第二和第三音频源的方法,系统和程序产品中。 计算系统识别第二音频流包括第一,第二和第三音频源。 计算系统确定第一和第二音频源是第一个会话的一部分。 该计算系统生成第三音频流,该第三音频流组合来自第一音频流的第一音频源,来自第二音频流的第一音频源,来自第一音频流的第二音频源和第二音频源 第二音频流,并且从第一音频流减少第三音频源,并且从第二音频流中减少第三音频源。
-
公开(公告)号:US09911420B1
公开(公告)日:2018-03-06
申请号:US15474289
申请日:2017-03-30
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Golan Pundak
IPC: G10L15/00 , G10L15/26 , G10L15/04 , G10L21/00 , G10L17/00 , G10L15/22 , G10L15/187 , G10L25/63 , G10L15/02 , G09B19/04 , G09B5/04
CPC classification number: G10L15/22 , G09B5/04 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63 , G10L2015/025 , G10L2015/223
Abstract: Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.
-
公开(公告)号:US09826083B2
公开(公告)日:2017-11-21
申请号:US14869223
申请日:2015-09-29
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Marcel M. M. Yung
CPC classification number: H04M1/72577 , G06F21/32 , G06F2221/2139 , H04W12/02 , H04W12/08
Abstract: A computing device receives voice input that includes first voice input from a first user and second voice input from a second user. The computing device may determine, based at least in part on the received voice input, a change in possession of the computing device. The computing device may determine, based at least in part on the first voice input and the second voice input, delegation of the computing device from the first user to the second user. The computing device may, in response to determining the change in possession of the computing device and the delegation of the computing device, change at least a level of access to functionality of the computing device from a first level of access to a second level of access.
-
7.
公开(公告)号:US20170300976A1
公开(公告)日:2017-10-19
申请号:US15174668
申请日:2016-06-06
Applicant: Google Inc.
Inventor: Ayse Seza Dogruöz , Natalia Ponomareva , Christoph Urs Oehler , Dimitri Kanevsky
IPC: G06Q30/02
CPC classification number: G06Q30/0269 , G06F17/275 , G06Q30/0241 , G06Q50/01
Abstract: Methods, systems, and media for language identification of a media content item based on comments are provided. In some embodiments, the method comprises: obtaining a plurality of comments associated with a media content item; selecting a subset of the plurality of comments based on one or more criteria; assigning, for each comment in the subset of the plurality of comments, a vector of language probabilities, wherein each component of the vector is assigned a language probability that indicates the likelihood that the comment includes content in a language from a plurality of languages; combining the vector of language probabilities for each comment in the subset of the plurality of comments to generate a combined language vector; identifying a language associated with the media content item based on the combined language vector; and performing an action based on the identified language.
-
公开(公告)号:US09570074B2
公开(公告)日:2017-02-14
申请号:US14557751
申请日:2014-12-02
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Golan Pundak
IPC: G10L15/04 , G10L15/00 , G10L21/00 , G10L17/00 , G10L15/22 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63
CPC classification number: G10L15/22 , G09B5/04 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63 , G10L2015/025 , G10L2015/223
Abstract: Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.
Abstract translation: 描述了用于诱导语音识别系统的用户来调整其自身行为的方法,系统和装置。 例如,在一个实现中,允许儿童控制电子设备的语音识别系统可以通过鼓励孩子更清楚地说话来改善儿童的言语发展。 为了做到这一点,语音识别系统可以产生由孩子说出的术语的语音表示,并且可以确定语音表示是否匹配被认为适合于孩子年龄的特定术语的特定语音发音。 在确定符合儿童说出的术语的语音表示的特定规范发音不是适合年龄的情况下,语音识别系统可以选择和实施各种修复策略,以诱导儿童重复该术语,使用发音 被认为适合年龄。
-
公开(公告)号:US20160155437A1
公开(公告)日:2016-06-02
申请号:US14557751
申请日:2014-12-02
Applicant: Google Inc.
Inventor: Dimitri Kanevsky , Golan Pundak
CPC classification number: G10L15/22 , G09B5/04 , G09B19/04 , G10L15/02 , G10L15/187 , G10L25/63 , G10L2015/025 , G10L2015/223
Abstract: Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.
Abstract translation: 描述了用于诱导语音识别系统的用户来调整其自身行为的方法,系统和装置。 例如,在一个实现中,允许儿童控制电子设备的语音识别系统可以通过鼓励孩子更清楚地说话来改善儿童的言语发展。 为了做到这一点,语音识别系统可以产生由孩子说出的术语的语音表示,并且可以确定语音表示是否匹配被认为适合于孩子年龄的特定术语的特定语音发音。 在确定符合儿童说出的术语的语音表示的特定规范发音不是适合年龄的情况下,语音识别系统可以选择和实施各种修复策略,以诱导儿童重复该术语,使用发音 被认为适合年龄。
-
-
-
-
-
-
-
-