Speech and Noise Models for Speech Recognition

    公开(公告)号:US20120022860A1

    公开(公告)日:2012-01-26

    申请号:US13250777

    申请日:2011-09-30

    IPC分类号: G10L21/02

    CPC分类号: G10L15/20 G10L21/0208

    摘要: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

    Speech and noise models for speech recognition

    公开(公告)号:US08249868B2

    公开(公告)日:2012-08-21

    申请号:US13250777

    申请日:2011-09-30

    IPC分类号: G10L15/20

    CPC分类号: G10L15/20 G10L21/0208

    摘要: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

    Speech and noise models for speech recognition
    3.
    发明授权
    Speech and noise models for speech recognition 有权
    用于语音识别的语音和噪声模型

    公开(公告)号:US08234111B2

    公开(公告)日:2012-07-31

    申请号:US12814665

    申请日:2010-06-14

    IPC分类号: G10L15/20

    CPC分类号: G10L15/20 G10L21/0208

    摘要: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

    摘要翻译: 可以接收由基于来自用户的音频输入的设备生成的音频信号。 音频信号可以包括至少一个对应于由该设备记录的一个或多个用户话语的用户音频部分。 可以访问与用户相关联的用户语音模型,并且可以确定音频信号中的背景音频低于定义的阈值。 响应于确定音频信号中的背景音频低于定义的阈值,可以基于音频信号来调整所访问的用户语音模型,以生成对用户的语音特征进行建模的适配的用户语音模型。 可以使用适配的用户语音模型对所接收的音频信号执行噪声补偿,以生成与接收的音频信号相比具有降低的背景音频的滤波音频信号。

    GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY
    4.
    发明申请
    GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY 有权
    GEOTAGGED环境音频用于增强语音识别精度

    公开(公告)号:US20120022870A1

    公开(公告)日:2012-01-26

    申请号:US13250843

    申请日:2011-09-30

    IPC分类号: H04W64/00 G10L15/00

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于增强语音识别精度。 一方面,一种方法包括接收对应于多个地理位置中的多个移动设备记录的环境音频的地理标记音频信号,接收对应于由特定移动设备记录的话语的音频信号,确定与该特定移动设备相关联的特定地理位置 特定的移动设备,使用所述地理标记的音频信号的子集来生成针对所述特定地理位置的噪声模型,其中使用对于所述特定地理位置生成的所述噪声模型对与所述话语相对应的所述音频信号执行噪声补偿。

    ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION
    5.
    发明申请
    ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION 有权
    使用地理信息的声学模型适应

    公开(公告)号:US20110295590A1

    公开(公告)日:2011-12-01

    申请号:US12787568

    申请日:2010-05-26

    IPC分类号: G06F17/20

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于增强语音识别精度。 在一个方面,一种方法包括接收对应于由移动设备记录的话语的音频信号,确定与移动设备相关联的地理位置,调整用于地理位置的一个或多个声学模型,以及对该音频执行语音识别 使用适合于地理位置的一个或多个声学模型模型的信号。

    GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY
    6.
    发明申请
    GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY 有权
    GEOTAGGED环境音频用于增强语音识别精度

    公开(公告)号:US20120296643A1

    公开(公告)日:2012-11-22

    申请号:US13564636

    申请日:2012-08-01

    IPC分类号: G10L21/02

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于增强语音识别精度。 在一个方面,一种方法包括接收对应于由移动设备记录的话语的音频信号,确定与移动设备相关联的地理位置,识别与地理位置相关联的环境音频对应的一组地理标记音频信号, 基于与相应的地理标记音频信号相关联的元数据,对所述一组地理标记音频信号的每个地理标记音频信号进行加权,并且使用该组加权的地理标记音频信号对对应于话语的音频信号执行噪声补偿。

    Geotagged and weighted environmental audio for enhanced speech recognition accuracy
    7.
    发明授权
    Geotagged and weighted environmental audio for enhanced speech recognition accuracy 有权
    地理标记和加权环境音频,以提高语音识别精度

    公开(公告)号:US08175872B2

    公开(公告)日:2012-05-08

    申请号:US13250843

    申请日:2011-09-30

    IPC分类号: G10L21/02 G10L15/00

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Enhancing noisy speech recognition accuracy by receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, selecting a subset of geotagged audio signals and weighting each geotagged audio signal of the subset based on whether the respective audio signal was manually uploaded or automatically updated, generating a noise model for the particular geographic location using the subset of weighted geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

    摘要翻译: 通过接收与多个地理位置中的多个移动设备记录的环境音频相对应的地理标记音频信号来增强噪声语音识别精度,接收对应于由特定移动设备记录的话语的音频信号,确定与该特定移动设备相关联的特定地理位置 移动设备,基于是否手动上传或自动更新相应的音频信号,选择地理标记的音频信号的子集并对该子集的每个地理标记音频信号进行加权,使用加权的地理标记音频信号的子集生成特定地理位置的噪声模型 使用对特定地理位置生成的噪声模型对与发音对应的音频信号执行噪声补偿。

    Predictive pre-recording of audio for voice input
    8.
    发明授权
    Predictive pre-recording of audio for voice input 有权
    用于语音输入的音频预测录像

    公开(公告)号:US08504185B2

    公开(公告)日:2013-08-06

    申请号:US13563504

    申请日:2012-07-31

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes obtaining sensor data from one or more sensors of a mobile device while the mobile device is operating in an inactive state, determining that a user of the mobile device is interacting with the mobile device based on the sensor data, invoking voice input functionality of the mobile device in response to determining that the user of the mobile device is interacting with the mobile device, detecting a voice input, and activating the mobile device in response to detecting the voice input.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于提供用于语音输入的音频的预测预记录。 一方面,一种方法包括:当移动设备在非活动状态下工作时,从移动设备的一个或多个传感器获取传感器数据,基于传感器数据确定移动设备的用户正在与移动设备交互, 响应于确定移动设备的用户正在与移动设备进行交互,检测语音输入以及响应于检测到语音输入而激活移动设备,来调用移动设备的语音输入功能。

    Acoustic model adaptation using geographic information
    9.
    发明授权
    Acoustic model adaptation using geographic information 有权
    使用地理信息的声学模型适应

    公开(公告)号:US08219384B2

    公开(公告)日:2012-07-10

    申请号:US13250690

    申请日:2011-09-30

    IPC分类号: G06F17/20

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于增强语音识别精度。 在一个方面,一种方法包括接收对应于由移动设备记录的话语的音频信号,确定与移动设备相关联的地理位置,调整用于地理位置的一个或多个声学模型,以及对该音频执行语音识别 使用适合于地理位置的一个或多个声学模型模型的信号。

    Predictive pre-recording of audio for voice input
    10.
    发明授权
    Predictive pre-recording of audio for voice input 有权
    用于语音输入的音频预测录像

    公开(公告)号:US08195319B2

    公开(公告)日:2012-06-05

    申请号:US13250533

    申请日:2011-09-30

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于提供用于语音输入的音频的预测预记录。 在一个方面,一种方法包括建立参考移动设备的状态的状态数据和由移动设备的一个或多个传感器感测到的传感器数据的状态数据,将规则或概率模型应用于输入数据 ,基于将规则或概率模型应用于输入数据,推断出移动设备的用户可能发起语音输入,并且响应于推断用户可能会调用移动设备的一个或多个功能 启动语音输入。