Detecting a user's voice activity using dynamic probabilistic models of speech features
    1.
    发明授权
    Detecting a user's voice activity using dynamic probabilistic models of speech features 有权
    使用语音特征的动态概率模型来检测用户的语音活动

    公开(公告)号:US09378755B2

    公开(公告)日:2016-06-28

    申请号:US14502795

    申请日:2014-09-30

    Applicant: Apple Inc.

    CPC classification number: G10L25/84 G10L15/20 G10L19/04 G10L25/27 G10L25/78

    Abstract: Method of detecting voice activity starts with by generating probabilistic models that respectively model features of speech dynamically over time. Probabilistic models may model each feature dependent on a past feature and a current state. Features of speech may include a nonstationary signal presence feature, a periodicity feature, and a sparsity feature. Noise suppressor may then perform noise suppression on an acoustic signal to generate a nonstationary signal presence signal and a noise suppressed acoustic signal. An LPC module may then perform residual analysis on the noise suppressed data signal to generate a periodicity signal and a sparsity signal. Inference generator receives the probabilistic models and receives, in real-time, nonstationary signal presence signal, periodicity signal, and sparsity signal. Inference generator may then generate in real time an estimate of voice activity based on the probabilistic models, nonstationary signal presence signal, periodicity signal, and sparsity signal. Other embodiments are also described.

    Abstract translation: 检测语音活动的方法始于生成随时间动态分别模拟语音特征的概率模型。 概率模型可以根据过去特征和当前状态对每个特征进行建模。 语音特征可以包括非平稳信号存在特征,周期性特征和稀疏特征。 然后,噪声抑制器可以对声学信号执行噪声抑制以产生非平稳信号存在信号和噪声抑制声信号。 然后,LPC模块可以对噪声抑制数据信号执行残差分析,以产生周期性信号和稀疏信号。 推理发生器接收概率模型,并实时接收非平稳信号存在信号,周期信号和稀疏信号。 因此,推理发生器可以基于概率模型,非平稳信号存在信号,周期性信号和稀疏信号实时生成语音活动的估计。 还描述了其它实施例。

    Robust speech recognition in the presence of echo and noise using multiple signals for discrimination

    公开(公告)号:US09672821B2

    公开(公告)日:2017-06-06

    申请号:US14835588

    申请日:2015-08-25

    Applicant: APPLE INC.

    CPC classification number: G10L15/20 G10L15/16 G10L2021/02082

    Abstract: Systems and methods for speech recognition system having a speech processor that is trained to recognize speech by considering (1) a raw microphone signal that includes an echo signal and (2) different types of echo information signals from an echo cancellation system (and optionally different types of ambient noise suppression signals from a noise suppressor). The different types of echo information signals may include those used for echo cancelation and those having echo information. The speech recognition system may convert the raw microphone signal and different types of echo information signals (and optional noise suppression signals) into spectral features in the form of a vector, and a concatenator to combine the feature vectors into a total vector (for a period of time) that is used to train the speech processor, and during use of the speech processor to recognize speech.

    ROBUST SPEECH RECOGNITION IN THE PRESENCE OF ECHO AND NOISE USING MULTIPLE SIGNALS FOR DISCRIMINATION
    4.
    发明申请
    ROBUST SPEECH RECOGNITION IN THE PRESENCE OF ECHO AND NOISE USING MULTIPLE SIGNALS FOR DISCRIMINATION 有权
    使用多个信号进行歧视的ECHO和NOISE存在下的鲁棒语音识别

    公开(公告)号:US20160358602A1

    公开(公告)日:2016-12-08

    申请号:US14835588

    申请日:2015-08-25

    Applicant: APPLE INC.

    CPC classification number: G10L15/20 G10L15/16 G10L2021/02082

    Abstract: Systems and methods for speech recognition system having a speech processor that is trained to recognize speech by considering (1) a raw microphone signal that includes an echo signal and (2) different types of echo information signals from an echo cancellation system (and optionally different types of ambient noise suppression signals from a noise suppressor). The different types of echo information signals may include those used for echo cancelation and those having echo information. The speech recognition system may convert the raw microphone signal and different types of echo information signals (and optional noise suppression signals) into spectral features in the form of a vector, and a concatenator to combine the feature vectors into a total vector (for a period of time) that is used to train the speech processor, and during use of the speech processor to recognize speech.

    Abstract translation: 通过考虑(1)包含回波信号的原始麦克风信号和(2)来自回波消除系统的不同类型的回波信息信号(以及可选地不同的),语音识别系统的系统和方法具有经过训练以识别语音的语音处理器 来自噪声抑制器的环境噪声抑制信号的类型)。 不同类型的回波信息信号可以包括用于回波消除的信号和具有回波信息的信号。 语音识别系统可以将原始麦克风信号和不同类型的回波信息信号(和可选的噪声抑制信号)以矢量的形式转换为频谱特征,以及将特征向量组合成总矢量(一段时间)的级联器 的时间),用于训练语音处理器,并且在语音处理器的使用期间识别语音。

    DETECTING A USER'S VOICE ACTIVITY USING DYNAMIC PROBABILISTIC MODELS OF SPEECH FEATURES
    6.
    发明申请
    DETECTING A USER'S VOICE ACTIVITY USING DYNAMIC PROBABILISTIC MODELS OF SPEECH FEATURES 有权
    使用动态特征的动态概率模型检测用户的声音活动

    公开(公告)号:US20150348572A1

    公开(公告)日:2015-12-03

    申请号:US14502795

    申请日:2014-09-30

    Applicant: Apple Inc.

    CPC classification number: G10L25/84 G10L15/20 G10L19/04 G10L25/27 G10L25/78

    Abstract: Method of detecting voice activity starts with by generating probabilistic models that respectively model features of speech dynamically over time. Probabilistic models may model each feature dependent on a past feature and a current state. Features of speech may include a nonstationary signal presence feature, a periodicity feature, and a sparsity feature. Noise suppressor may then perform noise suppression on an acoustic signal to generate a nonstationary signal presence signal and a noise suppressed acoustic signal. An LPC module may then perform residual analysis on the noise suppressed data signal to generate a periodicity signal and a sparsity signal. Inference generator receives the probabilistic models and receives, in real-time, nonstationary signal presence signal, periodicity signal, and sparsity signal. Inference generator may then generate in real time an estimate of voice activity based on the probabilistic models, nonstationary signal presence signal, periodicity signal, and sparsity signal. Other embodiments are also described.

    Abstract translation: 检测语音活动的方法始于生成随时间动态分别模拟语音特征的概率模型。 概率模型可以根据过去特征和当前状态对每个特征进行建模。 语音特征可以包括非平稳信号存在特征,周期性特征和稀疏特征。 然后,噪声抑制器可以对声学信号执行噪声抑制以产生非平稳信号存在信号和噪声抑制声信号。 然后,LPC模块可以对噪声抑制数据信号执行残差分析,以产生周期性信号和稀疏信号。 推理发生器接收概率模型,并实时接收非平稳信号存在信号,周期信号和稀疏信号。 因此,推理发生器可以基于概率模型,非平稳信号存在信号,周期性信号和稀疏信号实时生成语音活动的估计。 还描述了其它实施例。

Patent Agency Ranking