Audio output masking for improved automatic speech recognition

    公开(公告)号:US09704478B1

    公开(公告)日:2017-07-11

    申请号:US14094591

    申请日:2013-12-02

    CPC classification number: G10L21/0232 G10L15/00 G10L2021/02082

    Abstract: Features are disclosed for filtering portions of an output audio signal in order to improve automatic speech recognition on an input signal which may include a representation of the output signal. A signal that includes audio content can be received, and a frequency or band of frequencies can be selected to be filtered from the signal. The frequency band may correspond to a desired frequency band for speech recognition. An input signal can be obtained comprising audio data corresponding to a user utterance and presentation of the output signal. Automatic speech recognition can be performed on the input signal. In some cases, an acoustic model trained for use with such frequency band filtering may be used to perform speech recognition.

    Beamformer design using constrained convex optimization in three-dimensional space
    13.
    发明授权
    Beamformer design using constrained convex optimization in three-dimensional space 有权
    Beamformer设计在三维空间中使用约束凸优化

    公开(公告)号:US09591404B1

    公开(公告)日:2017-03-07

    申请号:US14040138

    申请日:2013-09-27

    Abstract: Embodiments of systems and methods are described for determining weighting coefficients based at least in part on using convex optimization subject to one or more constraints to approximate a three-dimensional beampattern. In some implementations, the approximated three-dimensional beampattern comprises a main lobe that includes a look direction for which waveforms detected by a sensor array are not suppressed and a side lobe that includes other directions for which waveforms detected by the microphone array are suppressed. The one or more constraints can include a constraint that suppression of waveforms received by the sensor array from the side lobe are greater than a threshold. In some implementations, the threshold can be dependent on at least one of an angular direction of the waveform and a frequency of the waveform.

    Abstract translation: 描述了系统和方法的实施例,用于至少部分地基于利用一个或多个约束来逼近三维波形图的凸优化来确定加权系数。 在一些实施方式中,近似的三维beampattern包括主瓣,其包括不抑制由传感器阵列检测到的波形的外观方向,以及包括由麦克风阵列检测到的波形的其他方向被抑制的旁瓣。 一个或多个约束可以包括约束,其抑制传感器阵列从旁瓣接收的波形大于阈值。 在一些实现中,阈值可以取决于波形的角度方向和波形的频率中的至少一个。

    Method and system for beam selection in microphone array beamformers
    15.
    发明授权
    Method and system for beam selection in microphone array beamformers 有权
    麦克风阵列波束形成器中波束选择的方法和系统

    公开(公告)号:US09432769B1

    公开(公告)日:2016-08-30

    申请号:US14447498

    申请日:2014-07-30

    Abstract: Embodiments of systems and methods are described for determining which of a plurality of beamformed audio signals to select for signal processing. In some embodiments, a plurality of audio input signals are received from a microphone array comprising a plurality of microphones. A plurality of beamformed audio signals are determined based on the plurality of input audio signals, the beamformed audio signals comprising a direction. A plurality of signal features may be determined for each beamformed audio signal. Smoothed features may be determined for each beamformed audio signal based on at least a portion of the plurality of signal features. The beamformed audio signal corresponding to the maximum smoothed feature may be selected for further processing.

    Abstract translation: 描述了系统和方法的实施例,用于确定多个波束形成的音频信号中的哪一个被选择用于信号处理。 在一些实施例中,从包括多个麦克风的麦克风阵列接收多个音频输入信号。 基于多个输入音频信号来确定多个波束形成的音频信号,波束形成的音频信号包括方向。 可以为每个波束形成的音频信号确定多个信号特征。 基于多个信号特征的至少一部分,可以为每个波束形成的音频信号确定平滑特征。 可以选择对应于最大平滑特征的波束形成的音频信号用于进一步处理。

    Dereverberation and noise reduction

    公开(公告)号:US12272369B1

    公开(公告)日:2025-04-08

    申请号:US17578737

    申请日:2022-01-19

    Abstract: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. In some examples, the system may include a deep neural network (DNN) configured to perform speech enhancement, which is located after an Acoustic Echo Cancellation (AEC) component. For example, the DNN may process isolated audio data output by the AEC component to jointly mitigate additive noise and reverberation. In other examples, the system may include a DNN configured to perform acoustic interference cancellation, which may jointly mitigate additive noise, reverberation, and residual echo, removing the need to perform residual echo suppression processing. The DNN is configured to process complex-valued spectrograms corresponding to the isolated audio data and/or estimated echo data generated by the AEC component.

    Beamforming for a wearable computer

    公开(公告)号:US10863270B1

    公开(公告)日:2020-12-08

    申请号:US16361808

    申请日:2019-03-22

    Abstract: A wearable computer is configured to use beamforming techniques to isolate a user's speech from extraneous audio signals occurring within a physical environment. A microphone array of the wearable computer may generate audio signal data from an utterance from a user's mouth. A motion sensor(s) of the wearable computer may generate motion data from movement of the wearable computer. This motion data may be used to determine a direction vector pointing from the wearable computer to the user's mouth, and a beampattern may be defined that has a beampattern direction in substantial alignment with the determined direction vector to focus the microphone array on the user's mouth for speech isolation.

Patent Agency Ranking