Enhancement of Multichannel Audio
    1.
    发明申请
    Enhancement of Multichannel Audio 有权
    增强多声道音频

    公开(公告)号:US20120310635A1

    公开(公告)日:2012-12-06

    申请号:US13571344

    申请日:2012-08-10

    申请人: Hannes Muesch

    发明人: Hannes Muesch

    IPC分类号: G10L21/02

    摘要: The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between portions of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

    摘要翻译: 本发明涉及音频信号处理。 更具体地,本发明涉及通过对音频部分之间已被平滑的音频应用增益来增强诸如电视音频的多声道音频。 本发明涉及用于执行这些方法的方法,装置以及存储在计算机可读介质上的软件,用于使计算机执行这些方法。

    AUDIO SIGNAL PROCESSING SYSTEM AND AUDIO SIGNAL PROCESSING METHOD
    2.
    发明申请
    AUDIO SIGNAL PROCESSING SYSTEM AND AUDIO SIGNAL PROCESSING METHOD 有权
    音频信号处理系统和音频信号处理方法

    公开(公告)号:US20120095755A1

    公开(公告)日:2012-04-19

    申请号:US13330100

    申请日:2011-12-19

    IPC分类号: G10L21/02

    摘要: An audio signal processing system including a time-frequency conversion unit which converts an audio signal in time domain into frequency domain in frame units so as to calculate a frequency spectrum of the audio signal, a spectral change calculation unit which calculates an amount of change between a frequency spectrum of a first frame and a frequency spectrum of a second frame before the first frame based on the frequency spectrum of the first frame and the frequency spectrum of the second frame, and a judgment unit which judges the type of the noise which is included in the audio signal of the first frame in accordance with the amount of spectral change.

    摘要翻译: 一种音频信号处理系统,包括时间 - 频率转换单元,其将时域中的音频信号以帧为单位转换成频域,以便计算音频信号的频谱;频谱变化计算单元,其计算 基于第一帧的频谱和第二帧的频谱的第一帧的频谱和第一帧之前的第二帧的频谱,以及判断单元,其判断噪声的类型是 根据光谱变化量包括在第一帧的音频信号中。

    Voiced/unvoiced classification of speech for excitation codebook
selection in celp speech decoding during frame erasures
    3.
    发明授权
    Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures 失效
    在帧擦除期间,在celp语音解码中激活码本选择的语音/语音分类

    公开(公告)号:US5732389A

    公开(公告)日:1998-03-24

    申请号:US482708

    申请日:1995-06-07

    CPC分类号: G10L25/93 G10L2025/932

    摘要: A CELP speech decoder includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook. The CS-ACELP decoder generates a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information. The decoder does this by classifying the speech signal to be generated as periodic (voiced) or non-periodic (unvoiced) and then generating an excitation signal based on this classification. If the speech signal is classified as periodic, the excitation signal is generated based on the output signal from the first portion and not on the output signal from the second portion. If the speech signal is classified as non-periodic, the excitation signal is generated based on the output signal from said second portion and not on the output signal from said first portion.

    摘要翻译: CELP语音解码器包括包括自适应码本的第一部分和包括固定码本的第二部分。 当所述解码器不可靠地接收到压缩语音信息的当前帧的至少一部分时,CS-ACELP解码器基于来自所述第一和第二部分的输出信号选择性地产生语音激励信号。 解码器通过将要生成的语音信号分类为周期性(有声)或非周期性(无声),然后基于该分类产生激励信号来实现。 如果语音信号被分类为周期性,则基于来自第一部分的输出信号而不是来自第二部分的输出信号产生激励信号。 如果语音信号被分类为非周期性,则基于来自所述第二部分的输出信号而不是来自所述第一部分的输出信号产生激励信号。

    Method and device for discriminating voiced and unvoiced sounds
    4.
    发明授权
    Method and device for discriminating voiced and unvoiced sounds 失效
    用于辨别有声和无声的声音的方法和装置

    公开(公告)号:US5664052A

    公开(公告)日:1997-09-02

    申请号:US48034

    申请日:1993-04-14

    摘要: A method and a device for discriminating a voiced sound from an unvoiced sound or background noise in speech signals are disclosed. Each block or frame of input speech signals is divided into plural sub-blocks and the standard deviation, effective value or the peak value is detected in a detection unit for detecting statistical characteristics from one sub-block to another. A bias detection unit detects a bias on the time scale of the standard deviation, effective value or the peak value to decide whether the speech signals are voiced or unvoiced from one block to another.

    摘要翻译: 公开了一种在语音信号中识别有声声音与无声或背景噪声的方法和装置。 输入语音信号的每个块或帧被分成多个子块,并且在用于检测从一个子块到另一个子块的统计特性的检测单元中检测标准偏差,有效值或峰值。 偏置检测单元检测标准偏差,有效值或峰值的时间标度上的偏差,以确定语音信号是从一个块到另一个是浊音还是清音。

    NEURAL TEMPORAL BEAMFORMER FOR NOISE REDUCTION IN SINGLE-CHANNEL AUDIO SIGNALS

    公开(公告)号:US20240257827A1

    公开(公告)日:2024-08-01

    申请号:US18160278

    申请日:2023-01-26

    摘要: This disclosure provides methods, devices, and systems for audio signal processing. The present implementations more specifically relate to multi-frame beamforming using neural network supervision. In some aspects, a speech enhancement system may include a linear filter, a deep neural network (DNN), a voice activity detector (VAD), and an IFC calculator. The DNN infers a probability of speech (pDNN) in a current frame of a single-channel audio signal based on a neural network model. The VAD determines whether speech is present or absent in the current audio frame based on the probability of speech pDNN. The IFC calculator may estimate an IFC vector based on the output of the DNN (such as the probability of speech pDNN) and the output of the VAD (such as an indication of whether speech is present in the current frame). The linear filter uses the IFC vector to suppress noise in the current audio frame.

    Voice Activity Detector for Audio Signals
    7.
    发明申请
    Voice Activity Detector for Audio Signals 有权
    语音信号检测器

    公开(公告)号:US20150243300A1

    公开(公告)日:2015-08-27

    申请号:US14701622

    申请日:2015-05-01

    发明人: Hannes Muesch

    IPC分类号: G10L25/78 G10L19/012

    摘要: According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.

    摘要翻译: 根据一个方面,公开了一种用于检测语音活动的方法,所述方法包括接收输入音频信号的帧,所述输入音频信号具有采样率; 基于所述采样率将所述帧划分成多个子带,所述多个子带至少包括最低子带和最高子带; 用移动平均滤波器对最低子带进行滤波,以减少最低子带的能量; 估计所述多个子带中的每一个的噪声电平; 计算所述多个子带中的每一个的信噪比值; 以及基于所计算的信噪比值的平均值和所述多个子带中的每一个的能量的加权平均值来确定所述帧的语音活动水平。 其他方面包括解码使用本文描述的方法编码的音频的音频解码器。

    Enhancement of multichannel audio
    8.
    发明授权
    Enhancement of multichannel audio 有权
    增强多声道音频

    公开(公告)号:US08271276B1

    公开(公告)日:2012-09-18

    申请号:US13463600

    申请日:2012-05-03

    申请人: Hannes Muesch

    发明人: Hannes Muesch

    IPC分类号: G10L19/14

    摘要: The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between segments of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.

    摘要翻译: 本发明涉及音频信号处理。 更具体地,本发明涉及通过对已经在音频的片段之间被平滑的音频应用增益来增强诸如电视音频的多声道音频。 本发明涉及用于执行这些方法的方法,装置以及存储在计算机可读介质上的软件,用于使计算机执行这些方法。

    Method and device for discriminating voiced and unvoiced sounds
    9.
    发明授权
    Method and device for discriminating voiced and unvoiced sounds 失效
    用于辨别有声和无声的声音的方法和装置

    公开(公告)号:US5809455A

    公开(公告)日:1998-09-15

    申请号:US753347

    申请日:1996-11-25

    摘要: A method and a device for discriminating a voiced sound from an unvoiced sound or background noise in speech signals are disclosed. Each block or frame of input speech signals is divided into plural sub-blocks and the standard deviation, effective value or the peak value is detected in a detection unit for detecting statistical characteristics from one sub-block to another. A bias detection unit detects a bias on the time scale of the standard deviation, effective value or the peak value to decide whether the speech signals are voiced or unvoiced from one block to another.

    摘要翻译: 公开了一种在语音信号中识别有声声音与无声或背景噪声的方法和装置。 输入语音信号的每个块或帧被分成多个子块,并且在用于检测从一个子块到另一个子块的统计特性的检测单元中检测标准偏差,有效值或峰值。 偏置检测单元检测标准偏差,有效值或峰值的时间标度上的偏差,以确定语音信号是从一个块到另一个是浊音还是清音。