Abstract:
A method for continuously estimating reverberation decay comprising receiving a sequence of audio data samples. Determining whether a plateau is present in the sequence of audio data samples. Generating one or more reverberation parameters from the sequence of audio data samples if it is determined that the plateau is present.
Abstract:
A system for audio processing comprising an initial background statistical model system configured to generate an initial background statistical model using a predetermined sample size of audio data. A parameter computation system configured to generate parametric data for the audio data including cepstral and energy parameters. A background statistics computation system configured to generate preliminary background statistics for determining whether speech has been detected. A first speech detection system configured to determine whether speech was present in the initial sample of audio data. An adaptive background statistical model system configured to provide an adaptive background statistical model for use in continuous processing of audio data for speech detection. A parameter computation system configured to calculate cepstral parameters, energy parameters and other suitable parameters for speech detection. A speech/non-speech classification system configured to classify individual frames as speech frames or non-speech frames, based on the computed parameters and the adaptive background statistical model data. A background statistics update system configured to update the background statistical model based on detected speech and non-speech frames. A second speech detection system configured to perform speech detection processing and to generate a suitable indicator for use in processing audio data that is determined to include speech signals.
Abstract:
A selective audio source enhancement system includes a processor and a memory, and a pre-processing unit configured to receive audio data including a target audio signal, and to perform sub-band domain decomposition of the audio data to generate buffered outputs. In addition, the system includes a target source detection unit configured to receive the buffered outputs, and to generate a target presence probability corresponding to the target audio signal, as well as a spatial filter estimation unit configured to receive the target presence probability, and to transform frames buffered in each sub-band into a higher resolution frequency-domain. The system also includes a spectral filtering unit configured to retrieve a multichannel image of the target audio signal and noise signals associated with the target audio signal, and an audio synthesis unit configured to extract an enhanced mono signal corresponding to the target audio signal from the multichannel image.
Abstract:
A system for processing audio data comprising a linear demixing system configured to receive a plurality of sub-band audio channels and to generate an audio output and a noise output. A spatial likelihood system coupled to the linear demixing system, the spatial likelihood system configured to receive the audio output and the noise output and to generate a spatial likelihood function. A sequential Gaussian mixture model system coupled to the spatial likelihood system, the sequential Gaussian mixture model system configured to generate a plurality of model parameters. A Bayesian probability estimator system configured to receive the plurality of model parameters and a speech/noise presence probability and to generate a noise power spectral density and spectral gains. A spectral filtering system configured to receive the spectral gains and to apply the spectral gains to noisy input mixtures.
Abstract:
Traditionally, echo cancellation has employed linear adaptive filters to cancel echoes in a two way communication system. The rate of adaptation is often dynamic and varies over time. Disclosed are novel rates of adaptation that perform well in the presence of background noise, during double talk and with echo path changes. Additionally, the echo or residual echo can further be suppressed with non-linear processing performed using joint frequency-time domain processing.
Abstract:
A selective audio source enhancement system includes a processor and a memory, and a pre-processing unit configured to receive audio data including a target audio signal, and to perform sub-band domain decomposition of the audio data to generate buffered outputs. In addition, the system includes a target source detection unit configured to receive the buffered outputs, and to generate a target presence probability corresponding to the target audio signal, as well as a spatial filter estimation unit configured to receive the target presence probability, and to transform frames buffered in each sub-band into a higher resolution frequency-domain. The system also includes a spectral filtering unit configured to retrieve a multichannel image of the target audio signal and noise signals associated with the target audio signal, and an audio synthesis unit configured to extract an enhanced mono signal corresponding to the target audio signal from the multichannel image.
Abstract:
Methods for processing a multichannel audio signal that includes transient noise signals are provided. The method includes buffering the multichannel audio signal in a subband domain, and estimating the subband frames for transient noise likelihood. A probability of transient noise for the buffered subband frames is determined and a multichannel spatial filter is applied to decompose the subband frames to transient attenuated target source and noise estimation cancelled of the target source signal. A spectral filter is applied to the target source frame to enhance the target source frame and the subband frames that are determined to have a probability of the transient noise greater than a first threshold and a probability of target source less than a second threshold are muted.
Abstract:
Systems and methods provide input and output mode control for audio processing on a user device. Audio processing may be configured by monitoring audio activity on a device having at least one microphone and a digital audio processing unit, collecting information from the monitoring of the activity, including an identification of at least one application utilizing audio processing, and determining a context for the audio processing, the context including at least one context resource having associated metadata. An audio configuration is determined based on the application and determined context, and an action is performed to control the audio processing mode. User controls providing addition mode control may be displayed automatically based on a current application and determined context.
Abstract:
A system for detecting motion comprising a first speaker, a first microphone separated from the first speaker by a distance D1, a sound generator, an echo parameter measurement device and an echo parameter monitor, wherein the echo parameter monitor stores two or more sequential echo parameters and generates a motion indicator if the two or more sequential echo parameters indicate a change in an acoustic echo path that exceeds a predetermined threshold.
Abstract:
An audio driver equipped with a distortion compensation unit corrects for detected distortion and includes a digital to analog converter (DAC), an amplifier, and an output driver that drives a loudspeaker. Between the output driver and the loudspeaker, the audio driver can include a series resistor and a differential amplifier to measure the voltage across the resistor. A distortion detection unit can use the detected voltage to determine whether distortion, such as rub and buzz distortion is present. The distortion detection unit can comprise an analog to digital converter (ADC) to digitize the voltage data, an FFT to transform the voltage data into frequency information, a root-mean-square (RMS) module that measures the energy at each frequency, and an analysis module which looks for the distortion signature in the energy spectrum.