摘要:
A multi-channel input signal having at least three original channels is represented by a parameter representation of the multi-channel signal. A first balance parameter, a first coherence parameter, or a first inter-channel time difference between a first channel pair and a second balance parameter, or a second coherence parameter, or a second inter-channel time difference parameter between a second channel pair are calculated. This set of parameters is the parameter representation of the original signals. The first channel pair has two channels, which are different from two channels of a second channel pair. Furthermore, each channel of the two channel pairs is one of the original channels, or a weighted combination of the original channels, and the first channel pair and the second channel pair include information on the three original channels. For multi-channel reconstruction purposes, the parameters are used in addition to down-mixing information to generate a selectable number of output channels in a scalable fashion.
摘要:
An audio signal having at least two channels can be efficiently down-mixed into a downmix signal and a residual signal, when the down-mixing rule used depends on a spatial parameter that is derived from the audio signal and that is post-processed by a limiter to apply a certain limit to the derived spatial parameter with the aim of avoiding instabilities during the up-mixing or down-mixing process. By having a down-mixing rule that dynamically depends on parameters describing an interrelation between the audio channels, one can assure that the energy within the down-mixed residual signal is as minimal as possible, which is advantageous in the view of coding efficiency. By post processing the spatial parameter with a limiter prior to using it in the down-mixing, one can avoid instabilities in the down- or up-mixing, which otherwise could result in a disturbance of the spatial perception of the encoded or decoded audio signal.
摘要:
A synthesizer for generating a decorrelation signal using an input signal is operative on a plurality of subband signals, wherein a subband signal includes a sequence of at least two subband samples, the sequence of the subband samples representing a bandwidth of the subband signal, which is smaller than a bandwidth of the input signal. The synthesizer includes a filter stage for filtering each subband signal using a reverberation filter to obtain a plurality of reverberated subband signals, wherein a plurality of reverberated subband signals together represent the decorrelation signal. This decorrelation signal is used for reconstructing a signal based on a parametrically encoded stereo signal consisting of a mono signal and a coherence measure.
摘要:
An audio object coder for generating an encoded object signal using a plurality of audio objects includes a downmix information generator for generating downmix information indicating a distribution of the plurality of audio objects into at least two downmix channels, an audio object parameter generator for generating object parameters for the audio objects, and an output interface for generating the imported audio output signal using the downmix information and the object parameters. An audio synthesizer uses the downmix information for generating output data usable for creating a plurality of output channels of the predefined audio output configuration.
摘要:
The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The method comprises applying independent bandwidth limits for the input channels.
摘要:
An apparatus for generating a high frequency audio signal that includes an analyzer for analyzing an input signal to determine a transient information adaptively. Additionally a spectral converter is provided for converting the input signal into an input spectral representation. A spectral processor processes the input spectral representation to generate a processed spectral representation including values for higher frequencies than the input spectral representation. A time converter is configured for converting the processed spectral representation to a time representation, wherein the spectral converter or the time converter are controllable to perform a frequency domain oversampling for the first portion of the input signal having the transient information associated and to not perform the frequency domain oversampling for the second portion of the input signal not having the associated transient information.
摘要:
The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency domain representation of the second input channel and a complex prediction coefficient. The method comprises performing frequency-domain modifications selectively before or after upmixing.
摘要:
An audio signal decoder configured to provide a decoded audio signal representation on the basis of an encoded audio signal representation including a sampling frequency information, an encoded time warp information and an encoded spectrum representation includes a time warp calculator and a warp decoder. The time warp calculator is configured to adapt a mapping rule for mapping codewords of the encoded time warp information onto decoded time warp values describing the decoded time warp information in dependence on the sampling frequency information. The warp decoder is configured to provide the decoded audio signal representation on the basis of the encoded spectrum representation and in dependence on the decoded time warp information.
摘要:
The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank (501) comprising an analysis transformation unit (601) having a frequency resolution of Δf; and an analysis window (611) having a duration of DA; the analysis filter bank (501) being configured to provide a set of analysis subband signals from the low frequency component of the signal; a nonlinear processing unit (502, 650) configured to determine a set of synthesis subband signals based on a portion of the set of analysis subband signals, wherein the portion of the set of analysis subband signals is phase shifted by a transposition order T; and a synthesis filter bank (504) comprising a synthesis transformation unit (602) having a frequency resolution of QΔf; and a synthesis window (612) having a duration of Ds; the synthesis filter bank (504) being configured to generate the high frequency component of the signal from the set of synthesis subband signals; wherein Q is a frequency resolution factor with Q≧1 and smaller than the transposition order T; and wherein the value of the product of the frequency resolution Δf and the duration DA of the analysis filter bank is selected based on the frequency resolution factor Q.
摘要:
For a multi-channel reconstruction of audio signals based on at least one base channel, an energy measure is used for compensating energy losses due to an predictive upmix. The energy measure can be applied in the encoder or the decoder. Furthermore, a decorrelated signal is added to output channels generated by an energy-loss introducing upmix procedure. The energy of the decorrelated signal is smaller than or equal to an energy error introduced by the predictive upmix. Thus, problems occurring for prediction based up-mix methods such as up-mixing signals that are coded with High Frequency Reconstruction techniques are solved, so that the correct correlation between the up-mixed channels is obtained or the up-mix is adapted to arbitrary down-mixes.