Abstract:
When decompressing an HOA data frame representation, a gain control (15, 151) is applied for each channel signal before it is perceptually encoded (16). The gain values are transferred in a differential manner as side information. However, for starting decoding of such streamed compressed HOA data frame representation absolute gain values are required, which should be coded with a minimum number of bits. For determining such lowest integer number (βe) of bits the HOA data frame representation (C(k)) is rendered in spatial domain to virtual loudspeaker signals lying on a unit sphere, followed by normalization of the HOA data frame representation (C(k). Then the lowest integer number of bits is set to βe=┌ log2(┌ log2(√{square root over (KMAX)}·O)┐+1)┐.
Abstract:
Encoding of Higher Order Ambisonics (HOA) signals commonly results in high data rates. For data rate reduction, a method (100) for encoding direction information for frames of an input HOA signal comprises determining (s101) active candidate directions (MDIR(k)) among predefined global directions having global direction indices, dividing (s102) the input HOA signal into frequency subbands (f1 . . . , fF), determining (s103) for each frequency subband active subband directions among the active candidate directions, assigning (s104) a relative direction index to each direction per subband, assembling (s105) direction information for the frame, the direction information comprising the active candidate directions (MDIRk)), for each subband and each active candidate direction a bit indicating whether or not the active candidate direction is an active subband direction for the respective frequency subband, and for each frequency subband the relative direction indices of active subband directions in the second set of subband directions, and transmitting (s106) the assembled direction information.
Abstract:
When compressing an HOA data frame representation, a gain control (15, 151) is applied for each channel signal before it is perceptually encoded (16). The gain values are transferred in a differential manner as side information. However, for starting decoding of such streamed compressed HOA data frame representation absolute gain values are required, which should be coded with a minimum number of bits. For determining such lowest integer number (βe) of bits the HOA data frame representation (C(k)) is rendered in spatial domain to virtual loudspeaker signals lying on a unit sphere, followed by normalisation of the HOA data frame representation (C(k)). Then the lowest integer number of bits is set to βe=┌log2(┌log2(√{square root over (KMAX)}·O)┐+1)┐.
Abstract:
For an efficient encoding of subband configuration data the first, penultimate and last subband groups are treated differently than the other subband groups. Further, subband group bandwidth difference values are used in the encoding. The number of subband groups NSB is coded using a fixed number of bits representing NSB−1. The bandwidth value BSB[1] of the first subband group is coded using a unary code representing BSB[1]−1. No bandwidth value BSB[g] is coded for the last subband g=NSB. For subband groups g=2, . . . , NSB−2 bandwidth difference values ΔBSB[g]=BSB[g]−BSB[g−1] are coded using a unary code, and the bandwidth difference value ΔBSB[NSB−1] for subband group g=NSB−1 is coded using a fixed number of bits.
Abstract:
A method for compressing a HOA signal being an input HOA representation with input time frames (C(k)) of HOA coefficient sequences comprises spatial HOA encoding of the input time frames and subsequent perceptual encoding and source encoding. Each input time frame is decomposed (802) into a frame of predominant sound signals (XPS(k−1)) and a frame of an ambient HOA component ({tilde over (C)}AMB(k−1)). The ambient HOA component ({tilde over (C)}AMB(k−1)) comprises, in a layered mode, first HOA coefficient sequences of the input HOA representation (cn(k−1)) in lower positions and second HOA coefficient sequences (cAMB,n(k−1)) in remaining higher positions. The second HOA coefficient sequences are part of an HOA representation of a residual between the input HOA representation and the HOA representation of the predominant sound signals.
Abstract:
There are two representations for Higher Order Ambisonics denoted HOA: spatial domain and coefficient domain. The invention generates from a coefficient domain representation a mixed spatial/coefficient domain representation, wherein the number of said HOA signals can be variable. A vector of coefficient domain signals is separated into a vector of coefficient domain signals having a constant number of HOA coefficients and a vector of coefficient domain signals having a variable number of HOA coefficients. The constant-number HOA coefficients vector is transformed to a corresponding spatial domain signal vector. In order to facilitate high-quality coding, without creating signal discontinuities the variable-number HOA coefficients vector of coefficient domain signals is adaptively normalized and multiplexed with the vector of spatial domain signals.
Abstract:
Higher Order Ambisonics (HOA) represents three-dimensional sound. HOA provides high spatial resolution and facilitates analyzing of the sound field with respect to dominant sound sources. The invention aims to identify independent dominant sound sources constituting the sound field, and to track their temporal trajectories. Known applications are searching for all potential candidates for dominant sound source directions by looking at the directional power distribution of the original HOA representation, whereas in the invention all components which are correlated with the signals of previously found sound sources are removed. By such operation the problem of erroneously detecting many instead of only one correct sound source can be avoided in case its contributions to the sound field are highly directionally dispersed.
Abstract:
Spherical microphone arrays capture a three-dimensional sound field (P(Ωct)) for generating an Ambisonics representation (Anm(t)), where the pressure distribution on the surface of the sphere is sampled by the capsules of the array. The impact of the microphones on the captured sound field is removed using the inverse microphone transfer function. The equalization of the transfer function of the microphone array is a big problem because the reciprocal of the transfer function causes high gains for small values in the transfer function and these small values are affected by transducer noise. The present principles minimize that noise by using a Wiener filter processing (34) in the frequency domain, which processing is automatically controlled (33) per wave number by the signal-to-noise ratio of the microphone array.