Abstract:
In general, techniques are described for obtaining audio rendering information in a bitstream. A device configured to render higher order ambisonic coefficients comprising a processor and a memory may perform the techniques. The processor may be configured to obtain sparseness information indicative of a sparseness of a matrix used to render the higher order ambisonic coefficients to a plurality of speaker feeds. The memory may be configured to store the sparseness information.
Abstract:
In general, techniques are described for performing a vector-based synthesis with respect to higher order ambisonic coefficients (or, in other words, spherical harmonic coefficients). A device comprising a processor may be configured to perform the techniques. The processor may perform the vector-based synthesis with respect to spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients and determine distinct and background directional information from the directional information. The processor may then reduce an order of the directional information associated with the background audio objects to generate transformed background directional information, and apply compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
Abstract:
In general, techniques are described for specifying spherical harmonic coefficients in a bitstream. A device comprising one or more processors may perform the techniques. The processors may be configured to identify, from the bitstream, a plurality of hierarchical elements describing a sound field that are included in the bitstream. The processors may further be configured to parse the bitstream to determine the identified plurality of hierarchical elements.
Abstract:
In general, techniques are described for transforming spherical harmonic coefficients. A device comprising one or more processors may perform the techniques. The processors may be configured to parse the bitstream to determine transformation information describing how the sound field was transformed to reduce a number of the plurality of hierarchical elements that provide information relevant in describing the sound field. The processors may further be configured to, when reproducing the sound field based on those of the plurality of hierarchical elements that provide information relevant in describing the sound field, transform the sound field based on the transformation information to reverse the transformation performed to reduce the number of the plurality of hierarchical elements.
Abstract:
In general, techniques are described for mapping virtual speakers to physical speakers, having first adjusted the position of one of the virtual speakers based on a relative position of the one of the virtual speakers to one of the physical speakers. A device comprising one or more processors may perform the techniques. The one or more processors may be configured to determine a difference in position between one of a plurality of physical speakers and one of a plurality of virtual speakers arranged in a geometry, and adjust a position of the one of the plurality of virtual speakers within the geometry based on the determined difference in position and prior to mapping the plurality of virtual speakers to the plurality of physical speakers.
Abstract:
This disclosure describes techniques for coding of higher-order ambisonics audio data comprising at least one higher-order ambisonic (HOA) coefficient corresponding to a spherical harmonic basis function having an order greater than one. This disclosure describes techniques for adjusting HOA soundfields to potentially improve spatial alignment of the acoustic elements to the visual component in a mixed audio/video reproduction scenario. In one example, a device for rendering an HOA audio signal includes one or more processors configured to render the HOA audio signal over one or more speakers based on one or more field of view (FOV) parameters of a reference screen and one or more FOV parameters of a viewing window.
Abstract:
In general, techniques are described for mapping virtual speakers to physical speakers, having first adjusted the position of one of the virtual speakers based on a relative position of the one of the virtual speakers to one of the physical speakers. A device comprising one or more processors may perform the techniques. The one or more processors may be configured to determine a difference in position between one of a plurality of physical speakers and one of a plurality of virtual speakers arranged in a geometry, and adjust a position of the one of the plurality of virtual speakers within the geometry based on the determined difference in position and prior to mapping the plurality of virtual speakers to the plurality of physical speakers.
Abstract:
In general, techniques are described for obtaining audio rendering information in a bitstream. A device configured to render higher order ambisonic coefficients comprising a processor and a memory may perform the techniques. The processor may be configured to obtain sparseness information indicative of a sparseness of a matrix used to render the higher order ambisonic coefficients to a plurality of speaker feeds. The memory may be configured to store the sparseness information.
Abstract:
Systems and techniques for rendering audio data are generally disclosed. An example device for rendering a higher order ambition (HOA) audio signal includes a memory configured to store the HOA audio signal, and one or more processors coupled to the memory. The one or more processors are configured to perform a loudness compensation process as part of generating an effect matrix. The one or more processors are further configured to render the HOA audio signal based on the effect matrix.
Abstract:
In general, techniques are described by which to perform spatial masking with respect to spherical harmonic coefficients. As one example, an audio encoding device comprising a processor may perform various aspects of the techniques. The processor may be configured to perform spatial analysis based on the spherical harmonic coefficients describing a three-dimensional sound field to identify a spatial masking threshold. The processor may further be configured to render the multi-channel audio data from the plurality of spherical harmonic coefficients, and compress the multi-channel audio data based on the identified spatial masking threshold to generate a bitstream.