Abstract:
In a particular aspect, a multimedia device includes one or more sensors configured to generate first sensor data and second sensor data. The first sensor data is indicative of a first position at a first time and the second sensor data is indicative of a second position at a second time. The multimedia device further includes a processor coupled to the one or more sensors. The processor is configured to generate a first version of a spatialized audio signal, determine a cumulative value based on an offset, the first position, and the second position, and generate a second version of the spatialized audio signal based on the cumulative value.
Abstract:
In general, techniques are directed to intermediate compression of higher order ambisonic audio data. For example, a device comprising a processor and a memory may be configured to perform the techniques. The memory may be configured to store an intermediately formatted audio data generated as a result of an intermediate compression of higher order ambisonic audio data. The one or more processors may be configured to process the intermediately formatted audio data.
Abstract:
In general, techniques are described for indicating reuse of a syntax element that indicates a quantization mode used when compressing a vector. A device comprising a processor and a memory may perform the techniques. The processor may be configured to obtain a bitstream comprising a vector in a spherical harmonics domain. The bitstream may further comprise an indicator for whether to reuse, from a previous frame, at least one syntax element indicative of a quantization mode used when compressing the vector. The memory may be configured to store the bitstream.
Abstract:
In general, techniques are described for obtaining audio rendering information in a bitstream. A device configured to render higher order ambisonic coefficients comprising a processor and a memory may perform the techniques. The processor may be configured to obtain sparseness information indicative of a sparseness of a matrix used to render the higher order ambisonic coefficients to a plurality of speaker feeds. The memory may be configured to store the sparseness information.
Abstract:
Systems and techniques for rendering audio data are generally disclosed. An example device for rendering a higher order ambition (HOA) audio signal includes a memory configured to store the HOA audio signal, and one or more processors coupled to the memory. The one or more processors are configured to perform a loudness compensation process as part of generating an effect matrix. The one or more processors are further configured to render the HOA audio signal based on the effect matrix.
Abstract:
In one example, a device for retrieving audio data includes one or more processors configured to receive availability data representative of a plurality of available adaptation sets, the available adaptation sets including a scene-based audio adaptation set and one or more object-based audio adaptation sets, receive selection data identifying which of the scene-based audio adaptation set and the one or more object-based audio adaptation sets are to be retrieved, and provide instruction data to a streaming client to cause the streaming client to retrieve data for each of the adaptation sets identified by the selection data, and a memory configured to store the retrieved data for the audio adaptation sets.
Abstract:
In general, techniques are described for indicating reuse of a syntax element indicating a vector quantization codebook used in compressing a vector. A device comprising a processor and a memory may perform the techniques. The processor may be configured to obtain a bitstream comprising a vector in a spherical harmonics domain. The bitstream may further comprise an indicator for whether to reuse, from a previous frame, a syntax element indicative of a vector quantization codebook used when compressing the vector. The memory may be configured to store the bitstream.
Abstract:
In general, techniques are described for indicating reusability of an index that determines a Huffman codebook used to code data associated with a vector in a spherical harmonics domain. The bitstream may comprise an indicator for whether to reuse, from a previous frame, at least one syntax element indicative of the index. The memory may be configured to store the bitstream.
Abstract:
In general, techniques are described for obtaining decomposed versions of spherical harmonic coefficients. A device comprising a processor and a memory may be configured to perform the techniques. The processor may obtain a non-zero set of coefficients of a vector representative a distinct component of a sound field. The vector may have been decomposed from a plurality of spherical harmonic coefficients that describe the sound field. The processor may also obtain one of a plurality of configuration modes by which to extract the non-zero set of coefficients of the vector, where the one of the configuration modes indicates that the coefficients include all of the coefficients except for at least one of the coefficients. The processor may further extract the coefficients of the vector based on the obtained one of the configuration modes. The memory may be configured to store the non-zero set of the coefficients of the vector.
Abstract:
In general, techniques are described by which to perform spatial masking with respect to spherical harmonic coefficients. As one example, an audio encoding device comprising a processor may perform various aspects of the techniques. The processor may be configured to perform spatial analysis based on the spherical harmonic coefficients describing a three-dimensional sound field to identify a spatial masking threshold. The processor may further be configured to render the multi-channel audio data from the plurality of spherical harmonic coefficients, and compress the multi-channel audio data based on the identified spatial masking threshold to generate a bitstream.