Abstract:
In general, techniques are described for compressing higher order ambisonics (HOA) audio data. A device comprising one or more processors may be configured to perform the techniques. The one or more processors may be configured to obtain a plurality of spherical harmonic coefficients from a plurality of near field compensated spherical harmonic coefficients by, at least in part, counterbalancing application of a near field compensation filter to the plurality of spherical harmonic coefficients.
Abstract:
A method for feature extraction by an electronic device is described. The method includes processing speech using a physiological cochlear model. The method also includes analyzing sections of an output of the physiological cochlear model. The method further includes extracting a place-based analysis vector and a time-based analysis vector for each section. The method additionally includes determining one or more features from each analysis vector.
Abstract:
In general, techniques are described for identifying a codebook to be used when compressing spatial components of a sound field. A device comprising one or more processors may be configured to perform the techniques. The one or more processors may be configured to identify a Huffman codebook to use when compressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
Abstract:
In general, techniques are described for obtaining an indication of whether spherical harmonic coefficients are representative of a synthetic audio object. In accordance with the techniques, a device comprising one or more processors may be configured to obtain an indication of whether spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object.
Abstract:
In general, techniques are described by which to perform spatial masking with respect to spherical harmonic coefficients. As one example, an audio encoding device comprising a processor may perform various aspects of the techniques. The processor may be configured to perform spatial analysis based on the spherical harmonic coefficients describing a three-dimensional sound field to identify a spatial masking threshold. The processor may further be configured to render the multi-channel audio data from the plurality of spherical harmonic coefficients, and compress the multi-channel audio data based on the identified spatial masking threshold to generate a bitstream.
Abstract:
An example device includes a memory configured to store a plurality of representations of a soundfield, each representation of the soundfield comprising a different set of ambisonic coefficients representative of the same soundfield at concurrent periods of time. The device also includes a processor, coupled to the memory, and the processor is configured to perform audio playback based on a field of view and on a particular representation of the soundfield from the plurality of representations.
Abstract:
In general, disclosed is a device that includes one or more processors, coupled to the memory, configured to perform an energy analysis with respect to one or more audio objects, in the ambisonics domain, in the first time segment. The one or more processors are also configured to perform a similarity measure between the one or more audio objects, in the ambisonics domain, in the first time segment, and the one or more audio objects, in the ambisonics domain, in the second time segment. In addition, the one or more processors are configured to perform a reorder of the one or more audio objects, in the ambisonics domain, in the first time segment with the one or more audio objects, in the ambisonics domain, in the second time segment, to generate one or more reordered audio objects in the first time segment.
Abstract:
In general, various aspects of the techniques are described for selecting audio streams based on motion. A device comprising a processor and a memory may be configured to perform the techniques. The processor may be configured to obtain a current location of the device, and obtain capture locations. Each of the capture locations may identify a location at which a respective one of audio streams is captured. The processor may also be configured to select, based on the current location and the capture locations, a subset of the audio streams, where the subset of the audio streams have less audio streams than the audio streams. The processor may further be configured to reproduce, based on the subset of the audio streams, a soundfield. The memory may be configured to store the subset of the plurality of audio streams.
Abstract:
In general, various aspects of the techniques are described for selecting audio streams based on motion. A device comprising a processor and a memory may be configured to perform the techniques. The processor may be configured to obtain a current location of the device, and obtain capture locations. Each of the capture locations may identify a location at which a respective one of audio streams is captured. The processor may also be configured to select, based on the current location and the capture locations, a subset of the audio streams, where the subset of the audio streams have less audio streams than the audio streams. The processor may further be configured to reproduce, based on the subset of the audio streams, a soundfield. The memory may be configured to store the subset of the plurality of audio streams.
Abstract:
In general, techniques are described by which to perform spatial relation coding using virtual higher order ambisonic coefficients. A device comprising a memory and a processor may perform the techniques. The memory may be configured to store audio data, the audio data representative of zero-ordered higher order ambisonic (HOA) coefficient, and one or more greater-than-zero-ordered HOA coefficients. The processor may be configured to obtain, based on the one or more greater-than-zero-ordered HOA coefficients, a virtual zero-ordered HOA coefficient. The processor may also be configured to obtain, based on the virtual HOA coefficient, one or more parameters from which to synthesize the one or more greater-than-zero-ordered HOA coefficients. The processor may further be configured to generate a bitstream that includes a first indication representative of the zero-ordered HOA coefficients, and a second indication representative of the one or more parameters.