Abstract:
Provided are an apparatus and a method for integrally encoding and decoding a speech signal and a audio signal. The encoding apparatus may include: an input signal analyzer to analyze a characteristic of an input signal; a first conversion encoder to convert the input signal to a frequency domain signal, and to encode the input signal when the input signal is a audio characteristic signal; a Linear Predictive Coding (LPC) encoder to perform LPC encoding of the input signal when the input signal is a speech characteristic signal; and a bitstream generator to generate a bitstream using an output signal of the first conversion encoder and an output signal of the LPC encoder.
Abstract:
The present invention relates to a method and an apparatus for processing a signal, which are used to effectively reproduce an audio signal, and more particularly, to a method and an apparatus for processing an audio signal, which are used for implementing a filtering for input audio signals with a low computational complexity.To this end, provided are a method for processing an audio signal including: receiving an input audio signal; receiving truncated subband filter coefficients for filtering each subband signal of the input audio signal, the truncated subband filter coefficients being at least a portion of subband filter coefficients obtained from binaural room impulse response (BRIR) filter coefficients for binaural filtering of the input audio signal, the lengths of the truncated subband filter coefficients being determined based on filter order information obtained by at least partially using characteristic information extracted from the corresponding subband filter coefficients, and the truncated subband filter coefficients being constituted by at least one FFT filter coefficient in which fast Fourier transform (FFT) by a predetermined block size in the corresponding subband has been performed; performing the fast Fourier transform of the subband signal based on a predetermined subframe size in the corresponding subband; generating a filtered subframe by multiplying the fast Fourier transformed subframe and the FFT filter coefficients; inverse fast Fourier transforming the filtered subframe; and generating a filtered subband signal by overlap-adding at least one subframe which is inverse fast Fourier transformed and an apparatus for processing an audio signal using the same.
Abstract:
Disclosed is an object based audio contents generating/playing apparatus. The object based audio contents generating/playing apparatus may include an object audio signal obtaining unit to obtain a plurality of object audio signals by recording a plurality of sound source signals, a recording space information obtaining unit to obtain recording space information with respect to a recording space of the plurality of sound source signals, a sound source location information obtaining unit to obtain sound location information of the plurality of sound source signals, and an encoding unit to generate object based audio contents by encoding at least one of the plurality of object audio signals, the recording space information, and the sound source location information, thereby enabling the object based audio contents to be played using at least one of a WFS scheme and a multi-channel surround scheme regardless of a reproducing environment of the audience.
Abstract:
A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.
Abstract:
An audio rendering method and an electronic device performing the same are disclosed. The disclosed audio rendering method includes determining an air absorption attenuation amount of an audio signal based on a recording distance included in metadata of the audio signal and a source distance between a sound source of the audio signal and a listener; and rendering the audio signal based on the air absorption attenuation amount.
Abstract:
Provided is an encoding apparatus for integrally encoding and decoding a speech signal and a audio signal, and may include: an input signal analyzer to analyze a characteristic of an input signal; a stereo encoder to down mix the input signal to a mono signal when the input signal is a stereo signal, and to extract stereo sound image information; a frequency band expander to expand a frequency band of the input signal; a sampling rate converter to convert a sampling rate; a speech signal encoder to encode the input signal using a speech encoding module when the input signal is a speech characteristics signal; a audio signal encoder to encode the input signal using a audio encoding module when the input signal is a audio characteristic signal; and a bitstream generator to generate a bitstream.
Abstract:
A method of rendering object-based audio and an electronic device for performing the method are disclosed. The method includes identifying metadata of the object-based audio, determining whether the metadata includes a parameter set for an atmospheric absorption effect for each distance, and rendering the object-based audio, using a distance between the object-based audio and a listener obtained using the metadata and the atmospheric absorption effect according to an effect of a medium attenuation based on the parameter, when the metadata includes the parameter.
Abstract:
Provided is an acoustic signal processing device for a spatially extended sound source and a method thereof. The acoustic signal processing device includes a memory configured to store instructions, and a processor electrically connected to the memory and configured to execute the instructions. When the instructions are executed by the processor, the processor performs a plurality of operations, and the plurality of operations includes transforming an object provided as a spatially extended sound source into a cuboid in a virtual reality (VR) space, obtaining coordinates of the cuboid, and determining a position of a sound source of the object based on the coordinates of the cuboid and coordinates of a user in the VR space.
Abstract:
A Unified Speech and Audio Codec (USAC) that may process a window sequence based on mode switching is provided. The USAC may perform encoding or decoding by overlapping between frames based on a folding point when mode switching occurs. The USAC may process different window sequences for each situation to perform encoding or decoding, and thereby may improve a coding efficiency.
Abstract:
A method and apparatus for performing binaural rendering of an audio signal are provided. The method includes identifying an input signal that is based on an object, and metadata that includes distance information indicating a distance to the object, generating a binaural filter that is based on the metadata, using a binaural room impulse response, obtaining a binaural filter to which a low-pass filter (LPF) is applied, using a frequency response control that is based on the distance information, and generating a binaural-rendered output signal by performing a convolution of the input signal and the binaural filter to which the LPF is applied.