Abstract:
A communications device is presented for providing bi-directional audio communications between a near-end user and a far-end user via a bidirectional communications channel. The communications device includes an adaptive echo canceller receiving a near-end audio signal and a far-end audio signal and providing an echo-canceled near-end audio signal for transmission to the far-end user via the communications channel. The adaptive echo canceller includes a first bank of analysis filters for filtering the near-end audio signal, a second bank of analysis filters for filtering the far-end audio signal, and a bank of synthesis filters for filtering sub-band echo-canceled signals generated within the adaptive echo canceller. The first and second filter banks have a frequency response optimized to reduce echo residual gain.
Abstract:
A spatial element is added to communications, including over telephone conference calls heard through headphones or a stereo speaker setup. Functions are created to modify signals from different callers to create the illusion that the callers are speaking from different parts of the room.
Abstract:
An encoder performs context-adaptive arithmetic encoding of transform coefficient data. For example, an encoder switches between coding of direct levels of quantized transform coefficient data and run-level coding of run lengths and levels of quantized transform coefficient data. The encoder can determine when to switch between coding modes based on a pre-determined switch point or by counting consecutive coefficients having a predominant value (e.g., zero). A decoder performs corresponding context-adaptive arithmetic decoding.
Abstract:
An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.
Abstract:
An audio encoder and decoder use architectures and techniques that improve the efficiency of multi-channel audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder performs a pre-processing multi-channel transform on multi-channel audio data, varying the transform so as to control quality. The encoder groups multiple windows from different channels into one or more tiles and outputs tile configuration information, which allows the encoder to isolate transients that appear in a particular channel with small windows, but use large windows in other channels. Using a variety of techniques, the encoder performs flexible multi-channel transforms that effectively take advantage of inter-channel correlation. An audio decoder performs corresponding processing and decoding. In addition, the decoder performs a post-processing multi-channel transform for any of multiple different purposes.
Abstract:
The disclosed architecture employs signal processing techniques to provide audio perception only, or audio perception that matches the visual perception. This also provides spatial audio reproduction for multiparty teleconferencing such that the teleconferencing participants perceive themselves as if they were sitting in the same room. The solution is based on the premise that people perceive sounds as a reconstructed wavefront, and hence, the wavefronts are used to provide the spatial perceptual cues. The differences between the spatial perceptual cues derived from the reconstructed wavefront of sound waves and the ideal wavefront of sound waves form an objective metric for spatial perceptual quality, and provide the means of evaluating the overall system performance. Additionally, compensation filters are employed to improve the spatial perceptual quality of stereophonic systems by optimizing the objective metrics.
Abstract:
An audio encoder encodes a combined channel (e.g., a sum channel) for a group of plural physical audio channels. The encoder determines plural parameters for representing individual physical channels of the group as modified versions of the encoded combined channel. The plural parameters comprise ratios of power in each individual channel to power in the combined channel (e.g., a ratio of the power of a right channel to the power of the combined channel, and a ratio of the power of the left channel to the power of the combined channel). The plural parameters can include a complex parameter. The combined channel and the plural parameters facilitate reconstruction at the audio decoder of source channels. An audio decoder performs a forward complex transform on the multi-channel audio data and reconstructs plural channels from the multi-channel audio data. The decoder can maintain second-order statistics for the source channels.
Abstract:
Described techniques and tools include techniques and tools for mapping digital media data (e.g., audio, video, still images, and/or text, among others) in a given format to a transport or file container format useful for encoding the data on optical disks such as digital video disks (DVDs). A digital media universal elementary stream can be used to map digital media streams (e.g., an audio stream, video stream or an image) into any arbitrary transport or file container, including optical disk formats, and other transports, such as broadcast streams, wireless transmissions, etc. The information to decode any given frame of the digital media in the stream can be carried in each coded frame. A digital media universal elementary stream includes stream components called chunks. An implementation of a digital media universal elementary stream arranges data for a media stream in frames, the frames having one or more chunks.
Abstract:
A communications device is presented for providing bi-directional audio communications between a near-end user and a far-end user via a bidirectional communications channel. The communications device includes an adaptive echo canceller receiving a near-end audio signal and a far-end audio signal and providing an echo-canceled near-end audio signal for transmission to the far-end user via the communications channel. The adaptive echo canceller includes a first bank of analysis filters for filtering the near-end audio signal, a second bank of analysis filters for filtering the far-end audio signal, and a bank of synthesis filters for filtering sub-band echo-canceled signals generated within the adaptive echo canceller. The first and second filter banks have a frequency response optimized to reduce echo residual gain.
Abstract:
An audio encoder and decoder use architectures and techniques that improve the efficiency of multi-channel audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder performs a pre-processing multi-channel transform on multi-channel audio data, varying the transform so as to control quality. The encoder groups multiple windows from different channels into one or more tiles and outputs tile configuration information, which allows the encoder to isolate transients that appear in a particular channel with small windows, but use large windows in other channels. Using a variety of techniques, the encoder performs flexible multi-channel transforms that effectively take advantage of inter-channel correlation. An audio decoder performs corresponding processing and decoding. In addition, the decoder performs a post-processing multi-channel transform for any of multiple different purposes.