Abstract:
Described are techniques to use adaptive learning to control bandwidth or rate of transmission of a computer on a network. Congestion observations such as packet delay and packet loss are used to compute a congestion signal. The congestion signal is correlated with information about actual congestion on the network, and the transmission rate is adjusted according to the degree of correlation. Transmission rate may not adjust when packet delay or packet loss is not strongly correlated with actual congestion. The congestion signal is adaptively learned. For instance, the relative effects of loss and delay on the congestion signal may change over time. Moreover, an operating congestion level may be minimized by adaptive adjustment.
Abstract:
An audio encoder/decoder uses a combination of an overlap windowing transform and block transform that have reversible implementations to provide a reversible, integer-integer form of a lapped transform. The reversible lapped transform permits both lossy and lossless transform domain coding of an audio signal having variable subframe sizes.
Abstract:
An encoder performs context-adaptive arithmetic encoding of transform coefficient data. For example, an encoder switches between coding of direct levels of quantized transform coefficient data and run-level coding of run lengths and levels of quantized transform coefficient data. The encoder can determine when to switch between coding modes based on a pre-determined switch point or by counting consecutive coefficients having a predominant value (e.g., zero). A decoder performs corresponding context-adaptive arithmetic decoding.
Abstract:
An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.
Abstract:
An audio encoder receives multi-channel audio data comprising a group of plural source channels and performs channel extension coding, which comprises encoding a combined channel for the group and determining plural parameters for representing individual source channels of the group as modified versions of the encoded combined channel. The encoder also performs frequency extension coding. The frequency extension coding can comprise, for example, partitioning frequency bands in the multi-channel audio data into a baseband group and an extended band group, and coding audio coefficients in the extended band group based on audio coefficients in the baseband group. The encoder also can perform other kinds of transforms. An audio decoder performs corresponding decoding and/or additional processing tasks, such as a forward complex transform.
Abstract:
An audio encoder performs adaptive entropy encoding of audio data. For example, an audio encoder switches between variable dimension vector Huffman coding of direct levels of quantized audio data and run-level coding of run lengths and levels of quantized audio data. The encoder can use, for example, context-based arithmetic coding for coding run lengths and levels. The encoder can determine when to switch between coding modes by counting consecutive coefficients having a predominant value (e.g., zero). An audio decoder performs corresponding adaptive entropy decoding.
Abstract:
An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.
Abstract:
Construction and use of forward error correction codes is provided. A systematic MDS FEC code is obtained having a property wherein any set of contiguous or non-contiguous r packets can be lost during a data transmission of k data packets and r encoded packets and the original k packets can be recovered unambiguously. The systematic MDS FEC code is transformed into a (k+r, k) systematic MDS FEC code that guarantees at least one of the encoded packets is a parity packet. The starting systematic MDS FEC code may be Cauchy-based, and the transformation code derived from the starting Cauchy-based MDS FEC code allows for very efficient initialization, encoding and decoding operations.
Abstract:
A system and method for correcting errors and losses occurring during a receiver-driven layered multicast (RLM) of real-time media over a heterogeneous packet network such as the Internet. This is accomplished by augmenting RLM with one or more layers of error correction information. This allows each receiver to separately optimize the quality of received audio and video information by subscribing to at least one error correction layer. Ideally, each source layer in a RLM would have one or more multicasted error correction data streams (i.e., layers) associated therewith. Each of the error correction layers would contain information that can be used to replace lost packets from the associated source layer. More than one error correction layer is proposed as some of the error correction packets contained in the data stream needed to replace the packets lost in the associated source stream may themselves be lost in transmission. A preferred process for generating the error correction streams involves the use of a unique adaptation of the Forward Error Correction (FEC) techniques. This process encodes the transmission data using a linear transform which adds redundant elements. The redundancy permits losses to be corrected because any of the original data elements can be derived from any of the encoded elements. Thus, as long as enough of the encoded data elements are received so as to equal the number of the original data elements, it is possible to derive all the original elements.
Abstract:
Techniques and tools for adjusting quality and bit rate of multiple chunks of media delivered over a network are described. For example, each of the multiple chunks is encoded as multiple layers (e.g., a base layer and multiple embedded residual layers) for fine-grained scalability at different rate/quality points. A server stores the encoded data for the layers of chunks as well as curve information that parameterizes rate-distortion curves for the chunks. The server sends the curve information to a client. For the multiple chunks, the client uses the curve information to determine rate-distortion preferences for the respective chunks, then sends feedback indicating the rate-distortion preferences to the server. For each of the multiple chunks, the server, based at least in part upon the feedback, selects one or more scalable layers of the chunk to deliver to the client.