Abstract:
An apparatus and method for encoding video frames is provided. The video frames are divided into blocks for encoding. Encoding of the video blocks utilizes motion detection, motion estimation and adaptive compression, to obtain the desired compression for a particular bit rate. Adaptive compression includes intra compression (without regard to other frames) and inter compression (with regard to other frames). Intra compression, inter compression with motion detection, and inter compression with motion estimation are performed on a block by block basis, as needed. Segmentation is provided to compare encoding of a block with encoding of its sub-blocks, and to select the best block size for encoding.
Abstract:
Frequency segmentation is important to the quality of encoding spectral data. Segmentation involves breaking the spectral data into units called sub-bands or vectors. Homogeneous segmentation may be suboptimal. Various features are described for providing spectral data intensity dependent segmentation. Finer segmentation is provided for regions of greater spectral variance and coarser segmentation is provided for more homogeneous regions. Sub-bands which have similar characteristics may be merged with very little effect on quality, whereas sub-bands with highly variable data may be better represented if a sub-band is split. Various methods are described for measuring tonality, energy, or shape of a sub-band. These various measurements are discussed in light of making decisions of when to split or merge sub-bands to provide variable frequency segmentation.
Abstract:
For encoding of mixed-mode images containing text and continuous-tone content, the pixels in the image that form the text content are detected and separated. Text detection classifies pixels as text or continuous tone content by accumulating pixel counts for groups of contiguous, non-smooth pixels with the same color. Groups whose pixel count exceeds a threshold are classified as text. The text detection technique further reduces classification errors by testing for boundary dimensions and pixel density of the group characteristic of long straight lines or large borders. The text detection technique further searches the neighborhood of groups qualifying as text for pixels of the same color, so as to also detect pixels for isolated text marks like dots, accents or punctuation. The separated text and continuous-tone content can be encoded separately for efficient compression while preserving text quality, and the text again superimposed on the continuous tone content at decompression.
Abstract:
An apparatus and method for encoding video frames is provided. The video frames are divided into blocks for encoding. Encoding of the video blocks utilizes motion detection, motion estimation and adaptive compression, to obtain the desired compression for a particular bit rate. Adaptive compression includes intra compression (without regard to other frames) and inter compression (with regard to other frames). Intra compression, inter compression with motion detection, and inter compression with motion estimation are performed on a block by block basis, as needed. Segmentation is provided to compare encoding of a block with encoding of its sub-blocks, and to select the best block size for encoding.
Abstract:
An audio encoder receives multi-channel audio data comprising a group of plural source channels and performs channel extension coding, which comprises encoding a combined channel for the group and determining plural parameters for representing individual source channels of the group as modified versions of the encoded combined channel. The encoder also performs frequency extension coding. The frequency extension coding can comprise, for example, partitioning frequency bands in the multi-channel audio data into a baseband group and an extended band group, and coding audio coefficients in the extended band group based on audio coefficients in the baseband group. The encoder also can perform other kinds of transforms. An audio decoder performs corresponding decoding and/or additional processing tasks, such as a forward complex transform.
Abstract:
The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).
Abstract:
A low computational power digital audio player achieves beat continuous transitioning between digital audio pieces based on beat metadata, which can be generated via offline processing on a higher computational power computer or via background or idle processing on the digital audio player. The digital audio player produces playlists of beat matching compatible songs based on the metadata, or pick lists of songs that are beat matching compatible with a currently playing song. By facilitating selection of songs with beat matching compatible tempos based on metadata, the beat continuous transitions can be achieved without altering the beat tempo of digital audio pieces, or with simple resampling.
Abstract:
A video encoding system encodes video streams for multiple bit rate video streaming using an approach that permits the encoded bit rate to vary subject to a peak bit rate and average bit rate constraints for higher quality streams, while a bottom bit rate stream is encoded to achieve a constant chunk rate. The video encoding system also dynamically decides an encoding resolution for segments of the multiple bit rate video streams that varies with video complexity so as to achieve a better visual experience for multiple bit rate streaming.
Abstract:
A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.
Abstract:
An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.