Abstract:
Methods and apparatus for processing video data that is divided into frames are presented. In one aspect, this relates to a method for processing video data that is divided into frames. The video data includes a current frame, which has an associated current macroblock, and an adjacent frame, which also has an associated macroblock. The method for processing video data involves obtaining an uncompressed current block that is a part of the current macroblock and an adjacent block that is part of the adjacent macroblock, and calculating a distance between the uncompressed current block and the adjacent block. It is determined whether the distance between the uncompressed current block and the adjacent block is acceptable. If the distance is unacceptable, then the motion between the uncompressed current block and the adjacent block is estimated, and the uncompressed current block is adaptively compressed.
Abstract:
The present invention relates, in one aspect, to a method for processing video data that is divided into frames. The video data includes a current frame, which has an associated current macroblock, and an adjacent frame, which also has an associated macroblock. The method for processing video data involves obtaining an uncompressed current block that is a part of the current macroblock and an adjacent block that is part of the adjacent macroblock, and calculating a distance between the uncompressed current block and the adjacent block. It is determined whether the distance between the uncompressed current block and the adjacent block is acceptable. If the distance is unacceptable, then the current block is adoptively compressed.
Abstract:
A method of interactively providing a number of client computers with a dynamically selectable and scalable range of multimedia data over a diverse computer network including local area networks (LANs) wide area networks (WANs) such as the internet. Multimedia data is provided by a server to the client computers includes a base layer and one or more enhancement layers. Enhancement layers can be spatial and/or temporal in nature. Depending on the implementation, the server may also provide information about the multimedia data to the client computers. The server splits the multimedia data for streaming via multiple multicast group (MMG) addresses. Information about the portion of the multimedia data carried by each MMG is broadcasted to the client computers. Armed with the information about the multimedia data, client computers can intelligently join and leave MMGs as needed. In some embodiments, the client computers provide feedback about the usage and/or need for the multimedia data, enabling the server to right-size, e.g., grow and/or prune, the multimedia data for network efficiency. With right sizing, the content of the base layer may be increased or decreased with the corresponding growing and pruning of the enhancement layers. Enhancement layers may also be grown and/or pruned independently of the base layer, i.e., without a corresponding change in the base layer.
Abstract:
A multimedia compression system for generating frame rate scaleable data in the case of video, and, more generally, universally scaleable data. Universally scaleable data is scaleable across all of the relevant characteristics of the data. In the case of video, these characteristics include frame rate, resolution, and quality. The scaleable data generated by the compression system is comprised of multiple additive layers for each characteristic across which the data is scaleable. In the case of video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating each of these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured scalar quantization for generating the quality layers). The compression system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scaleable data.
Abstract:
An efficient transmission protocol for transmitting multimedia streams from a server to a client computer over a diverse computer network including local area networks (LANs) and wide area networks (WANs) such as the internet. The client computer includes a playout buffer, and the transmission rate is dynamically matched to the available bandwidth capacity of the network connection between the server and the client computer. If a playtime of the playout buffer, which is one measure of the number of data packets currently in the playout buffer, drops below a dynamically computed Decrease_Bandwidth (DEC_BW) threshold, then the transmission rate is decreased by sending a DEC_BW message to the server. Conversely, if the number of packets remaining in the playout buffer rises above a dynamically computed Upper Increase_Bandwidth (INC_BW) threshold and does not drop below a Lower INC_BW threshold for at least an INC_BW wait period, then the transmission rate is incremented. The transmission rate can be selected from among a predetermined set of discrete bandwidth values or from within a continuous range of bandwidth values. In one variation, in addition to responding to changes in network connection capacity, the client computer also determines an average client computational capacity. Accordingly, if the average client computational capacity is less than the network capacity, the lower of the two capacities is the determining one, thereby avoiding a playout buffer overrun.
Abstract:
An image compression system includes a vectorizer and a hierarchical vector quantization table that outputs embedded code. The vectorizer converts an image into image vectors representing respective blocks of image pixels. The table provides computation-free transformation and compression of the image vectors. Table design can be divided into codebook design and fill-in procedures for each stage. Codebook design for the preliminary stages uses a splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure. Codebook design for the final stage uses a greedily-grown and then entropy-pruned tree-structure variation of GLA with an entropy-constrained distortion measure. Table fill-in for all stages uses an unweighted proximity measure for assigning inputs to codebook vectors. Transformations and compression are fast because they are computation free. The hierarchical, multi-stage, character of the table allow it to operate with low memory requirements. The embedded output allows convenient scalability suitable for collaborative video applications over heterogeneous networks.
Abstract:
A multimedia compression system for generating frame rate scaleable data in the case of universally scaleable data. Universally scaleable data is scaleable across all of the relevant characteristics of the data (e.g., frame rate, resolution, and quality for video). The scaleable data generated by the compression system includes multiple additive layers for each characteristic across which the data is scaleable. For video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured scalar quantization for generating the quality layers). The system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scaleable data.
Abstract:
A multimedia compression system for generating frame rate scalable data in the case of video, and, more generally, universally scalable data. Universally scalable data is scalable across all of the relevant characteristics of the data. In the case of video, these characteristics include frame rate, resolution, and quality. The scalable data generated by the compression system is comprised of multiple additive layers for each characteristic across which the data is scalable. In the case of video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating each of these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured scalar quantization for generating the quality layers). The compression system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scalable data.
Abstract:
The production of synchronization scripts and associated annotated multimedia streams for servers and client computers coupled to each other by a diverse computer network which includes local area networks (LANs) and/or wide area networks (WANs) such as the intermet. Annotated multimedia streams can include a compressed video stream for display in a video window, an accompanying compressed audio stream and annotations. Synchronization scripts include annotation streams for synchronizing the display of video streams with annotations, e.g., displayable events, such textual/graphical data in the form of HTML pages with Java applets to be displayed in one or more event windows. The producer includes a capture module and an author module for capturing video streams and generating annotation streams, respectively. The capture module compresses the video stream using a suitable compression format. Annotation streams include annotation frames which provide either pointer(s) to the event(s) of interest or include displayable data embedded within the annotation stream. Accordingly, each annotation frame includes either an event locator or an event data. In addition, each annotation frame includes an event time marker which corresponds to the time stamp(s) of associated video frame(s) within the video stream. Embedded displayable data include ticker tape data embedded within the annotation stream. Examples of event locators to displayable events include URL addresses pointing to HTML web pages. The video/audio streams and annotation streams are stored in stream server(s) for subsequent retrieval by client computer(s) in a coordinated manner, so that the client computer(s) is able to synchronously display the video frames and displayable event(s) in a video window and event window(s), respectively. In one implementation, annotation streams include a flipper stream for locating HTML pages and a ticker stream which include ticker (tape) data.
Abstract:
Client computer(s) retrieve and display synchronized annotated multimedia streams from servers dispersed over a diverse computer network which includes local area networks (LANs) and/or wide area networks (WANs) such as the internet. Multimedia streams provided to the client computer(s) can include a compressed video stream for display in a video window and an accompanying compressed audio stream. Annotations, i.e., displayable events, include textual/graphical data in the form of HTML pages with Java applets to be displayed in one or more event windows. The video/audio and annotation streams are produced and then stored in stream server(s). Annotation streams include annotation frames which provide either pointer(s) to the event(s) of interest or include displayable data embedded within the annotation stream. Accordingly, each annotation frame includes either an event locator or an event data. In addition, each annotation frame includes an event time marker which corresponds to the time stamp(s) of associated video frame(s) within the video stream. Examples of embedded displayable data include ticker tape data embedded within the annotation stream. Examples of event locators to displayable events include URL addresses pointing to HTML web pages. Video/audio streams and annotation streams are provided by the stream server(s) to the client computer(s) in a coordinated manner, so that the client computer(s) is able to synchronously display the video frames and displayable event(s) in a video window and event window(s), respectively.