Abstract:
A motion video signal encoder maximizes image quality without exceeding transmission bandwidth available to carry the encoded motion video signal by comparing encoded frames of the motion video signal to a desired size of frame. If the size of encoded frames differ from the desired size, quantization is adjusted to produce encoded frames closer in size to the desired size. In addition, a cumulative bandwidth balance records an accumulated amount of available bandwidth. The cumulative bandwidth balance is adjusted as time elapses to add to the available bandwidth and as each frame is encoded to thereby consume bandwidth. If the cumulative bandwidth balance deviates from a predetermined range, quantization is adjusted as needed to either improve image quality to more completely consume available bandwidth or to reduce image quality to thereby consume less bandwidth. Rapid changes in the amount of change or motion in the motion video signal are detected by comparing the amount of change between two consecutive frames and the amount of change between the next two consecutive frames. Quantization is precompensated according to the measured rapid change. Conditional replenishment is improved by dividing macroblocks into quadrants and measuring differences between corresponding quadrants of macroblocks. As a result, sensitivity to changes along edges and corners of macroblocks is increased. In addition, sensitivity to changes in a particular macroblock is increased when an adjacent macroblock contains sufficient change to be encoded and therefore not a candidate for conditional replenishment.
Abstract:
A software-based encoder is provided for an end-to-end scalable video delivery system that operates over heterogeneous networks. The encoder utilizes a scalable video compression algorithm based on a Laplacian pyramid decomposition to generate an embedded information stream. The decoder decimates a highest resolution original image, e.g., 640.times.480 pixels, to produce an intermediate 320.times.240 pixel image that is decimated to produce an intermediate 160.times.120 pixel image that is compressed to form an encodable base layer 160.times.120 pixel image. A decompressed base layer image is also up-sampled at step to produce an up-sampled 640.times.480 pixel image that is subtracted from the original 640.times.480 pixel image 200 to yield an error image. At the receiving end, the decoder extracts from the embedded stream different streams at different spatial and temporal resolutions. Because decoding requires only additions and look-ups from a small stored table, decoding occurs in real-time.
Abstract:
A method and system for forwarding information such as data files to a recipient across disparate or incompatible communication networks, which are not constrained by incompatible user devices. The sender sends information such as a data file to an intended recipient via a messaging server. The messaging server communicates with the intended recipient using basic communication tools that are generally compatible regardless of the network that the recipient is subscribed to. The messaging server stores the information, creates and sends a notification message to the intended recipient that she has information to be retrieved. The notification message includes a unique access address associated with the message, at which the recipient can retrieve the information. Different unique access addresses are associated with different messages.
Abstract:
A method of interactively providing a number of client computers with a dynamically selectable and scalable range of multimedia data over a diverse computer network including local area networks (LANs) wide area networks (WANs) such as the internet. Multimedia data is provided by a server to the client computers includes a base layer and one or more enhancement layers. Enhancement layers can be spatial and/or temporal in nature. Depending on the implementation, the server may also provide information about the multimedia data to the client computers. The server splits the multimedia data for streaming via multiple multicast group (MMG) addresses. Information about the portion of the multimedia data carried by each MMG is broadcasted to the client computers. Armed with the information about the multimedia data, client computers can intelligently join and leave MMGs as needed. In some embodiments, the client computers provide feedback about the usage and/or need for the multimedia data, enabling the server to right-size, e.g., grow and/or prune, the multimedia data for network efficiency. Enhancement layers may also be grown and/or pruned independently of the base layer, i.e., without a corresponding change in the base layer.
Abstract:
A system for classifying image elements comprising means for converting an image into a series of vectors and a hierarchical lookup table that classifies the vectors. The lookup table implements a pre-computed discrete cosine transform (DCT) to enhance classification accuracy. The hierarchical lookup table includes four stages: three of which constitute a preliminary section; the fourth stage constitutes the final section. Each stage has a respective stage table. The method for designing each stage table comprises a codebook design procedure and a table fill-in procedure. Codebook design for the preliminary stages strives to minimize a classification-sensitive proximity measure; codebook design for the final stage attempts to minimize Bayes risk of misclassification. Table fill-in for the first stage involves generating all possible input combinations, concatenating each possible input combination to define a concatenated vector, applying a DCT to convert the address vector to the spatial frequency domain, finding the closest first-stage codebook vector, and assigning to the address the index associated that codebook vector. Table fill-in for subsequent stages involves decoding each possible input combination to obtain spatial frequency domain vectors, applying an inverse DOC to convert the inputs to pixel domain vectors, concatenating the pixel domain vectors to obtain a higher dimension pixel domain vector, applying a DCT to obtain a spatial frequency domain vector, finding the closest same-stage codebook vector, and assigning the codebook vector index to the input combination.
Abstract:
An image compression system includes a vectorizer and a hierarchical vector quantization table that outputs embedded code. The vectorizer converts an image into image vectors representing respective blocks of image pixels. The table provides computation-free transformation and compression of the image vectors. Table design can be divided into codebook design and fill-in procedures for each stage. Codebook design for the preliminary stages uses a splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure. Codebook design for the final stage uses a greedily-grown and then entropy-pruned tree-structure variation of GLA with an entropy-constrained distortion measure. Table fill-in for all stages uses an unweighted proximity measure for assigning inputs a codebook vectors. Transformations and compression are fast because they are computation free. The hierarchical, multi-stage, character of the table allow it to operate with low memory requirements. The embedded output allows convenient scalability suitable for collaborative video applications over heterogeneous networks.
Abstract:
A transmission method for video image data using an embedded bit stream in a hierarchical table-lookup vector quantizer comprises the steps encoding an image using hierarchical vector quantization and an embedding process to obtain an embedded bit stream for lossless transmission. The bit stream is selectively truncated and decoded to obtain a reconstructed image.
Abstract:
A multimedia compression system for generating frame rate scalable data in the case of video, and, more generally, universally scalable data Universally scalable data is scalable across all of the relevant characteristics of the data. In the case of video, these characteristics include frame rate, resolution, and quality. The scalable data generated by the compression system is comprised of multiple additive layers for each characteristic across which the data is scalable. In the case of video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating each of these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured-scalar quantization for generating the quality layers). The compression system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scalable data.
Abstract:
A cost effective method for generating and delivering scalable multimedia content targeted at specific end user(s) via client computers coupled to servers by a diverse computer network which includes local area networks (LANs) and/or wide area networks (WANs) such as the internet. In one embodiment in which the server is billed for network bandwidth consumed, upon receiving an end user request for multimedia content, the server computes the likelihood of patronage. Indicators useful for estimating the likelihood of patronage include regularity of patronage, income history, credit worthiness, age, hobbies, occupation and marital status. A cost effective bandwidth is selected for delivering the requested content. Such an arrangement is advantageous because the content is delivered to end user at a bandwidth corresponding to the probability of consummating a sale.
Abstract:
The production of synchronization scripts and associated annotated multimedia streams for servers and client computers coupled to each other by a diverse computer network which includes local area networks (LANs) and/or wide area networks (WANs) such as the internet. Annotated multimedia streams can include a compressed video stream for display in a video window, an accompanying compressed audio stream and annotations. Synchronization scripts include annotation streams for synchronizing the display of video streams with annotations, e.g., displayable events, such textual/graphical data in the form of HTML pages with Java applets to be displayed in one or more event windows. The producer includes a capture module and an author module for capturing video streams and generating annotation streams, respectively. The capture module compresses the video stream using a suitable compression format. Annotation streams include annotation frames which provide either pointer(s) to the event(s) of interest or include displayable data embedded within the annotation stream. Accordingly, each annotation frame includes either an event locator or an event data. In addition, each annotation frame includes an event time marker which corresponds to the time stamp(s) of associated video frame(s) within the video stream. Embedded displayable data include ticker tape data embedded within the annotation stream. Examples of event locators to displayable events include URL addresses pointing to HTML web pages. The video/audio streams and annotation streams are stored in stream server(s) for subsequent retrieval by client computer(s) in a coordinated manner, so that the client computer(s) is able to synchronously display the video frames and displayable event(s) in a video window and event window(s), respectively. In one implementation, annotation streams include a flipper stream for locating HTML pages and a ticker stream which include ticker (tape) data.