Abstract:
In one embodiment, a longitudinal camera array is rotated through a capture cylinder, with each camera in the array capturing multiple images as the array rotates. These images can be looking outward along the radials of the cylinder, or alternatively looking tangential to the cylinder. The longitudinal camera array allows the surrounding scene to be captured from multiple different planes that are substantially parallel to the ends of the capture cylinder, allowing for more accurate subsequent rendering of the scene. A view of the scene can be subsequently rendered by determining a location and direction of view of an observer, and then selecting one or more of the multiple lateral and longitudinally adjacent capture images, as well as one or more pixels within that capture image(s), to use to determine a display value for the pixel.
Abstract:
In one embodiment, a longitudinal camera array is rotated through a capture cylinder, with each camera in the array capturing multiple images as the array rotates. These images can be looking outward along the radials of the cylinder, or alternatively looking tangential to the cylinder. The longitudinal camera array allows the surrounding scene to be captured from multiple different planes that are substantially parallel to the ends of the capture cylinder, allowing for more accurate subsequent rendering of the scene. A view of the scene can be subsequently rendered by determining a location and direction of view of an observer, and then selecting one or more of the multiple lateral and longitudinally adjacent capture images, as well as one or more pixels within that capture image(s), to use to determine a display value for the pixel.
Abstract:
Described is a communication mechanism that provides push-to-talk functionality for mobile and desktop computing environments. Mobile and desktop computers are configured as client computers in a client/server architecture. Some of the client computers are configured to handle multiple push-to-talk sessions simultaneously. If multiple streams from different sessions are active at the same time, the client computer may determine which of these overlapped streams to record and then record them for later playback. A server handles the registration of the client computers, manages the multiple sessions for each of the client computers, and performs a floor control process so that each push-to-talk session operates in a half-duplex mode.
Abstract:
Generation, coding and transmission of an effective video form, scalable portrait video. As an expansion to bi-level video, portrait video is composed of more gray levels, and therefore possesses higher visual quality while it maintains a low bit rate and low computational costs. Portrait video is a scalable video in that each video with a higher level always contains all the information of the video with a lower level. The bandwidths of 2-4 level portrait videos fit into the bandwidth range of 20-40 Kbps that GPRS and CDMA 1X can stably provide. Therefore, portrait video is very promising for video broadcast and communication on 2.5 G wireless networks. With portrait video technology, this system and method is the first to enable two-way video conferencing on Pocket PCs and Handheld PCs.
Abstract:
Interactive multi-view video presents new types of video capture systems, video formats, video compression algorithms, and services. Many video cameras are allocated to capture an event from various related locations and directions. The captured videos are compressed in control PCs and are sent to a server in real-time. Users can subscribe to a new type of service that allows users to connect to the servers and receive multi-view videos interactively. In one embodiment of the invention, an automatic pattern-free calibration tool is employed to calibrate the multiple cameras. In contrast with a pattern-based method which uses the correspondences between image points and pattern points, the pattern-free calibration method is based on the correspondences between image points from different views.
Abstract:
The described systems and methods can be used to estimate the global distributed kernel density without prior information of data using a gossip based method. In the gossip based method, a node in a distributed network periodically selects and exchanges kernels with a random node in the network. After exchanging, both the initiating and the target node use the received kernels to update their local estimates. In addition, a data reduction method can be used to optimize the size of the kernel set at each node.
Abstract:
A streaming media codec may include a collection of media stream processing modules arranged into a processing graph. One or more of the modules may perform a Fourier-related transform, and a significant fraction of media stream processing may occur post-transform. The media stream may be considered as a sequence of processing blocks, and post-transform processing blocks contain transform coefficients. Such transform coefficients are amenable to classification into processing classes. Some processing classes may require significantly less processing effort than others by post-transform processing modules. Such transform coefficient classes may be efficiently specified, for example, with coefficient bounding rectangles, and the specification provided to one or more post-transform streaming media processing modules to enable the modules to allocate their processing resources more effectively. Streaming media processing modules making effective use of the transform coefficient class information, and streaming media codecs that incorporate them, are called transform coefficient bounding (TCB) enhanced.
Abstract:
Described is fast motion estimation based upon epipolar geometry, which can be used in compressing multi-view video. An epipolar line is computed based on a point (e.g., a centroid point) in a macroblock to be predicted, and a temporary starting point in an image is determined, such as a median predicted search center. A search starting point is further determined based on the temporary starting point and the epipolar line, e.g., a point on the epipolar line corresponding to an intersecting line that is projected orthogonally from the temporary point to the epipolar line. A motion estimation mechanism searches the search space to produce a motion vector. The search may be conducted starting at the search starting point in a reduced search area located around the epipolar line, e.g., a local diamond search and/or rotated unsymmetrical rood-pattern search.
Abstract:
In a first exemplary media implementation, one or more processor-accessible media include processor-executable instructions that, when executed, direct a device to perform actions including: comparing an accuracy indicator to at least one threshold, the accuracy indicator corresponding to a reference macroblock selected for a target macroblock; ascertaining a refinement case from multiple refinement cases based on the comparing, each refinement case of the multiple refinement cases defining multiple test points in relation to the reference macroblock; and analyzing the ascertained refinement case with regard to the target macroblock. In a second exemplary media implementation, one or more media include processor-executable instructions that, when executed, direct a device to perform actions including: determining if two chrominance sums have magnitudes that are each less than a first product and four luminance sums have magnitudes that are each less than a second product; and if so, forwarding all zero values for a macroblock.
Abstract:
Systems and methods for video communication are described. In one aspect, network bandwidth conditions are estimated. Bi-level or full-color video is then transmitted over the network at transmission bit rates that are controlled as a function of the estimated bandwidth conditions. To this end, network bandwidth capability is periodically probed to identify similar, additional, or decreased bandwidth capabilities as compared to the estimated bandwidth conditions. Decisions to hold, decrease, or increase the video transmission bit rate are made based on the estimated bandwidth conditions in view of the probing operations. When the transmission bit rate is increased or decreased, the transmission bit rate is calculated to target an upper or lower bit rate, both of which are indicated by the estimated bandwidth conditions. Bi-level video communication is switched to full-color video transmission, or vice versa, when the video transmission bit rate respectively reaches the upper bit rate or the lower bit rate.