Abstract:
Techniques are provided for dynamically adapting the view from a conference endpoint that includes a presentation apparatus, such as a whiteboard. A first signal is received that includes a video signal derived from a video camera that is viewing a room during a conference session in which a person is presenting information on a presentation apparatus. During the video conference, switching is performed between the first signal and a second signal representing content being displayed on the presentation apparatus during the conference session for output and transmission to other conference endpoints of the conference session. The determination as to whether to supply the first signal (for a normal view of the conference room) or the second signal may be based on a position determination of the presenter or may be instead be based on an external view selection command received from another conference endpoint participating in the conference session.
Abstract:
Techniques are provided for establishing a videoconference session between participants at different endpoints, where each endpoint includes at least one computing device and one or more displays. A plurality of video streams is received at an endpoint, and each video stream is classified as at least one of a people view and a data view. The classified views are analyzed to determine one or more regions of interest for each of the classified views, where at least one region of interest has a size smaller than a size of the classified view. Synthesized views of at least some of the video streams are generated, wherein the synthesized views include at least one view including a region of interest, and views including the synthesized views are rendered at one or more displays of an endpoint device.
Abstract:
Video frames are captured at one or more cameras during a video conference session, where each video frame includes a digital image with a plurality of pixels. Depth values associated with each pixel are determined in at least one video frame, where each depth value represents a distance of a portion of the digital image represented by at least one corresponding pixel from the one or more cameras that capture the at least one video frame. Luminance values of pixels are adjusted within captured video frames based upon the depth values determined for the pixels so as to achieve relighting of the video frames as the video frames are displayed during the video conference session.
Abstract:
Video frames are captured at one or more cameras during a video conference session, where each video frame includes a digital image with a plurality of pixels. Depth values associated with each pixel are determined in at least one video frame, where each depth value represents a distance of a portion of the digital image represented by at least one corresponding pixel from the one or more cameras that capture the at least one video frame. Luminance values of pixels are adjusted within captured video frames based upon the depth values determined for the pixels so as to achieve relighting of the video frames as the video frames are displayed during the video conference session.
Abstract:
A method for transition control in a videoconference comprises receiving a plurality of video streams from a plurality of cameras, displaying a first video stream of the plurality of video streams, detecting a stream selection event for display of a second video stream of the plurality of video streams, determining a transition category for a transition from the first video stream to the second video stream, and selecting a display transition based on the transition category for displaying the transition from the first video stream to the second video stream.
Abstract:
A video coder includes a forward coder and a reconstruction module determining a motion compensated predicted picture from one or more previously decoded pictures in a multi-picture store. The reconstruction module includes a reference picture predictor that uses only previously decoded pictures to determine one or more predicted reference pictures. The predicted reference picture(s) are used for motion compensated prediction. The reference picture predictor may include optical flow analysis that uses a current decoded picture and that may use one or more previously decoded pictures together with affine motion analysis and image warping to determine at least a portion of at least one of the reference pictures.
Abstract:
Techniques are provided for establishing a videoconference session between participants at different endpoints, where each endpoint includes at least one computing device and one or more displays. A plurality of video streams is received at an endpoint, and each video stream is classified as at least one of a people view and a data view. The classified views are analyzed to determine one or more regions of interest for each of the classified views, where at least one region of interest has a size smaller than a size of the classified view. Synthesized views of at least some of the video streams are generated, wherein the synthesized views include at least one view including a region of interest, and views including the synthesized views are rendered at one or more displays of an endpoint device.
Abstract:
A video coder includes a forward coder and a reconstruction module determining a motion compensated predicted picture from one or more previously decoded pictures in a multi-picture store. The reconstruction module includes a reference picture predictor that uses only previously decoded pictures to determine one or more predicted reference pictures. The predicted reference picture(s) are used for motion compensated prediction. The reference picture predictor may include optical flow analysis that uses a current decoded picture and that may use one or more previously decoded pictures together with affine motion analysis and image warping to determine at least a portion of at least one of the reference pictures.
Abstract:
Clock synchronization for an acoustic echo canceller (AEC) with a speaker and a microphone connected over a digital link may be provided. A clock difference may be estimated by analyzing the speaker signal and the microphone signal in the digital domain. The clock synchronization may be combined in both hardware and software. This synchronization may be performed in two stages, first with coarse synchronization in hardware, then fine synchronization in software with, for example, a re-sampler.
Abstract:
Techniques are provided for dynamically adapting the view from a conference endpoint that includes a presentation apparatus, such as a whiteboard. A first signal is received that includes a video signal derived from a video camera that is viewing a room during a conference session in which a person is presenting information on a presentation apparatus. During the video conference, switching is performed between the first signal and a second signal representing content being displayed on the presentation apparatus during the conference session for output and transmission to other conference endpoints of the conference session. The determination as to whether to supply the first signal (for a normal view of the conference room) or the second signal may be based on a position determination of the presenter or may be instead be based on an external view selection command received from another conference endpoint participating in the conference session.