Abstract:
A videoconferencing endpoint includes at least one processor a number of microphones and at least one camera. The endpoint can receive audio information and visual motion information during a teleconferencing session. The audio information includes one or more angles with respect to the microphone from a location of a teleconferencing session. The audio information is evaluated automatically to determine at least one candidate angle corresponding to a possible location of an active talker. The candidate angle can be analyzed further with respect to the motion information to determine whether the candidate angle correctly corresponds to person who is speaking during the teleconferencing session.
Abstract:
A conference apparatus reduces or eliminates noise in audio for endpoints in a conference. Endpoints in the conference are designated as a primary talker and as secondary talkers. Audio for the endpoints is processed with speech detectors to characterize the audio as speech or not and to determine energy levels of the audio. As the audio is written to buffers and then read from the buffers, decisions for the gain settings of faders for read audio of the endpoints being combined in the speech selective mix. In addition, the conference apparatus can mitigate the effects of a possible speech collision that may occur during the conference between endpoints.
Abstract:
Examples hybrid topologies of a conferencing system are disclosed. An example of a hybrid topology may comprise a plurality of endpoints and a central entity. Each of said plurality of endpoints may provide its primary video stream and audio stream to said centralized entity. The centralized entity provides the primary speaker stream and the mixed audio stream to each of said plurality of endpoint participants. In addition, some of plurality of endpoint establishes low bandwidth/low resolution media streams with other of said plurality of endpoint participants for non-speaker video.
Abstract:
An endpoint optimizes bandwidth by initiating a peer-to-peer conference with a plurality of remote devices, generating a first quality list comprising a first device of the plurality of remote devices from which to receive a first data stream at a first quality level, transmit a request to the first device to receive the first data stream at the first quality level, determining that a second device of the plurality of remote devices is not a member of the first quality list, and in response to determining that the second device of the plurality of remote devices is not a member of the first quality list, transmitting a request to the second device to receive a second data stream at a second quality level.
Abstract:
A videoconferencing system has a videoconferencing unit that use portable devices as peripherals for the system. The portable devices obtain near-end audio and send the audio to the videoconferencing unit via a wireless connection. In turn, the videoconferencing unit sends the near-end audio from the loudest portable device along with near-end video to the far-end. The portable devices can control the videoconferencing unit and can initially establish the videoconference by connecting with the far-end and then transferring operations to the videoconferencing unit. To deal with acoustic coupling between the unit's loudspeaker and the portable device's microphone, the unit uses an echo canceller that is compensated for differences in the clocks used in the ND and D/A converters of the loudspeaker and microphone.
Abstract:
A videoconferencing unit comprises a display screen configured to display a video data stream comprising images of a far end participant. A processor is adapted to decode the video data stream and generate a modified region of the video data stream. The modified region of the video data stream is displayed on the display screen at a location where images of eyes of the far end participant are displayed on the display screen. A camera is configured with a lens to capture images of a near end participant through the modified region of the video data stream, with at least a portion of the lens positioned within the modified region of the video data stream.
Abstract:
A system and method is disclosed for adapting a continuous presence videoconferencing layout according to interactions between conferees. Using regions of interest found in video images, the arrangement of images of conferees may be dynamically arranged as displayed by endpoints. Arrangements may be responsive to various metrics, including the position of conferees in a room and dominant conferees in the videoconference. Video images may be manipulated as part of the arrangement, including cropping and mirroring the video image. As interactions between conferees change, the layout may be automatically rearranged responsive to the changed interactions.
Abstract:
A novel universal bridge (UB) can handle and conduct multimedia multipoint conferences between a plurality of MREs and LEPs without using an MRM, an MCU and a gateway. Further, a UB can be configured to allocate and release resources dynamically according to the current needs of each conferee and the session.
Abstract:
A multipoint communication system uses Internet protocol trunking to facilitate communication between media control units (for sending and receiving multipoint communication signals between end-point devices), a media gateway (for translating between non-Internet protocol multipoint communication signals and Internet protocol communication signals), and a controller (for establishing and controlling a multipoint communication session between the end-point devices). In addition, a multimedia gateway (for use in a multipoint communication system) is described that incorporates an interactive voice response unit through which users of non-Internet protocol devices (connected to the multimedia gateway) interact to establish a communication session with a multipoint communication system.
Abstract:
In accordance with the present invention, a system and method for computing a location of an acoustic source is disclosed. The method includes steps of processing a plurality of microphone signals in frequency space to search a plurality of candidate acoustic source locations for a maximum normalized signal energy. The method uses phase-delay look-up tables to efficiently determine phase delays for a given frequency bin number k based upon a candidate source location and a microphone location, thereby reducing system memory requirements. Furthermore, the method compares a maximum signal energy for each frequency bin number k with a threshold energy Et(k) to improve accuracy in locating the acoustic source.