Abstract:
Disclosed herein are methods, systems, and devices for improved audio, video, and data conferencing. The present invention provides a conferencing system comprising a plurality of endpoints communicating data including audio data and control data according to a communication protocol. A local conference endpoint may control or be controlled by a remote conference endpoint. Data comprising control signals may be exchanged between the local endpoint and remote endpoint via various communication protocols. In other embodiments, the present invention provides for improved bridge architecture for controlling functions of conference endpoints including controlling functions of the bridge.
Abstract:
In accordance with the present invention, a system and method for computing a location of an acoustic source is disclosed. The method includes steps of processing a plurality of microphone signals in frequency space to search a plurality of candidate acoustic source locations for a maximum normalized signal energy. The method uses phase-delay look-up tables to efficiently determine phase delays for a given frequency bin number k based upon a candidate source location and a microphone location, thereby reducing system memory requirements. Furthermore, the method compares a maximum signal energy for each frequency bin number k with a threshold energy Et(k) to improve accuracy in locating the acoustic source.
Abstract:
A system and method for adjusting a video bit rate (VBR) over a network includes reducing the VBR if the network incurs a packet loss (PL) that is greater than a PL threshold increasing the VBR if the PL is less than or equal to the PL threshold over a maximum integer number of time intervals and increasing the maximum integer number of time intervals if the PL is greater than the packet loss threshold at the increased VBR. In addition, the VBR is increased over consecutive time intervals until a maximum video bit rate is restored, if the PL over each consecutive time interval is less than or equal to the packet loss threshold.
Abstract:
Disclosed are methods and systems for multipoint videoconferencing. A Media Relay MCU (MRM) receives compressed media (audio, video, and/or data) from a plurality of endpoints participating in a video conferencing session. For a given endpoint, the MRM selects which of other endpoints to display in a CP layout at the given endpoint. The MRM transmits the compressed media from the selected endpoints to the given endpoint to be presented in the CP layout. The MRM does not decode the compressed media.
Abstract:
A wireless LAN can be used to support audio communication sessions between wireless communication devices and wired communication devices both configured to operate according to the Internet Protocol. Both the wired and wireless communication devices generate and transmit frames of voice information over the LAN to each other and in the process of generating these frames they place a timestamp in each frame that is used by a receiving communications device to determine when the frame should be played in relationship to all of the other frames of voice information it receives. At times these communication devices can place incorrect timestamp values in the frames of audio information which can affect the quality of the communication experience for a user. I propose to correct any incorrect timestamp values by first recognizing that a timestamp value is incorrect and then rounding the value to the nearest frame boundary.
Abstract:
A method and apparatus for detecting a singing frequency in a signal processing system using two neural-networks is disclosed. The first one (a hit neural network) monitors the maximum spectral peak FFT bin as it changes with time. The second one (change neural network) monitors the monotonic increasing behavior. The inputs to the neural-networks are the maximum spectral magnitude bin and its rate of change in time. The output is an indication whether howling is likely to occur and the corresponding singing frequency. Once the singing frequency is identified, it can be suppressed using any one of many available techniques such as notch filters. Several improvements of the base method or apparatus are also disclosed, where additional neural networks are used to detect more than one singing frequency.
Abstract:
Dynamically adapting a continuous presence (CP) layout in a videoconference enhances a videoconferencing experience by providing optimum visibility to regions of interest within the CP layout and ignoring regions of no interest. Based on the CP layout, a CP video image can be built, in which a conferee at a receiving endpoint can observe, simultaneously, several other participants' sites in the conference. For example, more screen space within the CP layout is devoted to presenting the participants in the conference and little or no screen space is used to present an empty seat, an empty room, or an unused portion of a room. Aspect ratios of segments of the CP layout (e.g., landscape vs. portrait) can be adjusted to optimally present the regions of interest. The CP layout can be adjusted as regions of interest change depending on the dynamics of the video conference.
Abstract:
An association of videoconferencing services is disclosed that enables two or more videoconferences to be generated with each videoconference running independently from the others and having its own conferees. The association is achieved by having at least one conferee (a traveler) that can “move” or “travel” from one videoconference to another in the association. The one or more travelers belong to the association and not to any particular videoconference. In exemplary embodiments, the traveler can choose between the various associated videoconferences by making a selection to a multipoint control unit (MCU) that controls the associated videoconferences.
Abstract:
A method for efficiently accessing pertinent information retrieved from a videoconferencing system call log. System call logs typically contain a chronological list of raw information pertaining to inbound and outbound videoconferencing calls. The method for efficiently accessing this chronological information is performed using input from the user at an endpoint to correlate and sort for display the information required at the current time. Videoconferencing systems typically are shared use or community type devices and the method of this disclosure allows for more user friendly access to pertinent information. Auto population of a speed dial list or associating a smart tag with the retrieved information is another possible feature to aid the end user. This method will allow a business to more efficiently use a limited number of videoconferencing systems amongst diverse groups of users with diverse calling needs.
Abstract:
The system disclosed herein seeks to solve the problem of unnatural, dizzy camera motion when moving a camera between two preset positions, e.g., head and shoulder shots of two videoconference participants. Particularly, the system described herein attempts to mimic the camera movement that a professional camera operator would use in a professional video production. A preferred method of moving the camera between two positions is to first zoom out, away from the first position, pan across to the next position, then zoom in. This gives viewers of the camera content an idea of the spatial relationship between the two camera positions and also avoids the aesthetically undesirable effect of panning through irrelevant visual background at high zoom ratios. Disclosed herein is a technique of producing this desired behavior in the context of an automatically or semi-automatically controlled video camera, such as those used in conjunction with videoconferencing equipment.