Abstract:
Systems and methods are described for automatically framing participants in a video conference using a single camera of a video conferencing system. A camera of a video conferencing system may capture video images of a conference room. A processor of the video conferencing system may identify a potential region of interest within a video image of the captured video images, the potential region of interest including an identified participant. Feature detection may be executed on the potential region of interest, and a region of interest may be computed based on the executed feature detection. The processor may then automatically frame the identified participant within the computed region of interest, the automatic framing including at least one of cropping the video image to match the computed region of interest and rescaling the video image to a desired resolution.
Abstract:
A system and method for initiating conference calls with external devices are disclosed. Call participants are sent conference invitation and conference information regarding the designated conference call. This conference information is stored on the participant's external device. When the participants arrive at a conference call location having a conferencing device, the conferencing device is capable of communicating with the external device, initiating communications, exchanging conference information. If the participant is verified and/or authorized, the conference system may send the IP address of the conference device to the conference system to initiate the conference call. In one embodiment, the conference device uses an ultrasound acoustic communication band to initiate the call with the external device on a semi-automated basis. An acoustic signature comprising a pilot sequence for communications synchronization may be generated to facilitate the call. Audible and aesthetic acoustic protocols may also be employed.
Abstract:
Systems and methods are described for dynamically suppressing non-linear distortion for a device, such as a speakerphone. A device may receive a signal, where the device has non-linear distortion at a predetermined frequency. The received signal may be analyzed to compute a tone strength parameter and a band level. The received signal may be filtered such that a spectrum of the input signal is dynamically limited by reducing suppression of the non-linear distortion when the tone strength parameter is in a lower portion of a predetermined range and increasing suppression of the non-linear distortion when the tone strength parameter is in an upper portion of the predetermined range, the predetermined range of the tone strength parameter corresponding to a loudness range of the device.
Abstract:
A telecommunications device (1) comprises a base part (2), a cap part (4), a body part (3), a control panel (5), an upper light ring (9), a lower light ring (8), a plurality of sound microphones and a plurality of sound speakers. The cap part (4) is concave in shape. The body part (3) is frusto-conical in shape and tapers outwardly away from the cap part (4) towards the base part (2). The body part (3) and the cap part (4) cover the sound microphones and the sound speakers. The control panel (5) is inclined relative to the plane of the support surface (6) and relative to the plane of the base part (2). The control panel (5) protrudes upwardly from the upper edge of the body part (3) over the concave cap part (4), and the control panel (5) protrudes downwardly from the lower edge of the body part (3). The control panel (5) controls operation of the sound microphones and the sound speakers. The upper light ring (9) is located between the cap part (4) and the body part (3). The lower light ring (8) is located between the body part (3) and the base part (2). When the device (1) is in an active state in which the device (1) is capable of capturing sound and/or rendering sound, each of the light rings (8, 9) emits light to indicate the active state of the device (1).
Abstract:
Improved audio data processing method and systems are provided. Some implementations involve dividing frequency domain audio data into a plurality of subbands and determining amplitude modulation signal values for each of the plurality of subbands. A band-pass filter may be applied to the amplitude modulation signal values in each subband, to produce band-pass filtered amplitude modulation signal values for each subband. The band-pass filter may have a central frequency that exceeds an average cadence of human speech. A gain may be determined for each subband based, at least in part, on a function of the amplitude modulation signal values and the band-pass filtered amplitude modulation signal values. The determined gain may be applied to each subband.
Abstract:
Systems and methods are described for automatically framing participants in a video conference using a single camera of a video conferencing system. A camera of a video conferencing system may capture video images of a conference room. A processor of the video conferencing system may identify a potential region of interest within a video image of the captured video images, the potential region of interest including an identified participant. Feature detection may be executed on the potential region of interest, and a region of interest may be computed based on the executed feature detection. The processor may then automatically frame the identified participant within the computed region of interest, the automatic framing including at least one of cropping the video image to match the computed region of interest and rescaling the video image to a desired resolution.
Abstract:
Systems and methods are described for determining orientation of an external audio device in a video conference, which may be used to provide congruent multimodal representation for a video conference. A camera of a video conferencing system may be used to detect a potential location of an external audio device within a room in which the video conferencing system is providing a video conference. Within the detected potential location, a visual pattern associated with the external audio device may be identified. Using the identified visual pattern, the video conferencing system may estimate an orientation of the external audio device, the orientation being used by the video conferencing system to provide spatial audio video congruence to a far end audience.
Abstract:
A telecommunications device (1) comprises a base part (2), a cap part (4), a body part (3), a control panel (5), an upper light ring (9), a lower light ring (8), a plurality of sound microphones and a plurality of sound speakers. The cap part (4) is con cave in shape. The body part (3) is frusto-conical in shape and tapers outwardly away from the cap part (4) towards the base part (2). The body part (3) and the cap part (4) cover the sound microphones and the sound speakers. The control panel (5) is inclined relative to the plane of the support surface (6) and relative to the plane of the base part (2). The control panel (5) protrudes upwardly from the upper edge of the body part (3) over the concave cap part (4), and the control panel (5) protrudes downwardly from the lower edge of the body part (3). The control panel (5) controls operation of the sound microphones and the sound speakers. The upper light ring (9) is located between the cap part (4) and the body part (3). The lower light ring (8) is located between the body part (3) and the base part (2). When the device (1) is in an active state in which the device (1) is capable of capturing sound and/or rendering sound, each of the light rings (8, 9) emits light to indicate the active state of the device (1).
Abstract:
A system and method for initiating conference calls with external devices are disclosed. Call participants are sent conference invitation and conference information regarding the designated conference call. This conference information is stored on the participant's external device. When the participants arrive at a conference call location having a conferencing device, the conferencing device is capable of communicating with the external device, initiating communications, exchanging conference information. If the participant is verified and/or authorized, the conference system may send the IP address of the conference device to the conference system to initiate the conference call. In one embodiment, the conference device uses an ultrasound acoustic communication band to initiate the call with the external device on a semi-automated basis. An acoustic signature comprising a pilot sequence for communications synchronization may be generated to facilitate the call. Audible and aesthetic acoustic protocols may also be employed.
Abstract:
Systems and methods are described for determining orientation of an external audio device in a video conference, which may be used to provide congruent multimodal representation for a video conference. A camera of a video conferencing system may be used to detect a potential location of an external audio device within a room in which the video conferencing system is providing a video conference. Within the detected potential location, a visual pattern associated with the external audio device may be identified. Using the identified visual pattern, the video conferencing system may estimate an orientation of the external audio device, the orientation being used by the video conferencing system to provide spatial audio video congruence to a far end audience.