Abstract:
A loudspeaker is driven with a loudspeaker signal to generate sound, and sound is converted to one or more microphone signals with one or more microphones. The microphone signals are concurrently transformed into far-field beam signals and near-field beam signals. The far-field beam signals and the near-field beam signals are concurrently processed to produce one or more far-field output signals and one or more near-field output signals, respectively. Echo is detected and canceled in the far-field beam signals and in the near-field beam signals. When the echo is not detected above a threshold, the one or more far-field output signals are outputted. When the echo is detected above the threshold, the one or more near-field output signals are outputted. A signal based on the one or more output signals is transmitted.
Abstract:
A processing system can include a tracking microphone array; audio tracker circuitry connected to the tracking microphone array to track an audio source based on an audio input from the array; communication microphones; and a processor. The processor can include audio circuitry to receive an audio input from the communication microphones and process the audio input to apply one or more of acoustic echo cancellation (AEC) and acoustic echo suppression (AES) processing to the audio input. The processor can further include calculating circuitry to calculate a ratio of signal power after and before the AEC and/or the AES processing, and control circuitry to generate an acoustic echo presence indication based on the ratio calculated by the calculating circuitry. The processor can transmit, via transmitting circuitry, the acoustic echo presence indication to an audio tracking device via a data communication channel between the processor and the audio tracker.
Abstract:
A microphone array includes one or more front-facing microphones disposed on a front surface of the collaboration endpoint and a plurality of secondary microphones disposed on a second surface of the collaboration endpoint. The sound signals received at each of the one or more front-facing microphones and the plurality of secondary microphones are converted into microphone signals. When the sound signals have a frequency below a threshold frequency, an output signal is generated from microphone signals generated by the one or more front-facing microphones and the plurality of secondary microphones. When the sound signals have a frequency at or above a threshold frequency, an output signal is generated from microphone signals generated by only the one or more front-facing microphones.
Abstract:
A microphone array includes one or more front-facing microphones disposed on a front surface of the collaboration endpoint and a plurality of secondary microphones disposed on a second surface of the collaboration endpoint. The sound signals received at each of the one or more front-facing microphones and the plurality of secondary microphones are converted into microphone signals. When the sound signals have a frequency below a threshold frequency, an output signal is generated from microphone signals generated by the one or more front-facing microphones and the plurality of secondary microphones. When the sound signals have a frequency at or above a threshold frequency, an output signal is generated from microphone signals generated by only the one or more front-facing microphones.
Abstract:
The disclosed technology relates to a microphone array. The array comprises a plurality of microphones with each microphone having a horn portion. Each microphone of the array further comprises an instrument disposed at a distal end of the horn portion. Each instrument of the array is configured to convert sound waves into an electrical signal. The microphone array further comprises a beamforming signal processing circuit electrically coupled to each instrument and configured to create a plurality of beam signals based on respective electrical signals.
Abstract:
A processing system can include a processor that includes circuitry. The circuitry can be configured to: receive far-end and near-end audio signals; detect silence events and voice activities from the audio signals; determine whether an audio event in the audio signals is an interference event or a speaker event based on the detected silence events and voice activities, and further based on localized acoustic source data and faces or motion detected from an image; and generate a mute or unmute indication based on whether the audio event is the interference event or the speaker event. The system can include a near-end microphone array to output the near-end audio signals, one or more far-end microphones to output the far-end audio signals, and one or more cameras to capture the image of the environment.
Abstract:
At a microphone array, a soundfield is detected to produce a set of microphone signals each from a corresponding microphone in the microphone array. The set of microphone signals represents the soundfield. The detected soundfield is decomposed into a set of sub-soundfield signals based on the set of microphone signals. Each sub-soundfield signal is processed, such that each sub-soundfield signal is separately dereverberated using other ones of the sub-soundfield signals to remove reverberation from the sub-soundfield signal, to produce a set of processed sub-soundfield signals. The set of processed sub-sound field signals are mixed into a mixed output signal.
Abstract:
At a microphone array, a soundfield is detected to produce a set of microphone signals each from a corresponding microphone in the microphone array. The set of microphone signals represents the soundfield. The detected soundfield is decomposed into a set of sub-soundfield signals based on the set of microphone signals. Each sub-soundfield signal is processed, such that each sub-soundfield signal is separately dereverberated to remove reverberation therefrom, to produce a set of processed sub-soundfield signals. The set of processed sub-sound field signals are mixed into a mixed output signal.
Abstract:
A video conference endpoint determines a position of a best audio pick-up region for placement of a sound source relative to a microphone having a receive pattern configured to capture sound signals from the best region. The endpoint captures an image of a scene that encompasses the best region and displays the image of the scene. The endpoint generates an image representative of the best region and displays the generated image representative of the best region as an overlay of the scene image.
Abstract:
A telepresence video conference endpoint device includes spaced-apart microphone arrays each configured to transduce sound into corresponding sound signals. A processor receives the sound signals from the arrays and determines a direction-of-arrival (DOA) of sound at each array based on the set of sound signals from that array, determines if each array is blocked or unblocked based on the DOA determined for that array, selects an array among the arrays based on whether each array is determined to be blocked or unblocked, and perform subsequent sound processing based on one or more of the sound signals from the selected array.