Abstract:
A processing system can include a processor that includes circuitry. The circuitry can be configured to: receive far-end and near-end audio signals; detect silence events and voice activities from the audio signals; determine whether an audio event in the audio signals is an interference event or a speaker event based on the detected silence events and voice activities, and further based on localized acoustic source data and faces or motion detected from an image; and generate a mute or unmute indication based on whether the audio event is the interference event or the speaker event. The system can include a near-end microphone array to output the near-end audio signals, one or more far-end microphones to output the far-end audio signals, and one or more cameras to capture the image of the environment.
Abstract:
A processing system can include a tracking microphone array; audio tracker circuitry connected to the tracking microphone array to track an audio source based on an audio input from the array; communication microphones; and a processor. The processor can include audio circuitry to receive an audio input from the communication microphones and process the audio input to apply one or more of acoustic echo cancellation (AEC) and acoustic echo suppression (AES) processing to the audio input. The processor can further include calculating circuitry to calculate a ratio of signal power after and before the AEC and/or the AES processing, and control circuitry to generate an acoustic echo presence indication based on the ratio calculated by the calculating circuitry. The processor can transmit, via transmitting circuitry, the acoustic echo presence indication to an audio tracking device via a data communication channel between the processor and the audio tracker.