Abstract:
A mobile device that is capable of automatically starting and ending the recording of an audio signal captured by at least one microphone is presented. The mobile device is capable of adjusting a number of parameters related with audio logging based on the context information of the audio input signal.
Abstract:
A personalized (i.e., speaker-derivable) bandwidth extension is provided in which the model used for bandwidth extension is personalized (e.g., tailored) to each specific user. A training phase is performed to generate a bandwidth extension model that is personalized to a user. The model may be subsequently used in a bandwidth extension phase during a phone call involving the user. The bandwidth extension phase, using the personalized bandwidth extension model, will be activated when a higher band (e.g., wideband) is not available and the call is taking place on a lower band (e.g., narrowband).
Abstract:
A system which performs social interaction analysis for a plurality of participants includes a processor. The processor is configured to determine a similarity between a first spatially filtered output and each of a plurality of second spatially filtered outputs. The processor is configured to determine the social interaction between the participants based on the similarities between the first spatially filtered output and each of the second spatially filtered outputs and display an output that is representative of the social interaction between the participants. The first spatially filtered output is received from a fixed microphone array, and the second spatially filtered outputs are received from a plurality of steerable microphone arrays each corresponding to a different participant.
Abstract:
Disclosed is an application interface that takes into account the user's gaze direction relative to who is speaking in an interactive multi-participant environment where audio-based contextual information and/or visual-based semantic information is being presented. Among these various implementations, two different types of microphone array devices (MADs) may be used. The first type of MAD is a steerable microphone array (a.k.a. a steerable array) which is worn by a user in a known orientation with regard to the user's eyes, and wherein multiple users may each wear a steerable array. The second type of MAD is a fixed-location microphone array (a.k.a. a fixed array) which is placed in the same acoustic space as the users (one or more of which are using steerable arrays).
Abstract:
A system which tracks a social interaction between a plurality of participants, includes a fixed beamformer that is adapted to output a first spatially filtered output and configured to receive a plurality of second spatially filtered outputs from a plurality of steerable beamformers. Each steerable beamformer outputs a respective one of the second spatially filtered outputs associated with a different one of the participants. The system also includes a processor capable of determining a similarity between the first spatially filtered output and each of the second spatially filtered outputs. The processor determines the social interaction between the participants based on the similarity between the first spatially filtered output and each of the second spatially filtered outputs.
Abstract:
A method for signal level matching by an electronic device is described. The method includes capturing a plurality of audio signals from a plurality of microphones. The method also includes determining a difference signal based on an inter-microphone subtraction. The difference signal includes multiple harmonics. The method also includes determining whether a harmonicity of the difference signal exceeds a harmonicity threshold. The method also includes preserving the harmonics to determine an envelope. The method further applies the envelope to a noise-suppressed signal.
Abstract:
A system which tracks a social interaction between a plurality of participants, includes a fixed beamformer that is adapted to output a first spatially filtered output and configured to receive a plurality of second spatially filtered outputs from a plurality of steerable beamformers. Each steerable beamformer outputs a respective one of the second spatially filtered outputs associated with a different one of the participants. The system also includes a processor capable of determining a similarity between the first spatially filtered output and each of the second spatially filtered outputs. The processor determines the social interaction between the participants based on the similarity between the first spatially filtered output and each of the second spatially filtered outputs.
Abstract:
Disclosed is an application interface that takes into account the user's gaze direction relative to who is speaking in an interactive multi-participant environment where audio-based contextual information and/or visual-based semantic information is being presented. Among these various implementations, two different types of microphone array devices (MADs) may be used. The first type of MAD is a steerable microphone array (a.k.a. a steerable array) which is worn by a user in a known orientation with regard to the user's eyes, and wherein multiple users may each wear a steerable array. The second type of MAD is a fixed-location microphone array (a.k.a. a fixed array) which is placed in the same acoustic space as the users (one or more of which are using steerable arrays).
Abstract:
Apparatus and methods for audio noise attenuation are disclosed. An audio signal analyzer can determine whether an input audio signal received from a microphone device includes a noise signal having identifiable content. If there is a noise signal having identifiable content, a content source is accessed to obtain a copy of the noise signal. An audio canceller can generate a processed audio signal, having an attenuated noise signal, based on comparing the copy of the noise signal to the input audio signal. Additionally or alternatively, data may be communicated on a communication channel to a separate media device to receive at least a portion of the copy of the noise signal from the separate media device, or to receive content-identification data corresponding to the content source.
Abstract:
Apparatus and methods for audio noise attenuation are disclosed. An audio signal analyzer can determine whether an input audio signal received from a microphone device includes a noise signal having identifiable content. If there is a noise signal having identifiable content, a content source is accessed to obtain a copy of the noise signal. An audio canceller can generate a processed audio signal, having an attenuated noise signal, based on comparing the copy of the noise signal to the input audio signal. Additionally or alternatively, data may be communicated on a communication channel to a separate media device to receive at least a portion of the copy of the noise signal from the separate media device, or to receive content-identification data corresponding to the content source.