Abstract:
A method for encoding three dimensional audio by a wireless communication device is disclosed. The wireless communication device detects an indication of a plurality of localizable audio sources. The wireless communication device also records a plurality of audio signals associated with the plurality of localizable audio sources. The wireless communication device also encodes the plurality of audio signals.
Abstract:
Systems, methods, and apparatus for projecting an estimated direction of arrival of sound onto a plane that does not include the estimated direction are described.
Abstract:
A method of selectively authorizing access includes obtaining, at an authentication device, first information corresponding to first synthetic biometric data. The method also includes obtaining, at the authentication device, first common synthetic data and second biometric data. The method further includes generating, at the authentication device, second common synthetic data based on the first information and the second biometric data. The method also includes selectively authorizing, by the authentication device, access based on a comparison of the first common synthetic data and the second common synthetic data.
Abstract:
A method of processing audio may include receiving, by a computing device, a plurality of real-time audio signals outputted by a plurality of microphones communicatively coupled to the computing device. The computing device may output to a display a graphical user interface (GUI) that presents audio information associated with the received audio signals. The one or more received audio signals may be processed based on a user input associated with the audio information presented via the GUI to generate one or more processed audio signals. The one or more processed audio signals may be output to, for example, one or more output devices such as speakers, headsets, and the like.
Abstract:
A method for speech modeling by an electronic device is described. The method includes obtaining a real-time noise reference based on a noisy speech signal. The method also includes obtaining a real-time noise dictionary based on the real-time noise reference. The method further includes obtaining a first speech dictionary and a second speech dictionary. The method additionally includes reducing residual noise based on the real-time noise dictionary and the first speech dictionary to produce a residual noise-suppressed speech signal at a first modeling stage. The method also includes generating a reconstructed speech signal based on the residual noise-suppressed speech signal and the second speech dictionary at a second modeling stage.
Abstract:
A method for multi-channel echo cancellation and noise suppression is described. One of multiple echo estimates is selected for non-linear echo cancellation. Echo notch masking is performed on a noise-suppressed signal based on an echo direction of arrival (DOA) to produce an echo-suppressed signal. Non-linear echo cancellation is performed on the echo-suppressed signal based, at least in part, on the selected echo estimate.
Abstract:
A personalized (i.e., speaker-derivable) bandwidth extension is provided in which the model used for bandwidth extension is personalized (e.g., tailored) to each specific user. A training phase is performed to generate a bandwidth extension model that is personalized to a user. The model may be subsequently used in a bandwidth extension phase during a phone call involving the user. The bandwidth extension phase, using the personalized bandwidth extension model, will be activated when a higher band (e.g., wideband) is not available and the call is taking place on a lower band (e.g., narrowband).
Abstract:
In general, techniques are described for image generation for a collaborative sound system. A headend device comprising a processor may perform these techniques. The processor may be configured to determine a location of a mobile device participating in a collaborative surround sound system as a speaker of a plurality of speakers of the collaborative surround sound system. The processor may further be configured to generate an image that depicts the location of the mobile device that is participating in the collaborative surround sound system relative to the plurality of other speakers of the collaborative surround sound system.
Abstract:
A system which performs social interaction analysis for a plurality of participants includes a processor. The processor is configured to determine a similarity between a first spatially filtered output and each of a plurality of second spatially filtered outputs. The processor is configured to determine the social interaction between the participants based on the similarities between the first spatially filtered output and each of the second spatially filtered outputs and display an output that is representative of the social interaction between the participants. The first spatially filtered output is received from a fixed microphone array, and the second spatially filtered outputs are received from a plurality of steerable microphone arrays each corresponding to a different participant.
Abstract:
A method for detecting voice activity by an electronic device is described. The method includes detecting near end speech based on a near end voiced speech detector and at least one single channel voice activity detector. The near end voiced speech detector is associated with a harmonic statistic based on a speech pitch histogram.