Abstract:
A device includes a memory, a receiver, a processor, and a display. The memory is configured to store a speaker model. The receiver is configured to receive an input audio signal. The processor is configured to determine a first confidence level associated with a first portion of the input audio signal based on the speaker model. The processor is also configured to determine a second confidence level associated with a second portion of the input audio signal based on the speaker model. The display is configured to present a graphical user interface associated with the first confidence level or associated with the second confidence level.
Abstract:
A method for enhancing an audio signal by an electronic device is described. The method includes determining formant peaks based on an audio signal. The method also includes generating formant peak models. Generating formant peak models includes individually modeling each formant peak. The method further includes generating a global envelope based on the formant peak models.
Abstract:
Systems, methods, and apparatus for pitch trajectory analysis are described. Such techniques may be used to remove vocals and/or vibrato from an audio mixture signal. For example, such a technique may be used to pre-process the signal before an operation to decompose the mixture signal into individual instrument components.
Abstract:
A method for restoring a processed speech signal by an electronic device is described. The method includes obtaining at least one audio signal. The method also includes performing bin-wise voice activity detection based on the at least one audio signal. The method further includes restoring the processed speech signal based on the bin-wise voice activity detection.