Abstract:
A method for redundant frame coding by an electronic device is described. The method includes determining an adaptive codebook energy and a fixed codebook energy based on a frame. The method also includes coding a redundant version of the frame based on the adaptive codebook energy and the fixed codebook energy. The method further includes sending a subsequent frame.
Abstract:
A method for determining pitch pulse period signal boundaries by an electronic device is described. The method includes obtaining a signal. The method also includes determining a first averaged curve based on the signal. The method further includes determining at least one first averaged curve peak position based on the first averaged curve and a threshold. The method additionally includes determining pitch pulse period signal boundaries based on the at least one first averaged curve peak position. The method also includes synthesizing a speech signal.
Abstract:
A method includes determining a first modeled high-band signal based on a low-band excitation signal of an audio signal, where the audio signal includes a high-band portion and a low-band portion. The method also includes determining scaling factors based on energy of sub-frames of the first modeled high-band signal and energy of corresponding sub-frames of the high-band portion of the audio signal. The method includes applying the scaling factors to a modeled high-band excitation signal to determine a scaled high-band excitation signal and determining a second modeled high-band signal based on the scaled high-band excitation signal. The method includes determining gain parameters based on the second modeled high-band signal and the high-band portion of the audio signal.
Abstract:
A device includes a receiver configured to receive an audio frame of an audio stream. The audio frame includes information that indicates a coded bandwidth of the audio frame. The device also includes a decoder configured to generate first decoded speech associated with the audio frame and to determine an output mode of the decoder based at least in part on the information that indicates the coded bandwidth. A bandwidth mode indicated by the output mode of the decoder is different than a bandwidth mode indicated by the information that indicates the coded bandwidth. The decoder is further configured to output second decoded speech based on the first decoded speech. The second decoded speech is generated according to an output mode of the decoder.
Abstract:
Techniques are described for performing adaptive noise suppression to improve handling of both speech signals and music signals at least up to super wideband (SWB) bandwidths. The techniques include identifying a context or environment in which audio data is captured, and adaptively changing a level of noise suppression applied to the audio data prior to bandwidth compressing (e.g., encoding) based on the context. For a valid speech context, an audio pre-processor may set a first level of noise suppression that is relatively aggressive in order to suppress noise (including music) in the speech signals. For a valid music context, the audio pre-processor may set a second level of noise suppression that is less aggressive in order to leave the music signals undistorted. In this way, a vocoder at a transmitter side wireless communication device may properly encode both speech and music signals with minimal distortions.
Abstract:
A device includes a de-jitter buffer configured to receive a packet, the packet including first data and second data. The first data includes a partial copy of first frame data corresponding to a first frame of a sequence of frames. The second data corresponds to a second frame of the sequence of frames. The device also includes an analyzer configured to, in response to receiving the packet, generate a first frame receive timestamp associated with the first data. The analyzer is also configured to, in response to receiving the packet, generate a second frame receive timestamp associated with the second data. The first frame receive timestamp indicates a first time that is earlier than a second time indicated by the second frame receive timestamp.
Abstract:
A method of processing an audio signal includes determining an average signal-to-noise ratio for the audio signal over time. The method includes, based on the determined average signal-to-noise ratio, a formant-sharpening factor is determined. The method also includes applying a filter that is based on the determined formant-sharpening factor to a codebook vector that is based on information from the audio signal.
Abstract:
An apparatus includes a network interface configured to receive, via a circuit-switched network, a packet. The packet includes a primary coding of a first audio frame, redundant coding of a second audio frame, and one or more bits that indicate signaling information. The signaling information corresponds to a decode operation of at least one of the primary coding or the redundant coding. The apparatus further includes a decoder configured to decode a portion of the packet based on the signaling information.
Abstract:
Techniques are described for performing adaptive noise suppression to improve handling of both speech signals and music signals at least up to super wideband (SWB) bandwidths. The techniques include identifying a context or environment in which audio data is captured, and adaptively changing a level of noise suppression applied to the audio data prior to bandwidth compressing (e.g., encoding) based on the context. For a valid speech context, an audio pre-processor may set a first level of noise suppression that is relatively aggressive in order to suppress noise (including music) in the speech signals. For a valid music context, the audio pre-processor may set a second level of noise suppression that is less aggressive in order to leave the music signals undistorted. In this way, a vocoder at a transmitter side wireless communication device may properly encode both speech and music signals with minimal distortions.
Abstract:
A device includes a receiver configured to receive an audio frame of an audio stream. The device also includes a decoder configured to generate first decoded speech associated with the audio frame and to determine a count of audio frames classified as being associated with band limited content. The decoder is further configured to output second decoded speech based on the first decoded speech. The second decoded speech may be generated according to an output mode of the decoder. The output mode may be selected based at least in part on the count of audio frames.