CODING AND DECODING OF PULSE AND RESIDUAL PARTS OF AN AUDIO SIGNAL

    公开(公告)号:US20240177724A1

    公开(公告)日:2024-05-30

    申请号:US18406351

    申请日:2024-01-08

    发明人: Goran MARKOVIC

    摘要: An audio encoder for encoding an audio signal comprising an pulse portion and a stationary portion, comprises: a pulse extractor configured for extracting the pulse portion from the audio signal, further comprising a pulse coder for encoding the extracted pulse portion to acquire an encoded pulse portion; wherein the pulse extractor is configured to determine a spectrogram of the audio signal to extract the pulse portion, wherein the spectrogram has higher time resolution than the signal encoder; a signal encoder configured for encoding a residual signal derived from the audio signal to acquire an encoded residual signal, the residual signal being derived from the audio signal so that the pulse portion is reduced or eliminated from the audio signal; and an output interface configured for outputting the encoded pulse portion and the encoded residual signal to provide an encoded signal.

    High resolution audio coding for improving package loss concealment

    公开(公告)号:US11749290B2

    公开(公告)日:2023-09-05

    申请号:US17373148

    申请日:2021-07-12

    发明人: Yang Gao

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing long-term prediction (LTP) are described. One example of the methods includes determining a pitch gain and a pitch lag of an input audio signal for at least a predetermined number of frames. It is determined that the pitch gain of the input audio signal has exceeded a predetermined threshold and that a change of the pitch lag of the input audio signal has been within a predetermined range for at least the predetermined number of frames. In response to determining that the pitch gain of the input audio signal has exceeded the predetermined threshold and that the change of the third pitch lag has been within the predetermined range for at least the predetermined number of frames, a pitch gain is set for a current frame of the input audio signal.

    Deep neural network based audio processing method, device and storage medium

    公开(公告)号:US11270688B2

    公开(公告)日:2022-03-08

    申请号:US16930337

    申请日:2020-07-16

    摘要: A deep neural network based audio processing method is provided. The method includes: obtaining a deep neural network based speech extraction model; receiving an audio input object having a speech portion and a non-speech portion, wherein the audio input object includes one or more audio data frames each having a set of audio data samples sampled at a predetermined sampling interval and represented in time domain data format; obtaining a user audiogram and a set of user gain compensation coefficients associated with the user audiogram; and inputting the audio input object and the set of user gain compensation coefficients into the trained speech extraction model to obtain an audio output result represented in time domain data format outputted by the trained speech extraction model, wherein the non-speech portion of the audio input object is at least partially attenuated in or removed from the audio output result.

    DEEP NEURAL NETWORK BASED AUDIO PROCESSING METHOD, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20210074266A1

    公开(公告)日:2021-03-11

    申请号:US16930337

    申请日:2020-07-16

    摘要: A deep neural network based audio processing method is provided. The method includes: obtaining a deep neural network based speech extraction model; receiving an audio input object having a speech portion and a non-speech portion, wherein the audio input object includes one or more audio data frames each having a set of audio data samples sampled at a predetermined sampling interval and represented in time domain data format; obtaining a user audiogram and a set of user gain compensation coefficients associated with the user audiogram; and inputting the audio input object and the set of user gain compensation coefficients into the trained speech extraction model to obtain an audio output result represented in time domain data format outputted by the trained speech extraction model, wherein the non-speech portion of the audio input object is at least partially attenuated in or removed from the audio output result.

    TEXT CATEGORIZATION USING NATURAL LANGUAGE PROCESSING

    公开(公告)号:US20200175228A1

    公开(公告)日:2020-06-04

    申请号:US16784551

    申请日:2020-02-07

    摘要: A method performed by a device may include identifying a plurality of samples of textual content; performing tokenization of the plurality of samples to generate a respective plurality of tokenized samples; performing embedding of the plurality of tokenized samples to generate a sample matrix; determining groupings of attributes of the sample matrix using a convolutional neural network; determining context relationships between the groupings of attributes using a bidirectional long short term memory (LSTM) technique; selecting predicted labels for the plurality of samples using a model, wherein the model selects, for a particular sample of the plurality of samples, a predicted label of the predicted labels from a plurality of labels based on respective scores of the particular sample with regard to the plurality of labels and based on a nonparametric paired comparison of the respective scores; and providing information identifying the predicted labels.

    Method and apparatus for recovering lost frames

    公开(公告)号:US10311885B2

    公开(公告)日:2019-06-04

    申请号:US15817296

    申请日:2017-11-20

    摘要: A method for recovering a lost frame in a received audio signal includes: obtaining an initial high-frequency band signal of a current lost frame in the received audio signal; calculating a ratio R, wherein the ratio R is a ratio of a high frequency excitation energy of a previous frame of the current lost frame to a high frequency excitation energy of the current lost frame; obtaining a global gain of the current lost frame according to the ratio R and a global gain of the previous frame of the current lost frame; and recovering a high-frequency band signal of the current lost frame according to the initial high-frequency band signal of the current lost frame and the global gain of the current lost frame. The method can be used in an audio signal decoding process for low-loss recovery of lost frames of the audio signal.