-
公开(公告)号:US11862183B2
公开(公告)日:2024-01-02
申请号:US17368390
申请日:2021-07-06
Inventor: Jongmo Sung , Seung Kwon Beack , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC: G10L19/032
CPC classification number: G10L19/032
Abstract: An audio signal encoding and decoding method using a neural network model, a method of training the neural network model, and an encoder and decoder performing the methods are disclosed. The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, computing an output signal from the first feature information using a recurrent decoding model, calculating a residual signal by subtracting the output signal from the input signal, computing the second feature information of the residual signal using a nonrecurrent encoding model, and converting the first feature information and the second feature information to a bitstream.
-
公开(公告)号:US11664037B2
公开(公告)日:2023-05-30
申请号:US17326035
申请日:2021-05-20
Applicant: Electronics and Telecommunications Research Institute , The Trustees of Indiana University
Inventor: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Inseon Jang , Minje Kim , Haici Yang
IPC: G10L19/032 , G10L21/0272
CPC classification number: G10L19/032 , G10L21/0272
Abstract: Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
-
3.
公开(公告)号:US11205442B2
公开(公告)日:2021-12-21
申请号:US16562110
申请日:2019-09-05
Inventor: Young Ho Jeong , Sang Won Suh , Tae Jin Lee , Woo-taek Lim , Hui Yong Kim
Abstract: Provided is a sound event recognition method that may improve a sound event recognition performance using a correlation between difference sound signal feature parameters based on a neural network, in detail, that may extract a sound signal feature parameter from a sound signal including a sound event, and recognize the sound event included in the sound signal by applying a convolutional neural network (CNN) trained using the sound signal feature parameter.
-
公开(公告)号:US11562757B2
公开(公告)日:2023-01-24
申请号:US17377157
申请日:2021-07-15
Inventor: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang , Jin Soo Choi
IPC: G10L19/06 , G10L19/032
Abstract: An audio signal encoding method performed by an encoder includes identifying a time-domain audio signal in a unit of blocks, quantizing a linear prediction coefficient extracted from a combined block in which a current original block of the audio signal and a previous original block chronologically adjacent to the current original block using frequency-domain linear predictive coding (LPC), generating a temporal envelope by dequantizing the quantized linear prediction coefficient, extracting a residual signal from the combined block based on the temporal envelope, quantizing the residual signal by one of time-domain quantization and frequency-domain quantization, and transforming the quantized residual signal and the quantized linear prediction coefficient into a bitstream.
-
公开(公告)号:US11133015B2
公开(公告)日:2021-09-28
申请号:US16180298
申请日:2018-11-05
Inventor: Seung Kwon Beack , Woo-taek Lim , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Hui Yong Kim
IPC: G10L19/04 , G10L25/30 , G10L19/008
Abstract: A method of predicting a channel parameter of an original signal from a downmix signal is disclosed. The method may include generating an input feature map to be used to predict a channel parameter of the original signal based on a downmix signal of an original signal, determining an output feature map including a predicted parameter to be used to predict the channel parameter by applying the input feature map to a neural network, generating a label map including information associated with the channel parameter of the original signal, and predicting the channel parameter of the original signal by comparing the output feature map and the label map.
-
公开(公告)号:US10540988B2
公开(公告)日:2020-01-21
申请号:US16196356
申请日:2018-11-20
Inventor: Woo-taek Lim
Abstract: Disclosed is a sound event detecting method including receiving an audio signal, transforming the audio signal into a two-dimensional (2D) signal, extracting a feature map by training a convolutional neural network (CNN) using the 2D signal, pooling the feature map based on a frequency, and determining whether a sound event occurs with respect to each of at least one time interval based on a result of the pooling.
-
7.
公开(公告)号:US11823688B2
公开(公告)日:2023-11-21
申请号:US17390753
申请日:2021-07-30
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , The Trustees of Indiana University
Inventor: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Inseon Jang , Minje Kim
IPC: G10L19/008 , G06N3/04 , G10L19/032 , G10L19/24
CPC classification number: G10L19/008 , G06N3/04 , G10L19/032
Abstract: Disclosed are a method of encoding and decoding an audio signal and an encoder and a decoder performing the method. The method of encoding an audio signal includes identifying an input signal, and generating a bitstring of each encoding layer by applying, to the input signal, an encoding model including a plurality of successive encoding layers that encodes the input signal, in which a current encoding layer among the encoding layers is trained to generate a bitstring of the current encoding layer by encoding an encoded signal which is a signal encoded in a previous encoding layer and quantizing an encoded signal which is a signal encoded in the current encoding layer.
-
公开(公告)号:US11783844B2
公开(公告)日:2023-10-10
申请号:US17527351
申请日:2021-11-16
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , Gwangju Institute of Science and Technology
Inventor: Woo-taek Lim , Seung Kwon Beack , Jongmo Sung , Tae Jin Lee , Inseon Jang , Jong Won Shin , Soojoong Hwang , Youngju Cheon , Sangwook Han
Abstract: Disclosed are methods of encoding and decoding an audio signal using side information, and an encoder and a decoder for performing the methods. The method of encoding an audio signal using side information includes identifying an input signal, the input signal being an original audio signal, extracting side information from the input signal using a learning model trained to extract side information from a feature vector of the input signal, encoding the input signal, and generating a bitstream by combining the encoded input signal and the side information.
-
公开(公告)号:US11580999B2
公开(公告)日:2023-02-14
申请号:US17331416
申请日:2021-05-26
Inventor: Seung Kwon Beack , Jongmo Sung , Mi Suk Lee , Tae Jin Lee , Woo-taek Lim , Inseon Jang
IPC: G10L19/022 , G10L19/06 , G10L19/16 , G10L19/035
Abstract: An audio signal encoding method performed by an encoder includes identifying an audio signal of a time domain in units of a block, generating a combined block by combining i) a current original block of the audio signal and ii) a previous original block chronologically adjacent to the current original block, extracting a first residual signal of a frequency domain from the combined block using linear predictive coding of a time domain, overlapping chronologically adjacent first residual signals among first residual signals converted into a time domain, and quantizing a second residual signal of a time domain extracted from the overlapped first residual signal by converting the second residual signal of the time domain into a frequency domain using linear predictive coding of a frequency domain.
-
公开(公告)号:US11545163B2
公开(公告)日:2023-01-03
申请号:US16729112
申请日:2019-12-27
Inventor: Seung Kwon Beack , Woo-taek Lim , Tae Jin Lee
IPC: G10L19/032 , G10L25/30
Abstract: A loss function of a signal including an audio signal is determined. A loss function determining system for an audio signal is provided. A loss function is determined by: determining a reference quantization index by quantizing an original input signal; inputting the original input signal to a neural network classifier and applying an activation function to an output layer of the neural network classifier; and determining a total loss function for the neural network classifier using an output of the activation function and the reference quantization index.
-
-
-
-
-
-
-
-
-