-
公开(公告)号:US20190180142A1
公开(公告)日:2019-06-13
申请号:US16203668
申请日:2018-11-29
Inventor: Woo-taek LIM , Seung Kwon BEACK
IPC: G06K9/62 , G06N3/02 , G10L19/008 , G06F16/683
Abstract: Disclosed is an apparatus and method for extracting a sound source from a multi-channel audio signal. A sound source extracting method includes transforming a multi-channel audio signal into two-dimensional (2D) data, extracting a plurality of feature maps by inputting the 2D data into a convolutional neural network (CNN) including at least one layer, and extracting a sound source from the multi-channel audio signal using the feature maps.
-
12.
公开(公告)号:US20230245666A1
公开(公告)日:2023-08-03
申请号:US18102472
申请日:2023-01-27
Inventor: Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Inseon JANG , Byeongho CHO
IPC: G10L19/035 , G10L19/038 , G10L19/00
CPC classification number: G10L19/035 , G10L19/038 , G10L19/0017
Abstract: Provided are an encoding method, an encoding device, a decoding method, and a decoding device using a scalar quantization and a vector quantization. The encoding method includes converting an input signal of a time domain into a frequency domain, generating a first residual signal from an input signal of a frequency domain by using a scale factor, performing a scalar quantization of the first residual signal, generating a second residual signal from the scalar-quantized first residual signal, performing a lossless encoding of the scalar-quantized first residual signal, performing a vector quantization of the second residual signal, and transmitting a bitstream including the lossless-encoded first residual signal and the vector-quantized second residual signal.
-
13.
公开(公告)号:US20230039546A1
公开(公告)日:2023-02-09
申请号:US17711908
申请日:2022-04-01
Applicant: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
Inventor: Inseon JANG , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Jongwon SHIN , Youngju CHEON , Sangwook HAN , Soojoong HWANG
IPC: G10L19/038 , G06N3/04
Abstract: An audio encoding/decoding apparatus and method using vector quantized residual error features are disclosed. An audio signal encoding method includes outputting a bitstream of a main codec by encoding an original signal, decoding the bitstream of the main codec, determining a residual error feature vector from a feature vector of a decoded signal and a feature vector of the original signal, and outputting a bitstream of additional information by encoding the residual error feature vector.
-
公开(公告)号:US20220335963A1
公开(公告)日:2022-10-20
申请号:US17670172
申请日:2022-02-11
Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE , INDUSTRY-ACADEMIC COOPERATION FOUNDATION, YONSEI UNIVERSITY
Inventor: Inseon JANG , Seung Kwon BEACK , Jongmo SUNG , Tae Jin LEE , Woo-taek LIM , Hong-Goo KANG , Jihyun LEE , Chanwoo LEE , Hyungseob LIM
IPC: G10L19/038 , G10L25/30
Abstract: An audio signal encoding and decoding method using a neural network model, and an encoder and decoder for performing the same are disclosed. A method of encoding an audio signal using a neural network model, the method may include identifying an input signal, generating a quantized latent vector by inputting the input signal into a neural network model encoding the input signal, and generating a bitstream corresponding to the quantized latent vector, wherein the neural network model may include i) a feature extraction layer generating a latent vector by extracting a feature of the input signal, ii) a plurality of downsampling blocks downsampling the latent vector, and iii) a plurality of quantization blocks performing quantization of a downsampled latent vector.
-
公开(公告)号:US20220005488A1
公开(公告)日:2022-01-06
申请号:US17368484
申请日:2021-07-06
Inventor: Jongmo SUNG , Seung Kwon BEACK , Mi Suk LEE , Tae Jin LEE , Woo-taek LIM , Inseon JANG
IPC: G10L19/032 , G10L19/16 , H04B17/309 , G06N3/08
Abstract: The encoding method includes computing the first feature information of an input signal using a recurrent encoding model, quantizing the first feature information and producing the first feature bitstream, computing the first output signal from the quantized first feature information using a recurrent decoding model, computing the second feature information of the input signal using a nonrecurrent encoding model, quantizing the second feature information and producing the second feature bitstream, computing the second output signal from the quantized second feature information using a nonrecurrent decoding model, determining an encoding mode based on the input signal, the first and second output signals, and the first and second feature bitstreams, and outputting an overall bitstream by multiplexing an encoding mode bit and one of the first feature bitstream and the second feature bitstream depending on the encoding mode.
-
公开(公告)号:US20210174252A1
公开(公告)日:2021-06-10
申请号:US16927691
申请日:2020-07-13
Applicant: Electronics and Telecommunications Research Institute , Kyungpook National University Industry-Academic Cooperation Foundation
Inventor: Young Ho JEONG , Soo Young PARK , Sang Won SUH , Woo-taek LIM , Minhan KIM , Seokjin LEE
Abstract: Disclosed is an apparatus and method for augmenting training data using a notch filter. The method may include obtaining original data, and obtaining training data having a modified frequency component from the original data by filtering the original data using a filter configured to remove a component of a predetermined frequency band.
-
17.
公开(公告)号:US20200312350A1
公开(公告)日:2020-10-01
申请号:US16561824
申请日:2019-09-05
Inventor: Woo-taek LIM , Sang Won SUH , Young Ho JEONG
Abstract: A sound event detection method includes receiving a sound signal and determining and outputting whether a sound event is present in the sound signal by applying a trained neural network to the received sound signal, and performing post-processing of the output to reduce an error in the determination, wherein the neural network is trained to early stop at an optimal epoch based on a different threshold for each of at least one sound event present in a pre-processed sound signal. That is, the sound event detection method may detect an optimal epoch to stop training by applying different characteristics for respective sound events and improve the sound event detection performance based on the optimal epoch.
-
公开(公告)号:US20200211576A1
公开(公告)日:2020-07-02
申请号:US16729112
申请日:2019-12-27
Inventor: Seung Kwon BEACK , Woo-taek LIM , Tae Jin LEE
IPC: G10L19/032 , G10L25/30 , G06N3/08
Abstract: A loss function determining method and a loss function determining system for an audio signal are disclosed. A method of determining a loss function capable of being defined when a neural network is used to reconstruct an audio signal is provided.
-
公开(公告)号:US20250006210A1
公开(公告)日:2025-01-02
申请号:US18747007
申请日:2024-06-18
Applicant: Electronics and Telecommunications Research Institute , UIF (University Industry Foundation), Yonsei University
Inventor: Woo-taek LIM , Inseon JANG , Seung Kwon BEACK , Hong-Goo KANG , Byeong Hyeon KIM , Jihyun LEE , Hyungseob LIM
IPC: G10L19/08
Abstract: A method of encoding/decoding a speech signal and a device for performing the same are provided. The method includes outputting, based on a first input speech signal of a previous timepoint and a second input speech signal of a current timepoint, a predicted signal that predicts the second input speech signal from the first input speech signal and obtaining, based on the second input speech signal and the predicted signal, a residual signal by removing a correlation between the first input speech signal and the second input speech signal from the second input speech signal.
-
公开(公告)号:US20240233738A9
公开(公告)日:2024-07-11
申请号:US18358646
申请日:2023-07-25
Applicant: Electronics and Telecommunications Research Institute , Gwangju Institute of Science and Technology
Inventor: Inseon JANG , Seung Kwon BEACK , Tae Jin LEE , Jongmo SUNG , Woo-taek LIM , Byeongho CHO , Jongwon SHIN
IPC: G10L19/02
CPC classification number: G10L19/02
Abstract: Provided is an encoding apparatus including a memory configured to store instructions and a processor electrically connected to the memory and configured to execute the instructions, wherein the processor may be configured to perform a plurality of operations, when the instructions are executed by the processor, wherein the plurality of operations may include obtaining an input audio signal, generating an embedded audio signal by embedding signal components of a second frequency band of the input audio signal in a first frequency band of the input audio signal, generating additional information associated with the first frequency band and the second frequency band, generating an encoded audio signal by encoding the embedded audio signal, and formatting the encoded audio signal and the additional information into a bitstream.
-
-
-
-
-
-
-
-
-