-
公开(公告)号:US20180247642A1
公开(公告)日:2018-08-30
申请号:US15697923
申请日:2017-09-07
Inventor: Hyun Woo KIM , Ho Young JUNG , Jeon Gue PARK , Yun Keun LEE
CPC classification number: G10L15/16 , G06N3/08 , G06N3/084 , G10L15/02 , G10L15/04 , G10L21/04 , G10L25/84 , G10L2015/025 , G10L2015/027
Abstract: The present invention relates to a method and apparatus for improving spontaneous speech recognition performance. The present invention is directed to providing a method and apparatus for improving spontaneous speech recognition performance by extracting a phase feature as well as a magnitude feature of a voice signal transformed to the frequency domain, detecting a syllabic nucleus on the basis of a deep neural network using a multi-frame output, determining a speaking rate by dividing the number of syllabic nuclei by a voice section interval detected by a voice detector, calculating a length variation or an overlap factor according to the speaking rate, and performing cepstrum length normalization or time scale modification with a voice length appropriate for an acoustic model.
-
12.
公开(公告)号:US20180166071A1
公开(公告)日:2018-06-14
申请号:US15607880
申请日:2017-05-30
Inventor: Sung Joo LEE , Jeon Gue PARK , Yun Keun LEE , Hoon CHUNG
Abstract: Provided are a method of automatically classifying a speaking rate and a speech recognition system using the method. The speech recognition system using automatic speaking rate classification includes a speech recognizer configured to extract word lattice information by performing speech recognition on an input speech signal, a speaking rate estimator configured to estimate word-specific speaking rates using the word lattice information, a speaking rate normalizer configured to normalize a word-specific speaking rate into a normal speaking rate when the word-specific speaking rate deviates from a preset range, and a rescoring section configured to rescore the speech signal whose speaking rate has been normalized.
-
13.
公开(公告)号:US20150221303A1
公开(公告)日:2015-08-06
申请号:US14595238
申请日:2015-01-13
Inventor: Jeom Ja KANG , Hyung Bae JEON , Yun Keun LEE , Ho Young JUNG
Abstract: Provided are a discussion learning system enabling a discussion learning to proceed based on a speech recognition system without an instructor and a method using the same, the discussion learning system including an learning content providing server configured to provide a discussion environment, extract speeches of learners joining a discussion, and generate speech information based on the extracted speeches, and a speech recognition server configured to perform a speech recognition with respect to each of the learners based on the speech information, determine a progress of the discussion based on a result of the speech recognition, and provide the learning content providing server with interpretation information for smoothly continuing the discussion.
Abstract translation: 提供了一种讨论学习系统,使得能够基于没有讲师的语音识别系统进行讨论学习,以及使用该语音识别系统的方法,所述讨论学习系统包括被配置为提供讨论环境的学习内容提供服务器,提取学习者加入的演讲 讨论,并基于所提取的演讲生成语音信息,以及语音识别服务器,被配置为基于语音信息对每个学习者执行语音识别,基于语音的结果来确定讨论的进度 识别,并提供学习内容为服务器提供解释信息,以顺利地继续讨论。
-
14.
公开(公告)号:US20200184310A1
公开(公告)日:2020-06-11
申请号:US16711317
申请日:2019-12-11
Inventor: Hoon CHUNG , Jeon Gue PARK , Yun Keun LEE
Abstract: Provided is an apparatus and method for reducing the number of deep neural network model parameters, the apparatus including a memory in which a program for DNN model parameter reduction is stored, and a processor configured to execute the program, wherein the processor represents hidden layers of the model of the DNN using a full-rank decomposed matrix, uses training that is employed with a sparsity constraint for converting a diagonal matrix value to zero, and determines a rank of each of the hidden layers of the model of the DNN according to a degree of the sparsity constraint.
-
公开(公告)号:US20190318228A1
公开(公告)日:2019-10-17
申请号:US16260637
申请日:2019-01-29
Inventor: Hyun Woo KIM , Ho Young JUNG , Jeon Gue PARK , Yun Keun LEE
Abstract: Provided are an apparatus and method for a statistical memory network. The apparatus includes a stochastic memory, an uncertainty estimator configured to estimate uncertainty information of external input signals from the input signals and provide the uncertainty information of the input signals, a writing controller configured to generate parameters for writing in the stochastic memory using the external input signals and the uncertainty information and generate additional statistics by converting statistics of the external input signals, a writing probability calculator configured to calculate a probability of a writing position of the stochastic memory using the parameters for writing, and a statistic updater configured to update stochastic values composed of an average and a variance of signals in the stochastic memory using the probability of a writing position, the parameters for writing, and the additional statistics.
-
公开(公告)号:US20180165578A1
公开(公告)日:2018-06-14
申请号:US15478342
申请日:2017-04-04
Inventor: Hoon CHUNG , Jeon Gue PARK , Sung Joo LEE , Yun Keun LEE
CPC classification number: G06N3/04 , G06N3/0481 , G06N3/063
Abstract: Provided are an apparatus and method for compressing a deep neural network (DNN). The DNN compression method includes receiving a matrix of a hidden layer or an output layer of a DNN, calculating a matrix representing a nonlinear structure of the hidden layer or the output layer, and decomposing the matrix of the hidden layer or the output layer using a constraint imposed by the matrix representing the nonlinear structure.
-
公开(公告)号:US20170206894A1
公开(公告)日:2017-07-20
申请号:US15187581
申请日:2016-06-20
Inventor: Byung Ok KANG , Jeon Gue PARK , Hwa Jeon SONG , Yun Keun LEE , Eui Sok CHUNG
CPC classification number: G10L15/16 , G10L15/063 , G10L15/07 , G10L2015/022 , G10L2015/0636
Abstract: A speech recognition apparatus based on a deep-neural-network (DNN) sound model includes a memory and a processor. As the processor executes a program stored in the memory, the processor generates sound-model state sets corresponding to a plurality of pieces of set training speech data included in multi-set training speech data, generates a multi-set state cluster from the sound-model state sets, and sets the multi-set training speech data as an input node and the multi-set state cluster as output nodes so as to learn a DNN structured parameter.
-
-
-
-
-
-