-
公开(公告)号:US10079022B2
公开(公告)日:2018-09-18
申请号:US15193216
申请日:2016-06-27
Inventor: Dong-Hyun Kim
IPC: G01L15/00 , G10L15/04 , G10L15/14 , G10L15/26 , G10L15/28 , G10L15/18 , G10L15/20 , G10L17/00 , G10L15/30 , G10L15/02 , G10L15/183
CPC classification number: G10L15/30 , G10L15/02 , G10L15/183
Abstract: A voice recognition terminal, a voice recognition server, and a voice recognition method for performing personalized voice recognition. The voice recognition terminal includes a feature extraction unit for extracting feature data from an input voice signal, an acoustic score calculation unit for calculating acoustic model scores using the feature data, and a communication unit for transmitting the acoustic model scores and state information to a voice recognition server in units of one or more frames, and receiving transcription data from the voice recognition server, wherein the transcription data is recognized using a calculated path of a language network when the voice recognition server calculates the path of the language network using the acoustic model scores.
-
公开(公告)号:US12206894B2
公开(公告)日:2025-01-21
申请号:US17479811
申请日:2021-09-20
Inventor: Dong-Hyun Kim , Ji-Hoon Do , Youn-Hee Kim , Se-Yoon Jeong , Hyoung-Jin Kwon , Jong-Ho Kim , Joo-Young Lee , Jin-Soo Choi , Tae-Jin Lee
IPC: H04N19/593 , G06N3/04 , H04N19/11 , H04N19/136 , H04N19/176 , H04N19/184 , H04N19/436
Abstract: Disclosed herein are a method, an apparatus, and a storage medium for image encoding/decoding. An intra-prediction mode for the target block is derived, and intra-prediction for the target block that uses the derived intra-prediction mode is performed. The intra-prediction mode for the target block is derived using an artificial neural network, and an MPM list for the target block is derived using information about the target block, pieces of information about blocks adjacent to the target block, and the artificial neural network. The artificial neural network outputs one or more available intra-prediction modes. Further, the artificial neural network outputs match probabilities for one or more candidate intra-prediction modes, and each of the match probabilities for the candidate intra-prediction modes indicates a probability that the corresponding candidate intra-prediction mode matches the intra-prediction mode for the target block.
-
3.
公开(公告)号:US20210279912A1
公开(公告)日:2021-09-09
申请号:US17192480
申请日:2021-03-04
Inventor: Joo-Young Lee , Se-Yoon Jeong , Hyoung-Jin Kwon , Dong-Hyun Kim , Youn-Hee Kim , Jong-Ho Kim , Tae-Jin Lee , Jin-Soo Choi
Abstract: An encoding apparatus extracts features of an image by applying multiple padding operations and multiple downscaling operations to an image represented by data and transmits feature information indicating the features to a decoding apparatus. The multiple padding operations and the multiple downscaling operations are applied to the image in an order in which one padding operation is applied and thereafter one downscaling operation corresponding to the padding operation is applied. A decoding method receives feature information from an encoding apparatus, and generates a to reconstructed image by applying multiple upscaling operations and multiple trimming operations to an image represented by the feature information. The multiple upscaling operations and the multiple trimming operations are applied to the image in an order in which one upscaling operation is applied and thereafter one trimming operation corresponding to the upscaling operation is applied.
-
公开(公告)号:US12190548B2
公开(公告)日:2025-01-07
申请号:US17482067
申请日:2021-09-22
Inventor: Ji-Hoon Do , Hyoung-Jin Kwon , Dong-Hyun Kim , Youn-Hee Kim , Joo-Young Lee , Se-Yoon Jeong , Jin-Soo Choi , Tae-Jin Lee , Jee-Hoon Kim , Dong-Gyu Sim , Seoung-Jun Oh , Min-Hun Lee , Yun-Gu Lee , Han-Sol Choi , Kwang-Hwan Kim
IPC: G06T9/00 , G06N3/02 , G06T3/4046 , G06V10/40
Abstract: There are provided an apparatus, method, system, and recording medium for performing selective encoding/decoding on feature information. An encoding apparatus generates residual feature information. The encoding apparatus transmits the residual feature information to a decoding apparatus through a residual feature map bitstream. The residual feature information is the difference between feature information extracted from an original image and feature information extracted from a reconstructed image. Feature information of the reconstructed image is generated using the reconstructed image. Reconstructed feature information is generated using the feature information of the reconstructed image and reconstructed residual feature information.
-
公开(公告)号:US12154302B2
公开(公告)日:2024-11-26
申请号:US17542002
申请日:2021-12-03
Inventor: Joo-Young Lee , Se-Yoon Jeong , Hyoung-Jin Kwon , Dong-Hyun Kim , Youn-Hee Kim , Jong-Ho Kim , Ji-Hoon Do , Jin-Soo Choi , Tae-Jin Lee
IPC: G06T9/00
Abstract: Disclosed herein are a method, an apparatus and a storage medium for image encoding/decoding using a binary mask. An encoding method includes generating a latent vector using an input image, generating a selected latent vector component set using a binary mask, and generating a main bitstream by performing entropy encoding on the selected latent vector component set. A decoding method includes generating a selected latent vector component set including one or more selected latent vector components by performing entropy decoding on a main bitstream and generating the latent vector in which the one or more selected latent vector components are relocated by relocating the selected latent vector component set in the latent vector.
-
6.
公开(公告)号:US11769276B2
公开(公告)日:2023-09-26
申请号:US17192480
申请日:2021-03-04
Inventor: Joo-Young Lee , Se-Yoon Jeong , Hyoung-Jin Kwon , Dong-Hyun Kim , Youn-Hee Kim , Jong-Ho Kim , Tae-Jin Lee , Jin-Soo Choi
IPC: G06T9/00 , G06T3/40 , G06V10/764 , G06V10/82 , G06V10/44
CPC classification number: G06T9/002 , G06T3/4046 , G06V10/454 , G06V10/764 , G06V10/82
Abstract: An encoding apparatus extracts features of an image by applying multiple padding operations and multiple downscaling operations to an image represented by data and transmits feature information indicating the features to a decoding apparatus. The multiple padding operations and the multiple downscaling operations are applied to the image in an order in which one padding operation is applied and thereafter one downscaling operation corresponding to the padding operation is applied. A decoding method receives feature information from an encoding apparatus, and generates a reconstructed image by applying multiple upscaling operations and multiple trimming operations to an image represented by the feature information. The multiple upscaling operations and the multiple trimming operations are applied to the image in an order in which one upscaling operation is applied and thereafter one trimming operation corresponding to the upscaling operation is applied.
-
7.
公开(公告)号:US11665363B2
公开(公告)日:2023-05-30
申请号:US17535926
申请日:2021-11-26
Inventor: Hyoung-Jin Kwon , Ji-Hoon Do , Dong-Hyun Kim , Youn-Hee Kim , Jong-Ho Kim , Joo-Young Lee , Se-Yoon Jeong , Jin-Soo Choi , Tae-Jin Lee , Jee-Hoon Kim , Dong-Gyu Sim , Seoung-Jun Oh , Min-Hun Lee , Yun-Gu Lee , Han-Sol Choi , Kwang-Hwan Kim
IPC: H04N19/503 , H04N19/11
CPC classification number: H04N19/503 , H04N19/11
Abstract: Disclosed herein are a method, apparatus, system, and computer-readable recording medium for image compression. An encoding apparatus performs preprocessing of feature map information, frame packing, frame classification, and encoding. A decoding apparatus performs decoding, frame depacking, and postprocessing in order to reconstruct feature map information. By encoding the feature map information, inter-prediction and intra-block prediction for a frame are performed. The encoding apparatus provides the decoding apparatus with a feature map information bitstream for reconstructing the feature map information along with an image information bitstream.
-
公开(公告)号:US10216729B2
公开(公告)日:2019-02-26
申请号:US14914390
申请日:2014-04-30
Inventor: Sang-Hun Kim , Ki-Hyun Kim , Ji-Hyun Wang , Dong-Hyun Kim , Seung Yun , Min-Kyu Lee , Dam-Heo Lee , Mu-Yeol Choi
IPC: G06F17/28 , H04M1/60 , G10L13/033 , G10L15/00
Abstract: A user terminal, hands-free device and method for hands-free automatic interpretation service. The user terminal includes an interpretation environment initialization unit, an interpretation intermediation unit, and an interpretation processing unit. The interpretation environment initialization unit performs pairing with a hands-free device in response to a request from the hands-free device, and initializes an interpretation environment. The interpretation intermediation unit sends interpretation results obtained by interpreting a user's voice information received from the hands-free device to a counterpart terminal, and receives interpretation results obtained by interpreting a counterpart's voice information from the counterpart terminal. The interpretation processing unit synthesizes the interpretation results of the counterpart into a voice form based on the initialized interpretation environment when the interpretation results are received from the counterpart terminal, and sends the synthesized voice information to the hands-free device.
-
9.
公开(公告)号:US09601112B2
公开(公告)日:2017-03-21
申请号:US14256386
申请日:2014-04-18
Inventor: Dong-Hyun Kim
CPC classification number: G10L15/07 , G10L2015/228
Abstract: An embodiment of the present invention relates to a speech recognition system and method using incremental device-based acoustic model adaptation. The speech recognition system comprises a model selection module selecting an acoustic model of multi-model tree by verifying and categorizing a device key transmitted from a user device; a model management module generating and incrementally adapting multi-model tree by categorizing voice data based on a user device; and a speech recognition module performing speech recognition by receiving the acoustic model selected from the model selection module and transmitting data of which reliability exceeds a predetermined threshold value to the model management module.
-
10.
公开(公告)号:US09514751B2
公开(公告)日:2016-12-06
申请号:US14224427
申请日:2014-03-25
Inventor: Dong-Hyun Kim
IPC: G10L15/183 , G10L15/30 , G10L15/187 , G10L15/22
CPC classification number: G10L15/30 , G10L15/183 , G10L15/187 , G10L2015/228
Abstract: Described herein is a speech recognition device comprising: a communication module receiving speech data corresponding to speech input from a speech recognition terminal and multi-sensor data corresponding to input environment of the speech; a model selection module selecting a language and acoustic model corresponding to the multi-sensor data among a plurality of language and acoustic models classified according to the speech input environment on the basis of previous multi-sensor data; and a speech recognition module controlling the communication module to apply a feature vector extracted from the speech data to the language and acoustic model and transmit speech recognition result for the speech data to the speech recognition terminal.
Abstract translation: 这里描述的是一种语音识别装置,包括:通信模块,接收对应于来自语音识别终端的语音输入的语音数据和对应于语音输入环境的多传感器数据; 模型选择模块,基于先前的多传感器数据,在根据语音输入环境分类的多个语言和声学模型中选择与多传感器数据对应的语言和声学模型; 以及语音识别模块,其控制所述通信模块将从所述语音数据提取的特征向量应用于所述语言和声学模型,并将语音数据的语音识别结果发送到所述语音识别终端。
-
-
-
-
-
-
-
-
-