APPARATUS AND METHOD FOR VERIFYING UTTERANCE IN SPEECH RECOGNITION SYSTEM

    公开(公告)号:US20170200458A1

    公开(公告)日:2017-07-13

    申请号:US15186286

    申请日:2016-06-17

    Abstract: An apparatus and method for verifying an utterance based on multi-event detection information in a natural language speech recognition system. The apparatus includes a noise processor configured to process noise of an input speech signal, a feature extractor configured to extract features of speech data obtained through the noise processing, an event detector configured to detect events of the plurality of speech features occurring in the speech data using the noise-processed data and data of the extracted features, a decoder configured to perform speech recognition using a plurality of preset speech recognition models for the extracted feature data, and an utterance verifier configured to calculate confidence measurement values in units of words and sentences using information on the plurality of events detected by the event detector and a preset utterance verification model and perform utterance verification according to the calculated confidence measurement values.

    MOBILE COMMUNICATION TERMINAL AND OPERATING METHOD THEREOF
    9.
    发明申请
    MOBILE COMMUNICATION TERMINAL AND OPERATING METHOD THEREOF 有权
    移动通信终端及其操作方法

    公开(公告)号:US20140221043A1

    公开(公告)日:2014-08-07

    申请号:US14018068

    申请日:2013-09-04

    CPC classification number: H04M1/72519 G10L15/25 H04M2250/52 H04M2250/74

    Abstract: Provided is a mobile communication terminal including: a camera module which captures an image of a set area; a microphone module which, when a sound including a voice of a user is input, extracts a sound level corresponding to the sound and a sound generating position; and a control module which estimates a position of a lip of the user from the image, extracts a voice level from the sound level corresponding to the position of the lip of the user and a voice generating position from the sound generating position, and recognizes the voice of the user based on at least one of the voice level and the voice generating position.

    Abstract translation: 提供了一种移动通信终端,包括:相机模块,其捕获设置区域的图像; 麦克风模块,当输入包括用户的声音的声音时,提取与声音和声音产生位置相对应的声级; 以及控制模块,其从图像估计用户的嘴唇的位置,从与声音产生位置的用户的嘴唇的位置和语音产生位置相对应的声级提取语音电平,并且识别出 基于语音电平和语音产生位置中的至少一个的用户的语音。

Patent Agency Ranking