摘要:
A speech detection apparatus and method are provided. The speech detection apparatus and method determine whether a frame is speech or not using feature information extracted from an input signal. The speech detection apparatus may estimate a situation related to an input frame and determine which feature information is required for speech detection for the input frame in the estimated situation. The speech detection apparatus may detect a speech signal using dynamic feature information that may be more suitable to the situation of a particular frame, instead of using the same feature information for each and every frame.
摘要:
A speech detection apparatus and method are provided. The speech detection apparatus and method determine whether a frame is speech or not using feature information extracted from an input signal. The speech detection apparatus may estimate a situation related to an input frame and determine which feature information is required for speech detection for the input frame in the estimated situation. The speech detection apparatus may detect a speech signal using dynamic feature information that may be more suitable to the situation of a particular frame, instead of using the same feature information for each and every frame.
摘要:
Provided are a method and apparatus for providing a mobile voice web service in a mobile terminal. The method includes analyzing a web history of a user from web search logs of the user and generating a voice access list based on the analysis results, and performing voice recognition by dynamically generating a voice recognition syntax according to the generated voice access list. Accordingly, by limiting syntax required for voice recognition by generating a syntax suitable for a web context of the user, efficient voice recognition, which can be performed in a terminal not a server, can be implemented.
摘要:
A method of and apparatus for detecting noise are provided. The method of detecting noise includes: receiving an input of a voice frame and converting the voice frame into a filter bank vector; converting the converted filter bank vector into band data; calculating a weight Gaussian mixture model (GMM) for each band by using the converted band data; and detecting noise in the voice frame based on the calculation result.
摘要:
Provided are a method and apparatus for providing a mobile voice web service in a mobile terminal. The method includes analyzing a web history of a user from web search logs of the user and generating a voice access list based on the analysis results, and performing voice recognition by dynamically generating a voice recognition syntax according to the generated voice access list. Accordingly, by limiting syntax required for voice recognition by generating a syntax suitable for a web context of the user, efficient voice recognition, which can be performed in a terminal not a server, can be implemented.
摘要:
A method of and apparatus for detecting noise are provided. The method of detecting noise includes: receiving an input of a voice frame and converting the voice frame into a filter bank vector; converting the converted filter bank vector into band data; calculating a weight Gaussian mixture model (GMM) for each band by using the converted band data; and detecting noise in the voice frame based on the calculation result.
摘要:
An apparatus for speech recognition includes: a first confidence score calculator calculating a first confidence score using a ratio between a likelihood of a keyword model for feature vectors per frame of a speech signal and a likelihood of a Filler model for the feature vectors; a second confidence score calculator calculating a second confidence score by comparing a Gaussian distribution trace of the keyword model per frame of the speech signal with a Gaussian distribution trace sample of a stored corresponding keyword of the keyword model; and a determination module determining a confidence of a result using the keyword model in accordance with a position determined by the first and second confidence scores on a confidence coordinate system.
摘要:
An apparatus for speech recognition includes: a first confidence score calculator calculating a first confidence score using a ratio between a likelihood of a keyword model for feature vectors per frame of a speech signal and a likelihood of a Filler model for the feature vectors; a second confidence score calculator calculating a second confidence score by comparing a Gaussian distribution trace of the keyword model per frame of the speech signal with a Gaussian distribution trace sample of a stored corresponding keyword of the keyword model; and a determination module determining a confidence of a result using the keyword model in accordance with a position determined by the first and second confidence scores on a confidence coordinate system.
摘要:
An apparatus and method for recognizing voice. The apparatus includes a feature vector extraction unit dividing an input voice signal into predetermined unit regions, and extracting feature vectors corresponding to each of the unit regions; a predicted node extraction unit extracting a list of second nodes whose travels to a first node corresponding to the extracted feature vectors are predicted, with reference to a network of one or more nodes; a single waveform similarity calculation unit calculating degrees of single waveform similarity of the first node and the second nodes of the list by substituting the extracted feature vectors into single waveform probability distributions that constitute voice signals corresponding to the second nodes; a multiple waveform similarity calculation unit calculating degrees of multiple waveform similarity by substituting the extracted feature vectors into multiple waveform probability distributions that constitute single waveform probability distributions usable to calculate the degrees of single waveform similarity in a preset range; and an output unit outputting a function-performing signal corresponding to a multiple waveform probability distribution that enables calculation of a highest of the calculated degrees of multiple waveform similarity.
摘要:
Provided are a method and apparatus for automatically completing a text input using speech recognition. The method includes: receiving a first part of a text from a user through a text input device; recognizing a speech of the user, which corresponds to the text; and completing a remaining part of the text based on the first part of the text and the recognized speech. Therefore, accuracy of the text input and convenience of the speech recognition can be ensured, and a non-input part of the text can be easily input based on the input part of the text and the recognized speech at a high speed.