Abstract:
A speech recognition apparatus, includes a reliability estimating unit configured to estimate reliability of a time-frequency segment from an input voice signal; and a reliability reflecting unit configured to reflect the reliability of the time-frequency segment to a normalized cepstrum feature vector extracted from the input speech signal and a cepstrum average vector included for each state of an HMM in decoding. Further, the speech recognition apparatus includes a cepstrum transforming unit configured to transform the cepstrum feature vector and the average vector through a discrete cosine transformation matrix and calculate a transformed cepstrum vector. Furthermore, the speech recognition apparatus includes an output probability calculating unit configured to calculate an output probability value of time-frequency segments of the input speech signal by applying the transformed cepstrum vector to the cepstrum feature vector and the average vector.
Abstract:
An apparatus for evaluating the performance of speech recognition includes a speech database for storing N-number of test speech signals for evaluation. A speech recognizer is located in an actual environment and executes the speech recognition of the test speech signals reproduced using a loud speaker from the speech database in the actual environment to produce speech recognition results. A performance evaluation module evaluates the performance of the speech recognition by comparing correct recognition results answers with the speech recognition results.