摘要:
A speech recognition apparatus, includes a reliability estimating unit configured to estimate reliability of a time-frequency segment from an input voice signal; and a reliability reflecting unit configured to reflect the reliability of the time-frequency segment to a normalized cepstrum feature vector extracted from the input speech signal and a cepstrum average vector included for each state of an HMM in decoding. Further, the speech recognition apparatus includes a cepstrum transforming unit configured to transform the cepstrum feature vector and the average vector through a discrete cosine transformation matrix and calculate a transformed cepstrum vector. Furthermore, the speech recognition apparatus includes an output probability calculating unit configured to calculate an output probability value of time-frequency segments of the input speech signal by applying the transformed cepstrum vector to the cepstrum feature vector and the average vector.
摘要:
A silicon nanowire including metal nanoclusters formed on a surface thereof at a high density. The metal nanocluster improves electrical and optical characteristics of the silicon nanowire, and thus can be usefully used in various electrical devices such as a lithium battery, a solar cell, a bio sensor, a memory device, or the like.
摘要:
A microphone-array-based speech recognition system using a blind source separation (BBS) and a target speech extraction method in the system are provided. The speech recognition system performs an independent component analysis (ICA) to separate mixed signals input through a plurality of microphone into sound-source signals, extracts one target speech spoken for speech recognition from the separated sound-source signals by using a Gaussian mixture model (GMM) or a hidden Markov Model (HMM), and automatically recognizes a desired speech from the extracted target speech. Accordingly, it is possible to obtain a high speech recognition rate even in a noise environment.
摘要:
An apparatus for evaluating the performance of speech recognition includes a speech database for storing N-number of test speech signals for evaluation. A speech recognizer is located in an actual environment and executes the speech recognition of the test speech signals reproduced using a loud speaker from the speech database in the actual environment to produce speech recognition results. A performance evaluation module evaluates the performance of the speech recognition by comparing correct recognition results answers with the speech recognition results.
摘要:
The present invention relates to an apparatus and method for recognizing content using an audio signal. The content recognition apparatus includes a query fingerprint extraction unit for forming frames having a preset frame length for an audio signal, and generating frame-based feature vectors for respective frames, thus extracting a query fingerprint. A reference fingerprint DB stores reference fingerprints to be compared with the query fingerprint and pieces of content information corresponding to the reference fingerprints. A fingerprint matching unit determines a reference fingerprint matching the query fingerprint. In this case, the query fingerprint extraction unit forms the frames while varying a frame shift size that is an interval between start points of neighboring frames in a partial section. According to the present invention, there can be provided a content recognition apparatus and method which can maintain the accuracy and reliability of matching while promptly providing results.
摘要:
Embodiments of the invention include a non-volatile memory device manufactured using ion-implantation, and a method of manufacturing the same. A dielectric layer may be formed on a semiconductor substrate, and an ion implantation layer, which may be used as a charge trapping site, may be formed by ion implantation with Si or Ge. Then, an annealing process may be performed. Subsequently, a process for forming a transistor on the dielectric layer may be performed.
摘要:
The present invention relates to an apparatus and method for recognizing content using an audio signal. The content recognition apparatus includes a query fingerprint extraction unit for forming frames having a preset frame length for an audio signal, and generating frame-based feature vectors for respective frames, thus extracting a query fingerprint. A reference fingerprint DB stores reference fingerprints to be compared with the query fingerprint and pieces of content information corresponding to the reference fingerprints. A fingerprint matching unit determines a reference fingerprint matching the query fingerprint. In this case, the query fingerprint extraction unit forms the frames while varying a frame shift size that is an interval between start points of neighboring frames in a partial section. According to the present invention, there can be provided a content recognition apparatus and method which can maintain the accuracy and reliability of matching while promptly providing results.
摘要:
Disclosed is a method of generating a search network for voice recognition, the method including: generating a pronunciation transduction weighted finite state transducer by implementing a pronunciation transduction rule representing a phenomenon of pronunciation transduction between recognition units as a weighted finite state transducer; and composing the pronunciation transduction weighted finite state transducer and one or more weighted finite state transducers.
摘要:
Disclosed herein is an apparatus and method for creating an acoustic model. The apparatus includes a binary tree creation unit, an information creation unit, and a binary tree reduction unit. The binary tree creation unit creates a binary tree by repeatedly merging a plurality of Gaussian components for each Hidden Markov Model (HMM) state of an acoustic model based on a distance measure reflecting a variation in likelihood score. The information creation unit creates information about information about the largest size of the acoustic model in accordance with a platform including a speech recognizer. The binary tree reduction unit reduces the binary tree in accordance with the information about the largest size of the acoustic model.
摘要:
An apparatus for a speech recognition based on source separation and identification includes: a sound source separator for separating mixed signals, which are input to two or more microphones, into sound source signals by using independent component analysis (ICA), and estimating direction information of the separated sound source signals; and a speech recognizer for calculating normalized log likelihood probabilities of the separated sound source signals. The apparatus further includes a speech signal identifier identifying a sound source corresponding to a user's speech signal by using both of the estimated direction information and the reliability information based on the normalized log likelihood probabilities.