摘要:
A speech recognition apparatus, method and computer program product whereby noise is subtracted from an input speech signal by a plurality of spectral subtractions having differing rates of noise subtraction to produce plural noise-subtracted signals, at least one speech features is extracted from the noise-subtracted signals, and the extracted feature is compared with a standard speech pattern obtained beforehand to recognize the speech signal based on a result of the comparison. In addition, features can be extracted from at least one of the noise-subtracted signals and also the input speech signal for comparison with the standard speech pattern. Plural features can be combined into a single feature for the comparison.
摘要:
A noise estimation unit estimates a noise signal in an input signal. A section decision unit distinguishes a target signal section from a noise signal section in the input signal. A noise suppression unit suppresses the noise signal based on a first suppression coefficient from the input signal. A noise excess suppression unit suppresses the noise signal based on a second suppression coefficient from the input signal. The second suppression coefficient is larger than the first suppression coefficient. A switching unit switches between an output signal from the noise suppression unit and an output signal from the noise excess suppression unit based on a decision result of the section decision unit.
摘要:
A speech recognition apparatus, method and computer program product whereby noise is subtracted from an input speech signal by a plurality of spectral subtractions having differing rates of noise subtraction to produce plural noise-subtracted signals, at least one speech features is extracted from the noise-subtracted signals, and the extracted feature is compared with a standard speech pattern obtained beforehand to recognize the speech signal based on a result of the comparison. In addition, features can be extracted from at least one of the noise-subtracted signals and also the input speech signal for comparison with the standard speech pattern. Plural features can be combined into a single feature for the comparison.
摘要:
A noise estimation unit estimates a noise signal in an input signal. A section decision unit distinguishes a target signal section from a noise signal section in the input signal. A noise suppression unit suppresses the noise signal based on a first suppression coefficient from the input signal. A noise excess suppression unit suppresses the noise signal based on a second suppression coefficient from the input signal. The second suppression coefficient is larger than the first suppression coefficient. A switching unit switches between an output signal from the noise suppression unit and an output signal from the noise excess suppression unit based on a decision result of the section decision unit.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determining unit determines whether or not the input signal is a noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step. A noise model adapting unit calculates a first likelihood by comparing the time series of the amount of characteristics with recognizing standard patterns and orders the recognizing standard patterns in accordance with the size of the first likelihood. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit calculates a compared likelihood of a noise model adaptive to a noise environment, i.e., a compared likelihood of environmental noise. A rejection determining unit compares the likelihood of the registered vocabulary with the likelihood of the environmental noise, and determines whether or not the input speech is the noise. When it is determined that the input speech is the noise, a noise model adapting unit adaptively updates an environment adaptive noise model by using the input speech. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adapted noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determines whether or not the input signal is noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step. A noise model adapting unit calculates a first likelihood by comparing the time series of the amount of characteristics with recognizing standard patterns and orders the recognizing standard patterns in accordance with the size of the first likelihood. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit obtains a likelihood that respective recognizing-unit standard patterns coincide with a time series of the amount of characteristics representing the characteristics of the input speed. A first rejection unit determines whether or not the input signal is noise, on the basis of a likelihood of coincidence obtained by the recognizing target vocabulary comparing unit, and a second rejection determining unit determines whether or not the input signal determined to be noise by the first rejection determining unit is noise, on the basis of the likelihood of coincidence obtained by the recognizing target vocabulary comparing unit and the likelihood of coincidence obtained by the environment adaptive noise model comparing unit. An optimal phoneme series comparing unit determines a likelihood that respective recognizing-unit standard patterns coincide with the time series of the amount of characteristics. When the first or second rejection determining unit determines the input signal to be noise, the environment adaptive recognizing unit selecting unit adaptively updates an order of selection of the recognizing-unit standard pattern on the basis of the likelihood of coincidence obtained by the optimal phoneme series comparing unit. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit calculates a compared likelihood of a noise model adaptive to a noise environment, i.e., a compared likelihood of environmental noise. A rejection determining unit compares the likelihood of the registered vocabulary with the likelihood of the environmental noise, and determines whether or not the input speech is the noise. When it is determined that the input speech is the noise, a noise model adapting unit adaptively updates an environment adaptive noise model by using the input speech. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.
摘要:
A recognizing target vocabulary comparing unit calculates a compared likelihood of a recognizing target vocabulary, i.e., a compared likelihood of a registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit calculates a compared likelihood of a noise model adaptive to a noise environment, i.e., a compared likelihood of environmental noise. A rejection determining unit compares the likelihood of the registered vocabulary with the likelihood of the environmental noise, and determines whether or not the input speech is the noise. When it is determined that the input speech is the noise, a noise model adapting unit adaptively updates an environment adaptive noise model by using the input speech. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.