摘要:
An acoustic model registration apparatus, an talker recognition apparatus, an acoustic model registration method and an acoustic model registration processing program, each of which prevents certainly an acoustic model having a low recognition capability for talker from being registered certainly, are provided.When a talker utters for the N utterances and the utterance sounds of the N utterances are input through the microphone 1, the sound feature quantity extraction part 4 extracts sound feature quantities which indicate the acoustic features of the input utterance sounds, wherein each sound feature quantity has one-to-one correspondence to each utterance, the talker model generation part 5 generates a talker model based on the extracted sound feature quantities for the N utterances, the collation part 6 calculates the degree of individual similarity between the each sound feature quantity of the N utterances and the talker model generated above, and only in the case that all the calculated degrees of similarities of the N utterances are equal to or more than the threshold value, the similarity verifying part 9 directs to register the generated talker model in the talker models' database as a talker model for the talker recognition.
摘要:
A contents presenting system includes: an analyzing unit which collects and analyzes user's conversation to output an analysis result; a contents acquiring unit which acquires contents from a contents database based on the analysis result; and a contents presenting unit which presents the acquired contents to the user. Since the analysis result of user's conversation includes a factor representing the environment where the user is talking, by determining contents based on the analysis result, it is possible to provide the contents which is suited for the environment where the user is present.
摘要:
A quantization error correcting device corrects quantization error included in audio information at the time of decoding. The audio information is divided into a plurality of frequency bands and compressive-encoded for each frequency band with bit allocation determined based on audible frequency characteristic. The device includes: a detecting unit for detecting, based on bit allocation information indicating bit allocation and encoded values of the compressive-encoded audio information, a range of quantization error indicating a range in which audio information value before compressive-encoding corresponding to the encoded value exists; and an outputting unit for outputting a decoded value corresponding to one of the encoded values based on the detected range of quantization error and the ranges of quantization errors of other correlated ones of the encoded values.
摘要:
A harmonic tone generator produces a harmonics signal even for input audio signals of a small amplitude. Conversion of a digitized audio signal in accordance with a predetermined non-linear function is performed also for an audio signal of a small amplitude. According to the second aspect of the invention, a level difference between the digital audio signal level in the present sampling time and the audio signal level in the preceding sampling time is detected and the detected level difference is converted to an output value in accordance with a predetermined non-linear function by a non-linear converting circuit. The converted output value is accumulated. According to the third aspect of the invention, the detected level difference is converted to a function conversion output in accordance with a predetermined function by a non-linear converting circuit. A gain of an amplifier to amplify the audio signal in the present sampling time is changed in accordance with the function conversion output.
摘要:
A sound echo machine as an acoustic signal processing unit of the present invention comprising an adder to which an input signal is fed, and a delay circuit for delaying the signal fed from the adder for a certain time to repeatedly feed back to the adder to generate an echo sound further comprises an input signal level detector for detecting the level of the input signal and sending it to a frequency oscillator to vary the oscillating frequency in accordance with the thus detected signal level for feeding it later to the delay circuit so as to modulate the time to be delayed at a predetermined cycle, whereby it can create an acoustic field in which a listener can feel as if various level of reflected sounds are coming towards him from various directions. On the other hand, a sound effecter as an acoustic signal processing unit comprising a plurality of acoustic signal processing sections, a plurality of attenuators each connected to these acoustic signal processing sections, and an adder for summing up all the signals from these attenuators further comprises a signal mixing ratio control section for monitoring the input acoustic signal level, and determining a signal mixing ratio among the respective output signals from the plurality of acoustic signal processing sections in accordance with the thus monitored level of the input acoustic signal, whereby even a simple structure can provide a specific sound effect.
摘要:
An operator recognition device is provided that eliminates the registration of data such as HMM data having a characteristic amount for which error in recognition occurs easily when recognizing an operator, and thus reduces the possibility of errors in recognition, and has stable recognition performance. When registering HMM data that is used when performing recognition processing, a speaker recognition device 100 eliminates the registration of HMM data of a password having a characteristic amount of the spoken voice component that is similar to a characteristic amount that is indicated by HMM data that is already registered, and does not allow the registration of HMM data for which it is estimated that error in recognition will occur easily during the recognition process.
摘要:
EN) A speaker recognition system (1) includes a speaker model registration device (10) which registers a speaker model for speaker recognition in the speaker recognition system. The speaker model registration device includes acquisition means (13) for acquiring utterances by n+α times (wherein n is an integer not smaller than 2 and α is an integer not smaller than 1); calculation means (20) for calculating a speaker model by using the acquired utterances of n times as utterances for registration; correlation means (30) for correlating the calculated speaker model by using the acquired utterances of α times as correlation utterances; and registration means (40) for registering those having the correlation result satisfying a predetermined reference among the correlated speaker models, as the speaker model for speaker recognition.
摘要:
The present invention is directed to provide a data selecting apparatus and a navigation apparatus capable of easily and promptly selecting one piece of data from a plurality of pieces of data. A navigation apparatus 100 has: a display controller 111 for obtaining name data and genre information of each point data from a map data storing unit 105, generating display data for displaying names of the point data, which are arranged by the genre information at the same hierarchical level on the basis of the obtained name data and genre information, and performing the display control on the generated display data, and an operating unit 106 used for selecting a genre to which point data to be selected by the user belongs and selecting a name of one piece of the point data from the selected genre. The display controller 111 performs generation of display data for displaying the genre selected by the operating unit 106 and display control on the display data or performs generation of display data at the time when the name of point data is selected by the operating unit 106 and display control on the display data interlockingly with selecting operation executed by using the operating unit 106.
摘要:
A speech recognition apparatus and speech recognition method are provided for reducing such events as erroneous recognition and disabled recognition and improving a recognition efficiency. The speech recognition apparatus generates a word model based on a dictionary memory and a sub-word sound model, and matches the word model with a speech input signal in accordance with a predetermined algorithm to perform a speech recognition for the speech input signal, wherein the apparatus comprises main matching means, operative when matching the word model with the speech input signal along a processing path indicated by the algorithm, for limiting the processing path based on a course command to select the word model most approximate to the speech input signal, local template storing means for previously typifying local sound features of spoken speeches for storage as local templates; and local matching means for matching each of component sections of the speech input signal with the local templates stored in the local template storing means to definitely determine a sound feature for each of the component sections, and generating the course command in accordance with the result of the definite determination.
摘要:
A speech synthesizing method which synthesizes speech naturally is disclosed. Standardized frame power values of an n-th frame is calculated when frame power values at head and tail frames in a phoneme are standardized. An average value of the power values sampled from the power frequency characteristics in the n-th frame at a predetermined frequency interval is set as a mean frame power value. A sum of squares of signal levels in one frame of a frequency signal from a sound source is calculated as a frame power correction value. A speech envelope signal is calculated as a function having variables of the standardized frame power values, the frame power correction value and the mean frame power value. The speech envelope signal adjusts the amplitude level of a speech waveform signal supplied from a vocal tract filter according to the level of the speech envelope signal.