Method and apparatus for speech recognition
    1.
    发明授权
    Method and apparatus for speech recognition 失效
    用于语音识别的方法和装置

    公开(公告)号:US4817159A

    公开(公告)日:1989-03-28

    申请号:US616836

    申请日:1984-06-04

    IPC分类号: G10L15/00 G10L5/00

    CPC分类号: G10L15/00

    摘要: Speech parameters (P.sub.h and P.sub.l) are derived for consonant classification and recognition by separating a speech signal into Low and High frequency bands, then in each band obtaining the time first-derivative, from which the min-max differences (power dip) are obtained (P.sub.h and P.sub.l). The distribution of P.sub.h and P.sub.l in a two-dimensional plot for a discriminant diagram classifies the consonant phoneme.

    摘要翻译: 通过将语音信号分为低频和高频段,然后在每个频带中获得时间一阶导数,得到语音参数(Ph和P1),从而获得最小 - 最大差异(功率下降) (Ph和Pl)。 Ph和Pl在判别图的二维图中的分布将辅音音素分类。

    Voice recognition method for recognizing a word in speech
    2.
    发明授权
    Voice recognition method for recognizing a word in speech 失效
    用于识别语音中的单词的语音识别方法

    公开(公告)号:US5692097A

    公开(公告)日:1997-11-25

    申请号:US347089

    申请日:1994-11-23

    CPC分类号: G10L15/12

    摘要: An inter-frame similarity between an input voice and a standard patterned word is calculated for each of frames and for each of standard patterned words, and a posterior probability similarity is produced by subtracting a constant value from each of the inter-frame similarities. The constant value is determined by analyzing voice data obtained from specified persons to set the posterior probability similarities to positive values when a word existing in the input voice matches with the standard patterned word and to set the posterior probability similarities to negative values when a word existing in the input voice does not match with the standard patterned word. Thereafter, an accumulated similarity having an accumulated value obtained by accumulating values of the posterior probability similarities according to a continuous dynamic programming matching operation for the frames of the input voice is calculated for each of the standard patterned words. Thereafter, a particular standard patterned word relating to an accumulated similarity having a maximum value among the accumulated similarities is output as a recognized word of the input voice.

    摘要翻译: 针对每个帧和每个标准图案化字计算输入语音和标准图案化字之间的帧间相似度,并且通过从每个帧间相似性中减去常数值来产生后验概率相似性。 通过分析从指定人员获得的语音数据来确定常数值,以便当存在于输入语音中的单词与标准图案化词匹配时将后验概率相似性设置为正值,并且当存在词时将后验概率相似性设置为负值 在输入语音中与标准图案字不匹配。 此后,针对每个标准图案化字,计算累积相似度,该相似度具有根据输入声音的帧的连续动态规划匹配操作累积后验概率相似度的值而获得的累积值。 此后,输出与积累的相似度中具有最大值的累积相似度相关的特定标准图案化字作为输入语音的识别字。

    Method of speech recognition
    3.
    发明授权
    Method of speech recognition 失效
    语音识别方法

    公开(公告)号:US5309547A

    公开(公告)日:1994-05-03

    申请号:US897131

    申请日:1992-06-11

    IPC分类号: G10L15/10 G10L5/00

    CPC分类号: G10L15/10

    摘要: A method of speech recognition includes the steps of analyzing input speech every frame and deriving feature parameters from the input speech, generating an input vector from the feature parameters of a plurality of frames, and periodically calculating partial distances between the input vector and partial standard patterns while shifting the frame one by one. Standard patterns correspond to recognition-object words respectively, and each of the standard patterns is composed of the partial standard patterns which represent parts of the corresponding recognition-object word respectively. The partial distances are accumulated into distances between the input speech and the standard patterns. The distances correspond to the recognition-object words respectively. The distances are compared with each other, and a minimum distance of the distances is selected when the input speech ends. One of the recognition-object words which corresponds to the minimum distance is decided to be a recognition result.

    摘要翻译: 一种语音识别方法包括以下步骤:每帧分析输入语音,并从输入语音中导出特征参数,从多个帧的特征参数生成输入向量,并周期性地计算输入向量与部分标准模式之间的部分距离 同时逐帧移动框架。 标准图案分别对应于识别对象字,并且每个标准图案分别由表示相应识别对象字的部分的部分标准图案组成。 部分距离累积到输入语音和标准模式之间的距离。 距离对应于识别对象字。 距离相互比较,当输入语音结束时,选择距离的最小距离。 对应于最小距离的识别对象词之一被确定为识别结果。

    Method of speech recognition
    4.
    发明授权
    Method of speech recognition 失效
    语音识别方法

    公开(公告)号:US5345536A

    公开(公告)日:1994-09-06

    申请号:US808692

    申请日:1991-12-17

    IPC分类号: G10L15/10 G10L5/06

    CPC分类号: G10L15/10

    摘要: A set of "m" feature parameters is generated every frame from reference speech which is spoken by at least one speaker and which represents recognition-object words, where "m" denotes a preset integer. A set of "n" types of standard patterns is previously generated on the basis of speech data of a plurality of speakers, where "n" denotes a preset integer. Matching between the feature parameters of the reference speech and each of the standard patterns is executed to generate a vector of "n" reference similarities between the feature parameters of the reference speech and each of the standard patterns every frame. The reference similarity vectors of respective frames are arranged into temporal sequences corresponding to the recognition-object words respectively. The reference similarity vector sequences are previously registered as dictionary similarity vector sequences. Input speech to be recognized is analyzed to generate "m" feature parameters from the input speech. Matching between the feature parameters of the input speech and the standard patterns is executed to generate a vector of "n" input-speech similarities between the feature parameters of the input speech and the standard patterns every frame. The input-speech similarity vectors of respective frames are arranged into a temporal sequence. The input-speech similarity vector sequence is collated with the dictionary similarity vector sequences to recognize the input speech.

    摘要翻译: 一组“m”特征参数是由至少一个说话者所说的参考语音的每一帧产生的,并且表示识别对象词,其中“m”表示预置的整数。 先前根据多个扬声器的语音数据生成一组“n”种标准模式,其中“n”表示预置的整数。 执行参考语音的特征参数与每个标准模式之间的匹配以在每帧中生成参考语音的特征参数与每个标准模式之间的“n”个参考相似度的向量。 各帧的参考相似度矢量分别被布置成与识别对象字对应的时间序列。 参考相似性向量序列预先登记为词典相似性向量序列。 分析要识别的输入语音以从输入语音生成“m”个特征参数。 执行输入语音的特征参数与标准模式之间的匹配,以在每个帧之间产生输入语音的特征参数和标准模式之间的“n”个输入语音相似性的向量。 各帧的输入语音相似度向量被排列成时间序列。 输入语音相似性向量序列与词典相似性向量序列进行比较,以识别输入语音。

    Apparatus for speech recognition
    5.
    发明授权
    Apparatus for speech recognition 失效
    语音识别装置

    公开(公告)号:US4885791A

    公开(公告)日:1989-12-05

    申请号:US920785

    申请日:1986-10-20

    IPC分类号: G10L15/00

    CPC分类号: G10L15/00

    摘要: Disclosed is a speech recognition apparatus comprising: a speech analysis portion for extracting parameters necessary for determination of spoken words; a speech period detecting portion for extracting one or more combinations of speech periods using the parameters; and a structure analysis portion for detecting feature points indicative of phoneme structure of each word and for determining a word through computation of similarity to proposed words in accordance with the presence and absence of the feature points. Therefore, erroneous recognition due to noise introduction or the like can be reduced by detecting one or more combinations of proposed speech periods by the speech period detecting portion. By extracting only necessary number of extracting points, which contribute to the distinguishment between words, with reference to analysis procedure provided for each word, the sharpness of determination is bettered. More stable operation than conventional apparatus has been achieved in connection with time base expansion/compression. Small numbers of parameters obtained through speech analysis are used to reduce the amount of computation, while the above-mentioned parameters are stable against difference in phenemes due to difference in speakers.

    摘要翻译: 公开了一种语音识别装置,包括:语音分析部分,用于提取确定口语所需的参数; 语音周期检测部分,用于使用参数提取语音周期的一个或多个组合; 以及结构分析部分,用于检测指示每个单词的音素结构的特征点,并且根据特征点的存在和不存在,通过计算与所提出的单词的相似度来确定单词。 因此,可以通过由语音周期检测部分检测所提出的语音周期的一个或多个组合来减少由噪声引入引起的错误识别等。 通过提取必要数量的提取点,这有助于区分词,参考为每个单词提供的分析程序,提高了确定的清晰度。 结合时基扩展/压缩已经实现了比传统装置更稳定的操作。 使用通过语音分析获得的少量参数来减少计算量,而上述参数由于扬声器的差异而对于差异性是稳定的。

    Apparatus for speech recognition
    6.
    发明授权
    Apparatus for speech recognition 失效
    语音识别装置

    公开(公告)号:US4736429A

    公开(公告)日:1988-04-05

    申请号:US618368

    申请日:1984-06-07

    IPC分类号: G10L15/10 G10L15/04 G10L5/00

    CPC分类号: G10L15/04

    摘要: Apparatus for speech recognition, having each phoneme as a fundamental recognition unit, recognizes input speech by discriminating phonemes in the input speech. The apparatus comprises a memory for storing phoneme standard patterns of phonemes or phoneme groups; a spectrum analyzer for obtaining parameters indicative of the input speech signal spectrum; a statistical distance measure similarity calculator calculates the degree of similarity between the output of the spectrum analyzer and standard patterns stored in the memory; a segmentation portion for segmenting by using time-dependent low- and high-frequency power variations of the input speech signal and results from the similarity calculator; and a phoneme discriminator for recognizing phonemes by using the results from the similarity calculator.

    摘要翻译: 具有每个音素作为基本识别单元的语音识别装置通过在输入语音中区分音素来识别输入语音。 该装置包括用于存储音素或音素组的音素标准模式的存储器; 频谱分析器,用于获得指示输入语音信号频谱的参数; 统计距离测量相似度计算器计算频谱分析仪的输出与存储在存储器中的标准模式之间的相似度; 分割部分,用于通过使用输入语音信号的时间相关的低频和高频功率变化和来自相似性计算器的结果来分割; 以及通过使用相似度计算器的结果来识别音素的音素鉴别器。

    Method of and apparatus for speech recognition wherein decisions are
made based on phonemes
    7.
    发明授权
    Method of and apparatus for speech recognition wherein decisions are made based on phonemes 失效
    用于语音识别的方法和装置,其中基于音素进行决定

    公开(公告)号:US5131043A

    公开(公告)日:1992-07-14

    申请号:US441225

    申请日:1989-11-20

    IPC分类号: G10L15/00

    CPC分类号: G10L15/10

    摘要: Linear prediction coefficients of a speech signal including unknown words are derived for each of successive periodic frame intervals. For every frame over the duration of an individual phoneme of the speech signal, the degree of similarity of stored coefficients of known words and derived coefficients of the unknown words are calculated so that at the end of the individual phonemes, the degree of similarity is calculated. Phoneme segmentation data are derived in response to the speech signal and combined with the calculated degree of similarity over the individual phoneme to derive phoneme strings of the speech signal. The derived and stored phoneme strings are compared to indicate the words stored in a word dictionary having the greatest similarity with the derived phoneme strings.

    摘要翻译: 对于每个连续的周期性帧间隔导出包括未知字的语音信号的线性预测系数。 对于在语音信号的单个音素的持续时间内的每个帧,计算已知单词的存储系数和未知单词的导出系数的相似度,使得在单个音素的末尾,计算相似度 。 音素分割数据是响应于语音信号导出的,并且与计算的各个音素上的相似程度相结合以导出语音信号的音素串。 将导出和存储的音素字符串进行比较,以指示与导出的音素串具有最大相似性的词典中存储的词。

    Method and circuit arrangement for shaping a signal waveform
    8.
    发明授权
    Method and circuit arrangement for shaping a signal waveform 失效
    用于整形信号波形的方法和电路布置

    公开(公告)号:US4367441A

    公开(公告)日:1983-01-04

    申请号:US206826

    申请日:1980-11-14

    IPC分类号: H03K5/007 H03K5/08

    CPC分类号: H03K5/082

    摘要: An input signal is applied to a clamping circuit where the level of the input signal is clamped at a desired base line voltage in response to a clamping control signal. The clamping control signal is a pulse train signal, and pulses are applied to the clamping circuit only when the level of the output signal of the clamping circuit is within a given range. Therefore, when the level rising rate or speed is low, the level of the input signal is intermittently clamped so that the output signal level is maintained close to the base line voltage. When the level rising rate exceeds a given value, clamping is not performed so that the output level follows the input level. In order to see whether the level of the output signal is within the given range or not, a reference voltage which is higher than the base line voltage is used when shaping a two-level signal. If the input signal is of three-level, another reference voltage, which is lower than the base line voltage, is additionally used.

    摘要翻译: 输入信号被施加到钳位电路,其中响应于钳位控制信号将输入信号的电平钳位在期望的基极电压。 钳位控制信号是脉冲序列信号,只有当钳位电路的输出信号的电平在给定范围内时,钳位电路才施加脉冲。 因此,当电平上升速率或速度低时,输入信号的电平被间歇地钳位,使得输出信号电平保持接近基线电压。 当电平上升率超过给定值时,不执行钳位,使输出电平跟随输入电平。 为了看出输出信号的电平是否在给定的范围内,当对二电平信号进行整形时,使用比基线电压高的参考电压。 如果输入信号为三电平,则另外使用低于基线电压的另一参考电压。

    Voice recognition method
    9.
    发明授权
    Voice recognition method 失效
    语音识别方法

    公开(公告)号:US5241649A

    公开(公告)日:1993-08-31

    申请号:US628987

    申请日:1990-12-17

    申请人: Katsuyuki Niyada

    发明人: Katsuyuki Niyada

    IPC分类号: G10L15/08 G10L15/10

    CPC分类号: G10L15/10

    摘要: In a voice recognition method, a d-by-J demensioned reference voice pattern is prepared for each target word, when J denotes a predetermined number of frames and d denotes a predetermined number of characterizing parameters per frame. A spoken input word is partitioned between its start and end points into J frames, and d characteristic parameters are extracted for each frame to form a d-by-J demensioned input time-series vector. The resemblance between the input vector and each of the reference voice patterns is then calculated using a statistical distance scale, and the spoken word is identified with the reference pattern providing the highest resemblance. The method requires fewer calculations and yet attains a high recognition rate through the normalization of the input voice word for both spectrum and time.

    摘要翻译: 在语音识别方法中,当J表示预定数量的帧,d表示每帧的预定数量的特征参数时,为每个目标字准备d-by-J拆分的参考语音模式。 一个口头输入字在其起始点和终点之间划分为J个帧,并且为每个帧提取d个特征参数以形成一个由D-by-J取消输入的时间序列向量。 然后使用统计距离尺度计算输入向量和每个参考语音模式之间的相似性,并且用提供最高相似性的参考模式来识别口语单词。 该方法需要较少的计算,并且通过对频谱和时间的输入语音词的归一化来获得高的识别率。

    Electronic engraving and recording system
    10.
    发明授权
    Electronic engraving and recording system 失效
    电子雕刻和录音系统

    公开(公告)号:US3950608A

    公开(公告)日:1976-04-13

    申请号:US444678

    申请日:1974-02-21

    IPC分类号: B42D15/10

    摘要: An electronic engraving and recording system comprising a television camera for picking up an image of an object and converting this image into an electrical signal, means for generating control signals on the basis of the synchronizing signal used in the television camera, memory means for storing the electrical signal under control of the control signals, means for engraving and recording the image according to the signal read out from the memory means under control of the control signals, and monitoring display means for displaying the visible image of the object in response to the application of the signal read out from the memory means. An image of an object can be simply engraved and recorded on a card within a short period of time without requiring any photographic original of the object.This invention relates to electronic engraving and recording systems, and more particularly to a system of the kind above described which can electronically engrave and record an image of an object on a sheet such as a card of suitable material which has a flat and smooth surface and is highly endurable against wear.Various attempts have heretofore been made for the identification of the true user of an identification card such as a credit card, ID card, bank card, cash dispenser card, oil card, key card, consultation ticket, communication ticket, or license card. For instance, in one of the prior attempts, a photographic print of the face or other features of a user is affixed to a base plate of a card. In another prior art attempt, a photographic print of the face or other features of a user is utilized as an original to produce a printing plate which is used to print a picture of the face or other features of the user on a base plate of a card.These methods have however had various defects. For example, the former method of affixing a photographic picture of, for example, the face of a user is unsatisfactory from the standpoint of preventing possible forgery. That is, if the card were stolen or lost, the card may be used illicitly by replacing the photograph affixed to the card and the true user of the card may suffer from unexpected damage. Another defect of this prior art method resides in the fact that the thickness of the photograph-bearing portion of the card is increased by the amount corresponding to the thickness of the photograph so that when, for example, the card is magnetically verified by a suitable mechanical apparatus, an inconvenience is frequently encountered in the mechanical handling of the card. In the latter method which resorts to printing, an individual printing plate must be prepared for producing a single card. Thus, this latter method is also defective in that the manufacturing cost of such a card increases considerably. Further, either of these prior art methods is defective in that the photographic picture or printed picture of the face of the user manifested on the base plate of the card is inferior in durability, and therefore, it tends to be worn away during prolonged use of the card to such an extent that the identifying function thereof is finally lost making it difficult to certify the identity of the user during the valid term of the card.It is therefore an object of the present invention to provide an electronic engraving and recording system which eliminates the necessity for preparing a photographic original of an object such as a person.Another object of the present invention is to provide an electronic engraving and recording system in which a visible image, which is a magnified image of an image to be engraved on a card and having exactly the same magnification with respect to the length and width is displayed on a visible image display means so that such image can be easily monitored.The electronic engraving and recording system according to the present invention includes image pickup means for picking up an image of an object such as a person whose picture is to be engraved on a card, signal storage means for storing a digital signal obtained by converting an analog signal representative of a still picture of the image picked up by the pickup means, means for reading out the digital signal stored in the storage means and converting same into an analog signal, visible image display means or monitoring means for displaying a visible image in response to the application of the analog signal from the D-A converting means, and engraving means for engraving the image corresponding to the still picture of the object on a card which is highly endurable against wear.

    摘要翻译: 一种电子雕刻和记录系统,包括用于拾取对象的图像并将该图像转换为电信号的电视摄像机,用于根据在电视中使用的同步信号产生控制信号的装置