VOICE EMPHASIZING DEVICE AND VOICE EMPHASIZING METHOD
    1.
    发明申请
    VOICE EMPHASIZING DEVICE AND VOICE EMPHASIZING METHOD 有权
    语音处理设备和语音处理方法

    公开(公告)号:US20100070283A1

    公开(公告)日:2010-03-18

    申请号:US12447775

    申请日:2008-09-29

    IPC分类号: G10L21/00

    摘要: A voice emphasizing device emphasizes in a speech a “strained rough voice” at a position where a speaker or user of the speech intends to generate emphasis or musical expression. Thereby, the voice emphasizing device can provide the position with emphasis of anger, excitement, tension, or an animated way of speaking, or musical expression of Enka (Japanese ballad), blues, rock, or the like. As a result, rich vocal expression can be achieved. The voice emphasizing device includes: an emphasis utterance section detection unit (12) detecting, from an input speech waveform, an emphasis section that is a time duration having a waveform intended by the speaker or user to be converted; and a voice emphasizing unit (13) increasing fluctuation of an amplitude envelope of the waveform in the detected emphasis section.

    摘要翻译: 语音强调设备在讲话中强调在讲话者或使用者的语音意图产生强调或音乐表达的位置处的“紧张粗糙的声音”。 因此,声音强调装置可以以Enka(日本民谣),蓝调,摇滚等的强调愤怒,兴奋,紧张或动画的演绎方式或音乐表现为出发点。 因此,可以实现丰富的声乐表达。 语音强调装置包括:强调话音部分检测单元,从输入语音波形检测作为要被转换的扬声器或用户想要的波形的持续时间的强调部分; 以及语音强调单元(13),增加检测到的强调部分中波形的振幅包络的波动。

    Method and apparatus for speech recognition
    2.
    发明授权
    Method and apparatus for speech recognition 失效
    用于语音识别的方法和装置

    公开(公告)号:US4817159A

    公开(公告)日:1989-03-28

    申请号:US616836

    申请日:1984-06-04

    IPC分类号: G10L15/00 G10L5/00

    CPC分类号: G10L15/00

    摘要: Speech parameters (P.sub.h and P.sub.l) are derived for consonant classification and recognition by separating a speech signal into Low and High frequency bands, then in each band obtaining the time first-derivative, from which the min-max differences (power dip) are obtained (P.sub.h and P.sub.l). The distribution of P.sub.h and P.sub.l in a two-dimensional plot for a discriminant diagram classifies the consonant phoneme.

    摘要翻译: 通过将语音信号分为低频和高频段,然后在每个频带中获得时间一阶导数,得到语音参数(Ph和P1),从而获得最小 - 最大差异(功率下降) (Ph和Pl)。 Ph和Pl在判别图的二维图中的分布将辅音音素分类。

    VEHICLE CONTROL DEVICE AND VEHICLE CONTROL METHOD
    3.
    发明申请
    VEHICLE CONTROL DEVICE AND VEHICLE CONTROL METHOD 有权
    车辆控制装置和车辆控制方法

    公开(公告)号:US20110178680A1

    公开(公告)日:2011-07-21

    申请号:US13075232

    申请日:2011-03-30

    IPC分类号: G06F7/00 B60T7/12 B62D6/00

    摘要: A vehicle control device (10) is provided that can predict a driving operation of a driver earlier to respond to the driving operation quickly. The vehicle control device (10) includes: a posture measuring unit (11) to measure a posture indicating a state of at least one of the buttock region, the upper pelvic region, and the driver's leg opposite to the other leg with which the driver operates a brake or an accelerator; a posture change detection unit (12) to detect a posture change measured; a preparatory movement identification unit (13) to identify whether the posture change is caused by the driver's preparatory movement spontaneously made before the brake or accelerator operation, based on whether the posture change detected satisfies a predetermined condition; and a vehicle control unit (14) to control the vehicle when it is identified that the posture change has been caused by the preparatory movement.

    摘要翻译: 提供一种车辆控制装置(10),其能够较早地预测驾驶员的驾驶操作以快速响应驾驶操作。 车辆控制装置(10)包括:姿势测量单元(11),用于测量指示与另一腿部相对的臀部区域,上部骨盆区域和驾驶员腿部中的至少一个的状态的姿势,驾驶员 操作制动器或加速器; 姿势变化检测单元(12),用于检测测量到的姿势变化; 准备运动识别单元,用于基于检测到的姿势变化是否满足预定条件来识别姿态变化是否由驾驶员在制动或加速器操作之前自发制定的运动引起; 以及车辆控制单元(14),用于当识别出由准备运动引起姿势变化时控制车辆。

    Method and apparatus for producing acoustic model
    4.
    发明授权
    Method and apparatus for producing acoustic model 有权
    用于制造声学模型的方法和装置

    公开(公告)号:US06842734B2

    公开(公告)日:2005-01-11

    申请号:US09879932

    申请日:2001-06-14

    IPC分类号: G10L15/06 G10L15/20

    摘要: In an acoustic model producing apparatus, a plurality of noise samples are categorized into clusters so that a number of the clusters is smaller than that of noise samples. A noise sample is selected in each of the clusters to set the selected noise samples to second noise samples for training. On the other hand, untrained acoustic models are stored on a storage unit so that the untrained acoustic models are trained by using the second noise samples for training, thereby producing trained acoustic models for speech recognition so as to produce a trained acoustic model for speech recognition.

    摘要翻译: 在声学模型生成装置中,多个噪声样本被分类成簇,使得多个簇小于噪声样本的数量。 在每个群集中选择噪声样本,以将所选择的噪声样本设置为用于训练的第二噪声样本。 另一方面,未经训练的声学模型被存储在存储单元上,使得通过使用第二噪声样本训练未训练的声学模型,从而产生用于语音识别的训练有素的声学模型,以便产生用于语音识别的训练声学模型 。

    Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition
    5.
    发明授权
    Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition 有权
    使用由语音识别产生的索引来检索视频和音频场景的方法和装置

    公开(公告)号:US06728673B2

    公开(公告)日:2004-04-27

    申请号:US10434119

    申请日:2003-05-09

    IPC分类号: G10L1900

    摘要: A video retrieval data generation apparatus includes an extractor that is configured to extract a characteristic pattern from a voice signal synchronous with a video signal. The video retrieval data generation apparatus also includes an index generator that is configured to set the voice signal for a voice period as a processing target. The index generator is further configured to prepare standard voice patterns of a subword corresponding to a plurality of subwords, detect, for each subword, a characteristic pattern similar to a standard voice pattern at each of the voice periods, and generate, for each subword, an index containing time synchronization information corresponding to a position where the similar characteristic pattern is detected. The video retrieval data generation apparatus also includes a multiplexer that is configured to multiplex video signals, voice signals and indexes to output in a data stream format.

    摘要翻译: 视频检索数据生成装置包括提取器,其被配置为从与视频信号同步的语音信号中提取特征模式。 视频检索数据生成装置还包括:索引生成器,其被配置为将语音周期的语音信号设置为处理对象。 索引发生器还被配置为准备与多个子词相对应的子字的标准语音模式,对于每个子字,在每个语音周期中检测类似于标准语音模式的特征模式,并且对于每个子字, 包含对应于检测到相似特征图案的位置的时间同步信息的索引。 视频检索数据生成装置还包括多路复用器,其被配置为将视频信号,语音信号和索引复用为以数据流格式输出。

    Method of speech recognition
    6.
    发明授权
    Method of speech recognition 失效
    语音识别方法

    公开(公告)号:US5345536A

    公开(公告)日:1994-09-06

    申请号:US808692

    申请日:1991-12-17

    IPC分类号: G10L15/10 G10L5/06

    CPC分类号: G10L15/10

    摘要: A set of "m" feature parameters is generated every frame from reference speech which is spoken by at least one speaker and which represents recognition-object words, where "m" denotes a preset integer. A set of "n" types of standard patterns is previously generated on the basis of speech data of a plurality of speakers, where "n" denotes a preset integer. Matching between the feature parameters of the reference speech and each of the standard patterns is executed to generate a vector of "n" reference similarities between the feature parameters of the reference speech and each of the standard patterns every frame. The reference similarity vectors of respective frames are arranged into temporal sequences corresponding to the recognition-object words respectively. The reference similarity vector sequences are previously registered as dictionary similarity vector sequences. Input speech to be recognized is analyzed to generate "m" feature parameters from the input speech. Matching between the feature parameters of the input speech and the standard patterns is executed to generate a vector of "n" input-speech similarities between the feature parameters of the input speech and the standard patterns every frame. The input-speech similarity vectors of respective frames are arranged into a temporal sequence. The input-speech similarity vector sequence is collated with the dictionary similarity vector sequences to recognize the input speech.

    摘要翻译: 一组“m”特征参数是由至少一个说话者所说的参考语音的每一帧产生的,并且表示识别对象词,其中“m”表示预置的整数。 先前根据多个扬声器的语音数据生成一组“n”种标准模式,其中“n”表示预置的整数。 执行参考语音的特征参数与每个标准模式之间的匹配以在每帧中生成参考语音的特征参数与每个标准模式之间的“n”个参考相似度的向量。 各帧的参考相似度矢量分别被布置成与识别对象字对应的时间序列。 参考相似性向量序列预先登记为词典相似性向量序列。 分析要识别的输入语音以从输入语音生成“m”个特征参数。 执行输入语音的特征参数与标准模式之间的匹配,以在每个帧之间产生输入语音的特征参数和标准模式之间的“n”个输入语音相似性的向量。 各帧的输入语音相似度向量被排列成时间序列。 输入语音相似性向量序列与词典相似性向量序列进行比较,以识别输入语音。

    Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition
    7.
    发明授权
    Method and apparatus for retrieving a video and audio scene using an index generated by speech recognition 失效
    使用由语音识别产生的索引来检索视频和音频场景的方法和装置

    公开(公告)号:US06611803B1

    公开(公告)日:2003-08-26

    申请号:US09600881

    申请日:2000-08-14

    IPC分类号: G10L1900

    摘要: A video retrieval apparatus includes a retrieval data generator that is configured to extract a characteristic pattern from a voice signal synchronous with a video signal to generate an index for video retrieval. The video retrieval apparatus also includes a retrieval processor that is configured to input a key word from a retriever and collate the key word with the index to retrieve a desired video. The retrieval data generator includes a multiplexor that is configured to multiplex video signals, voice signals and indexes to output in data stream format. The retrieval processor includes a demultiplexor that is configured to demultiplex the multiplexed data stream into the video signals, the voice signals and the indexes. A video reproduction apparatus may collate a visual pattern of the key word visual pattern data of the video signal at the time a person vocalizes a sound as the index for retrieval.

    摘要翻译: 视频检索装置包括检索数据生成器,其被配置为从与视频信号同步的语音信号中提取特征模式以生成用于视频检索的索引。 视频检索装置还包括检索处理器,其被配置为从检索者输入关键字并将关键字与索引对齐以检索所需的视频。 检索数据生成器包括多路复用器,其被配置为将视频信号,语音信号和索引复用为以数据流格式输出。 检索处理器包括解复用器,其被配置为将复用的数据流解复用为视频信号,语音信号和索引。 视频再现装置可以在人发声时将视频信号的关键字视觉图形数据的视觉图案整理为检索索引。

    Voice recognition method for recognizing a word in speech
    8.
    发明授权
    Voice recognition method for recognizing a word in speech 失效
    用于识别语音中的单词的语音识别方法

    公开(公告)号:US5692097A

    公开(公告)日:1997-11-25

    申请号:US347089

    申请日:1994-11-23

    CPC分类号: G10L15/12

    摘要: An inter-frame similarity between an input voice and a standard patterned word is calculated for each of frames and for each of standard patterned words, and a posterior probability similarity is produced by subtracting a constant value from each of the inter-frame similarities. The constant value is determined by analyzing voice data obtained from specified persons to set the posterior probability similarities to positive values when a word existing in the input voice matches with the standard patterned word and to set the posterior probability similarities to negative values when a word existing in the input voice does not match with the standard patterned word. Thereafter, an accumulated similarity having an accumulated value obtained by accumulating values of the posterior probability similarities according to a continuous dynamic programming matching operation for the frames of the input voice is calculated for each of the standard patterned words. Thereafter, a particular standard patterned word relating to an accumulated similarity having a maximum value among the accumulated similarities is output as a recognized word of the input voice.

    摘要翻译: 针对每个帧和每个标准图案化字计算输入语音和标准图案化字之间的帧间相似度,并且通过从每个帧间相似性中减去常数值来产生后验概率相似性。 通过分析从指定人员获得的语音数据来确定常数值,以便当存在于输入语音中的单词与标准图案化词匹配时将后验概率相似性设置为正值,并且当存在词时将后验概率相似性设置为负值 在输入语音中与标准图案字不匹配。 此后,针对每个标准图案化字,计算累积相似度,该相似度具有根据输入声音的帧的连续动态规划匹配操作累积后验概率相似度的值而获得的累积值。 此后,输出与积累的相似度中具有最大值的累积相似度相关的特定标准图案化字作为输入语音的识别字。

    Method of speech recognition
    9.
    发明授权
    Method of speech recognition 失效
    语音识别方法

    公开(公告)号:US5309547A

    公开(公告)日:1994-05-03

    申请号:US897131

    申请日:1992-06-11

    IPC分类号: G10L15/10 G10L5/00

    CPC分类号: G10L15/10

    摘要: A method of speech recognition includes the steps of analyzing input speech every frame and deriving feature parameters from the input speech, generating an input vector from the feature parameters of a plurality of frames, and periodically calculating partial distances between the input vector and partial standard patterns while shifting the frame one by one. Standard patterns correspond to recognition-object words respectively, and each of the standard patterns is composed of the partial standard patterns which represent parts of the corresponding recognition-object word respectively. The partial distances are accumulated into distances between the input speech and the standard patterns. The distances correspond to the recognition-object words respectively. The distances are compared with each other, and a minimum distance of the distances is selected when the input speech ends. One of the recognition-object words which corresponds to the minimum distance is decided to be a recognition result.

    摘要翻译: 一种语音识别方法包括以下步骤:每帧分析输入语音,并从输入语音中导出特征参数,从多个帧的特征参数生成输入向量,并周期性地计算输入向量与部分标准模式之间的部分距离 同时逐帧移动框架。 标准图案分别对应于识别对象字,并且每个标准图案分别由表示相应识别对象字的部分的部分标准图案组成。 部分距离累积到输入语音和标准模式之间的距离。 距离对应于识别对象字。 距离相互比较,当输入语音结束时,选择距离的最小距离。 对应于最小距离的识别对象词之一被确定为识别结果。

    Voice emphasizing device and voice emphasizing method
    10.
    发明授权
    Voice emphasizing device and voice emphasizing method 有权
    语音强调设备和语音强调方法

    公开(公告)号:US08311831B2

    公开(公告)日:2012-11-13

    申请号:US12447775

    申请日:2008-09-29

    IPC分类号: G10L13/06

    摘要: A voice emphasizing device emphasizes in a speech a “strained rough voice” at a position where a speaker or user of the speech intends to generate emphasis or musical expression. Thereby, the voice emphasizing device can provide the position with emphasis of anger, excitement, tension, or an animated way of speaking, or musical expression of Enka (Japanese ballad), blues, rock, or the like. As a result, rich vocal expression can be achieved. The voice emphasizing device includes: an emphasis utterance section detection unit (12) detecting, from an input speech waveform, an emphasis section that is a time duration having a waveform intended by the speaker or user to be converted; and a voice emphasizing unit (13) increasing fluctuation of an amplitude envelope of the waveform in the detected emphasis section.

    摘要翻译: 语音强调装置在讲话中强调了一个紧张的粗糙声音,其中讲话者或言语用户意图产生强调或音乐表达。 因此,声音强调装置可以以Enka(日本民谣),蓝调,摇滚等的强调愤怒,兴奋,紧张或动画的演绎方式或音乐表现为出发点。 因此,可以实现丰富的声乐表达。 语音强调装置包括:强调话音部分检测单元,从输入语音波形检测作为要被转换的扬声器或用户想要的波形的持续时间的强调部分; 以及语音强调单元(13),增加检测到的强调部分中波形的振幅包络的波动。