Method for analyzing speech involving detecting the formants by division into time frames using linear prediction
    1.
    发明授权
    Method for analyzing speech involving detecting the formants by division into time frames using linear prediction 失效
    用于分析涉及通过使用线性预测划分为时间帧来检测共振峰的语音的方法

    公开(公告)号:US06289305B1

    公开(公告)日:2001-09-11

    申请号:US08129077

    申请日:1994-03-02

    申请人: Jaan Kaja

    发明人: Jaan Kaja

    IPC分类号: G10L1904

    CPC分类号: G10L25/48 G10L25/15

    摘要: A process for speech analysis and more specifically an automatic process for the analysis of continuous speech. The waveshape of the speech is described with the aid of the resonant frequencies, formants, which arise in the speech organ. The process determines suitable frequencies for the formants from an utterance by dividing the utterance into time frames and analyzing the utterance by linear prediction in order to determine roots of the denominator polynomial and thereby frequency values for each frame. The utterance is divided into voiced regions and in each voiced region the centers of vowel sounds are established in order to obtain a number of starting points. Tracks are formed from the starting points by sorting the roots from frame to frame so that old and new roots are linked together. Factors of merit are calculated for the tracks relative to formants and the tracks are distributed to formants in accordance with the factors of merit. The factors of merit take into consideration the bandwidth, continuity and relation to the formants of the tracks. The process gives a global optimisation by delaying the formant allocation until a complete voiced region has been analyzed. By linking the tracks together in this way, additional/false resonances can be controlled, which resonances arise in association with linear prediction.

    摘要翻译: 语音分析的过程,更具体地说是用于分析连续语音的自动过程。 借助于言语器官中出现的共振频率,共振峰来描述语音的波形。 该过程通过将话语划分为时间帧并且通过线性预测分析发音来确定来自发音的适合频率,以便确定分母多项式的根,从而确定每个帧的频率值。 话语被分为有声区域,并且在每个浊音区域中,建立了元音声音的中心,以便获得多个起始点。 从起点形成轨道,通过将帧从帧到帧进行排序,使旧的和新的根链接在一起。 对于相对于共振峰的轨道计算优点因子,轨道根据品质因素分配给共振峰。 品质因素考虑到带宽,连续性和与轨道共振峰的关系。 该过程通过延迟共振峰分配来实现全局优化,直到已经分析完整的有声区域。 通过以这种方式将轨道连接在一起,可以控制附加/错误的共振,这与线性预测相关联地产生共振。

    Speech synthesis with weighted parameters at phoneme boundaries
    2.
    发明授权
    Speech synthesis with weighted parameters at phoneme boundaries 失效
    在音素边界加权参数的语音合成

    公开(公告)号:US5659664A

    公开(公告)日:1997-08-19

    申请号:US468640

    申请日:1995-06-06

    申请人: Jaan Kaja

    发明人: Jaan Kaja

    IPC分类号: C10L9/02 G10L13/04 G10L5/04

    CPC分类号: G10L13/07 G10L13/04 G10L25/15

    摘要: The invention relates to a method and an arrangement for speech synthesis and provides an automatic mechanism for simulating human speech. The method provides a number of control parameters for controlling a speech synthesis device. The invention solves the problem of coarticulation by using an interpolation mechanism. The control parameters are stored in a matrix or a sequence list for each polyphone. The behaviour of the respective parameter with time is defined around each phoneme boundary and polyphones are joined by forming a weighted mean value of the curves which are defined by their two associated matrices/sequences list. The invention also provides an arrangement for carrying out the method.

    摘要翻译: 本发明涉及一种用于语音合成的方法和装置,并且提供了一种用于模拟人类语音的自动机制。 该方法提供用于控制语音合成设备的多个控制参数。 本发明通过使用插值机制解决了共聚焦问题。 控制参数存储在每个polyphone的矩阵或序列表中。 通过形成由它们的两个相关联的矩阵/序列表定义的曲线的加权平均值,在每个音素边界周围定义相应参数随时间的行为。 本发明还提供了一种用于执行该方法的装置。

    Method for synthesizing voiceless consonants
    3.
    发明授权
    Method for synthesizing voiceless consonants 有权
    无声辅音合成通过回溯复制所选部分的双声道或多声道

    公开(公告)号:US6112178A

    公开(公告)日:2000-08-29

    申请号:US147466

    申请日:1999-03-05

    申请人: Jaan Kaja

    发明人: Jaan Kaja

    IPC分类号: G10L13/07 G10L13/06

    CPC分类号: G10L13/07

    摘要: A method for synthesizing speech using concatenation and Hanning-windows, in which a synthetic waveform is formed by concatenation of suitably selected parts of recorded human speech, the selected parts being windowed out with a Hanning window and copied into suitably selected locations in the synthetic waveform. The method is adapted to synthesize unvoiced consonants and includes the steps of palindromically copying suitably selected parts of the recorded human speech to form a synthesized waveform for the unvoiced consonant using concatenation. The method may be used for diphone, or polyphone, synthesis. The advantage of this palindromic synthesis method is that when the copying process has been reversed the second time there is either no repetition of identical blocks, or else the time difference between repetitions is markedly larger in comparison with known methods, thus minimizing unwanted periodic artifacts in the synthesized speech.

    摘要翻译: PCT No.PCT / SE97 / 01004 Sec。 371日期1999年3月5日 102(e)1999年3月5日PCT提交1997年6月9日PCT公布。 第WO98 / 00835号公报 日期1998年1月8日一种用于使用连接和汉宁窗合成语音的方法,其中通过连续记录的人类语音的适当选择的部分形成合成波形,所选择的部分与汉宁窗口一起被翻转并被复制到适当选择的 合成波形中的位置。 该方法适用于合成无声辅音,并且包括以下步骤:对记录的人类语音的适当选择的部分进行回文复制以形成使用级联的无声辅音的合成波形。 该方法可用于双耳或多音节合成。 这种回文合成方法的优点是,当复制过程已经被反转时,第二次不存在相同的块的重复,或者与已知方法相比,重复之间的时间差显着更大,从而最小化不必要的周期性伪像 合成语音。