Method and system for configurable allocation of sound segments for use in concatenative text-to-speech voice synthesis
    21.
    发明申请
    Method and system for configurable allocation of sound segments for use in concatenative text-to-speech voice synthesis 审中-公开
    声音段的可配置分配的方法和系统,用于连续的文本到语音语音合成

    公开(公告)号:US20070073542A1

    公开(公告)日:2007-03-29

    申请号:US11234690

    申请日:2005-09-23

    IPC分类号: G10L13/08

    CPC分类号: G10L13/07 G10L13/047

    摘要: Embodiments of the present invention provide a method, system and computer program product for synthesizing concatenative speech by allocating speech segments based upon their frequency of access during speech synthesis and storing frequently used speech segments in memory where they can be easily and quickly accessed. Speech data is recorded in separate files from which individual speech units are identified. The method and system of the present invention analyzes the frequency of access of each speech unit during synthesis and uses this data to sort the speech units according to their frequency of access. Those speech units that are accessed more frequently than others are loaded into memory where they can be accessed quickly during subsequent speech synthesis. Other speech units that are not used as frequently can be stored on a data storage disk. The invention can also dynamically adapt to changes in the frequency of speech unit access by moving units from memory to disk or vice versa depending upon their frequency of access or to account for a change in the user's system requirements.

    摘要翻译: 本发明的实施例提供了一种用于通过在语音合成期间基于其接入频率分配语音片段并将经常使用的语音片段存储在容易且快速地访问的存储器中来合成级联语音的方法,系统和计算机程序产品。 语音数据记录在单独的文件中,从中识别出各个语音单元。 本发明的方法和系统在合成期间分析每个语音单元的访问频率,并使用该数据根据其访问频率对语音单元进行排序。 比其他语言更频繁访问的语音单元被加载到存储器中,在随后的语音合成期间可以快速访问它们。 未被频繁使用的其他语音单元可以存储在数据存储盘上。 本发明还可以根据其访问频率或者考虑到用户系统要求的变化来动态地适应移动单元从存储器到磁盘或反之亦然的语音单元访问频率的变化。

    Method of synthesis for a steady sound signal

    公开(公告)号:US20060178873A1

    公开(公告)日:2006-08-10

    申请号:US10527945

    申请日:2003-08-05

    申请人: Ercan Gigi

    发明人: Ercan Gigi

    IPC分类号: G10L11/04

    CPC分类号: G10L13/07 G10L13/08 G10L21/01

    摘要: The present invention relates to a method of synthesizing a first sound signal based on a second sound signal, the first sound signal having a required first fundamental frequency and the second sound signal having a second fundamental frequency, the method comprising the steps of, a) determining of required pitch bell locations in the time domain of the first sound signal, the pitch bell locations being distanced by one period of the first fundamental frequency, b) providing of pitch bells by windowing the second sound signal on pitch bell locations in the time domain of the second sound signal, the pitch bell locations being distanced by one period of the second fundamental frequency, c) randomly selecting of a pitch bell from the provided pitch bells for each of the required pitch bell locations, d) performing an overlap and add operation on the selected pitch bells for synthesizing the first signal.

    Waveform synthesis
    23.
    发明授权
    Waveform synthesis 失效
    波形合成

    公开(公告)号:US07069217B2

    公开(公告)日:2006-06-27

    申请号:US09043171

    申请日:1997-01-09

    IPC分类号: G10L13/06

    CPC分类号: G10L13/07

    摘要: A synthesizer is disclosed in which a speech waveform is synthesized by selecting a synthetic starting waveform segment and then generating a sequence of further segments. The further waveform segments are generated based jointly upon the value of the immediately-preceding segment and upon a model of the dynamics of an actual sound similar to that being generated. In particular, a method is disclosed of a voiced speech sound comprising calculating each new output value from the previous output value using data modeling the evolution, over a short time interval, of the voiced speech sound to be synthesized. This sequential generation of waveform segments enables a synthesized sequence of speech waveforms to be generated of any duration. In addition, a low-dimensional state space representation of speech signals are used in which successive pitch pulse cycles are superimposed to estimate the progression of the cyclic speech signal within each cycle.

    摘要翻译: 公开了一种合成器,其中通过选择合成起始波形段然后生成另外的段的序列来合成语音波形。 另外的波形段是基于紧接在前的段的值以及类似于所生成的实际声音的动力学的模型而共同生成的。 特别地,公开了一种有声语音的方法,包括使用在短时间间隔内对要合成的有声语音进行演化的数据建模从先前输出值计算每个新的输出值。 这种连续生成的波形段使得可以产生任何持续时间的语音波形的合成序列。 此外,使用语音信号的低维状态空间表示,其中叠加连续的音调脉冲周期以估计每个周期内循环语音信号的进展。

    Speech synthesis apparatus using pitch marks, control method therefor, and computer-readable memory
    24.
    发明授权
    Speech synthesis apparatus using pitch marks, control method therefor, and computer-readable memory 有权
    使用间距标记的语音合成装置,其控制方法和计算机可读存储器

    公开(公告)号:US07054806B1

    公开(公告)日:2006-05-30

    申请号:US09262852

    申请日:1999-03-05

    申请人: Masayuki Yamada

    发明人: Masayuki Yamada

    IPC分类号: G10L11/04

    摘要: The distance between the first two pitch marks of a voiced portion of speech data to be processed is calculated. The difference between the adjacent inter-pitch-mark distances is calculated. The respective calculation results are stored and managed in a file.

    摘要翻译: 计算要处理的语音数据的有声部分的前两个音调标记之间的距离。 计算相邻的间距标记距离之间的差。 相应的计算结果在文件中存储和管理。

    Method and device for co-articulated concatenation of audio segments
    25.
    发明授权
    Method and device for co-articulated concatenation of audio segments 有权
    音频段共同连接的方法和装置

    公开(公告)号:US07047194B1

    公开(公告)日:2006-05-16

    申请号:US09763149

    申请日:1999-08-19

    申请人: Christoph Buskies

    发明人: Christoph Buskies

    IPC分类号: G10L15/00

    CPC分类号: G10L13/07

    摘要: The invention provides a method, apparatus, and a computer program stored on a data carrier that generates synthesized acoustical data by concatenating audio segments of sounds to reproduce a sequence of concatenated sounds/phones. The invention has an inventory or sounds and each sound has three bands (FIG. 1b) including an initial co-articulation band, a solo articulation band and a final co-articulation band. The invention selects audio segments that end or begin with a co-articulation band and a solo articulation band of one sound. The instance of concatenation is defined by the co-articulation band and the solo articulation band of the one sound.

    摘要翻译: 本发明提供了一种存储在数据载体上的方法,装置和计算机程序,其通过连接音频段来产生合成声学数据,以再现连续的声音/电话序列。 本发明具有库存或声音,并且每个声音具有三个带(图1b),包括初始共同连接带,独奏关节带和最终共同连接带。 本发明选择以一个声音的共同连接带和单独发音频带结束或开始的音频片段。 连接的实例由一个声音的共同表达带和单独发音频带来定义。

    Feature-domain concatenative speech synthesis
    26.
    发明授权
    Feature-domain concatenative speech synthesis 有权
    特征域级联语音合成

    公开(公告)号:US07035791B2

    公开(公告)日:2006-04-25

    申请号:US09901031

    申请日:2001-07-10

    申请人: Dan Chazan Ron Hoory

    发明人: Dan Chazan Ron Hoory

    IPC分类号: G10L11/04

    CPC分类号: G10L13/07 G10L25/18

    摘要: A method for speech synthesis includes receiving an input speech signal containing a set of speech segments, and estimating spectral envelopes of the input speech signal in a succession of time intervals during each of the speech segments. The spectral envelopes are integrated over a plurality of window functions in a frequency domain so as to determine elements of feature vectors corresponding to the speech segments. An output speech signal is reconstructed by concatenating the feature vectors corresponding to a sequence of the speech segments.

    摘要翻译: 一种用于语音合成的方法包括接收包含一组语音段的输入语音信号,并且在每个语音段期间以一连串的时间间隔估计输入语音信号的频谱包络。 频谱包络被集成在频域中的多个窗口函数上,以便确定与语音段对应的特征向量的元素。 通过连接对应于语音片段序列的特征向量来重构输出语音信号。

    Speech synthesis using concatenation of speech waveforms
    27.
    发明申请
    Speech synthesis using concatenation of speech waveforms 有权
    语音合成采用语音波形串联

    公开(公告)号:US20060059000A1

    公开(公告)日:2006-03-16

    申请号:US10527951

    申请日:2003-08-08

    申请人: Ercan Gigi

    发明人: Ercan Gigi

    IPC分类号: G10L13/00

    CPC分类号: G10L13/07

    摘要: The invention relates to a method of synthesizing of a speech signal, the speech signal having at least a first speech unit and a second speech unit, the method comprising the steps of: providing a first speech unit signal, the first speech unit signal having an end interval, providing a second speech unit signal, the second speech unit signal having a front interval, appending of at least some of the periods of the end interval in inverted order at the end of the first speech unit signal to provide a fade-out interval, appending of at least some of the periods of the front interval in inverted order at the beginning of the second speech unit signal to provide a fade-in interval, superposing of the end and fade-in intervals and of the fade-out and front intervals.

    摘要翻译: 本发明涉及一种合成语音信号的方法,所述语音信号具有至少第一语音单元和第二语音单元,所述方法包括以下步骤:提供第一语音单元信号,所述第一语音单元信号具有 提供第二语音单元信号,所述第二语音单元信号具有前部间隔,在所述第一语音单元信号的结尾以相反的顺序附加所述结束间隔的所述周期的至少一些周期以提供淡出 间隔,在第二语音单元信号的开始处以相反的顺序附加前部间隔的至少一些周期,以提供淡入间隔,叠加结束和淡入间隔以及淡出和 前段

    System and method for converting text-to-voice

    公开(公告)号:US06990450B2

    公开(公告)日:2006-01-24

    申请号:US09818331

    申请日:2001-03-27

    IPC分类号: G10L13/08

    CPC分类号: G10L13/07 G10L13/04

    摘要: A method for converting text to concatenated voice by utilizing a digital voice library and a set of playback rules is provided. Multiple voice recordings correspond to a single speech item and represent various inflections of that single speech item. The method includes determining syllable count and impact value for each speech item in a sequence of speech items. A desired inflection for each speech item is determined based on the syllable count and the impact value and further based on a set of playback rules. A sequence of voice recordings is determined by determining a voice recording for each speech item based on the desired inflection and based on the available voice recordings that correspond to the particular speech item. Voice data are generated based on a sequence of voice recordings by concatenating adjacent recordings in the sequence of voice recordings.

    Method for controlling duration in speech synthesis
    29.
    发明申请
    Method for controlling duration in speech synthesis 有权
    用于控制语音合成中的持续时间的方法

    公开(公告)号:US20060004578A1

    公开(公告)日:2006-01-05

    申请号:US10527779

    申请日:2003-08-05

    申请人: Ercan Gigi

    发明人: Ercan Gigi

    IPC分类号: G10L13/06

    CPC分类号: G10L13/07 G10L21/04

    摘要: The present invention relates to a method of synthesizing of a speech signal, comprising: -assigning of a first identifier to a first class of intervals of an original speech signal and assigning of a second identifier to a second class of intervals of the original speech signal, -windowing the original speech signal to provide a number of pitch bells, -processing the pitch bells having the first identifier assigned thereto for modifying a duration of the speech signal, -performing an overlap and add operation on the processed pitch bells.

    摘要翻译: 本发明涉及一种合成语音信号的方法,包括:将第一标识符分配给原始语音信号的第一类间隔,并将第二标识符分配给原始语音信号的第二类间隔 - 将原始语音信号窗口化以提供多个音调钟, - 处理具有分配给其的第一标识符的音调铃,以修改语音信号的持续时间, - 执行重叠并对处理的音调铃添加操作。

    Speech synthesis system
    30.
    发明申请
    Speech synthesis system 有权
    语音合成系统

    公开(公告)号:US20050149330A1

    公开(公告)日:2005-07-07

    申请号:US11070301

    申请日:2005-03-03

    申请人: Nobuyuki Katae

    发明人: Nobuyuki Katae

    IPC分类号: G10L13/06 G10L13/00

    CPC分类号: G10L13/07 G10L13/06

    摘要: A speech synthesizing system producing a speech of an improved quality of voice by selecting a combination of speech segment most suitable for a synthesis speech unit sequence. The speech synthesizing system comprises a speech segment storage section where speech segment is stored, a speech segment selection information storage section where speech segment selection information including combinations of speech segment constituted of speech segment stored in the speech segment storage section for an arbitrary speech unit sequence and the appropriateness information representing the appropriatenesses of the combinations are stored, a speech segment selecting section for selecting a combination of speech segment most suitable for a synthesis parameter according to the speech segment selection information stored in the speech segment storage section, and a waveform generating section for generating speech waveform data from the combination of speech segment selected by the speech segment selecting section.

    摘要翻译: 一种语音合成系统,通过选择最适合于合成语音单元序列的语音片段的组合来产生语音质量提高的语音。 语音合成系统包括存储语音片段的语音段存储部分,语音段选择信息存储部分,其中语音片段选择信息包括由任意语音单元序列存储在语音片段存储部分中的语音片段组成的语音片段的组合 并且存储表示组合的适当性的适当信息,用于根据存储在语音段存储部分中的语音段选择信息来选择最适合于合成参数的语音段的组合的语音段选择部分和产生 从用于由语音片段选择部分选择的语音片段的组合产生语音波形数据的部分。