System and method for pronunciation modeling
    1.
    发明授权
    System and method for pronunciation modeling 有权
    发音建模的系统和方法

    公开(公告)号:US08862470B2

    公开(公告)日:2014-10-14

    申请号:US13302380

    申请日:2011-11-22

    IPC分类号: G10L15/187 G10L15/183

    摘要: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 系统,计算机实现的方法和用于生成发音模型的有形计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    SYSTEM AND METHOD FOR SYNTHETIC VOICE GENERATION AND MODIFICATION
    2.
    发明申请
    SYSTEM AND METHOD FOR SYNTHETIC VOICE GENERATION AND MODIFICATION 有权
    用于合成语音生成和修改的系统和方法

    公开(公告)号:US20120035933A1

    公开(公告)日:2012-02-09

    申请号:US12852164

    申请日:2010-08-06

    IPC分类号: G10L13/00

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.

    摘要翻译: 这里公开了用于产生合成语音的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的系统组合第一文本到语音语音的第一数据库和第二文本到语音语音的第二数据库以生成组合数据库,基于策略从组合数据库中进行选择, 用于合成语音的语音类别的语音单元以产生所选择的语音单元,并且基于所选择的语音单元来合成语音。 该系统可以合成语音,而无需参数化第一个文本到语音的语音和第二个文本到语音的语音。 对于特定语音类别,策略可以定义哪些文本到语音语音来选择语音单元。 组合的数据库可以包括来自不同扬声器的多个文本到语音的声音。 组合的数据库可以包括以不同风格说话的单个扬声器的声音。 组合的数据库可以包括不同语言的语音。

    SYSTEM AND METHOD FOR PRONUNCIATION MODELING
    3.
    发明申请
    SYSTEM AND METHOD FOR PRONUNCIATION MODELING 有权
    发明建模系统与方法

    公开(公告)号:US20100145707A1

    公开(公告)日:2010-06-10

    申请号:US12328407

    申请日:2008-12-04

    IPC分类号: G10L13/06

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 本文公开了用于生成发音模型的系统,计算机实现的方法和有形的计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    System and method for speech personalization by need
    4.
    发明授权
    System and method for speech personalization by need 有权
    需要语音个性化的系统和方法

    公开(公告)号:US09002713B2

    公开(公告)日:2015-04-07

    申请号:US12480864

    申请日:2009-06-09

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions. The method can further store a speaker personalization profile having information for the modified set of allocated resources and recognize speech associated with the speaker based on the speaker personalization profile.

    摘要翻译: 这里公开了用于说话人识别个性化的系统,计算机实现的方法和有形的计算机可读存储介质。 该方法使用一组分配的资源来识别从与语音接口交互的扬声器接收的语音,所分配的资源的集合包括带宽,处理器时间,存储器和存储。 该方法记录与识别的语音相关联的度量,并且在记录度量之后,修改与记录的度量相称的所分配资源集合中的所分配的资源中的至少一个。 该方法使用经修改的分配资源集来识别来自扬声器的附加语音。 指标可以包括语音识别置信度分数,处理速度,对话行为,重复请求,对确认的否定响应以及任务完成。 该方法还可以存储具有用于所修改的分配资源集合的信息的扬声器个性化简档,并且基于说话者个性化简档识别与说话者相关联的语音。

    System and method for generalized preselection for unit selection synthesis
    5.
    发明授权
    System and method for generalized preselection for unit selection synthesis 有权
    用于单位选择合成的广义预选系统和方法

    公开(公告)号:US08805687B2

    公开(公告)日:2014-08-12

    申请号:US12563654

    申请日:2009-09-21

    IPC分类号: G10L13/06

    摘要: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

    摘要翻译: 本文公开了用于单元选择合成的系统,计算机实现的方法和计算机可读存储介质。 该方法使得计算设备将辅助电话机添加到具有现有电话机的语音合成器前端,基于补充电话机修改单元预选过程,基于修改的单位预选过程从辅助电话机和现有电话机中预选单元 ,并根据预选单位产生语音。 补充手机可以是现有手机的变体,可以包括字边界特征,可以包括其中初始辅音簇和一些字边界用变音符标记的群集特征,可以包括将单位标记为源自于 功能词或内容词,和/或可以包括语音前或后声部特征。 语音合成器前端可以将补充的电话机作为额外的功能。

    System and method for synthetic voice generation and modification
    6.
    发明授权
    System and method for synthetic voice generation and modification 有权
    合成语音产生和修改的系统和方法

    公开(公告)号:US08731932B2

    公开(公告)日:2014-05-20

    申请号:US12852164

    申请日:2010-08-06

    IPC分类号: G10L13/00 G10L13/08

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.

    摘要翻译: 这里公开了用于产生合成语音的系统,方法和非暂时的计算机可读存储介质。 被配置为实施该方法的系统组合第一文本到语音语音的第一数据库和第二文本到语音语音的第二数据库以生成组合数据库,基于策略从组合数据库中进行选择, 用于合成语音的语音类别的语音单元以产生所选择的语音单元,并且基于所选择的语音单元来合成语音。 该系统可以合成语音,而无需参数化第一个文本到语音的语音和第二个文本到语音的语音。 对于特定语音类别,策略可以定义哪些文本到语音语音来选择语音单元。 组合的数据库可以包括来自不同扬声器的多个文本到语音的声音。 组合的数据库可以包括以不同风格说话的单个扬声器的声音。 组合的数据库可以包括不同语言的语音。

    System and method for pronunciation modeling
    7.
    发明授权
    System and method for pronunciation modeling 有权
    发音建模的系统和方法

    公开(公告)号:US08073693B2

    公开(公告)日:2011-12-06

    申请号:US12328407

    申请日:2008-12-04

    IPC分类号: G10L15/02

    摘要: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 系统,计算机实现的方法和用于生成发音模型的有形计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    SYSTEM AND METHOD FOR IMPROVING SYNTHESIZED SPEECH INTERACTIONS OF A SPOKEN DIALOG SYSTEM
    8.
    发明申请
    SYSTEM AND METHOD FOR IMPROVING SYNTHESIZED SPEECH INTERACTIONS OF A SPOKEN DIALOG SYSTEM 有权
    用于改进SPOKEN对话系统的合成语音交互的系统和方法

    公开(公告)号:US20090112596A1

    公开(公告)日:2009-04-30

    申请号:US11929542

    申请日:2007-10-30

    IPC分类号: G10L13/00

    CPC分类号: G10L13/027

    摘要: A system and method are disclosed for synthesizing speech based on a selected speech act. A method includes modifying synthesized speech of a spoken dialogue system, by (1) receiving a user utterance, (2) analyzing the user utterance to determine an appropriate speech act, and (3) generating a response of a type associated with the appropriate speech act, wherein in linguistic variables in the response are selected, based on the appropriate speech act.

    摘要翻译: 公开了一种基于所选择的语音行为来合成语音的系统和方法。 一种方法包括通过(1)接收用户话语来修改语音对话系统的合成语音,(2)分析用户话语以确定适当的语音行为,以及(3)产生与适当语音相关联的类型的响应 行为,其中在响应中的语言变量中,基于适当的言语行为。

    SYSTEM AND METHOD FOR ENRICHING TEXT-TO-SPEECH SYNTHESIS WITH AUTOMATIC DIALOG ACT TAGS
    9.
    发明申请
    SYSTEM AND METHOD FOR ENRICHING TEXT-TO-SPEECH SYNTHESIS WITH AUTOMATIC DIALOG ACT TAGS 审中-公开
    用自动对话法则标签增强文本语音合成的系统和方法

    公开(公告)号:US20130066632A1

    公开(公告)日:2013-03-14

    申请号:US13232630

    申请日:2011-09-14

    IPC分类号: G10L13/08

    CPC分类号: G10L13/10

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for modifying the prosody of synthesized speech based on an associated speech act. A system configured according to the method embodiment (1) receives text, (2) performs an analysis of the text to determine and assign a speech act label to the text, and (3) converts the text to speech, where the speech prosody is based on the speech act label. The analysis performed compares the text to a corpus of previously tagged utterances to find a close match, determines a confidence score from a correlation of the text and the close match, and, if the confidence score is above a threshold value, retrieving the speech act label of the close match and assigning it to the text.

    摘要翻译: 本文公开了用于基于相关联的语音动作来修改合成语音的韵律的系统,方法和非暂时的计算机可读存储介质。 根据方法实施例(1)配置的系统接收文本,(2)对文本进行分析以确定并分配文本的语音标签,以及(3)将文本转换为语音,其中语音韵律是 基于言语行为标签。 进行的分析将文本与先前标记的话语的语料库进行比较以找到紧密匹配,从文本的相关性和紧密匹配之间确定置信度分数,并且如果置信度分数高于阈值,则检索语音行为 标签的紧密匹配并将其分配给文本。

    System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

    公开(公告)号:US08548807B2

    公开(公告)日:2013-10-01

    申请号:US12480848

    申请日:2009-06-09

    IPC分类号: G10L15/04

    摘要: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.