Methods and systems for adaptation of synthetic speech in an environment
    11.
    发明授权
    Methods and systems for adaptation of synthetic speech in an environment 有权
    在环境中适应合成语音的方法和系统

    公开(公告)号:US08571871B1

    公开(公告)日:2013-10-29

    申请号:US13633231

    申请日:2012-10-02

    Applicant: Google Inc.

    CPC classification number: G10L13/033 G10L21/003

    Abstract: Methods and systems for adaptation of synthetic speech in an environment are described. In an example, a device, which may include a text-to-speech (TTS) module, may be configured to determine characteristics of an environment of the device. The device also may be configured to determine, based on the one or more characteristics of the environment, speech parameters that characterize a voice output of the text-to-speech module. Further, the device may be configured to process a text to obtain the voice output corresponding to the text based on the speech parameters to account for the one or more characteristics of the environment.

    Abstract translation: 描述了在环境中适应合成语音的方法和系统。 在一个示例中,可以将可以包括文本到语音(TTS)模块的设备配置成确定设备的环境的特性。 该设备还可以被配置为基于环境的一个或多个特性来确定表征文本到语音模块的语音输出的语音参数。 此外,设备可以被配置为处理文本以基于语音参数获得对应于文本的语音输出,以解决环境的一个或多个特性。

    Method and system for building text-to-speech voice from diverse recordings
    13.
    发明授权
    Method and system for building text-to-speech voice from diverse recordings 有权
    从各种录音中构建文字到语音的方法和系统

    公开(公告)号:US09542927B2

    公开(公告)日:2017-01-10

    申请号:US14540088

    申请日:2014-11-13

    Applicant: Google Inc.

    CPC classification number: G10L13/02 G10L13/06 G10L25/03

    Abstract: A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.

    Abstract translation: 公开了一种用于从在不同条件下记录的多个扬声器构建文本到语音(TTS)合成系统的语音数据库的方法和系统。 对于参考扬声器的多个话语,可以提取一组参考扬声器向量,并且对于口语扬声器的多个话语中的每一个,可以提取相应的一组口语扬声器向量。 在补偿扬声器差异的变换下执行的匹配过程可以用于将每个口语扬声器向量与参考扬声器矢量相匹配。 口语扬声器矢量可以用匹配的参考扬声器矢量代替。 可以针对每组口语扬声器向量单独执行匹配和替换。 然后可以通过聚合所有替换的说话者向量来构建一组有条理的扬声器向量。 扬声器矢量的条件集可用于训练TTS系统。

    Devices and Methods for a Universal Vocoder Synthesizer
    14.
    发明申请
    Devices and Methods for a Universal Vocoder Synthesizer 有权
    通用声码器合成器的设备和方法

    公开(公告)号:US20160005392A1

    公开(公告)日:2016-01-07

    申请号:US14632890

    申请日:2015-02-26

    Applicant: Google Inc.

    Abstract: A device may receive an input indicative of acoustic feature parameters associated with speech. The device may determine a modulated noise representation for noise pertaining to one or more of an aspirate or a fricative in the speech based on the acoustic feature parameters. The aspirate may be associated with a characteristic of an exhalation of at least a threshold amount of breath. The fricative may be associated with a characteristic of airflow between two or more vocal tract articulators. The device may also provide an audio signal indicative of a synthetic audio pronunciation of the speech based on the modulated noise representation.

    Abstract translation: 设备可以接收指示与语音相关联的声学特征参数的输入。 该装置可以基于声学特征参数确定与语音中的抽吸或摩擦中的一个或多个有关的噪声的调制噪声表示。 抽吸可能与呼气至少一个阈值呼吸的特征有关。 摩擦可能与两个或多个声带关节器之间的气流特征相关联。 设备还可以基于调制的噪声表示来提供指示语音的合成音频发音的音频信号。

    Method and system for non-parametric voice conversion
    15.
    发明授权
    Method and system for non-parametric voice conversion 有权
    非参数语音转换的方法和系统

    公开(公告)号:US09183830B2

    公开(公告)日:2015-11-10

    申请号:US14069510

    申请日:2013-11-01

    Applicant: Google Inc.

    Abstract: A method and system is disclosed for non-parametric speech conversion. A text-to-speech (TTS) synthesis system may include hidden Markov model (HMM) HMM based speech modeling for both synthesizing output speech. A converted HMM may be initially set to a source HMM trained with a voice of a source speaker. A parametric representation of speech may be extract from speech of a target speaker to generate a set of target-speaker vectors. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each HMM state of the source HMM to a target-speaker vector. The HMM states of the converted HMM may be replaced with the matched target-speaker vectors. Transforms may be applied to further adapt the converted HMM to the voice of target speaker. The converted HMM may be used to synthesize speech with voice characteristics of the target speaker.

    Abstract translation: 公开了用于非参数语音转换的方法和系统。 文本到语音(TTS)合成系统可以包括用于合成输出语音的隐马尔可夫模型(HMM)基于HMM的语音建模。 可以将经转换的HMM初始设置为用源扬声器的声音训练的源HMM。 可以从目标说话者的语音中提取语音的参数表示,以产生一组目标扬声器向量。 可以使用在补偿扬声器差异的变换下执行的匹配过程来将源HMM的每个HMM状态与目标扬声器向量相匹配。 转换的HMM的HMM状态可以用匹配的目标扬声器向量替换。 可以应用变换来进一步使转换的HMM适应目标扬声器的声音。 转换的HMM可以用于合成具有目标扬声器的语音特征的语音。

    Method and System for Cross-Lingual Voice Conversion
    16.
    发明申请
    Method and System for Cross-Lingual Voice Conversion 有权
    跨语言转换的方法和系统

    公开(公告)号:US20150127349A1

    公开(公告)日:2015-05-07

    申请号:US14069492

    申请日:2013-11-01

    Applicant: Google Inc.

    Abstract: A method and system for is disclosed for cross-lingual voice conversion. A speech-to-speech system may include hidden Markov model (HMM) HMM based speech modeling for both recognizing input speech and synthesizing output speech. A cross-lingual HMM may be initially set to an output HMM trained with a voice of an output speaker in an output language. An auxiliary HMM may be trained with a voice of an auxiliary speaker in an input language. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each HMM state of the output HMM to a HMM state of the auxiliary HMM. The HMM states of the cross-lingual HMM may be replaced with the matched states. Transforms may be applied to adapt the cross-lingual HMM to the voices of the auxiliary speaker and of an input speaker. The cross-lingual HMM may be used for speech synthesis.

    Abstract translation: 公开了用于跨语言语音转换的方法和系统。 语音到语音系统可以包括用于识别输入语音和合成输出语音的隐马尔可夫模型(HMM)基于HMM的语音建模。 可以最初将跨语言HMM设置为以输出语言的输出说话者的语音训练的输出HMM。 辅助HMM可以用输入语言的辅助扬声器的声音进行训练。 可以使用在补偿扬声器差异的变换下执行的匹配过程来将输出HMM的每个HMM状态与辅助HMM的HMM状态相匹配。 跨语言HMM的HMM状态可以被替换为匹配状态。 可以应用变换来使跨语言HMM适应于辅助扬声器和输入扬声器的声音。 跨语言HMM可用于语音合成。

    Methods and Systems for Automated Generation of Nativized Multi-Lingual Lexicons
    17.
    发明申请
    Methods and Systems for Automated Generation of Nativized Multi-Lingual Lexicons 有权
    自动生成多语言词汇的方法和系统

    公开(公告)号:US20150095018A1

    公开(公告)日:2015-04-02

    申请号:US14283586

    申请日:2014-05-21

    Applicant: Google Inc.

    Abstract: An input signal that includes linguistic content in a first language may be received by a computing device. The linguistic content may include text or speech. The computing device may associate the linguistic content in the first language with one or more phonemes from a second language. The computing device may also determine a phonemic representation of the linguistic content in the first language based on use of the one or more phonemes from the second language. The phonemic representation may be indicative of a pronunciation of the linguistic content in the first language according to speech sounds of the second language.

    Abstract translation: 包括第一语言的语言内容的输入信号可以被计算设备接收。 语言内容可能包括文字或言语。 计算设备可将第一语言中的语言内容与来自第二语言的一个或多个音素相关联。 计算设备还可以基于来自第二语言的一个或多个音素的使用来确定第一语言中的语言内容的音位表示。 根据第二语言的语音,音素表示可以指示第一语言中的语言内容的发音。

    Methods and systems for automated generation of nativized multi-lingual lexicons
    18.
    发明授权
    Methods and systems for automated generation of nativized multi-lingual lexicons 有权
    自动生成本土化多语言词典的方法和系统

    公开(公告)号:US08768704B1

    公开(公告)日:2014-07-01

    申请号:US14053052

    申请日:2013-10-14

    Applicant: Google Inc.

    Abstract: An input signal that includes linguistic content in a first language may be received by a computing device. The linguistic content may include text or speech. Based on an acoustic feature comparison between a plurality of first-language speech sounds and a plurality of second-language speech sounds, the computing device may associate the linguistic content in the first language with one or more phonemes from a second language. The computing device may also determine a phonemic representation of the linguistic content in the first language based on use of the one or more phonemes from the second language. The phonemic representation may be indicative of a pronunciation of the linguistic content in the first language according to speech sounds of the second language.

    Abstract translation: 包括第一语言的语言内容的输入信号可以被计算设备接收。 语言内容可能包括文字或言语。 基于多个第一语言语音和多个第二语言语音之间的声学​​特征比较,计算设备可将第一语言中的语言内容与来自第二语言的一个或多个音素相关联。 计算设备还可以基于来自第二语言的一个或多个音素的使用来确定第一语言中的语言内容的音位表示。 根据第二语言的语音,音素表示可以指示第一语言中的语言内容的发音。

    Devices and Methods for a Speech-Based User Interface
    19.
    发明申请
    Devices and Methods for a Speech-Based User Interface 审中-公开
    基于语音的用户界面的设备和方法

    公开(公告)号:US20160336003A1

    公开(公告)日:2016-11-17

    申请号:US14711264

    申请日:2015-05-13

    Applicant: Google Inc.

    CPC classification number: G10L13/033 G06F3/167 G10L13/10 G10L2021/0135

    Abstract: A device may identify a plurality of sources for outputs that the device is configured to provide. The plurality of sources may include at least one of a particular application in the device, an operating system of the device, a particular area within a display of the device, or a particular graphical user interface object. The device may also assign a set of distinct voices to respective sources of the plurality of sources. The device may also receive a request for speech output. The device may also select a particular source that is associated with the requested speech output. The device may also generate speech having particular voice characteristics of a particular voice assigned to the particular source.

    Abstract translation: 设备可以识别设备被配置为提供的输出的多个源。 多个源可以包括设备中的特定应用,设备的操作系统,设备的显示器内的特定区域或特定图形用户界面对象中的至少一个。 该装置还可以将一组不同的声音分配给多个源的各个源。 设备还可以接收对语音输出的请求。 设备还可以选择与所请求的语音输出相关联的特定源。 设备还可以产生具有分配给特定源的特定语音的特定语音特征的语音。

    Devices and Methods for Use of Phase Information in Speech Processing Systems
    20.
    发明申请
    Devices and Methods for Use of Phase Information in Speech Processing Systems 有权
    在语音处理系统中使用相位信息的装置和方法

    公开(公告)号:US20160005391A1

    公开(公告)日:2016-01-07

    申请号:US14631583

    申请日:2015-02-25

    Applicant: Google Inc.

    CPC classification number: G10L13/02 G10L13/08 G10L25/75

    Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.

    Abstract translation: 设备可以接收语音信号。 设备可以确定语音信号的声学特征参数。 声学特征参数可以包括相位数据。 该装置可以基于相位数据与圆形空间表示的给定轴的对准来确定相位数据的圆形空间表示。 设备可以基于圆形空间表示将相位数据映射到语言特征。 语言特征可能与包含音素内容或文本内容的语言内容相关联。 该设备可以基于映射提供语言内容的合成音频发音。

Patent Agency Ranking