Patent search ap:("Google Inc.") AND inv:"Byungha Chun" Page 1

1.

发明申请
MULTILINGUAL PROSODY GENERATION 有权

公开(公告)号：US20160071512A1

公开(公告)日：2016-03-10

申请号：US14942300

申请日：2015-11-16

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Andrew W. Senior , Byungha Chun

IPC: G10L13/10 , G10L13/07 , G10L25/30 , G10L13/08

CPC classification number: G10L13/10 , G06F17/289 , G10L13/07 , G10L13/08 , G10L13/086 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.

2.

发明授权
Multilingual prosody generation 有权
Title translation: 多语言韵律一代

公开(公告)号：US09195656B2

公开(公告)日：2015-11-24

申请号：US14143627

申请日：2013-12-30

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Andrew W. Senior , Byungha Chun

IPC: G10L13/08 , G06F17/28 , G10L13/10

CPC classification number: G10L13/10 , G06F17/289 , G10L13/07 , G10L13/08 , G10L13/086 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于多语言韵律生成。在一些实现中，获得指示与文本相对应的一组语言特征的数据。指示语言特征的数据和指示文本语言的数据被提供给已经被训练以提供指示多种语言的韵律信息的输出的神经网络的输入。神经网络可以是已经使用多种语言的语音训练的神经网络。从神经网络接收到表示语言特征的韵律信息的输出。使用神经网络的输出生成表示文本的音频数据。

3.

发明申请
Devices and Methods for Use of Phase Information in Speech Processing Systems 有权
Title translation: 在语音处理系统中使用相位信息的装置和方法

公开(公告)号：US20160005391A1

公开(公告)日：2016-01-07

申请号：US14631583

申请日：2015-02-25

Applicant: Google Inc.

Inventor： Ioannis Agiomyrgiannakis , Byungha Chun

IPC: G10L13/02

CPC classification number: G10L13/02 , G10L13/08 , G10L25/75

Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.

Abstract translation: 设备可以接收语音信号。设备可以确定语音信号的声学特征参数。声学特征参数可以包括相位数据。该装置可以基于相位数据与圆形空间表示的给定轴的对准来确定相位数据的圆形空间表示。设备可以基于圆形空间表示将相位数据映射到语言特征。语言特征可能与包含音素内容或文本内容的语言内容相关联。该设备可以基于映射提供语言内容的合成音频发音。

4.

发明授权
Multilingual prosody generation 有权

公开(公告)号：US09905220B2

公开(公告)日：2018-02-27

申请号：US14942300

申请日：2015-11-16

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Andrew W. Senior , Byungha Chun

IPC: G10L13/08 , G10L13/10 , G06F17/28 , G10L13/07 , G10L25/30

CPC classification number: G10L13/10 , G06F17/289 , G10L13/07 , G10L13/08 , G10L13/086 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.

5.

发明申请
SPEECH SYNTHESIS MODEL SELECTION 审中-公开
Title translation: 语音合成模型选择

公开(公告)号：US20160343366A1

公开(公告)日：2016-11-24

申请号：US14716063

申请日：2015-05-19

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Byungha Chun

IPC: G10L13/027 , G10L13/08 , G10L13/047

CPC classification number: G10L13/08 , G10L13/047

Abstract: In some implementations, a text-to-speech system may perform a mapping of acoustic frames to linguistic model clusters in a pre-selection process for unit selection synthesis. An architecture may leverage data-driven models, such as neural networks that are trained using recorded speech samples, to effectively map acoustic frames to linguistic model clusters during synthesis. This architecture may allow for improved handling and synthesis of combinations of unseen linguistic features.

Abstract translation: 在一些实现中，文本到语音系统可以在用于单元选择合成的预选过程中执行声音帧到语言模型集群的映射。架构可以利用数据驱动的模型，例如使用记录的语音样本训练的神经网络，以在合成期间将声学帧有效地映射到语言模型簇。这种架构可以允许改进未被看见的语言特征组合的处理和综合。

6.

发明申请
MULTILINGUAL PROSODY GENERATION 有权
Title translation: 多重预测生成

公开(公告)号：US20150186359A1

公开(公告)日：2015-07-02

申请号：US14143627

申请日：2013-12-30

Applicant: Google Inc.

Inventor： Javier Gonzalvo Fructuoso , Andrew W. Senior , Byungha Chun

IPC: G06F17/28

CPC classification number: G10L13/10 , G06F17/289 , G10L13/07 , G10L13/08 , G10L13/086 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于多语言韵律生成。在一些实现中，获得指示与文本相对应的一组语言特征的数据。指示语言特征的数据和指示文本语言的数据被提供给已经被训练以提供指示多种语言的韵律信息的输出的神经网络的输入。神经网络可以是已经使用多种语言的语音训练的神经网络。从神经网络接收到表示语言特征的韵律信息的输出。使用神经网络的输出生成表示文本的音频数据。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification