Patent search ap:("Nvidia Corporation") AND inv:"Wei Ping" Page 1

1.

发明授权
Unsupervised alignment for text to speech synthesis using neural networks 有权

公开(公告)号：US11869483B2

公开(公告)日：2024-01-09

申请号：US17496636

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/00 , G10L13/08 , G10L13/10 , G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033

CPC classification number: G10L13/047 , G06N3/045 , G06N3/08 , G10L13/0335 , G10L13/08 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

2.

发明公开
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20230402028A1

公开(公告)日：2023-12-14

申请号：US18457221

申请日：2023-08-28

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/047 , G10L13/033 , G10L13/08 , G06N3/08 , G06N3/045 , G10L25/90

CPC classification number: G10L13/047 , G10L13/0335 , G10L13/08 , G06N3/08 , G06N3/045 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

3.

发明公开
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20230419947A1

公开(公告)日：2023-12-28

申请号：US18449969

申请日：2023-08-15

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/08

CPC classification number: G10L13/047 , G10L25/90 , G10L13/08 , G06N3/08 , G10L13/0335 , G06N3/045

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

4.

发明授权
Unsupervised alignment for text to speech synthesis using neural networks 有权

公开(公告)号：US11769481B2

公开(公告)日：2023-09-26

申请号：US17496569

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/00 , G10L13/10 , G10L13/06 , G10L13/07 , G10L13/047 , G10L25/90 , G06N3/045 , G06N3/08 , G10L13/033 , G10L13/08

CPC classification number: G10L13/047 , G06N3/045 , G06N3/08 , G10L13/0335 , G10L13/08 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

5.

发明申请
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS 有权

公开(公告)号：US20230113950A1

公开(公告)日：2023-04-13

申请号：US17496569

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/047 , G10L25/90

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

6.

发明申请
UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS 有权

公开(公告)号：US20230110905A1

公开(公告)日：2023-04-13

申请号：US17496636

申请日：2021-10-07

Applicant: Nvidia Corporation

Inventor： Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro

IPC: G10L13/08 , G10L13/047 , G10L13/033 , G06N3/08 , G06N3/04

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

7.

发明公开
NEURAL NETWORK-BASED LANGUAGE RESTRICTION 审中-公开

公开(公告)号：US20240095447A1

公开(公告)日：2024-03-21

申请号：US17846866

申请日：2022-06-22

Applicant: Nvidia Corporation

Inventor： Wei Ping , Boxin Wang , Chaowei Xiao , Mohammad Shoeybi , Mostofa Patwary , Anima Anandkumar , Bryan Catanzaro

IPC: G06F40/279 , G06F40/205 , G06F40/55

CPC classification number: G06F40/279 , G06F40/205 , G06F40/55

Abstract: Apparatuses, systems, and techniques are presented to identify and prevent generation of restricted content. In at least one embodiment, one or more neural networks are used to identify restricted content based only on the restricted content.

8.

发明申请
SYNTHESIZING VIDEO FROM AUDIO USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20230035306A1

公开(公告)日：2023-02-02

申请号：US17382027

申请日：2021-07-21

Applicant: Nvidia Corporation

Inventor： Ming-Yu Liu , Koki Nagano , Yeongho Seol , Jose Rafael Valle Gomes da Costa , Jaewoo Seo , Ting-Chun Wang , Arun Mallya , Sameh Khamis , Wei Ping , Rohan Badlani , Kevin Jonathan Shih , Bryan Catanzaro , Simon Yuen , Jan Kautz

IPC: G06T13/40 , H04N19/597 , G06N3/04 , G10L13/04 , G06T9/00 , G06T17/10 , G06T13/20

Abstract: Apparatuses, systems, and techniques are presented to generate media content. In at least one embodiment, a first neural network is used to generate first video information based, at least in part, upon voice information corresponding to one or more users, and a second neural network is used to generate second video information corresponding to the one or more users based, at least in part, upon the first video information and one or more images corresponding to the one or more users

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification