-
公开(公告)号:US11763799B2
公开(公告)日:2023-09-19
申请号:US17554547
申请日:2021-12-17
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Sangjun Park , Kyoungbo Min , Kihyun Choo , Seungdo Choi
IPC: G10L13/08 , G10L15/14 , G10L15/06 , G10L13/047 , G10L13/10
CPC classification number: G10L13/047 , G10L13/10
Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a microphone; a memory configured to store a text-to-speech (TTS) model and a plurality of evaluation texts; and a processor configured to: obtain a first reference vector of a user speech spoken by a user based the user speech being received through the microphone, generate a plurality of candidate reference vectors based on the first reference vector, obtain a plurality of synthesized sounds by inputting the plurality of candidate reference vectors and the plurality of evaluation texts to the TTS model, identify at least one synthesized sound of the plurality of synthesized sounds based on a similarity between characteristics of the plurality of synthesized sounds and the user speech, and store a second reference vector of the at least one synthesized sound in the memory as a reference vector corresponding to the user for the TTS model.
-
公开(公告)号:US11830473B2
公开(公告)日:2023-11-28
申请号:US17037023
申请日:2020-09-29
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Jesus Monge Alvarez , Holly Francois , Hosang Sung , Seungdo Choi , Kihyun Choo , Sangjun Park
IPC: G10L13/027 , G10L13/047 , G10L15/06 , G10L15/187 , G10L13/06
CPC classification number: G10L13/027 , G10L13/047 , G10L13/06 , G10L15/063 , G10L15/187 , G10L2015/0635
Abstract: A system for synthesising expressive speech includes: an interface configured to receive an input text for conversion to speech; a memory; and at least one processor coupled to the memory. The processor is configured to generate, using an expressivity characterisation module, a plurality of expression vectors, wherein each expression vector is a representation of prosodic information in a reference audio style file, and synthesise expressive speech from the input text, using an expressive acoustic model comprising a deep convolutional neural network that is conditioned by at least one of the plurality of expression vectors.
-
公开(公告)号:US20210225358A1
公开(公告)日:2021-07-22
申请号:US17037023
申请日:2020-09-29
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Jesus MONGE ALVAREZ , Holly Francois , Hosang Sung , Seungdo Choi , Kihyun Choo , Sangjun Park
IPC: G10L13/027 , G10L13/047 , G10L13/06 , G10L15/06 , G10L15/187
Abstract: A system for synthesising expressive speech includes: an interface configured to receive an input text for conversion to speech; a memory; and at least one processor coupled to the memory. The processor is configured to generate, using an expressivity characterisation module, a plurality of expression vectors, wherein each expression vector is a representation of prosodic information in a reference audio style file, and synthesise expressive speech from the input text, using an expressive acoustic model comprising a deep convolutional neural network that is conditioned by at least one of the plurality of expression vectors.
-
公开(公告)号:US12277923B2
公开(公告)日:2025-04-15
申请号:US17990358
申请日:2022-11-18
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Seungdo Choi , Kyoungbo Min , Sooyeon Park
IPC: G10K11/178 , H04R1/10
Abstract: An electronic apparatus includes an inner microphone provided on a first surface of the electronic apparatus; an outer microphone disposed on a second surface opposite the first surface; and a processor configured to: receive a voice signal of a counterpart and a voice signal of a wearer of the electronic apparatus that are input through the inner microphone and the outer microphone, based on a size of the voice signal of the wearer input through the inner microphone being greater than or equal to a predetermined threshold, remove the voice signal of the wearer input through the outer microphone based on the voice signal of the wearer input through the inner microphone, and amplify the voice signal of the counterpart input through the outer microphone and from which the voice signal of the wearer is removed and output the amplified voice signal, wherein the size of the voice signal of the wearer input through the inner microphone is greater than a size of the voice signal of the wearer input through the outer microphone.
-
公开(公告)号:US11942077B2
公开(公告)日:2024-03-26
申请号:US17949741
申请日:2022-09-21
Applicant: Samsung Electronics Co., Ltd.
Inventor: Kyoungbo Min , Seungdo Choi , Doohwa Hong
CPC classification number: G10L15/063 , G10L13/00 , G10L15/16
Abstract: An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.
-
公开(公告)号:US11475878B2
公开(公告)日:2022-10-18
申请号:US17081251
申请日:2020-10-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Kyoungbo Min , Seungdo Choi , Doohwa Hong
Abstract: An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.
-
公开(公告)号:US20220148562A1
公开(公告)日:2022-05-12
申请号:US17554547
申请日:2021-12-17
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Sangjun PARK , Kyoungbo Min , Kihyun Choo , Seungdo Choi
IPC: G10L13/047 , G10L13/10
Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a microphone; a memory configured to store a text-to-speech (TTS) model and a plurality of evaluation texts; and a processor configured to: obtain a first reference vector of a user speech spoken by a user based the user speech being received through the microphone, generate a plurality of candidate reference vectors based on the first reference vector, obtain a plurality of synthesized sounds by inputting the plurality of candidate reference vectors and the plurality of evaluation texts to the TTS model, identify at least one synthesized sound of the plurality of synthesized sounds based on a similarity between characteristics of the plurality of synthesized sounds and the user speech, and store a second reference vector of the at least one synthesized sound in the memory as a reference vector corresponding to the user for the TTS model.
-
-
-
-
-
-