-
1.
公开(公告)号:US20230335112A1
公开(公告)日:2023-10-19
申请号:US18213929
申请日:2023-06-26
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junesig SUNG , Taehoon KIM , Nikos ELLINAS , Pirros TSIAKOULIS , Hyoungmin PARK
IPC: G10L13/10
CPC classification number: G10L13/10
Abstract: According to certain embodiments, an electronic device, comprises: a memory storing therein instructions; and a processor electrically connected to the memory and configured to execute the instructions, wherein, when the instructions are executed by the processor, the processor receives training data comprising a plurality of phenomes; determines a prosody value for each one of the plurality of phenomes in the training data; clusters the plurality of phenomes based on the prosody value for each one of the plurality of phenomes in the training data, thereby resulting in a plurality of prosody clusters; extracts a phoneme sequence corresponding to a text in the training data; extracts a prosody cluster index sequence corresponding to an utterance of the text by selecting one of the plurality of clusters based on prosody values of the utterance of the text; and generates a text-to-speech (TTS) model based on the phoneme sequence and the prosody cluster index sequence.
-
2.
公开(公告)号:US20220301544A1
公开(公告)日:2022-09-22
申请号:US17654881
申请日:2022-03-15
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Cheol RYU , Kwanghoon KIM , Junesig SUNG
IPC: G10L13/08 , G10L15/187 , G10L15/22 , G10L13/04 , G10L15/30
Abstract: According to an embodiment, an electronic device comprises: a memory and at least one processor operatively connected with the memory. The at least one processor is configured to: in response to a voice assistant application being executed, identify a pronunciation variant for which an amount of sound source data stored in the memory is less than a specified value among a plurality of pronunciation variants, identify a subject based on the identified pronunciation variant, obtain a question text corresponding to a word including the identified pronunciation variant among a plurality of words included in the subject, output a question speech corresponding to the question text, and receive an utterance after outputting the question speech.
-
公开(公告)号:US20230395060A1
公开(公告)日:2023-12-07
申请号:US18235124
申请日:2023-08-17
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Sangjun PARK , Kihyun CHOO , Hyoungmin PARK , Junesig SUNG
IPC: G10L13/047 , G10L13/10
CPC classification number: G10L13/047 , G10L13/10
Abstract: An electronic apparatus, a terminal apparatus, and a controlling method thereof. The electronic apparatus includes an input interface; and a processor including a prosody module configured to extract an acoustic feature and a vocoder module configured to generate a speech waveform, wherein the processor is configured to: receive a text input using the input interface; identify a first acoustic feature from the text input using the prosody module, wherein the first acoustic feature corresponds to a first sampling rate; generate a modified acoustic feature corresponding to a modified sampling rate different from the first sampling rate, based on the identified first acoustic feature; and generate a plurality of vocoder learning models by training the vocoder module based on the first acoustic feature and the modified acoustic feature.
-
4.
公开(公告)号:US20230267925A1
公开(公告)日:2023-08-24
申请号:US17712699
申请日:2022-04-04
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junesig SUNG , Shinjae KANG , Chakladar SUBHOJIT
IPC: G10L15/183 , G10L25/60 , G10L15/16 , G10L13/10 , G10L25/90 , G10L15/02 , G10L25/18 , G10L15/22 , G10L15/06
CPC classification number: G10L15/183 , G10L13/10 , G10L15/02 , G10L15/063 , G10L15/16 , G10L15/22 , G10L25/18 , G10L25/60 , G10L25/90 , G10L2015/025
Abstract: An electronic device is provided. The electronic device includes a processor, and a memory operatively connected to the processor, wherein the memory stores instructions which, when executed, cause the processor to generate multiple sound sources for a designated text including at least one designated word, based on a personalized-text-to-speech model constructed with a designated user voice, and perform deep learning of a personalized automatic speech recognition model by using the multiple generated sound sources.
-
5.
公开(公告)号:US20220301542A1
公开(公告)日:2022-09-22
申请号:US17830574
申请日:2022-06-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junesig SUNG , Kwanghoon KIM , Hyoungmin PARK
IPC: G10L13/02
Abstract: An electronic device includes a memory storing instructions and a processor configured to execute the instructions. When the instructions are executed by the processor, the processor records a speech of a user corresponding to a text and obtains recorded data in which the text and the speech of the user are matched, stores an intermediate model trained based on a portion of the recorded data while training a speech model to generate a personalized text-to-speech (P-TTS) model corresponding to the user, generates an intermediate result from the training using the intermediate model and provides the generated intermediate result to the user, and receives feedback from the user on the intermediate result. Other example embodiments, in addition to the foregoing example embodiment, are also applicable.
-
公开(公告)号:US20170110113A1
公开(公告)日:2017-04-20
申请号:US15293879
申请日:2016-10-14
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junesig SUNG , Gunu JHO , Jaecheol BAE , Gwanghoon KIM , Hana KO , Sora BAE , Eunzu YUN , Hongil CHO
IPC: G10L13/08 , G10L13/04 , G10L13/033
CPC classification number: G10L13/086 , G10L13/0335 , G10L13/04 , G10L13/06
Abstract: An electronic device is provided. The electronic device includes a processor and a memory electrically connected to the processor. The memory stores a super-clustered common acoustic data set and instructions to allow the processor to acquire at least one text, select information associated with a speech into which the acquired text is transformed, when the selected information is first information, select at least one of first paths, load elements of the super-clustered common acoustic data set based on the selected first paths, and generate a first acoustic signal based on the elements of the super-clustered common acoustic data set, and when the selected information is second information, select at least one of second paths, load elements of the super-clustered common acoustic data set based on the at least one second path, and generate a second acoustic signal based on the elements of the super-clustered common acoustic data set.
-
-
-
-
-