ELECTRONIC DEVICE AND METHOD OF GENERATING TEXT-TO-SPEECH MODEL FOR PROSODY CONTROL OF THE ELECTRONIC DEVICE

    公开(公告)号:US20230335112A1

    公开(公告)日:2023-10-19

    申请号:US18213929

    申请日:2023-06-26

    CPC classification number: G10L13/10

    Abstract: According to certain embodiments, an electronic device, comprises: a memory storing therein instructions; and a processor electrically connected to the memory and configured to execute the instructions, wherein, when the instructions are executed by the processor, the processor receives training data comprising a plurality of phenomes; determines a prosody value for each one of the plurality of phenomes in the training data; clusters the plurality of phenomes based on the prosody value for each one of the plurality of phenomes in the training data, thereby resulting in a plurality of prosody clusters; extracts a phoneme sequence corresponding to a text in the training data; extracts a prosody cluster index sequence corresponding to an utterance of the text by selecting one of the plurality of clusters based on prosody values of the utterance of the text; and generates a text-to-speech (TTS) model based on the phoneme sequence and the prosody cluster index sequence.

    ELECTRONIC DEVICE INCLUDING PERSONALIZED TEXT TO SPEECH MODULE AND METHOD FOR CONTROLLING THE SAME

    公开(公告)号:US20220301544A1

    公开(公告)日:2022-09-22

    申请号:US17654881

    申请日:2022-03-15

    Abstract: According to an embodiment, an electronic device comprises: a memory and at least one processor operatively connected with the memory. The at least one processor is configured to: in response to a voice assistant application being executed, identify a pronunciation variant for which an amount of sound source data stored in the memory is less than a specified value among a plurality of pronunciation variants, identify a subject based on the identified pronunciation variant, obtain a question text corresponding to a word including the identified pronunciation variant among a plurality of words included in the subject, output a question speech corresponding to the question text, and receive an utterance after outputting the question speech.

    ELECTRONIC APPARATUS, TERMINAL APPARATUS AND CONTROLLING METHOD THEREOF

    公开(公告)号:US20230395060A1

    公开(公告)日:2023-12-07

    申请号:US18235124

    申请日:2023-08-17

    CPC classification number: G10L13/047 G10L13/10

    Abstract: An electronic apparatus, a terminal apparatus, and a controlling method thereof. The electronic apparatus includes an input interface; and a processor including a prosody module configured to extract an acoustic feature and a vocoder module configured to generate a speech waveform, wherein the processor is configured to: receive a text input using the input interface; identify a first acoustic feature from the text input using the prosody module, wherein the first acoustic feature corresponds to a first sampling rate; generate a modified acoustic feature corresponding to a modified sampling rate different from the first sampling rate, based on the identified first acoustic feature; and generate a plurality of vocoder learning models by training the vocoder module based on the first acoustic feature and the modified acoustic feature.

    ELECTRONIC DEVICE AND PERSONALIZED TEXT-TO-SPEECH MODEL GENERATION METHOD OF THE ELECTRONIC DEVICE

    公开(公告)号:US20220301542A1

    公开(公告)日:2022-09-22

    申请号:US17830574

    申请日:2022-06-02

    Abstract: An electronic device includes a memory storing instructions and a processor configured to execute the instructions. When the instructions are executed by the processor, the processor records a speech of a user corresponding to a text and obtains recorded data in which the text and the speech of the user are matched, stores an intermediate model trained based on a portion of the recorded data while training a speech model to generate a personalized text-to-speech (P-TTS) model corresponding to the user, generates an intermediate result from the training using the intermediate model and provides the generated intermediate result to the user, and receives feedback from the user on the intermediate result. Other example embodiments, in addition to the foregoing example embodiment, are also applicable.

Patent Agency Ranking