Voice visualization system for english learning, and method therefor

    公开(公告)号:US12118898B2

    公开(公告)日:2024-10-15

    申请号:US18260606

    申请日:2022-01-27

    申请人: Gi Hun Lee

    发明人: Gi Hun Lee

    摘要: A speech visualization system according to the present invention includes: a speech signal input unit for receiving speech signals of sentences with English pronunciations; a speech information analysis unit for analyzing speech information with frequencies, energy, and time of the speech signals and the text corresponding to the speech signals to divide the speech information into at least one or more segments; a speech information classification unit for classifying the segments of the speech information into flow units and each flow unit into at least one or more sub flow units each having at least one or more words; a visualization property assignment unit for assigning visualization properties for speech visualization to the analyzed and classified speech information; and a visualization processing unit for performing visualization processing based on the assigned visualization properties to generate speech visualization data.

    Automatic speech sensitivity adjustment feature

    公开(公告)号:US12106751B2

    公开(公告)日:2024-10-01

    申请号:US16555845

    申请日:2019-08-29

    摘要: An automatic speech sensitivity adjustment feature is provided. The described sensitivity feature can enable an automatic system adjustment of a sensitivity level based on the number and type of determined speech errors. The sensitivity level determines how sensitive the sensitivity feature will be when indicating speech errors. The sensitivity feature can receive audio input comprising one or more spoken words and determine speech errors for the audio input using at least a sensitivity level. The sensitivity feature can determine whether an amount and type of the speech errors requires an adjustment to the sensitivity level. The sensitivity feature can adjust the sensitivity level to a second sensitivity level based on the amount and type of the speech errors, where the second sensitivity level is a different level than the sensitivity level. The sensitivity feature can re-determine the speech errors for the audio input using at least the second sensitivity level.

    Methods and systems for confusion reduction for compressed acoustic models

    公开(公告)号:US12067978B2

    公开(公告)日:2024-08-20

    申请号:US17335663

    申请日:2021-06-01

    摘要: Methods and systems are disclosed herein for improvements relating to compressed automatic speech recognition (ASR) systems. The ASR system may comprise a compressed acoustic engine and an adaptive decoder. The adaptive decoder may be dynamically compiled based on characteristics of the compressed acoustic engine and a current state of the application device. In some embodiments, a dynamic command list is used to manage context-specific commands. Two or more commands recognized by the adaptive decoder may be confusable due to compression of the ASR system. Alternate commands may be determined that are semantically equivalent but phonetically different than the confusable commands to reduce classification error of the adaptive decoder. An alternate command may replace one or more of the confusable commands in the adaptive decoder. In some embodiments, a user interface is displayed to a user of the ASR system to select the alternate command for replacement in the decoder.

    Real-time name mispronunciation detection

    公开(公告)号:US12020683B2

    公开(公告)日:2024-06-25

    申请号:US17513335

    申请日:2021-10-28

    摘要: A real-time name mispronunciation detection feature can enable a user to receive instant feedback anytime they have mispronounced another person's name in an online meeting. The feature can receive audio input of a speaker and obtain a transcript of the audio input; identify a name from text of the transcript based on names of meeting participants; and extract a portion of the audio input corresponding to the name identified from the text of the transcript. The feature can obtain a reference pronunciation for the name using a user identifier associated with the name; and can obtain a pronunciation score for the name based on a comparison between the reference pronunciation for the name and the portion of the audio input corresponding to the name. The feature can then determine whether the pronunciation score is below a threshold; and in response, notify the speaker of a pronunciation error.