Speech recognition apparatus and method

    公开(公告)号:US11183174B2

    公开(公告)日:2021-11-23

    申请号:US16351612

    申请日:2019-03-13

    Inventor: Ki Soo Kwon

    Abstract: A processor-implemented method of personalizing a speech recognition model includes: obtaining statistical information of first scaling vectors combined with a base model for speech recognition; obtaining utterance data of a user; and generating a personalized speech recognition model by modifying a second scaling vector combined with the base model based on the utterance data of the user and the statistical information.

    Apparatus and method with speech recognition and learning

    公开(公告)号:US11437023B2

    公开(公告)日:2022-09-06

    申请号:US16736895

    申请日:2020-01-08

    Abstract: A processor-implemented speech recognition method includes: applying, to an input layer of a neural network, a frame of a speech sequence; obtaining an output of a hidden layer of the neural network corresponding to the frame; calculating a statistical value of at least one previous output of the hidden layer corresponding to at least one previous frame of the speech sequence; normalizing the output based on the statistical value; applying the normalized output to a subsequent layer of the neural network; and recognizing the speech sequence based on the applying of the normalized output.

    Speech signal recognition system and method

    公开(公告)号:US10607597B2

    公开(公告)日:2020-03-31

    申请号:US15916512

    申请日:2018-03-09

    Abstract: A speech signal recognition method, apparatus, and system. The speech signal recognition method may include obtaining by or from a terminal an output of a personalization layer, with respect to a speech signal provided by a user of the terminal, having been implemented by input of the speech signal to the personalization layer, the personalization layer being previously trained based on speech features of the user, implementing a global model by providing the obtained output of the personalization layer to the global model, the global model being configured to output a phonemic signal indicating a phoneme included in the speech signal through the global model being previously trained based on speech features common to a plurality of users, and re-training the personalization layer based on the phonemic signal output from the global model, where the personalization layer and the global model collectively represent an acoustic model.

Patent Agency Ranking