Electronic device and method for controlling thereof

    公开(公告)号:US11848004B2

    公开(公告)日:2023-12-19

    申请号:US17850096

    申请日:2022-06-27

    CPC classification number: G10L13/10 G10L13/047 G10L13/06

    Abstract: A method for controlling an electronic device includes obtaining a text, obtaining, by inputting the text into a first neural network model, acoustic feature information corresponding to the text and alignment information in which each frame of the acoustic feature information is matched with each phoneme included in the text, identifying an utterance speed of the acoustic feature information based on the alignment information, identifying a reference utterance speed for each phoneme included in the acoustic feature information based on the text and the acoustic feature information, obtaining utterance speed adjustment information based on the utterance speed of the acoustic feature information and the reference utterance speed for each phoneme, and obtaining, based on the utterance speed adjustment information, speech data corresponding to the text by inputting the acoustic feature information into a second neural network model.

    Electronic apparatus and controlling method thereof

    公开(公告)号:US11763799B2

    公开(公告)日:2023-09-19

    申请号:US17554547

    申请日:2021-12-17

    CPC classification number: G10L13/047 G10L13/10

    Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a microphone; a memory configured to store a text-to-speech (TTS) model and a plurality of evaluation texts; and a processor configured to: obtain a first reference vector of a user speech spoken by a user based the user speech being received through the microphone, generate a plurality of candidate reference vectors based on the first reference vector, obtain a plurality of synthesized sounds by inputting the plurality of candidate reference vectors and the plurality of evaluation texts to the TTS model, identify at least one synthesized sound of the plurality of synthesized sounds based on a similarity between characteristics of the plurality of synthesized sounds and the user speech, and store a second reference vector of the at least one synthesized sound in the memory as a reference vector corresponding to the user for the TTS model.

    ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THEREOF

    公开(公告)号:US20220406293A1

    公开(公告)日:2022-12-22

    申请号:US17850096

    申请日:2022-06-27

    Abstract: A method for controlling an electronic device includes obtaining a text, obtaining, by inputting the text into a first neural network model, acoustic feature information corresponding to the text and alignment information in which each frame of the acoustic feature information is matched with each phoneme included in the text, identifying an utterance speed of the acoustic feature information based on the alignment information, identifying a reference utterance speed for each phoneme included in the acoustic feature information based on the text and the acoustic feature information, obtaining utterance speed adjustment information based on the utterance speed of the acoustic feature information and the reference utterance speed for each phoneme, and obtaining, based on the utterance speed adjustment information, speech data corresponding to the text by inputting the acoustic feature information into a second neural network model.

    Adaptive time/frequency-based audio encoding and decoding apparatuses and methods
    19.
    发明授权
    Adaptive time/frequency-based audio encoding and decoding apparatuses and methods 有权
    自适应时间/频率音频编码和解码装置和方法

    公开(公告)号:US08862463B2

    公开(公告)日:2014-10-14

    申请号:US14041324

    申请日:2013-09-30

    CPC classification number: G10L19/12 G10L19/02 G10L19/20

    Abstract: Adaptive time/frequency-based audio encoding and decoding apparatuses and methods. The encoding apparatus includes a transformation & mode determination unit to divide an input audio signal into a plurality of frequency-domain signals and to select a time-based encoding mode or a frequency-based encoding mode for each respective frequency-domain signal, an encoding unit to encode each frequency-domain signal in the respective encoding mode, and a bitstream output unit to output encoded data, division information, and encoding mode information for each respective frequency-domain signal. In the apparatuses and methods, acoustic characteristics and a voicing model are simultaneously applied to a frame, which is an audio compression processing unit. As a result, a compression method effective for both music and voice can be produced, and the compression method can be used for mobile terminals that require audio compression at a low bit rate.

    Abstract translation: 自适应时间/频率音频编码和解码装置和方法。 编码装置包括:变换和模式确定单元,用于将输入音频信号分成多个频域信号,并且为每个相应的频域信号选择基于时间的编码模式或基于频率的编码模式,编码 单元,用于对各个编码模式中的每个频域信号进行编码,以及比特流输出单元,用于输出每个各个频域信号的编码数据,分割信息和编码模式信息。 在装置和方法中,声音特性和语音模型被同时应用于作为音频压缩处理单元的帧。 结果,可以产生对音乐和语音有效的压缩方法,并且压缩方法可以用于需要低比特率的音频压缩的移动终端。

Patent Agency Ranking