AUTOMATIC SPEAKER IDENTIFICATION USING SPEECH RECOGNITION FEATURES

    公开(公告)号:US20190378517A1

    公开(公告)日:2019-12-12

    申请号:US16448788

    申请日:2019-06-21

    Abstract: Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.

    Bounded access to critical data
    23.
    发明授权

    公开(公告)号:US10320757B1

    公开(公告)日:2019-06-11

    申请号:US14298181

    申请日:2014-06-06

    Abstract: A secure repository receives and stores user data, and shares the user data with trusted client devices. The user data may be shared individually or as part of bundled data relating to multiple users, but in either case, the secure repository associates specific data with specific users. This association is maintained by the trusted client devices, even after the data is altered by processing on the client device. If a user requests a purge of their data, the system deletes and/or disables that data on both the repository and the client devices, as well as deleting and/or disabling processed data derived from that user's data, unless a determination has been made that the processed data no longer contains confidential information.

    Automatic speaker identification using speech recognition features
    25.
    发明授权
    Automatic speaker identification using speech recognition features 有权
    自动扬声器识别使用语音识别功能

    公开(公告)号:US09558749B1

    公开(公告)日:2017-01-31

    申请号:US13957257

    申请日:2013-08-01

    Abstract: Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.

    Abstract translation: 公开了用于自动识别扬声器的特征。 自动语音识别(“ASR”)和/或其他自动确定的信息的工件可以针对各个用户简档或模型进行处理。 可以确定反映个人用户发声的可能性的得分。 分数可以基于例如对于语音的音频数据的帧最佳得分的高斯混合模型(“GMM”)的各个组件。 与特定话语的最高似然分数相关联的用户可以被识别为话语的说话者。 关于识别的用户的信息可以被提供给口语处理系统的组件,单独的应用等。

    Enhanced endpoint detection for speech recognition
    26.
    发明授权
    Enhanced endpoint detection for speech recognition 有权
    增强的语音识别端点检测

    公开(公告)号:US09437186B1

    公开(公告)日:2016-09-06

    申请号:US13921671

    申请日:2013-06-19

    Abstract: Determining the end of an utterance for purposes of automatic speech recognition (ASR) may be improved with a system that provides early results and/or incorporates semantic tagging. Early ASR results of an incoming utterance may be prepared based at least in part on an estimated endpoint and processed by a natural language understanding (NLU) process while final results, based at least in part on a final endpoint, are determined. If the early results match the final results, the early NLU results are already prepared for early execution. The endpoint may also be determined based at least in part on the content of the utterance, as represented by semantic tagging output from ASR processing. If the tagging indicate completion of a logical statement, an endpoint may be declared, or a threshold for silent frames prior to declaring an endpoint may be adjusted.

    Abstract translation: 用于自动语音识别(ASR)的话语的确定结束可以通过提供早期结果和/或包含语义标签的系统来改进。 可以至少部分地基于估计的端点并且由自然语言理解(NLU)过程进行处理来准备传入话语的早期ASR结果,而至少部分地基于最终端点确定最终结果。 如果早期结果符合最终结果,则早期NLU结果已经准备好提前执行。 还可以至少部分地基于话音的内容来确定端点,如ASR处理的语义标签输出所表示的。 如果标记指示逻辑语句的完成,则可以声明端点,或者可以调整在声明端点之前的静默帧的阈值。

    SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS
    27.
    发明申请
    SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS 有权
    分布式语音识别系统中的语音模型检索

    公开(公告)号:US20140163977A1

    公开(公告)日:2014-06-12

    申请号:US13712891

    申请日:2012-12-12

    CPC classification number: G10L15/32 G10L15/22 G10L15/30

    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

    Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。 可以异步检索模型和数据,并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。 一旦收到,模型和统计信息可以被缓存。 还可以异步检索更新模型和数据所需的统计数据,以便可以在模型和数据可用时更新模型和数据。 可以立即使用更新的模型和数据来重新处理话语,或者保存用于处理随后接收的话语。 可以跟踪与自动语音识别系统的用户交互,以便预测用户什么时候可能利用该系统。 基于这样的预测,模型和数据可以被预先缓存。

Patent Agency Ranking