SPEAKER IDENTIFICATION AND UNSUPERVISED SPEAKER ADAPTATION TECHNIQUES

    公开(公告)号:US20190051309A1

    公开(公告)日:2019-02-14

    申请号:US16155662

    申请日:2018-10-09

    Applicant: Apple Inc.

    CPC classification number: G10L17/26 G10L15/1822 G10L15/26 G10L17/04 G10L17/06

    Abstract: Systems and processes for generating a speaker profile for use in performing speaker identification for a virtual assistant are provided. One example process can include receiving an audio input including user speech and determining whether a speaker of the user speech is a predetermined user based on a speaker profile for the predetermined user. In response to determining that the speaker of the user speech is the predetermined user, the user speech can be added to the speaker profile and operation of the virtual assistant can be triggered. In response to determining that the speaker of the user speech is not the predetermined user, the user speech can be added to an alternate speaker profile and operation of the virtual assistant may not be triggered. In some examples, contextual information can be used to verify results produced by the speaker identification process.

    VOICE IDENTIFICATION IN DIGITAL ASSISTANT SYSTEMS

    公开(公告)号:US20200380980A1

    公开(公告)日:2020-12-03

    申请号:US16815984

    申请日:2020-03-11

    Applicant: Apple Inc.

    Abstract: Systems and processes for operating an intelligent automated assistant are provided. An example method includes receiving, from one or more external electronic devices, a plurality of speaker profiles for a plurality of users; receiving a natural language speech input; determining, based on comparing the natural language speech input to the plurality of speaker profiles: a first likelihood that the natural language speech input corresponds to a first user of the plurality of users; and a second likelihood that the natural language speech input corresponds to a second user of the plurality of users; determining whether the first likelihood and the second likelihood are within a first threshold; and in accordance with determining that the first likelihood and the second likelihood are not within the first threshold: providing a response to the natural language speech input, the response being personalized for the first user.

    AUTOMATIC ACCENT DETECTION
    5.
    发明申请
    AUTOMATIC ACCENT DETECTION 审中-公开
    自动检测

    公开(公告)号:US20160358600A1

    公开(公告)日:2016-12-08

    申请号:US14846650

    申请日:2015-09-04

    Applicant: Apple Inc.

    Abstract: Systems and processes for automatic accent detection are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a user input, determining a first similarity between a representation of the user input and a first acoustic model of a plurality of acoustic models, and determining a second similarity between the representation of the user input and a second acoustic model of the plurality of acoustic models. The method further includes determining whether the first similarity is greater than the second similarity. In accordance with a determination that the first similarity is greater than the second similarity, the first acoustic model may be selected; and in accordance with a determination that the first similarity is not greater than the second similarity, the second acoustic model may be selected.

    Abstract translation: 提供了自动重音检测的系统和过程。 根据一个示例,一种方法包括在具有一个或多个处理器和存储器的电子设备处接收用户输入,确定用户输入的表示与多个声学模型的第一声学模型之间的第一相似度, 以及确定所述用户输入的表示与所述多个声学模型的第二声学模型之间的第二相似度。 该方法还包括确定第一相似度是否大于第二相似度。 根据第一相似度大于第二相似度的确定,可以选择第一声学模型; 并且根据第一相似度不大于第二相似度的确定,可以选择第二声学模型。

    TRAINING SPEAKER RECOGNITION MODELS FOR DIGITAL ASSISTANTS

    公开(公告)号:US20190272831A1

    公开(公告)日:2019-09-05

    申请号:US15997174

    申请日:2018-06-04

    Applicant: Apple Inc.

    Abstract: Techniques for training a speaker recognition model used for interacting with a digital assistant are provided. In some examples, user authentication information is obtained at a first time. At a second time, a user utterance representing a user request is received. A voice print is generated from the user utterance. A determination is made as to whether a plurality of conditions are satisfied. The plurality of conditions includes a first condition that the user authentication information corresponds to one or more authentication credentials assigned to a registered user of an electronic device. The plurality of conditions further includes a second condition that the first time and the second time are not separated by more than a predefined time period. In accordance with a determination that the plurality of conditions are satisfied, a speaker profile assigned to the registered user is updated based on the voice print.

    CONTEXT-BASED ENDPOINT DETECTION
    7.
    发明申请
    CONTEXT-BASED ENDPOINT DETECTION 审中-公开
    基于语境的端点检测

    公开(公告)号:US20160358598A1

    公开(公告)日:2016-12-08

    申请号:US14846667

    申请日:2015-09-04

    Applicant: Apple Inc.

    CPC classification number: G10L15/04 G10L17/02 G10L25/87 G10L2025/783

    Abstract: The present disclosure generally relates to context-based endpoint detection in user speech input. A method for identifying an endpoint of a spoken request by a user may include receiving user input of natural language speech including one or more words; identifying at least one context associated with the user input; generating a probability, based on the at least one context associated with the user input, that a location in the user input is an endpoint; determining whether the probability is greater than a threshold; and in accordance with a determination that the probability is greater than the threshold, identifying the location in the user input as the endpoint.

    Abstract translation: 本公开通常涉及用户语音输入中的基于上下文的端点检测。 用于识别用户的口头请求的端点的方法可以包括接收包括一个或多个单词的自然语言语言的用户输入; 识别与所述用户输入相关联的至少一个上下文; 基于与所述用户输入相关联的所述至少一个上下文,生成所述用户输入中的位置是端点的概率; 确定概率是否大于阈值; 并且根据概率大于阈值的确定,将用户输入中的位置识别为端点。

    SPEAKER IDENTIFICATION AND UNSUPERVISED SPEAKER ADAPTATION TECHNIQUES
    8.
    发明申请
    SPEAKER IDENTIFICATION AND UNSUPERVISED SPEAKER ADAPTATION TECHNIQUES 审中-公开
    扬声器识别和不可支持的扬声器适配技术

    公开(公告)号:US20160093304A1

    公开(公告)日:2016-03-31

    申请号:US14835169

    申请日:2015-08-25

    Applicant: Apple Inc.

    CPC classification number: G10L17/26 G10L15/1822 G10L15/26 G10L17/04 G10L17/06

    Abstract: Systems and processes for generating a speaker profile for use in performing speaker identification for a virtual assistant are provided. One example process can include receiving an audio input including user speech and determining whether a speaker of the user speech is a predetermined user based on a speaker profile for the predetermined user. In response to determining that the speaker of the user speech is the predetermined user, the user speech can be added to the speaker profile and operation of the virtual assistant can be triggered. In response to determining that the speaker of the user speech is not the predetermined user, the user speech can be added to an alternate speaker profile and operation of the virtual assistant may not be triggered. In some examples, contextual information can be used to verify results produced by the speaker identification process.

    Abstract translation: 提供了用于生成用于为虚拟助理执行说话者识别的扬声器简档的系统和过程。 一个示例性过程可以包括基于用于预定用户的扬声器简档来接收包括用户语音的音频输入并确定用户语音的扬声器是否是预定用户。 响应于确定用户语音的扬声器是预定用户,可以将用户语音添加到扬声器简档,并且可以触发虚拟助理的操作。 响应于确定用户语音的讲话者不是预定用户,可以将用户语音添加到备用讲话者简档,并且虚拟助理的操作可能不被触发。 在一些示例中,可以使用上下文信息来验证由说话者识别过程产生的结果。

Patent Agency Ranking