Method and Apparatus for Automatic Speaker-Based Speech Clustering
    2.
    发明申请
    Method and Apparatus for Automatic Speaker-Based Speech Clustering 有权
    用于自动基于扬声器的语音聚类的方法和装置

    公开(公告)号:US20140358541A1

    公开(公告)日:2014-12-04

    申请号:US13907364

    申请日:2013-05-31

    IPC分类号: G10L15/06

    CPC分类号: G10L15/063 G10L17/04

    摘要: Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.

    摘要翻译: 可靠的基于扬声器的语音语音聚类允许改进扬声器识别和基于扬声器的语音分割。 根据至少一个示例实施例,迭代自下而上的基于扬声器的聚类方法采用诸如i向量之类的语音话语的声纹。 在每次迭代中,评估了轮廓宽度准则(SWC)值的聚类置信度分数,并将一对最近的聚类合并成单个聚类。 基于表示与不同簇相关联的声纹相似性的相似性分数来确定合并的最近聚类对。 然后将最终聚类模式确定为与对应于所评估的最高聚类置信度得分的迭代相关联的一组聚类。 所使用的SWC可能还是一个修改后的SWC,可以检测到迭代方法的早期停止。

    Method and Apparatus for Automated Speaker Parameters Adaptation in a Deployed Speaker Verification System
    3.
    发明申请
    Method and Apparatus for Automated Speaker Parameters Adaptation in a Deployed Speaker Verification System 有权
    自动扬声器参数适应部署扬声器验证系统的方法和装置

    公开(公告)号:US20140244257A1

    公开(公告)日:2014-08-28

    申请号:US13776502

    申请日:2013-02-25

    IPC分类号: G10L17/06

    摘要: Typical speaker verification systems usually employ speakers' audio data collected during an enrollment phase when users enroll with the system and provide respective voice samples. Due to technical, business, or other constraints, the enrollment data may not be large enough or rich enough to encompass different inter-speaker and intra-speaker variations. According to at least one embodiment, a method and apparatus employing classifier adaptation based on field data in a deployed voice-based interactive system comprise: collecting representations of voice characteristics, in association with corresponding speakers, the representations being generated by the deployed voice-based interactive system; updating parameters of the classifier, used in speaker recognition, based on the representations collected; and employing the classifier, with the corresponding parameters updated, in performing speaker recognition.

    摘要翻译: 典型的扬声器验证系统通常使用在注册阶段期间收集的扬声器的音频数据,当用户使用系统注册并提供相应的语音样本。 由于技术,业务或其他约束,注册数据可能不够大或足够丰富,不足以包含不同的扬声器间和演讲人之间的变化。 根据至少一个实施例,一种基于部署的基于语音的交互系统中基于现场数据的分类器适配的方法和装置包括:与对应的说话者相关联地收集语音特征的表示,所述表示由部署的基于语音的交互系统 互动系统; 基于所收集的表示更新用于说话人识别的分类器的参数; 并且在执行说话者识别中使用具有相应参数更新的分类器。

    Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis
    4.
    发明授权
    Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis 有权
    使用I矢量后验和概率线性判别分析的快速说话人识别得分

    公开(公告)号:US09373330B2

    公开(公告)日:2016-06-21

    申请号:US14454169

    申请日:2014-08-07

    CPC分类号: G10L17/06

    摘要: A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker.

    摘要翻译: 用于执行说话者识别的方法包括:通过第一和第二扬声器估计各个语音发声的声学覆盖的各种不确定性,所述声学覆盖表示说话者在说话时使用的相应声音; 以允许通过丢弃扬声器的不同声音的不确定性之间的依赖性来允许有效存储器使用的方式来表示声学覆盖的各个不确定性; 以允许通过表示声学覆盖的各个不确定性的倒数然后丢弃扬声器的不同声音的不确定性之间的依赖性的方式来允许有效计算的方式来表示声学覆盖的各个不确定性; 以及在所述比较期间利用所述声学覆盖的各自不确定性的方式,由所述扬声器计算所述语音发音之间的分数,所述分数指示所述扬声器是相同的扬声器的可能性。

    FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS
    5.
    发明申请
    FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS 有权
    使用I-VECTOR POSTERIORS和PROBABILISTIC LINEAR DISRIMINANT ANALYSIS的快速扬声器识别

    公开(公告)号:US20160042739A1

    公开(公告)日:2016-02-11

    申请号:US14454169

    申请日:2014-08-07

    IPC分类号: G10L17/06

    CPC分类号: G10L17/06

    摘要: A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker.

    摘要翻译: 用于执行说话者识别的方法包括:通过第一和第二扬声器估计各个语音发声的声学覆盖的各种不确定性,所述声学覆盖表示说话者在说话时使用的相应声音; 以允许通过丢弃扬声器的不同声音的不确定性之间的依赖性来允许有效存储器使用的方式来表示声学覆盖的各个不确定性; 以允许通过表示声学覆盖的各个不确定性的倒数然后丢弃扬声器的不同声音的不确定性之间的依赖性的方式来允许有效计算的方式来表示声学覆盖的各个不确定性; 以及在所述比较期间利用所述声学覆盖的各自不确定性的方式,由所述扬声器计算所述语音发音之间的分数,所述分数指示所述扬声器是相同的扬声器的可能性。

    Method and apparatus for automatic speaker-based speech clustering
    8.
    发明授权
    Method and apparatus for automatic speaker-based speech clustering 有权
    用于自动基于扬声器的语音聚类的方法和装置

    公开(公告)号:US09368109B2

    公开(公告)日:2016-06-14

    申请号:US13907364

    申请日:2013-05-31

    CPC分类号: G10L15/063 G10L17/04

    摘要: Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.

    摘要翻译: 可靠的基于扬声器的语音语音聚类允许改进扬声器识别和基于扬声器的语音分割。 根据至少一个示例实施例,迭代自下而上的基于扬声器的聚类方法采用诸如i向量之类的语音话语的声纹。 在每次迭代中,评估了轮廓宽度准则(SWC)值的聚类置信度分数,并将一对最近的聚类合并成单个聚类。 基于表示与不同簇相关联的声纹相似性的相似性分数来确定合并的最近聚类对。 然后将最终聚类模式确定为与对应于所评估的最高聚类置信度得分的迭代相关联的一组聚类。 所使用的SWC可能还是一个修改后的SWC,可以检测到迭代方法的早期停止。