Method and Apparatus for Efficient I-Vector Extraction
    1.
    发明申请
    Method and Apparatus for Efficient I-Vector Extraction 有权
    用于高效I向量提取的方法和装置

    公开(公告)号:US20140222428A1

    公开(公告)日:2014-08-07

    申请号:US13856992

    申请日:2013-04-04

    IPC分类号: G10L17/00

    CPC分类号: G10L17/02

    摘要: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According to an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of a linear operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition. Computing the voice characteristics, by using the determined representations, results in significant reduction in memory usage and possible increase in execution speed.

    摘要翻译: 大多数扬声器识别系统使用作为扬声器声音特征的紧凑表示的i向量。 典型的i向量提取程序在计算和内存使用方面是复杂的。 根据一个实施例,一种用于说话者识别的方法和相应的装置包括确定可变性运算符的每个分量的表示,其表示相对于背景统计模型的语音特征的统计的语音间和讲话间间的变化性 线性运算符对变量运算符的所有分量共同,并且具有大于变量运算符的分量的第二维度的第一维度; 使用所确定的表示来计算特定说话者的统计语音特征; 并且在执行说话者识别时采用特定说话者的统计语音特征。 通过使用确定的表示来计算语音特性导致显着降低内存使用并且可能增加执行速度。

    Method and apparatus for efficient i-vector extraction
    2.
    发明授权
    Method and apparatus for efficient i-vector extraction 有权
    用于高效i向量提取的方法和装置

    公开(公告)号:US09406298B2

    公开(公告)日:2016-08-02

    申请号:US13856992

    申请日:2013-04-04

    IPC分类号: G10L17/00 G10L17/02

    CPC分类号: G10L17/02

    摘要: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According to an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of a linear operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition. Computing the voice characteristics, by using the determined representations, results in significant reduction in memory usage and possible increase in execution speed.

    摘要翻译: 大多数扬声器识别系统使用作为扬声器声音特征的紧凑表示的i向量。 典型的i向量提取程序在计算和内存使用方面是复杂的。 根据一个实施例,一种用于说话者识别的方法和相应的装置包括确定可变性运算符的每个分量的表示,其表示相对于背景统计模型的语音特征的统计的语音间和讲话间间的变化性 线性运算符对变量运算符的所有分量共同,并且具有大于变量运算符的分量的第二维度的第一维度; 使用所确定的表示来计算特定说话者的统计语音特征; 并且在执行说话者识别时采用特定说话者的统计语音特征。 通过使用确定的表示来计算语音特性导致显着降低内存使用并且可能增加执行速度。

    Method and Apparatus for Efficient I-Vector Extraction
    3.
    发明申请
    Method and Apparatus for Efficient I-Vector Extraction 审中-公开
    用于高效I向量提取的方法和装置

    公开(公告)号:US20140222423A1

    公开(公告)日:2014-08-07

    申请号:US13762213

    申请日:2013-02-07

    IPC分类号: G10L17/02

    CPC分类号: G10L17/02

    摘要: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of an orthogonal operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition. Computing the voice characteristics, by using the determined representations, results in significant reduction in memory usage and substantial increase in execution speed.

    摘要翻译: 大多数扬声器识别系统使用作为扬声器声音特征的紧凑表示的i向量。 典型的i向量提取程序在计算和内存使用方面是复杂的。 根据一个实施例,一种用于说话者识别的方法和相应的装置包括确定可变性运算符的每个分量的表示,其表示相对于背景统计模型的语音特征的统计的语音间和讲话者之间的变化性, 可变运算符的所有组件共同的运算符,并且具有大于变量运算符的组件的第二维度的第一维度; 使用所确定的表示来计算特定说话者的统计语音特征; 并且在执行说话者识别时采用特定说话者的统计语音特征。 通过使用确定的表示来计算语音特性导致显着降低内存使用并大大增加执行速度。

    Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis
    4.
    发明授权
    Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis 有权
    使用I矢量后验和概率线性判别分析的快速说话人识别得分

    公开(公告)号:US09373330B2

    公开(公告)日:2016-06-21

    申请号:US14454169

    申请日:2014-08-07

    CPC分类号: G10L17/06

    摘要: A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker.

    摘要翻译: 用于执行说话者识别的方法包括:通过第一和第二扬声器估计各个语音发声的声学覆盖的各种不确定性,所述声学覆盖表示说话者在说话时使用的相应声音; 以允许通过丢弃扬声器的不同声音的不确定性之间的依赖性来允许有效存储器使用的方式来表示声学覆盖的各个不确定性; 以允许通过表示声学覆盖的各个不确定性的倒数然后丢弃扬声器的不同声音的不确定性之间的依赖性的方式来允许有效计算的方式来表示声学覆盖的各个不确定性; 以及在所述比较期间利用所述声学覆盖的各自不确定性的方式,由所述扬声器计算所述语音发音之间的分数,所述分数指示所述扬声器是相同的扬声器的可能性。

    FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS
    5.
    发明申请
    FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS 有权
    使用I-VECTOR POSTERIORS和PROBABILISTIC LINEAR DISRIMINANT ANALYSIS的快速扬声器识别

    公开(公告)号:US20160042739A1

    公开(公告)日:2016-02-11

    申请号:US14454169

    申请日:2014-08-07

    IPC分类号: G10L17/06

    CPC分类号: G10L17/06

    摘要: A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker.

    摘要翻译: 用于执行说话者识别的方法包括:通过第一和第二扬声器估计各个语音发声的声学覆盖的各种不确定性,所述声学覆盖表示说话者在说话时使用的相应声音; 以允许通过丢弃扬声器的不同声音的不确定性之间的依赖性来允许有效存储器使用的方式来表示声学覆盖的各个不确定性; 以允许通过表示声学覆盖的各个不确定性的倒数然后丢弃扬声器的不同声音的不确定性之间的依赖性的方式来允许有效计算的方式来表示声学覆盖的各个不确定性; 以及在所述比较期间利用所述声学覆盖的各自不确定性的方式,由所述扬声器计算所述语音发音之间的分数,所述分数指示所述扬声器是相同的扬声器的可能性。