发明授权
US06278972B1 System and method for segmentation and recognition of speech signals
有权
用于语音信号的分割和识别的系统和方法
- 专利标题: System and method for segmentation and recognition of speech signals
- 专利标题(中): 用于语音信号的分割和识别的系统和方法
-
申请号: US09225891申请日: 1999-01-04
-
公开(公告)号: US06278972B1公开(公告)日: 2001-08-21
- 发明人: Ning Bi , Chienchung Chang
- 申请人: Ning Bi , Chienchung Chang
- 主分类号: G01L1504
- IPC分类号: G01L1504
摘要:
A system and method for forming a segmented speech signal from an input speech signal having a plurality of frames. The input speech signal is converted from a time domain signal to a frequency domain signal having a plurality of speech frames, wherein each speech frame in the frequency domain signal is represented by at least one spectral value associated with the speech frame. A spectral difference value is then determined for each pair of adjacent frames in the frequency domain signal, wherein the spectral difference value for each pair of adjacent frames is representative of a difference between the at least one spectral value associated with each frame in the pair of adjacent frames. An initial cluster boundary is set between each pair of adjacent frames in the frequency domain signal, and a variance value is assigned to each cluster in the frequency domain signal, wherein the variance value for each cluster is equal to one of the determined spectral difference values. Next, a plurality of cluster merge parameters is calculated, wherein each of the cluster merge parameters is associated with a pair of adjacent clusters in the frequency domain signal. A minimum cluster merge parameter is selected from the plurality of cluster merge parameters. A merged cluster is then formed by canceling a cluster boundary between the clusters associated with the minimum merge parameter and assigning a merged variance value to the merged cluster, wherein the merged variance value is representative of the variance values assigned to the clusters associated with the minimum merge parameter. The process is repeated in order to form a plurality of merged clusters, and the segmented speech signal is formed in accordance with the plurality of merged clusters.
信息查询