Neural network voice activity detection employing running range normalization
摘要:
A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
信息查询
0/0