EFFICIENT SPEECH TO SPIKES CONVERSION PIPELINE FOR A SPIKING NEURAL NETWORK
摘要:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for converting audio to spikes for input to a spiking neural network configured to recognize speech based on the spikes are described. In some aspects, a method includes obtaining audio data and generating frequency domain audio signals that represent the audio data by converting the audio data into a frequency domain. The frequency domain audio signals are mapped into a set of Mel-frequency bands to obtain Mel-scale frequency audio signals. A log transformation is performed on the Mel-scale frequency audio signals to obtain log-Mel signals. Spike input is generated for input to a spiking neural network (SNN) model by converting the log-Mel signals to the series of spikes. The spike input is provided as an input to the SNN model.
信息查询
0/0