摘要:
Provided are methods and systems for passive training for automatic speech recognition. An example method includes utilizing a first, speaker-independent model to detect a spoken keyword or a key phrase in spoken utterances. While utilizing the first model, a second model is passively trained to detect the spoken keyword or the key phrase in the spoken utterances using at least partially the spoken utterances. The second, speaker dependent model may utilize deep neural network (DNN) or convolutional neural network (CNN) techniques. In response to completion of the training, a switch is made from utilizing the first model to utilizing the second model to detect the spoken keyword or the key phrase in spoken utterances. While utilizing the second model, parameters associated therewith are updated using the spoken utterances in response to detecting the keyword or the key phrase in the spoken utterances. User authentication functionality may be provided.
摘要:
Systems and methods for assisting automatic speech recognition (ASR) are provided. An example system includes a buffer operable to store sensor data. The sensor data includes an acoustic signal, the acoustic signal representing at least one captured sound. The system includes a processor communicatively coupled to the buffer and being operable to store received sensor data in the buffer. The received sensor data is analyzed to produce new parameters associated with the sensor data. The buffered sensor data is processed based at least on the new parameters. The processing may include separating clean voice from noise in the acoustic signal. The processor is further operable to provide at least the processed sensor data (for example, the clean voice) to an ASR system operable to receive and process the processed sensor data at a speed faster than real time. The new parameters may also be provided to the ASR system.