摘要:
Techniques are disclosed for reliably masking speech commands directed to one or more computing devices to prevent the speech commands from being rendered. In some embodiments, each of the one or more computing devices includes components configured to generate acoustic data from ambient sound waves, process the acoustic data to identify a speech command sequence, and mask the speech command sequence from being rendered. At least some of the systems and methods disclosed herein monitor inbound audio at a fine grain level of detail. Working at this level of granularity enables the system and methods described herein to detect potential speech commands early within the user's utterance thereof and to discriminate quickly between true speech commands and other user utterances. These early detection and discrimination features, in turn, enable some embodiments to manage potential communication disruptions (e.g., jitter and/or latency) by modifying rates of audio prior to rendering.
摘要:
Technologies are described herein that allow a user to wake up a computing device operating in a low-power state and for the user to be verified by speaking a single wake phrase. Wake phrase recognition is performed by a low-power engine. in some embodiments, the low-power engine may also perform speaker verification. In other embodiments, the mobile device wakes up after a wake phrase is recognized and a component other than the low-power engine performs speaker verification on a portion of the audio input comprising the wake phrase, More than one wake phrases may be associated with a particular user, and separate users may be associated with different wake phrases. Different wake phrases may cause the device transition from a low-power state to various active states.
摘要:
Technologies are described herein that allow a user to wake up a computing device operating in a low-power state and for the user to be verified by speaking a single wake phrase. Wake phrase recognition is performed by a low-power engine. In some embodiments, the low-power engine may also perform speaker verification. In other embodiments, the mobile device wakes up after a wake phrase is recognized and a component other than the low-power engine performs speaker verification on a portion of the audio input comprising the wake phrase. More than one wake phrases may be associated with a particular user, and separate users may be associated with different wake phrases. Different wake phrases may cause the device transition from a low-power state to various active states.
摘要:
An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node based on the desired scale factors of the inputs and select a scale factor for an activation function and an output node. The quantizing unit is to dynamically requantize the neural network by traversing a graph of the neural network.
摘要:
An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node based on the desired scale factors of the inputs and select a scale factor for an activation function and an output node. The quantizing unit is to dynamically requantize the neural network by traversing a graph of the neural network.
摘要:
An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node based on the desired scale factors of the inputs and select a scale factor for an activation function and an output node. The quantizing unit is to dynamically requantize the neural network by traversing a graph of the neural network.
摘要:
Technologies are described herein that allow a user to wake up a computing device operating in a low-power state and for the user to be verified by speaking a single wake phrase. Wake phrase recognition is performed by a low-power engine. In some embodiments, the low-power engine may also perform speaker verification. In other embodiments, the mobile device wakes up after a wake phrase is recognized and a component other than the low-power engine performs speaker verification on a portion of the audio input comprising the wake phrase. More than one wake phrases may be associated with a particular user, and separate users may be associated with different wake phrases. Different wake phrases may cause the device transition from a low-power state to various active states.
摘要:
Techniques are disclosed for reliably masking speech commands directed to one or more computing devices to prevent the speech commands from being rendered. In some embodiments, each of the one or more computing devices includes components configured to generate acoustic data from ambient sound waves, process the acoustic data to identify a speech command sequence, and mask the speech command sequence from being rendered. At least some of the systems and methods disclosed herein monitor inbound audio at a fine grain level of detail. Working at this level of granularity enables the system and methods described herein to detect potential speech commands early within the user's utterance thereof and to discriminate quickly between true speech commands and other user utterances. These early detection and discrimination features, in turn, enable some embodiments to manage potential communication disruptions (e.g., jitter and/or latency) by modifying rates of audio prior to rendering.
摘要:
Technologies are described herein that allow a user to wake up a computing device operating in a low-power state and for the user to be verified by speaking a single wake phrase. Wake phrase recognition is performed by a low-power engine. In some embodiments, the low-power engine may also perform speaker verification. In other embodiments, the mobile device wakes up after a wake phrase is recognized and a component other than the low-power engine performs speaker verification on a portion of the audio input comprising the wake phrase. More than one wake phrases may be associated with a particular user, and separate users may be associated with different wake phrases. Different wake phrases may cause the device transition from a low-power state to various active states.
摘要:
A disclosed speech processor includes a front end to receive a speech input and generate a feature vector indicative of a portion of the speech input and a Gaussian mixture (GMM) circuit to receive the feature vector, model any one of a plurality of GMM speech recognition algorithms, and generate a GMM score for the feature vector based on the GMM speech recognition algorithm modeled. In at least one embodiment, the GMM circuit includes a common compute block to generate feature a vector sum indicative of a weighted sum of differences squares between the feature vector and a mixture component of the GMM speech recognition algorithm. In at least one embodiment, the GMM speech recognition algorithm being modeled includes a plurality of Gaussian mixture components and the common compute block is operable to generate feature vector scores corresponding to each of the plurality of mixture components.