摘要:
Apparatuses and methods are described to identify desired audio. A first input of an apparatus is configured to receive a main signal. A second input of the apparatus is configured to receive a reference signal. A normalizer is configured to normalize a compressed main signal by a compressed reference signal to create a normalized main signal. A single channel normalized voice threshold comparator is configured to receive as an input the normalized main signal and to output a desired voice activity detection signal.
摘要:
In one embodiment, a head worn computer comprising a housing, wherein the housing includes at least one spring applying inward force to a user's head, a plurality of regions, the regions being able to be flexible regions or stiff regions, and two end regions, configured to wrap around a portion of the user's head, such that the two end regions are located at the front of the user's head, and each of the two end regions are on opposite sides of the user's head. In another embodiment, a method of configuring a head worn computer to a user, using the above mentioned head worn computer.
摘要:
A method of, and corresponding headset computer for, performing instant speech translation including, establishing a local network including a link between a first and a second headset computer in which preferred language settings of each headset computer are exchanged, transmitting captured speech in a first language from a first headset computer to a network-based speech recognition service to recognize and transcribe the captured speech as text, receiving the text at the first headset computer, broadcasting the text over the local network to at least the second headset computer, receiving the text at the second headset computer, transmitting the received text from the second headset computer to a network-based text translation service to translate the text to a text in a second language, receiving the text in the second language at the second headset computer from the network-based text translation service, and displaying the translated text at the second headset computer.
摘要:
Frequency domain signal extraction methods and apparatuses include receiving a reference signal, which contains mostly undesired audio and is substantially void of desired audio. The reference signal is decomposed into at least two reference frequency components. Filtering the at least two reference frequency components with at least two adaptive filters to form at least two filtered reference frequency components. The filtered reference frequency components are recombined in an IFFT component, to produce a filtered reference signal. A delayed signal is input to an adder. The delayed signal contains desired audio and undesired audio. The filtered reference signal is subtracted from the delayed signal to form an output signal containing desired audio. The output signal is decomposed into at least two frequency components. The filtering is adapted with the at least two frequency components. The filtering is inhibited intermittently with the at least two adaptive filters to prevent cancellation of the desired audio. In some implementations, frequency sub-bands are employed. In some implementations, an acoustic element with a Cardioid beam pattern is used to acquire the reference signal.
摘要:
A system performs stable control of moving devices (such as a helicopter or robot) with attached camera(s), providing live imagery back to a head-mounted computer (HMC). The HMC controls the moving device. The HMC user specifies a desired path or location for the moving device. Camera images enable the user-specified instructions to be followed accurately and the device's position to be maintained thereafter. A method of controlling a moving device with a headset computer includes analyzing, at the headset computer, at least one image received from the moving device to form an indication of change in position of the moving device. The method displays to a user of the headset computer the indication of change in position of the moving device. The method can additionally include enabling the user to control the moving device.
摘要:
A device and method to detect desired audio includes a ratio calculator. The ratio calculator calculates a ratio between a primary acoustic signal, and a reference acoustic signal. The primary acoustic signal contains desired audio and undesired audio and the reference acoustic signal contains mostly undesired audio, substantially void of undesired audio. A long-term mean value calculator is coupled to the ratio calculator. The long-term mean value calculator maintains an average of the ratio. A comparator is coupled to the ratio calculator and the long-term value calculator. The comparator compares the ratio with the average. Desired audio is detected when the ratio is greater than the average by a threshold amount.
摘要:
Systems and methods are described to automatically balance acoustic channel sensitivity. A long-term power level of a main acoustic signal is calculated to obtain an averaged main acoustic signal. Segments of the main acoustic signal are excluded from the averaged main acoustic signal using a desired voice activity detection signal. A long-term power level of a reference acoustic signal is calculated to obtain an averaged reference acoustic signal. Segments of the reference acoustic signal are excluded from the averaged reference acoustic signal using a desired voice activity detection signal. An amplitude correction signal is created using the averaged main acoustic signal and the averaged reference acoustic signal.
摘要:
Systems and methods are described to create a desired voice activity detection signal. A main acoustic signal and a plurality of reference acoustic signals are compressed. The compressed main acoustic signal is normalized by the plurality of compressed reference acoustic signals to create a plurality of normalized compressed main acoustic signals. The plurality of normalized compressed main acoustic signals is processed with a plurality of single channel normalized voice threshold comparators to form a plurality of normalized desired voice activity detection signals. One of the plurality of normalized desired voice activity detection signals is selected from the plurality of normalized desired voice activity detection signals to output as the desired voice activity detection signal.
摘要:
A remote control microdisplay device that uses hand and head movement and voice commands to control the parameters of a field of view for the microdisplay within a larger virtual display area associated with a host application.