摘要:
A method and apparatus for recognizing an activity of a monitored individual in an environment are described including receiving a first acoustic signal, performing audio feature extraction on the first acoustic signal in a first temporal window, classifying the first acoustic signal by determining a location of the monitored individual in the environment based on the extracted features of the first acoustic signal in the first temporal window, receiving a second audio signal, performing audio feature extraction of the second acoustic signal in a second temporal window and classifying the second acoustic signal by determining an activity of the monitored individual in the location in the environment based on the extracted features of the second acoustic signal in the second temporal window.
摘要:
A recording is usually a mixture of signals from several sound sources. The directions of the dominant sources in the recording may be known or determined using a source localization algorithm. To isolate or focus on a target source, multiple beamformers may be used. In one embodiment, each beamformer points to a direction of a dominant source and the outputs from the beamformers are processed to focus on the target source. Depending on whether the beamformer pointing to the target source has an output that is larger than the outputs of other beamformers, a reference signal or a scaled output of the beamformer pointing to the target source can be used to determine the signal corresponding to the target source. The scaling factor may depend on a ratio of the output of the beamformer pointing to the target source and the maximum value of the outputs of the other beamformers.
摘要:
The disclosure relates to a method including: - monitoring of events captured by sensors of a communication system, said monitoring being performed during a monitoring time interval taking account of a reference time of day when a first electronic device of the communication system is in a reference location; - generating an alert according to a similarity between a reference pattern of events and the monitored events, the alert being generated at a time different from the reference time of day. The disclosure also relates to corresponding electronic device, communication system, computer readable program product and computer readable storage medium.
摘要:
The present disclosure relates to a method for processing an input signal comprising an audio component and to the corresponding electronic device, non-transitory computer readable program product and computer readable storage medium. According to an embodiment of the present disclosure, the method comprises: • obtaining a set of time parameters from a time frequency transformation of the audio component of the input signal, said audio component being a mixture of audio signals comprising at least one first audio signal of a first audio source; • determining at least one motion feature of said first audio source from a visual sequence corresponding to the first audio signal; • obtaining a weight vector of the set of time parameters based on the motion feature; and • determining a time frequency transformation of the first audio signal based on the weight vector.
摘要:
A plenoptic camera and associated method is provided. The camera has an array of sensors for generating digital images. The images have associated audio signals. The array of sensors are configured to capture digital images associated with a default spatial coordinate and are also configured to receive control input from a processor to change focus from said default spatial coordinate to a new spatial coordinate based on occurrence of an event at said new spatial coordinate.
摘要:
This method for synchronizing two versions (4,6) of a multimedia content, each version (4,6) comprising a plurality of video frames, comprises steps of: a) extracting (20) audio fingerprints from each version (4,6) of the multimedia content; b) determining (24,26,28) at least two temporal matching periods between both versions (4,6) using the extracted audio fingerprints; c) mapping (30,32,34) the video frames of both versions (4,6) using the determined temporal matching periods.
摘要:
The present principles generally relate to audio apparatus, methods, and computer program products and in particular, to improvements that adjust the sound level or levels of one or more audio outputs of an audio system based on the determined origin and/or direction of propagation of a detected human voice in a location. Such an adjustment may be to decrease, mute, or increase the sound level of an audio output producing sound in the direction of the origin of the voice. A sound level produced by other audio outputs may be unchanged.
摘要:
A method is proposed for encoding at least two signals. The method comprises: mixing the at least two signals in a mixture; computing a map Z representative of locations of the at least two signals in a time-frequency plane; obtaining sampling locations of the map Z; sampling the map Z at the sampling locations, the sampling delivering a first list of values Z Ω ; and transmitting the mixture of the at least two signals and a second list of values based on the first list of values Z Ω .
摘要:
A method and a system (20) of audio source separation are described. The method comprises: receiving (10) an audio mixture and at least one text query associated to the audio mixture; retrieving (11) at least one audio sample from an auxiliary audio database; evaluating (12) the retrieved audio samples; and separating (13) the audio mixture into a plurality of audio sources using the audio samples. The corresponding system (20) comprises a receiving (21) and a processor (22) configured to implement the method.