Abstract:
Some implementations provide a method for identifying a speaker. The method determines position and orientation of a second device based on data from a first device that is for capturing the position and orientation of the second device. The second device includes several microphones for capturing sound. The second device has movable position and movable orientation. The method assigns an object as a representation of a known user. The object has a moveable position. The method receives a position of the object. The position of the object corresponds to a position of the known user. The method processes the captured sound to identify a sound originating from the direction of the object. The direction of the object is relative to the position and the orientation of the second device. The method identifies the sound originating from the direction of the object as belonging to the known user.
Abstract:
In general, techniques are described for capturing multi-channel audio data. A device comprising one or more processors may be configured to implement the techniques. The processors may analyze captured audio data to identify audio objects, and analyze video data captured concurrent to the capture of the audio data to identify video objects. The processors may then associate at least one of the audio objects with at least one of the video objects, and generate multi-channel audio data from the audio data based on the association of the at least one of audio objects with the at least one of the video objects.
Abstract:
Methods and apparatuses for providing tangible control of sound are provided and described as embodied in a system that includes a sound transducer array along with a touch surface-enabled display table. The array may include a group of transducers (multiple speakers and/or microphones) configured to perform spatial processing of signals for the group of transducers so that sound rendering (in configurations where the array includes multiple speakers), or sound pick-up (in configurations where the array includes multiple microphones), have spatial patterns (or sound projection patterns) that are focused in certain directions while reducing disturbances from other directions. Users may directly adjust parameters related to sound projection patterns by interacting with the touch surface while receiving visual feedback by exercising one or more commands on the touch surface. The commands may be adjusted according to visual feedback received from the change of the display on the touch surface.
Abstract:
A method for encoding three dimensional audio by a wireless communication device is disclosed. The wireless communication device detects an indication of a plurality of localizable audio sources. The wireless communication device also records a plurality of audio signals associated with the plurality of localizable audio sources. The wireless communication device also encodes the plurality of audio signals.
Abstract:
In general, techniques are described for forming a collaborative sound system. A headend device comprising one or more processors may perform the techniques. The processors may be configured to identify mobile devices that each includes a speaker and that are available to participate in a collaborative surround sound system. The processors may configure the collaborative surround sound system to utilize the speaker of each of the mobile devices as one or more virtual speakers of this system and then render audio signals from an audio source such that when the audio signals are played by the speakers of the mobile devices the audio playback of the audio signals appears to originate from the one or more virtual speakers of the collaborative surround sound system. The processors may then transmit the processed audio signals rendered to the mobile device participating in the collaborative surround sound system.
Abstract:
In general, techniques are described for performing constrained dynamic amplitude panning in collaborative sound systems. A headend device comprising one or more processors may perform the techniques. The processors may be configured to identify, for a mobile device participating in a collaborative surround sound system, a specified location of a virtual speaker of the collaborative surround sound system and determine a constraint that impacts playback of audio signals rendered from an audio source by the mobile device. The processors may be further configure to perform dynamic spatial rendering of the audio source with the determined constraint to render audio signals that reduces the impact of the determined constraint during playback of the audio signals by the mobile device.
Abstract:
A device comprising one or more processors is configured to determine a plurality of segments for each of a plurality of binaural room impulse response filters, wherein each of the plurality of binaural room impulse response filters comprises a residual room response segment and at least one direction-dependent segment for which a filter response depends on a location within a sound field; transform each of at least one direction-dependent segment of the plurality of binaural room impulse response filters to a domain corresponding to a domain of a plurality of hierarchical elements to generate a plurality of transformed binaural room impulse response filters, wherein the plurality of hierarchical elements describe a sound field; and perform a fast convolution of the plurality of transformed binaural room impulse response filters and the plurality of hierarchical elements to render the sound field.
Abstract:
A device comprising one or more processors is configured to apply adaptively determined weights to a plurality of channels of the audio signal to generate a plurality of adaptively weighted channels of the audio signal. The processors are further configured to combine at least two of the plurality of adaptively weighted channels of the audio signal to generate a combined signal. The processors are further configured to apply a binaural room impulse response filter to the combined signal to generate a binaural audio signal.
Abstract:
In general, techniques are described for capturing multi-channel audio data. A device comprising one or more processors may be configured to implement the techniques. The processors may analyze captured audio data to identify audio objects, and analyze video data captured concurrent to the capture of the audio data to identify video objects. The processors may then associate at least one of the audio objects with at least one of the video objects, and generate multi-channel audio data from the audio data based on the association of the at least one of audio objects with the at least one of the video objects.
Abstract:
A method for encoding multiple directional audio signals using an integrated codec by a wireless communication device is disclosed. The wireless communication device records a plurality of directional audio signals. The wireless communication device also generates a plurality of audio signal packets based on the plurality of directional audio signals. At least one of the audio signal packets includes an averaged signal. The wireless communication device further transmits the plurality of audio signal packets.