-
11.
公开(公告)号:US10567185B2
公开(公告)日:2020-02-18
申请号:US15546925
申请日:2016-02-03
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Richard J. Cartwright , Glenn N. Dickins
Abstract: Some aspects of the present disclosure involve the recording, processing and playback of audio data corresponding to conferences, such as teleconferences. In some teleconference implementations, the audio experience heard when a recording of the conference is played back may be substantially different from the audio experience of an individual conference participant during the original teleconference. In some implementations, the recorded audio data may include at least some audio data that was not available during the teleconference. In some examples, the spatial characteristics of the played-back audio data may be different from that of the audio heard by participants of the teleconference.
-
公开(公告)号:US10439951B2
公开(公告)日:2019-10-08
申请号:US15460490
申请日:2017-03-16
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Hannes Muesch , Richard J. Cartwright
IPC: H04L12/875 , H04L12/26 , H04L12/841 , H04L12/863 , H04L29/06
Abstract: Disclosed is a method and apparatus operative to process packets of media received from a network including a receiver unit operative, a jitter buffer data structure and a playback head defining a point in the jitter buffer data structure from which the ordered queue of packets are to be played back, and at least one prototype head. Each prototype head having a predetermined latency assigned thereto and defining a point in the jitter buffer data structure from which the ordered queue of packets is being played back containing said latency a processor operable to determine a measure of conversational quality associated with the ordered queue of packets being played back by each prototype head. Also described is a head selector operable to compare the measures of conversational quality associated with the ordered queue of packets being played back by each prototype head to select the prototype head with the highest measure of conversational quality and a playback unit coupled to the playback head.
-
公开(公告)号:US10321256B2
公开(公告)日:2019-06-11
申请号:US15547043
申请日:2016-02-02
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Glenn N. Dickins , Richard J. Cartwright
Abstract: Systems, methods, and computer program products for creating an object-based audio signal from an audio input are described. The audio input includes one or more audio channels that are recorded to collectively define an audio scene. The one or more audio channels are captured from a respective one or more spatially separated microphones disposed in a stable spatial configuration. A system receives the audio input. The system performs spatial analysis on the one or more audio channels to identify one or more audio objects within the audio scene. The system determines contextual information relating to the one or more audio objects. The system defines respective audio streams including audio data relating to at least one of the identified one or more audio objects. The system then outputs an object-based audio signal including the audio streams and the contextual information.
-
公开(公告)号:US09883314B2
公开(公告)日:2018-01-30
申请号:US15323724
申请日:2015-07-01
Applicant: Dolby Laboratories Licensing Corporation
Inventor: David Gunawan , Glenn N. Dickins , Richard J. Cartwright
CPC classification number: H04S7/30 , H04R2420/01 , H04R2430/25 , H04S3/002 , H04S2400/11 , H04S2400/15 , H04S2420/05 , H04S2420/11
Abstract: A method for altering an audio signal of interest in a multi-channel soundfield representation of an audio environment, the method including the steps of: (a) extracting the signal of interest from the soundfield representation; (b) determining a residual soundfield signal; (c) inputting a further associated audio signal, which is associated with the signal of interest; (d) transforming the associated audio signal into a corresponding associated soundfield signal compatable with the residual soundfield; and (e) combining the residual soundfield signal with the associated soundfield signal to produce an output soundfield signal.
-
15.
公开(公告)号:US09565314B2
公开(公告)日:2017-02-07
申请号:US14431247
申请日:2013-09-25
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Richard J. Cartwright , David S. McGrath , Glenn N. Dickins
CPC classification number: H04M3/561 , H04M3/568 , H04M2203/509 , H04R5/04 , H04S7/302
Abstract: The present document relates to audio conference systems. In particular, the present document relates to the mapping of soundfields within an audio conference system. A conference multiplexer (110, 175, 210, 400) configured to place a first input soundfield signal (402) originating from a first soundfield endpoint (120, 170) within a 2D or 3D conference scene (300) to be rendered to a listener (301) is described. The first input soundfield signal (402) is indicative of a soundfield captured by the first soundfield endpoint (120, 170). The conference multiplexer (110, 175, 210, 400) is configured to set up the conference scene (300) comprising a plurality of talker locations (321, 322, 332, 331) at different angles (323, 333) with respect to the listener (301); provide a first sector (325); wherein the first sector (325) has a first angular width (324); wherein the first angular width (324) is greater than zero; and transform the first input soundfield signal (402) into a first output soundfield signal (403), such that for the listener (301) the first output soundfield signal (403) appears to be emanating from one or more virtual talker locations (321, 322) within the first sector (325).
Abstract translation: 本文件涉及音频会议系统。 特别地,本文件涉及声音会议系统内的声场的映射。 会议多路复用器(110,175,210,400)被配置为将来自第一声场端点(120,170)的第一输入声场信号(402)放置在2D或3D会议场景(300)内以被呈现给听众 (301)。 第一输入声场信号(402)指示由第一声场端点(120,170)捕获的声场。 会议多路复用器(110,175,210,400)被配置为以相对于所述会议多路复用器(110,175,210,400)以不同的角度(323,333,331)来建立包括多个讲话者位置(321,322,332,331)的会议场景(300) 听众(301); 提供第一部门(325); 其中所述第一扇区(325)具有第一角宽度(324); 其中所述第一角宽度(324)大于零; 并且将第一输入声场信号(402)变换为第一输出声场信号(403),使得对于收听者(301),第一输出声场信号(403)似乎是从一个或多个虚拟讲话者位置(321, 322)在第一扇区(325)内。
-
公开(公告)号:US12136432B2
公开(公告)日:2024-11-05
申请号:US17781632
申请日:2020-12-08
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Benjamin Alexander Jancovich , Timothy Alan Port , Andrew P. Reilly , Richard J. Cartwright
IPC: G10L21/0232 , G06F3/16 , G10L15/22 , G10L25/51 , H03G3/32 , H03G9/02 , H04M9/08 , H04R1/08 , H04R1/40 , H04R3/00 , H04R3/02 , H04R3/04 , H04R3/12 , H04R29/00
Abstract: Noise compensation method comprising: (a) receiving a content stream including content audio data; (b) receiving first microphone signals from a first device; (c) detecting ambient noise from a noise source location in or near the audio environment; (d) causing a first wireless signal to be transmitted from the first device to a second device, the first wireless signal including instructions for the second device to record an audio segment (e) receiving a second wireless signal from the second device; (f) determining a content stream audio segment time interval for a content stream audio segment; (g) receiving a third wireless signal from the second device, including a recorded audio segment captured via a second device microphone; (h) determining a second device ambient noise signal at the second device location; and (i) implementing a noise compensation method for the content audio data based, at least in part, on the second device ambient noise signal.
-
公开(公告)号:US12112750B2
公开(公告)日:2024-10-08
申请号:US17630895
申请日:2020-07-28
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Mark R. P. Thomas , Richard J. Cartwright
CPC classification number: G10L15/22 , G06F3/167 , G10L15/08 , G10L21/0264 , H04R1/406 , H04R3/005 , H04S7/303 , G10L2015/223 , H04R2430/21
Abstract: A method for estimating a user's location in an environment may involve receiving output signals from each microphone of a plurality of microphones in the environment. At least two microphones of the plurality of microphones may be included in separate devices at separate locations in the environment and the output signals may correspond to a current utterance of a user. The method may involve determining multiple current acoustic features from the output signals of each microphone and applying a classifier to the multiple current acoustic features. Applying the classifier may involve applying a model trained on previously-determined acoustic features derived from a plurality of previous utterances made by the user in a plurality of user zones in the environment. The method may involve determining, based at least in part on output from the classifier, an estimate of the user zone in which the user is currently located.
-
公开(公告)号:US11770666B2
公开(公告)日:2023-09-26
申请号:US17397887
申请日:2021-08-09
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Richard J. Cartwright , David S. McGrath , Glenn N. Dickins
CPC classification number: H04S1/005 , H04M3/568 , H04R3/12 , H04R5/033 , H04R5/04 , H04S7/30 , H04S7/304 , H04S2400/01 , H04S2400/03 , H04S2400/11 , H04S2420/01 , H04S2420/11
Abstract: A computer implemented system for rendering captured audio soundfields to a listener comprises apparatus to deliver the audio soundfields to the listener. The delivery apparatus delivers the audio soundfields to the listener with first and second audio elements perceived by the listener as emanating from first and second virtual source locations, respectively, and with the first audio element and/or the second audio element delivered to the listener from a third virtual source location. The first virtual source location and the second virtual source location are perceived by the listener as being located to the front of the listener, and the third virtual source location is located to the rear or the side of the listener.
-
公开(公告)号:US20210035563A1
公开(公告)日:2021-02-04
申请号:US16936673
申请日:2020-07-23
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Richard J. Cartwright , Christopher Graham Hines
Abstract: In some embodiments, methods and systems for training an acoustic model, where the training includes a training loop (including at least one epoch) following a data preparation phase. During the training loop, training data are augmented to generate augmented training data. During each epoch of the training loop, at least some of the augmented training data is used to train the model. The augmented training data used during each epoch may be generated by differently augmenting (e.g., augmenting using a different set of augmentation parameters) at least some of the training data. In some embodiments, the augmentation is performed in the frequency domain, with the training data organized into frequency bands. The acoustic model may be of a type employed (when trained) to perform speech analytics (e.g., wakeword detection, voice activity detection, speech recognition, or speaker recognition) and/or noise suppression.
-
公开(公告)号:US10516782B2
公开(公告)日:2019-12-24
申请号:US15548245
申请日:2016-02-03
Applicant: DOLBY LABORATORIES LICENSING CORPORATION
Inventor: Richard J. Cartwright , Shen Huang
Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving audio data corresponding to a recording of at least one conference involving a plurality of conference participants. The audio data may include conference participant speech data from multiple endpoints, recorded separately and/or conference participant speech data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants. A search of the audio data may be based on one or more search parameters. The search may be a concurrent search for multiple features of the audio data. Instances of conference participant speech may be rendered to at least two different virtual conference participant positions of a virtual acoustic space.
-
-
-
-
-
-
-
-
-