-
公开(公告)号:US20240160849A1
公开(公告)日:2024-05-16
申请号:US18550429
申请日:2022-04-27
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Andrea FANELLI , Mingqing YUN , Satej Suresh PANKEY , Nicholas Laurence ENGEL , Poppy Anne Carrie Crum
IPC: G06F40/30
CPC classification number: G06F40/30
Abstract: Embodiments are disclosed for speaker diarization supporting episodical content. In an embodiment, a method comprises: receiving media data including one or more utterances; dividing the media data into a plurality of blocks; identifying segments of each block of the plurality of blocks associated with a single speaker; extracting embeddings for the identified segments in accordance with a machine learning model, wherein extracting embeddings for identified segments further comprises statistically combining extracted embeddings for identified segments that correspond to a respective continuous utterance associated with a single speaker; clustering the embeddings for the identified segments into clusters; and assigning a speaker label to each of the embeddings for the identified segments in accordance with a result of the clustering. In some embodiments, a voiceprint is used to identify a speaker and the speaker identity for a speaker label.
-
公开(公告)号:US20210211825A1
公开(公告)日:2021-07-08
申请号:US17263125
申请日:2019-07-05
Applicant: Dolby Laboratories Licensing Corporation
Inventor: McGregor Steele JOYNER , Alex BRANDMEYER , Scott DALY , Jeffrey Ross BAKER , Andrea FANELLI , Poppy Anne Carrie CRUM
Abstract: An apparatus and method of generating personalized HRTFs. The system is prepared by calculating a model for HRTFs described as the relationship between a finite example set of input data, namely anthropometric measures and demographic information for a set of individuals, and a corresponding set of output data, namely HRTFs numerically simulated using a high-resolution database of 3D scans of the same set of individuals. At the time of use, the system queries the user for their demographic information, and then from a series of images of the user, the system detects and measures various anthropometric characteristics. The system then applies the prepared model to the anthropometric and demographic data as part of generating a personalized HRTF. In this manner, the personalized HRTF can be generated with more convenience than by performing a high-resolution scan or an acoustic measurement of the user, and with less computational complexity than by numerically simulating their HRTF.
-
公开(公告)号:US20240048932A1
公开(公告)日:2024-02-08
申请号:US18455565
申请日:2023-08-24
Applicant: Dolby Laboratories Licensing Corporation
Inventor: McGregor Steele JOYNER , Alex BRANDMEYER , Scott DALY , Jeffrey Ross BAKER , Andrea FANELLI , Poppy Anne Carrie CRUM
CPC classification number: H04S7/301 , G06T7/11 , G06T7/70 , H04S1/002 , H04S7/303 , G06V40/10 , G06F18/214 , G06T2207/20081 , G06T2207/20084 , G06T2207/20132 , G06T2207/30196 , H04S2400/15 , H04S2420/01
Abstract: An apparatus and method of generating personalized HRTFs. The system is prepared by calculating a model for HRTFs described as the relationship between a finite example set of input data, namely anthropometric measures and demographic information for a set of individuals, and a corresponding set of output data, namely HRTFs numerically simulated using a high-resolution database of 3D scans of the same set of individuals. At the time of use, the system queries the user for their demographic information, and then from a series of images of the user, the system detects and measures various anthropometric characteristics. The system then applies the prepared model to the anthropometric and demographic data as part of generating a personalized HRTF. In this manner, the personalized HRTF can be generated with more convenience than by performing a high-resolution scan or an acoustic measurement of the user, and with less computational complexity than by numerically simulating their HRTF.
-
公开(公告)号:US20230071021A1
公开(公告)日:2023-03-09
申请号:US17792886
申请日:2021-01-22
Applicant: Dolby Laboratories Licensing Corporation
Inventor: Andrea FANELLI , Evan David GITTERMAN , Nathan Carl SWEDLOW , Alex BRANDMEYER , McGregor Steele JOYNER , Poppy Anne Carrie CRUM
Abstract: A system for determining a direction of gaze of a user, comprising a set of electrodes arranged on earpieces, each electrode comprising a patch of compressible and electrically conducting foam material. The system further includes circuitry connected to the electrodes and configured to receive a set of voltage signals from a set of electrodes arranged on an audio endpoint worn by a user, multiplex said voltage signals into an input signal, remove a predicted central voltage from said input signal, to provide a detrended signal, and determine said gaze direction based on said detrended signal. Such conducting foam materials provide satisfactory bio-sensing performance for a wide range of compression levels and over time. In the case of on-ear headphones, the foam electrodes may be integrated in the cuffs with little or no effect on the comfort level.
-
公开(公告)号:US20220366919A1
公开(公告)日:2022-11-17
申请号:US17762709
申请日:2020-09-22
Applicant: Dolby Laboratories Licensing Corporation
Inventor: DIRK JEROEN BREEBAART , ALEX BRANDMEYER , POPPY ANNE CARRIE CRUM , JOYNER STEELE MCGREGOR , David MCGRATH , Andrea FANELLI , Rhonda J. WILSON
IPC: G10L19/008 , H04S7/00
Abstract: Encoding/decoding techniques where multiple transform parameter sets are encoded together with a rendered playback presentation of an input audio content. The multiple transform parameters are used on the decoder side to transform the playback presentation to provide a personalized binaural playback presentation optimized for an individual listener with respect to their hearing profile. This may be achieved by selection or combination of the data present in the metadata streams.
-
-
-
-