Post-conference playback system having higher perceived quality than originally heard in the conference

    公开(公告)号:US10567185B2

    公开(公告)日:2020-02-18

    申请号:US15546925

    申请日:2016-02-03

    Abstract: Some aspects of the present disclosure involve the recording, processing and playback of audio data corresponding to conferences, such as teleconferences. In some teleconference implementations, the audio experience heard when a recording of the conference is played back may be substantially different from the audio experience of an individual conference participant during the original teleconference. In some implementations, the recorded audio data may include at least some audio data that was not available during the teleconference. In some examples, the spatial characteristics of the played-back audio data may be different from that of the audio heard by participants of the teleconference.

    Jitter buffer apparatus and method
    12.
    发明授权

    公开(公告)号:US10439951B2

    公开(公告)日:2019-10-08

    申请号:US15460490

    申请日:2017-03-16

    Abstract: Disclosed is a method and apparatus operative to process packets of media received from a network including a receiver unit operative, a jitter buffer data structure and a playback head defining a point in the jitter buffer data structure from which the ordered queue of packets are to be played back, and at least one prototype head. Each prototype head having a predetermined latency assigned thereto and defining a point in the jitter buffer data structure from which the ordered queue of packets is being played back containing said latency a processor operable to determine a measure of conversational quality associated with the ordered queue of packets being played back by each prototype head. Also described is a head selector operable to compare the measures of conversational quality associated with the ordered queue of packets being played back by each prototype head to select the prototype head with the highest measure of conversational quality and a playback unit coupled to the playback head.

    Adaptive audio construction
    13.
    发明授权

    公开(公告)号:US10321256B2

    公开(公告)日:2019-06-11

    申请号:US15547043

    申请日:2016-02-02

    Abstract: Systems, methods, and computer program products for creating an object-based audio signal from an audio input are described. The audio input includes one or more audio channels that are recorded to collectively define an audio scene. The one or more audio channels are captured from a respective one or more spatially separated microphones disposed in a stable spatial configuration. A system receives the audio input. The system performs spatial analysis on the one or more audio channels to identify one or more audio objects within the audio scene. The system determines contextual information relating to the one or more audio objects. The system defines respective audio streams including audio data relating to at least one of the identified one or more audio objects. The system then outputs an object-based audio signal including the audio streams and the contextual information.

    Spatial multiplexing in a soundfield teleconferencing system
    15.
    发明授权
    Spatial multiplexing in a soundfield teleconferencing system 有权
    声场电话会议系统中的空间复用

    公开(公告)号:US09565314B2

    公开(公告)日:2017-02-07

    申请号:US14431247

    申请日:2013-09-25

    CPC classification number: H04M3/561 H04M3/568 H04M2203/509 H04R5/04 H04S7/302

    Abstract: The present document relates to audio conference systems. In particular, the present document relates to the mapping of soundfields within an audio conference system. A conference multiplexer (110, 175, 210, 400) configured to place a first input soundfield signal (402) originating from a first soundfield endpoint (120, 170) within a 2D or 3D conference scene (300) to be rendered to a listener (301) is described. The first input soundfield signal (402) is indicative of a soundfield captured by the first soundfield endpoint (120, 170). The conference multiplexer (110, 175, 210, 400) is configured to set up the conference scene (300) comprising a plurality of talker locations (321, 322, 332, 331) at different angles (323, 333) with respect to the listener (301); provide a first sector (325); wherein the first sector (325) has a first angular width (324); wherein the first angular width (324) is greater than zero; and transform the first input soundfield signal (402) into a first output soundfield signal (403), such that for the listener (301) the first output soundfield signal (403) appears to be emanating from one or more virtual talker locations (321, 322) within the first sector (325).

    Abstract translation: 本文件涉及音频会议系统。 特别地,本文件涉及声音会议系统内的声场的映射。 会议多路复用器(110,175,210,400)被配置为将来自第一声场端点(120,170)的第一输入声场信号(402)放置在2D或3D会议场景(300)内以被呈现给听众 (301)。 第一输入声场信号(402)指示由第一声场端点(120,170)捕获的声场。 会议多路复用器(110,175,210,400)被配置为以相对于所述会议多路复用器(110,175,210,400)以不同的角度(323,333,331)来建立包括多个讲话者位置(321,322,332,331)的会议场景(300) 听众(301); 提供第一部门(325); 其中所述第一扇区(325)具有第一角宽度(324); 其中所述第一角宽度(324)大于零; 并且将第一输入声场信号(402)变换为第一输出声场信号(403),使得对于收听者(301),第一输出声场信号(403)似乎是从一个或多个虚拟讲话者位置(321, 322)在第一扇区(325)内。

    PER-EPOCH DATA AUGMENTATION FOR TRAINING ACOUSTIC MODELS

    公开(公告)号:US20210035563A1

    公开(公告)日:2021-02-04

    申请号:US16936673

    申请日:2020-07-23

    Abstract: In some embodiments, methods and systems for training an acoustic model, where the training includes a training loop (including at least one epoch) following a data preparation phase. During the training loop, training data are augmented to generate augmented training data. During each epoch of the training loop, at least some of the augmented training data is used to train the model. The augmented training data used during each epoch may be generated by differently augmenting (e.g., augmenting using a different set of augmentation parameters) at least some of the training data. In some embodiments, the augmentation is performed in the frequency domain, with the training data organized into frequency bands. The acoustic model may be of a type employed (when trained) to perform speech analytics (e.g., wakeword detection, voice activity detection, speech recognition, or speaker recognition) and/or noise suppression.

    Conference searching and playback of search results

    公开(公告)号:US10516782B2

    公开(公告)日:2019-12-24

    申请号:US15548245

    申请日:2016-02-03

    Abstract: Various disclosed implementations involve processing and/or playback of a recording of a conference involving a plurality of conference participants. Some implementations disclosed herein involve receiving audio data corresponding to a recording of at least one conference involving a plurality of conference participants. The audio data may include conference participant speech data from multiple endpoints, recorded separately and/or conference participant speech data from a single endpoint corresponding to multiple conference participants and including spatial information for each conference participant of the multiple conference participants. A search of the audio data may be based on one or more search parameters. The search may be a concurrent search for multiple features of the audio data. Instances of conference participant speech may be rendered to at least two different virtual conference participant positions of a virtual acoustic space.

Patent Agency Ranking