Patent search cpc:"G10L21/12" Page 1

1.

发明公开
METHODS AND SYSTEMS FOR COMPUTER-GENERATED VISUALIZATION OF SPEECH 审中-公开

公开(公告)号：US20240087591A1

公开(公告)日：2024-03-14

申请号：US18451736

申请日：2023-08-17

Applicant: SomniQ, Inc.

Inventor： Rikko Sakaguchi , Hidenori Ishikawa

IPC: G10L21/12 , G10L21/10 , G10L21/14 , G10L25/93

CPC classification number: G10L21/12 , G10L21/10 , G10L21/14 , G10L25/93

Abstract: Methods, systems and apparatuses for computer-generated visualization of speech are described herein. An example method of computer-generated visualization of speech including at least one segment includes: generating a graphical representation of an object corresponding to a segment of the speech; and displaying the graphical representation of the object on a screen of a computing device. Generating the graphical representation includes: representing a duration of the respective segment by a length of the object and representing intensity of the respective segment by a width of the object; and placing, in the graphical representation, a space between adjacent objects.

2.

发明公开
METHOD AND SYSTEM FOR LEARNING AND USING LATENT-SPACE REPRESENTATIONS OF AUDIO SIGNALS FOR AUDIO CONTENT-BASED RETRIEVAL 审中-公开

公开(公告)号：US20230317097A1

公开(公告)日：2023-10-05

申请号：US18142165

申请日：2023-05-02

Applicant: Distributed Creation Inc.

Inventor： Alejandro KORETZKY , Naveen Sasalu RAJASHEKHARAPPA

IPC: G10L25/54 , G06F16/65 , G06F3/16 , G06N3/08 , G10L21/12 , G10L21/14 , G10L25/30 , G06F18/214

CPC classification number: G10L25/54 , G06F16/65 , G06F3/165 , G06N3/08 , G10L21/12 , G10L21/14 , G10L25/30 , G06F18/214

Abstract: A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals. A method and system are also provided for extracting features from digital audio signals and learning, from the extracted features, an artificial neural network model for generating latent-space representations of digital audio signals which take care of selecting salient attributes of the signals that represent psychoacoustic differences between the signals.

3.

发明授权
Methods and systems for computer-generated visualization of speech 有权

公开(公告)号：US11735204B2

公开(公告)日：2023-08-22

申请号：US17404873

申请日：2021-08-17

Applicant: SomniQ, Inc.

Inventor： Rikko Sakaguchi , Hidenori Ishikawa

IPC: G10L21/12 , G10L25/93 , G10L21/14 , G10L21/10

CPC classification number: G10L21/12 , G10L21/10 , G10L21/14 , G10L25/93

Abstract: Methods, systems and apparatuses for computer-generated visualization of speech are described herein. An example method of computer-generated visualization of speech including at least one segment includes: generating a graphical representation of an object corresponding to a segment of the speech; and displaying the graphical representation of the object on a screen of a computing device. Generating the graphical representation includes: representing a duration of the respective segment by a length of the object and representing intensity of the respective segment by a width of the object; and placing, in the graphical representation, a space between adjacent objects.

4.

发明申请
INTERACTIVE KEYWORD CLOUD 有权

公开(公告)号：US20180046433A1

公开(公告)日：2018-02-15

申请号：US15790442

申请日：2017-10-23

Applicant: Patient Prism LLC

Inventor： Michael G. Spiessbach , Amol Nirgudkar

IPC: G06F3/16 , G06F17/22 , H04M3/51 , G06F17/24 , H04L29/08 , G06F3/0481 , G10L15/08 , G06F3/0482 , H04M3/42

CPC classification number: G06F3/165 , G06F3/04812 , G06F3/0482 , G06F17/2235 , G06F17/241 , G06Q30/0201 , G10L15/08 , G10L21/12 , G10L2015/088 , H04L67/02 , H04L67/146 , H04L67/22 , H04M3/42221 , H04M3/5175 , H04M2203/303 , H04M2203/305 , H04M2203/403

Abstract: Merchant/consumer calls may be recorded and evaluated according to a variety of criteria. The call recordings and analyses thereof, as well as consumer tracking information, may be displayed in a user interface of a web-based online portal for convenience in evaluating the use and efficacy of marketing channels as well as the quality of merchant/consumer interactions. In an aspect, the user interface provides a representation of a variety of telephone calls as an interactive keyword cloud that presents business-value-specific keywords targeted for detection during such telephone calls. The keyword cloud may depict keywords in a range of colors, sizes, and relative positioning to connote varied degrees of significance, such as a relative rate of occurrence of keywords in the represented telephone calls. Each keyword in the keyword cloud may contain a hyperlink to related content such as a listing of telephone calls containing the keyword.

5.

发明授权
Call visualization 有权

公开(公告)号：US09826090B2

公开(公告)日：2017-11-21

申请号：US15377428

申请日：2016-12-13

Applicant: Patient Prism LLC

Inventor： Michael G. Spiessbach , Amol Nirgudkar

IPC: H04M3/42 , H04M3/51 , G06Q30/02

CPC classification number: G06F3/165 , G06F3/04812 , G06F3/0482 , G06F17/2235 , G06F17/241 , G06Q30/0201 , G10L15/08 , G10L21/12 , G10L2015/088 , H04L67/02 , H04L67/146 , H04L67/22 , H04M3/42221 , H04M3/5175 , H04M2203/303 , H04M2203/305 , H04M2203/403

Abstract: Merchant/consumer calls may be recorded and evaluated according to a variety of criteria. The call recordings and analyses thereof, as well as consumer tracking information, may be displayed in a user interface of a web-based online portal for convenience in evaluating the use and efficacy of marketing channels as well as the quality of merchant/consumer interactions. In an aspect, the user interface provides call visualization in the form of audio data from a telephone call displayed as a waveform on a call timeline. The call may be (automatically or manually) annotated with various business-value-specific keywords spoken during the telephone call, and markers for these keywords can be presented on the call timeline to visually indicate the keyword and the time during the call when the keyword was spoken. A business value for the call may be determined based at least in part on keywords spoken during the call.

6.

发明授权
Hybrid audio representations for editing audio content 有权

公开(公告)号：US09666208B1

公开(公告)日：2017-05-30

申请号：US14968876

申请日：2015-12-14

Applicant: Adobe Systems Incorporated

Inventor： Michael Rubin , James A. Moorer

IPC: G10L15/26 , G10L21/12 , G10L25/87 , G10L21/10 , G06F3/16 , G06F3/0481 , G06T11/60

CPC classification number: G10L21/12 , G06F3/0481 , G06F3/167 , G06F17/211 , G06T11/60 , G06T2200/24 , G10L15/04 , G10L15/26 , G10L21/10 , G10L25/87

Abstract: The present disclosure includes a hybrid waveform system that displays a hybrid waveform to a user. In general, the hybrid waveform system provides a hybrid waveform to a user that uses converted readable text and waveforms to represent an audio segment. By providing a user with a hybrid waveform, the hybrid waveform system offers users with a number of benefits, such as providing an audio display that enables a user to quickly ascertain context information and audio information typically missing from audio transcriptions.

7.

发明申请
Audio File Metadata Event Labeling and Data Analysis 审中-公开
Title translation: 音频文件元数据事件标签和数据分析

公开(公告)号：US20160277577A1

公开(公告)日：2016-09-22

申请号：US15076572

申请日：2016-03-21

Applicant: TopBox, LLC

Inventor： Jeffrey Stephen Yentis , Christopher Lee Tranquill , Brian Keith Timmons , Ryan Andrew Studer , Micheal Dean Dobson

IPC: H04M3/51 , G10L21/12 , G06F3/16 , G06F17/24 , G06F3/0482 , G06F3/0484 , G06F3/0481 , H04M3/42 , G06F17/30

CPC classification number: G10L21/12 , G06F3/04817 , G06F3/0482 , G06F3/04842 , G06F3/165 , G06F16/3326 , G06F16/685 , G06F17/24 , G06F17/241 , G06Q10/063 , H04M3/42221 , H04M3/5175

Abstract: An interaction management system receives audio files of interactions between customers and customer service agents and client provided metadata from a client. The interaction management system provides an interface for creating enhanced metadata based on the received audio file and client provided metadata using a capture interface. The capture interface allows a user to label the audio file with event labels and sentiment labels at particular time stamps in the audio file. The interaction management system saves the captured metadata in an interaction file associated with the client provided audio file to be presented back to the user as a visual sequential representation of the captured data.

Abstract translation: 交互管理系统从客户端接收客户和客户服务代理之间的交互的音频文件以及客户端提供的元数据。交互管理系统提供用于使用捕获接口基于所接收的音频文件和客户端提供的元数据创建增强的元数据的接口。捕获接口允许用户在音频文件中的特定时间戳上标记具有事件标签和情绪标签的音频文件。交互管理系统将捕捉到的元数据保存在与客户端提供的音频文件相关联的交互文件中，作为被捕获数据的可视化顺序表示呈现给用户。

8.

发明申请
Multi-Party Conversation Analyzer and Logger 审中-公开
Title translation: 多方对话分析器和记录仪

公开(公告)号：US20160217807A1

公开(公告)日：2016-07-28

申请号：US15082959

申请日：2016-03-28

Applicant: Securus Technologies, Inc.

Inventor： Jay Loring Gainsboro , Lee Davis Weinstein

IPC: G10L25/63 , G10L21/12 , G10L15/18 , H04M3/22

CPC classification number: H04M3/568 , G10L15/1807 , G10L17/00 , G10L21/12 , G10L25/63 , H04M3/2281 , H04M3/42221 , H04M2201/41

Abstract: A multi-party conversation analyzer and logger uses a variety of techniques including spectrographic voice analysis, absolute loudness measurements, directional microphones, and telephonic directional separation to determine the number of parties who take part in a conversation, and segment the conversation by speaking party. In one aspect, the invention monitors telephone conversations in real time to detect conditions of interest (for instance, calls to non-allowed parties or calls of a prohibited nature from prison inmates). In another aspect, automated prosody measurement algorithms are used in conjunction with speaker segmentation to extract emotional content of the speech of participants within a particular conversation, and speaker interactions and emotions are displayed in graphical form. A conversation database is generated which contains conversation recordings, and derived data such as transcription text, derived emotions, alert conditions, and correctness probabilities associated with derived data. Investigative tools allow flexible queries of the conversation database.

Abstract translation: 多方会话分析器和记录器使用各种技术，包括光谱语音分析，绝对响度测量，定向麦克风和电话方向分离，以确定参与对话的各方数量，并通过会话分割对话。一方面，本发明实时监控电话对话以检测感兴趣的情况（例如，对不允许的当事人的呼叫或被监禁的囚犯的被禁止的呼叫）。在另一方面，自动化韵律测量算法与说话者分割结合使用以提取特定对话内的参与者的言语的情感内容，并且以图形形式显示说话人的交互和情绪。生成会话数据库，其中包含会话记录，以及衍生数据，如转录文本，派生情绪，警报条件以及与派生数据相关联的正确性概率。调查工具允许会话数据库的灵活查询。

9.

发明申请
Multi-party conversation analyzer & logger 有权
Title translation: 多方会话分析器和记录器

公开(公告)号：US20070071206A1

公开(公告)日：2007-03-29

申请号：US11475541

申请日：2006-06-26

Applicant: Jay Gainsboro , Lee Weinstein

Inventor： Jay Gainsboro , Lee Weinstein

IPC: H04M9/00 , H04M1/60

CPC classification number: G10L25/63 , G10L15/1807 , G10L17/00 , G10L21/12 , H04M3/2281 , H04M2201/41

Abstract: A multi-party conversation analyzer and logger uses a variety of techniques including spectrographic voice analysis, absolute loudness measurements, directional microphones, and telephonic directional separation to determine the number of parties who take part in a conversation, and segment the conversation by speaking party. In one aspect, the invention monitors telephone conversations in real time to detect conditions of interest (for instance, calls to non-allowed parties or calls of a prohibited nature from prison inmates). In another aspect, automated prosody measurement algorithms are used in conjunction with speaker segmentation to extract emotional content of the speech of participants within a particular conversation, and speaker interactions and emotions are displayed in graphical form. A conversation database is generated which contains conversation recordings, and derived data such as transcription text, derived emotions, alert conditions, and correctness probabilities associated with derived data. Investigative tools allow flexible queries of the conversation database.

Abstract translation: 多方会话分析器和记录器使用各种技术，包括光谱语音分析，绝对响度测量，定向麦克风和电话方向分离，以确定参与对话的各方数量，并通过会话分割对话。一方面，本发明实时监控电话对话以检测感兴趣的情况（例如，对不允许的当事人的呼叫或被监禁的囚犯的被禁止的呼叫）。在另一方面，自动化韵律测量算法与说话者分割结合使用以提取特定对话内的参与者的言语的情感内容，并且以图形形式显示说话人的交互和情绪。生成会话数据库，其中包含会话记录，以及衍生数据，如转录文本，派生情绪，警报条件以及与派生数据相关联的正确性概率。调查工具允许会话数据库的灵活查询。

10.

发明申请
MULTI-SPEAKER SPEECH RECOGNITION CORRECTION SYSTEM 审中-公开

公开(公告)号：US20180182396A1

公开(公告)日：2018-06-28

申请号：US15823937

申请日：2017-11-28

Applicant: SORIZAVA CO., LTD.

Inventor： Munhak AN

IPC: G10L15/26 , G06F17/28 , G10L15/08 , G10L15/32 , G10L21/0272 , G10L21/12

CPC classification number: G10L15/26 , G06F17/279 , G06F17/28 , G06F17/30746 , G10L15/08 , G10L15/32 , G10L21/0272 , G10L21/12

Abstract: The present invention relates to a multi-speaker speech recognition correction system for determining a speaker of an utterance with a simple method and easily correcting speech-recognized text during speech recognition for a plurality of speakers. According to the present invention, when speech signals are input to a multi-speaker speech recognition system from a plurality of microphones which are each provided to a corresponding one of a plurality of speakers, the multi-speaker speech recognition correction system may detect a speech session from a time point at which input of each of the speech signals is started to a time point at which the input of the speech signal is stopped, and a speech recognizer may convert only the detected speech sessions into text so that a speaker of an utterance can be identified by a simple method and speech recognition can be carried out at a low cost.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification