Speaker identity and content de-identification

发明授权

US11580951B2 Speaker identity and content de-identification 有权

请登陆查看更多内容

专利标题： Speaker identity and content de-identification
申请号： US17452563

申请日： 2021-10-27
公开(公告)号： US11580951B2

公开(公告)日： 2023-02-14
发明人: Aris Gkoulalas-Divanis , Xu Wang , Paul R. Bastide , Rohit Ranchal
申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION
申请人地址： US NY Armonk
专利权人： INTERNATIONAL BUSINESS MACHINES CORPORATION
当前专利权人： INTERNATIONAL BUSINESS MACHINES CORPORATION
当前专利权人地址： US NY Armonk
代理机构： Sherman IP LLP
代理商 Kenneth L. Sherman; Hemavathy Perumal
主分类号： G06F16/35
IPC分类号： G06F16/35 ; G10L13/00 ; G10L15/18 ; G10L15/26 ; G10L17/02

摘要：

One embodiment of the invention provides a method for speaker identity and content de-identification under privacy guarantees. The method comprises receiving input indicative of privacy protection levels to enforce, extracting features from a speech recorded in a voice recording, recognizing and extracting textual content from the speech, parsing the textual content to recognize privacy-sensitive personal information about an individual, generating de-identified textual content by anonymizing the personal information to an extent that satisfies the privacy protection levels and conceals the individual's identity, and mapping the de-identified textual content to a speaker who delivered the speech. The method further comprises generating a synthetic speaker identity based on other features that are dissimilar from the features to an extent that satisfies the privacy protection levels, and synthesizing a new speech waveform based on the synthetic speaker identity to deliver the de-identified textual content. The new speech waveform conceals the speaker's identity.

信息查询

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/30	.•非结构文本数据（文档管理系统入G06F 16/93）
G06F16/35	..••聚类；分类