-
公开(公告)号:US10592611B2
公开(公告)日:2020-03-17
申请号:US15332766
申请日:2016-10-24
发明人: Jesse Vig , Harish Arsikere , Margaret H. Szymanski , Luke R. Plurkowski , Kyle D. Dent , Daniel G. Bobrow , Daniel Davies , Eric Saund
摘要: Embodiments of the present invention provide a system for automatically extracting conversational structure from a voice record based on lexical and acoustic features. The system also aggregates business-relevant statistics and entities from a collection of spoken conversations. The system may infer a coarse-level conversational structure based on fine-level activities identified from extracted acoustic features. The system improves significantly over previous systems by extracting structure based on lexical and acoustic features. This enables extracting conversational structure on a larger scale and finer level of detail than previous systems, and can feed an analytics and business intelligence platform, e.g. for customer service phone calls. During operation, the system obtains a voice record. The system then extracts a lexical feature using automatic speech recognition (ASR). The system extracts an acoustic feature. The system then determines, via machine learning and based on the extracted lexical and acoustic features, a coarse-level structure of the conversation.
-
2.
公开(公告)号:US20180113854A1
公开(公告)日:2018-04-26
申请号:US15332766
申请日:2016-10-24
发明人: Jesse Vig , Harish Arsikere , Margaret H. Szymanski , Luke R. Plurkowski , Kyle D. Dent , Daniel G. Bobrow , Daniel Davies , Eric Saund
CPC分类号: G06F17/279 , G06F17/277 , G10L15/26 , G10L25/48 , H04M3/51 , H04M2201/40 , H04M2203/357
摘要: Embodiments of the present invention provide a system for automatically extracting conversational structure from a voice record based on lexical and acoustic features. The system also aggregates business-relevant statistics and entities from a collection of spoken conversations. The system may infer a coarse-level conversational structure based on fine-level activities identified from extracted acoustic features. The system improves significantly over previous systems by extracting structure based on lexical and acoustic features. This enables extracting conversational structure on a larger scale and finer level of detail than previous systems, and can feed an analytics and business intelligence platform, e.g. for customer service phone calls. During operation, the system obtains a voice record. The system then extracts a lexical feature using automatic speech recognition (ASR). The system extracts an acoustic feature. The system then determines, via machine learning and based on the extracted lexical and acoustic features, a coarse-level structure of the conversation.
-