Invention Application
- Patent Title: METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVSED AND SEMI-SUPERVISED TECHNIQUES
- Patent Title (中): 使用不间断和半监督技术分隔通信转录的方法
-
Application No.: US11931806Application Date: 2007-10-31
-
Publication No.: US20090112588A1Publication Date: 2009-04-30
- Inventor: Krishna Kummamuru , Deepak S. Padmanabhan , Shourya Roy , L. Venkata Subramaniam
- Applicant: Krishna Kummamuru , Deepak S. Padmanabhan , Shourya Roy , L. Venkata Subramaniam
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Main IPC: G10L15/06
- IPC: G10L15/06

Abstract:
A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a specified number of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters within sequences of the collection.
Information query