-
公开(公告)号:US11630958B2
公开(公告)日:2023-04-18
申请号:US17336881
申请日:2021-06-02
Applicant: Microsoft Technology Licensing, LLC
Inventor: Royi Ronen , Yarin Kuper , Tomer Rosenthal , Abedelkader Asi , Erez Altus , Rona Shaanan
IPC: G06F40/30 , G06F40/166 , G06F40/117 , G06F40/284 , G06N20/00 , G10L15/26 , G10L15/22 , G06F16/34 , G06F40/279
Abstract: The disclosure herein describes determining topics of communication transcripts using trained summarization models. A first communication transcript associated with a first communication is obtained and divided into a first set of communication segments. A first set of topic descriptions is generated based on the first set of communication segments by analyzing each communication segment of the first set of communication segments with a generative language model. A summarization model is trained using the first set of communication segments and associated first set of topic descriptions as training data. The trained summarization model is then applied to a second communication transcript and, based on applying the trained summarization model to the second communication transcript, a second set of topic descriptions of the second communication transcript is generated. By training the summarization model based on output of the generative language model, it enables efficient, accurate generation of topic descriptions from communication transcripts.