-
公开(公告)号:US20230120940A1
公开(公告)日:2023-04-20
申请号:US17589693
申请日:2022-01-31
Applicant: salesforce.com, inc.
Inventor: Liang Qiu , Chien-Sheng Wu , Wenhao Liu , Caiming Xiong
IPC: G10L15/06 , G10L15/183 , G10L15/05
Abstract: Embodiments described herein propose an approach for unsupervised structure extraction in task-oriented dialogues. Specifically, a Slot Boundary Detection (SBD) module is adopted, for which utterances from training domains are tagged with the conventional BIO schema but without the slot names. A transformer-based classifier is trained to detect the boundary of potential slot tokens in the test domain. Next, while the state number is usually unknown, it is more reasonable to assume the slot number is given when analyzing a dialogue system. The detected tokens are clustered into the number of slot of groups. Finally, the dialogue state is represented with a vector recording the modification times of every slot. The slot values are then tracked through each dialogue session in the corpus and label utterances with their dialogue states accordingly. The semantic structure is portrayed by computing the transition frequencies among the unique states.