Patent search ap:("CENTRAL CHINA NORMAL UNIVERSITY") AND inv:"Baolin YI" Page 1

1.

发明申请
MULTI-SPEAKER OVERLAPPING VOICE DETECTION METHOD AND SYSTEM THEREOF 有权

公开(公告)号：US20250046336A1

公开(公告)日：2025-02-06

申请号：US18788092

申请日：2024-07-29

Applicant: CENTRAL CHINA NORMAL UNIVERSITY

Inventor： Zengzhao CHEN , Zhifeng Wang , Baolin YI , Jiangbo Shu , Shengming WANG , Xinxing Jiang

IPC: G10L25/78 , G10L21/0272 , G10L25/30

Abstract: Disclosed are a multi-speaker overlapping voice detection method and a system. The method includes: obtaining a voice to be detected, and removing silence from the voice to be detected is removed; extracting a feature of the voice to be detected after silence removal to obtain a voice feature of the voice to be detected; and inputting the voice feature into an overlapping voice detection model to obtain an overlapping speaker number corresponding to the voice to be detected output by the overlapping voice detection model. The overlapping voice detection model is obtained by supervised training based on a voice feature of a sample voice and a corresponding label of the overlapping speaker number, extracts an embedding of the voice feature, and classifies the overlapping speaker number to obtain the overlapping speaker number of the voice to be detected based on the extracted speaker embedding.

2.

发明公开
CONSTRUCTION METHOD AND SYSTEM OF DESCRIPTIVE MODEL OF CLASSROOM TEACHING BEHAVIOR EVENTS 审中-公开

公开(公告)号：US20230334862A1

公开(公告)日：2023-10-19

申请号：US18011847

申请日：2021-09-07

Applicant: CENTRAL CHINA NORMAL UNIVERSITY

Inventor： Sannyuya LIU , Zengzhao CHEN , Zhicheng DAI , Shengming WANG , Xiuling HE , Baolin YI

IPC: G06V20/40 , G10L25/78 , G10L25/57 , G06V10/774 , G06V40/20 , G06Q50/20

CPC classification number: G06V20/44 , G10L25/78 , G10L25/57 , G06V20/49 , G06V20/41 , G06V10/774 , G06V40/20 , G06Q50/205

Abstract: The present invention discloses construction method and system of a descriptive model of classroom teaching behavior events. The construction method includes steps as the followings: acquiring classroom teaching video data to be trained; dividing the classroom teaching video data to be trained into multiple events according to utterances of a teacher by using a voice activity detection technology; and performing multi-modal recognition on all events by using multiple artificial intelligence technologies to divide the events into sub-events in multiple dimensions, establishing an event descriptive model according to the sub-events, and describing various teaching behavior events of the teacher in a classroom. The present invention divides a classroom video according to voice, which can ensure the completeness of the teacher's non-verbal behavior in each event to the greatest extent. Also, a descriptive model that uniformly describes all events is established by extracting commonality between different events, which can not only complete the description of various teaching behaviors of the teacher, but also reflect the correlation between events, so that the events are no longer isolated.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification