-
公开(公告)号:US20210074315A1
公开(公告)日:2021-03-11
申请号:US17101048
申请日:2020-11-23
Applicant: AT&T Intellectual Property I, L.P.
Inventor: Dimitrios DIMITRIADIS , Eric ZAVESKY , Matthew BURLICK
Abstract: Voice activity in a media signal is detected in an augmented, multi-tier classifier architecture. For instance, a first voice activity indicator, detected in a first modality for a human subject, is received from a first classifier. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system then concatenates, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.
-
公开(公告)号:US20180182415A1
公开(公告)日:2018-06-28
申请号:US15894245
申请日:2018-02-12
Applicant: AT&T Intellectual Property I, L.P.
Inventor: Dimitrios DIMITRIADIS , Eric ZAVESKY , Matthew BURLICK
CPC classification number: G10L25/78 , G06K9/00335 , G10L15/24 , G10L25/84
Abstract: Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.
-