METHOD, APPARATUS, DEVICE, AND STORAGE MEDIUM FOR SPEAKER CHANGE POINT DETECTION

    公开(公告)号:US20240331706A1

    公开(公告)日:2024-10-03

    申请号:US18741427

    申请日:2024-06-12

    IPC分类号: G10L17/04

    CPC分类号: G10L17/04

    摘要: A method, apparatus, device, and storage medium for speaker change point detection, the method including: acquiring target voice data to be detected; and extracting an acoustic feature characterizing acoustic information of the target voice data from the target voice data; encoding the acoustic feature to obtain speaker characterization vectors of the target voice data; integrating and firing the speaker characterization vectors of the target voice data based on a continuous integrate-and-fire CIF mechanism, to obtain a sequence of speaker characterizations in the target voice data; and determining the speaker change points, according to the sequence of the speaker characterizations bounded by the speaker change points in the target voice data. This method can effectively improve the accuracy of the detection result of a speaker change point in target voice data with a type of interaction.

    General speech enhancement method and apparatus using multi-source auxiliary information

    公开(公告)号:US12094484B2

    公开(公告)日:2024-09-17

    申请号:US18360838

    申请日:2023-07-28

    申请人: ZHEJIANG LAB

    摘要: The present disclosure discloses a general speech enhancement method and apparatus using multi-source auxiliary information. The method includes following steps: S1: building a training data set; S2: using the training data set to learn network parameters of a model, and building a speech enhancement model; S3: building a sound source information database in a pre-collection or on-site collection mode; S4: acquiring an input of the speech enhancement model; and S5: taking a noisy original signal as a main input of the speech enhancement model, taking auxiliary sound signals of a target source group and auxiliary sound signals of an interference source group as side inputs of the speech enhancement model for speech enhancement, and obtaining an enhanced speech signal.

    VIRTUAL AGENT TRANSPARENT USER AUTHENTICATION

    公开(公告)号:US20240232308A1

    公开(公告)日:2024-07-11

    申请号:US18152671

    申请日:2023-01-10

    摘要: Certain aspects of the present disclosure provide techniques for receiving audio data comprising a user voice command; determining a task to be completed by a remote service based on the user voice command; determining that a reference voice print associated with the user is stored in a user account; authenticating the user by determining that a sample voice print based on the user voice command matches the reference voice print associated with the user; storing authentication evidence associated with the task; and providing proof of user authentication to the remote service in order to initiate the task with the remote service.

    System and method for detecting fraudsters

    公开(公告)号:US12020711B2

    公开(公告)日:2024-06-25

    申请号:US17166525

    申请日:2021-02-03

    申请人: Nice Ltd.

    摘要: A system and method may classify a plurality of interactions, by: obtaining a plurality of voiceprints of the plurality of interactions, wherein each voiceprint of the plurality of voiceprints represents a speaker participating in an interaction of the plurality of interactions; calculating, for each interaction, a plurality of scores, wherein each score of the plurality of scores is indicative of a similarity between the voiceprint of the interaction and one voiceprint of a set of benchmark voiceprints; calculating, for each interaction, statistics of the scores; and determining that a plurality of interactions pertain to a single cluster of interactions based on statistics of the scores of the interactions in the cluster.

    Mobile Terminal And Hub Apparatus For Use In A Video Communication System

    公开(公告)号:US20240163397A1

    公开(公告)日:2024-05-16

    申请号:US18424706

    申请日:2024-01-26

    发明人: Mario Ferrari

    摘要: A hub apparatus (20) is designated to be used in a video communication system comprising the hub apparatus (20) and a plurality of mobile terminals (10a-10d) configured to be wirelessly connectable to the hub apparatus (20). The hub apparatus (20) comprises: a receiving unit (24) configured to receive from each mobile terminal (10) of the plurality of mobile terminals (10a-10d) a video stream, a current speaker indicator to indicate whether the user of the mobile terminal is speaking and an association information which associates the current speaker indicator transmitted by the mobile terminal with the video stream transmitted from such mobile terminal (10), and a generation unit (40) operatively connected to said receiving unit (24) and configured to generate an output video communication stream (6) based on the plurality of video streams received from each mobile terminal (10) of the plurality of mobile terminals (10a-10d), on the plurality of current speaker indicators received from each mobile terminal (10) of the plurality of mobile terminals (10a-10d) and on the plurality of association information received from each mobile terminal (10) of the plurality of mobile terminals (10a-10d).