-
公开(公告)号:US11646018B2
公开(公告)日:2023-05-09
申请号:US16829705
申请日:2020-03-25
Applicant: PINDROP SECURITY, INC.
Inventor: Vinay Maddali , David Looney , Kailash Patil
IPC: G10L15/197 , G10L15/18 , G10L15/02 , G10L15/04 , G10L15/06 , G10L15/22 , G10L25/84 , G10L25/21 , H04M3/51
CPC classification number: G10L15/197 , G10L15/02 , G10L15/04 , G10L15/063 , G10L15/1822 , G10L15/22 , G10L25/21 , G10L25/84 , H04M3/5183 , H04M2203/558
Abstract: Embodiments described herein provide for automatically classifying the types of devices that place calls to a call center. A call center system can detect whether an incoming call originated from voice assistant device using trained classification models received from a call analysis service. Embodiments described herein provide for methods and systems in which a computer executes machine learning algorithms that programmatically train (or otherwise generate) global or tailored classification models based on the various types of features of an audio signal and call data. A classification model is deployed to one or more call centers, where the model is used by call center computers executing classification processes for determining whether incoming telephone calls originated from a voice assistant device, such as Amazon Alexa® and Google Home®, or another type of device (e.g., cellular/mobile phone, landline phone, VoIP).
-
2.
公开(公告)号:US20250095662A1
公开(公告)日:2025-03-20
申请号:US18883681
申请日:2024-09-12
Applicant: Pindrop Security, Inc.
Inventor: David Looney , Nikolay Gaubitch
IPC: G10L19/018 , G10L25/30
Abstract: Embodiments disclosed herein include software processes executed by a computer for encoding and decoding watermarks for a speech signal in a call signal communicated via telephony channels. An encoder uses Linear Predictive Coding (LPC) to analyzes the call signal's spectral envelope and embeds the watermark into the LPC log-spectrum of the speech signal of the call signal. The encoder may reduce the watermark's strength at a formant peak of the speech signal, balancing the watermark's robustness and detectability. A deep decoder includes a neural network architecture trained on watermarked and watermark-free speech signals having various types of degradation to extract a feature vector of a call signal and compute a watermark detection score for one or more frames or for the call signal. At inference time, the deep decoder detects the watermark when the watermark detection score satisfies a detection threshold.
-
公开(公告)号:US20230015189A1
公开(公告)日:2023-01-19
申请号:US17953156
申请日:2022-09-26
Applicant: Pindrop Security, Inc.
Inventor: David Looney , Nikolay D. Gaubitch
Abstract: A computer may train a single-class machine learning using normal speech recordings. The machine learning model or any other model may estimate the normal range of parameters of a physical speech production model based on the normal speech recordings. For example, the computer may use a source-filter model of speech production, where voiced speech is represented by a pulse train and unvoiced speech by a random noise and a combination of the pulse train and the random noise is passed through an auto-regressive filter that emulates the human vocal tract. The computer leverages the fact that intentional modification of human voice introduces errors to source-filter model or any other physical model of speech production. The computer may identify anomalies in the physical model to generate a voice modification score for an audio signal. The voice modification score may indicate a degree of abnormality of human voice in the audio signal.
-
公开(公告)号:US11495244B2
公开(公告)日:2022-11-08
申请号:US16375785
申请日:2019-04-04
Applicant: PINDROP SECURITY, INC.
Inventor: David Looney , Nikolay D. Gaubitch
Abstract: A computer may train a single-class machine learning using normal speech recordings. The machine learning model or any other model may estimate the normal range of parameters of a physical speech production model based on the normal speech recordings. For example, the computer may use a source-filter model of speech production, where voiced speech is represented by a pulse train and unvoiced speech by a random noise and a combination of the pulse train and the random noise is passed through an auto-regressive filter that emulates the human vocal tract. The computer leverages the fact that intentional modification of human voice introduces errors to source-filter model or any other physical model of speech production. The computer may identify anomalies in the physical model to generate a voice modification score for an audio signal. The voice modification score may indicate a degree of abnormality of human voice in the audio signal.
-
公开(公告)号:US20240311474A1
公开(公告)日:2024-09-19
申请号:US18598595
申请日:2024-03-07
Applicant: PINDROP SECURITY, INC.
Inventor: Nikolay Gaubitch , David Looney
CPC classification number: G06F21/554 , G06N20/00 , G10L25/18 , G10L25/51 , G10L25/69 , G06F2221/034
Abstract: Embodiments include a computing device that executes software routines and/or one or more machine-learning architectures including obtaining training audio signals having corresponding training impulse responses associated with reverberation degradation, training a machine-learning model of a presentation attack detection engine to generate one or more acoustic parameters by executing the presentation attack detection engine using the training impulse responses of the training audio signals and a loss function, obtaining an audio signal having an acoustic impulse response associated with reverberation degradation caused by one or more rooms, generating the one or more acoustic parameters for the audio signal by executing the machine-learning model using the audio signal as input, and generating an attack score for the audio signal based upon the one or more parameters generated by the machine-learning model.
-
公开(公告)号:US12087319B1
公开(公告)日:2024-09-10
申请号:US17079082
申请日:2020-10-23
Applicant: PINDROP SECURITY, INC.
Inventor: David Looney , Nikolay Gaubitch
Abstract: Embodiments described herein provide for end-to-end joint determination of degradation parameter scores for certain types of degradation. Degradation parameters include degradation describing additive noise and multiplicative noise such as Signal-to-Noise Ratio (SNR), reverberation time (T60), and Direct-to-Reverberant Ratio (DRR). Various neural network architectures are described such that the inherent interplay between the degradation parameters is considered in both the degradation parameter score and degradation score determination. The neural network architectures are trained according to computer generated audio datasets.
-
-
-
-
-