Abstract:
In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.
Abstract:
Systems and methods for analyzing digital recordings of the human voice in order to find characteristics unique to an individual. A biometrics engine may use an analytics service in a contact center to supply audio streams based on configured rules and providers for biometric detection. The analytics service may provide call audio data and attributes to connected engines based on a provider-set of selection rules. The connected providers send call audio data and attributes through the analytics service. The engines are notified when a new call is available for processing and can then retrieve chunks of audio data and call attributes by polling an analytics service interface. A mathematical model of the human vocal tract in the call audio data is created and/or matched against existing models. The result is analogous to a fingerprint, i.e., a pattern unique to an individual to within some level of probability.
Abstract:
A method and system for using conversational biometrics and speaker identification and/or verification to filter voice streams during mixed mode communication. The method includes receiving an audio stream of a communication between participants. Additionally, the method includes filtering the audio stream of the communication into separate audio streams, one for each of the participants. Each of the separate audio streams contains portions of the communication attributable to a respective participant. Furthermore, the method includes outputting the separate audio streams to a storage system.
Abstract:
A method and system for using conversational biometrics and speaker identification and/or verification to filter voice streams during mixed mode communication. The method includes receiving an audio stream of a communication between participants. Additionally, the method includes filtering the audio stream of the communication into separate audio streams, one for each of the participants. Each of the separate audio streams contains portions of the communication attributable to a respective participant. Furthermore, the method includes outputting the separate audio streams to a storage system.
Abstract:
One-to-many comparisons of callers' words and/or voice prints with known words and/or voice prints to identify any substantial matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract different words, such as words of anger. The system may also segment at least a portion of the customer's voice to create a tone profile, and it formats the segmented words and tone profiles for network transmission to a server. The server compares the customer's words and/or tone profiles with multiple known words and/or tone profiles stored on a database to determine any substantial matches. The identification of any matches may be used for a variety of purposes, such as providing representative feedback or customer follow-up.
Abstract:
Arrangements described herein include identifying a voice communication session established between a first communication device and a second communication device and, based on the voice communication session established between the first communication device and the second communication device, identifying a plurality of contacts who potentially may be the second user. A list including at least a name of each of the plurality of contacts who potentially may be the second user is presented to a first user using the first communication device.
Abstract:
A home gateway system has a transceiver (70) capable of establishing a wireless local loop connection (72). A voice processing system (74) is coupled to the transceiver (70). The voice processing system (74) is capable of storing a message from an incoming call.A caller identification processing system (76) is coupled to the transceiver (70). The caller identification processing system (76) determines a telephone number of the incoming call and routes the incoming call to the voice processing system (74), if the telephone number belongs to a screened group of telephone numbers.
Abstract:
Technology for crime control includes receiving a voucher identifier for a mobile phone credit voucher purchased under duress by a victim and generating a request for a legal order directing a telecommunication service provider to obtain certain information about use of the voucher. Approval for the legal order is received and the legal order and the voucher identifier are transmitted by a law enforcement agency computer system via a network to a computer system of the telecommunication service provider. A phone number associated with a mobile phone to which a credit associated with the voucher identifier was applied and a recording of a telephone call to or from the phone number are received via the network from the telecommunication service provider computer system and the law enforcement agency computer system performs an automated analysis of the call by a voice recognition process.
Abstract:
In many scenarios, speaker verification systems can be given a single-channel audio with recordings of multiple speakers. To perform accurate speaker verification, a system can isolate the speech of a speaker. In one embodiment, a method, and corresponding system, of speaker verification includes extracting a target speaker's speech, using a known speaker voiceprint, from an audio recording that includes the target speaker's speech and the known speaker's speech. The known speaker voiceprint can correspond to the known speaker. Extracting the target speaker's speech can include determining portions of the audio recording where the known speaker voiceprint matches the known speaker's speech above a particular threshold, and extracting the target speaker's speech from other portions of the audio recording. In this manner, speaker verification is performed on the target speaker's speech without interference from the known speaker's speech and allows for a more accurate verification.
Abstract:
Contact center agents often work in close proximity to other agents. As a primary agent is engaged in a call, a neighboring agent speech may be picked up by the primary agent's microphone. Contact centers using automated speech recognition systems may monitor the agent's speech for key terms and, if detected, respond accordingly. Determining a primary agent spoke a key term, when the true speaker of the key term is a neighboring agent, may cause errors or other problems. Characterizing at least the primary agent's voice and then, once a key term is detected, determining if it was the primary agent that spoke the key term, may help to reduce the errors. Additionally, computational requirements may be reduced as non-key terms may be quickly discarded and optionally, key terms determined to not have been spoke by the primary agent, may also be discarded without further processing.