Abstract:
A system, device, and method for generating an audio output includes a master computing device and a plurality of client computing devices. Each client computing device includes a microphone to record audio signals. The client computing devices generate audio data based on the audio signals and transmit the audio data to the master computing device. The master computing device generates a final, higher quality audio output as a function of the audio data received from collection of participating the client computing devices.
Abstract:
A system and method for automatic call segmentation including steps and means for automatically detecting boundaries between utterances in the call transcripts; automatically classifying utterances into target call sections; automatically partitioning the call transcript into call segments; and outputting a segmented call transcript. A training method and apparatus for training the system to perform automatic call segmentation includes steps and means for providing at least one training transcript with annotated call sections; normalizing the at least one training transcript; and performing statistical analysis on the at least one training transcript.
Abstract:
A small baseline audio sample is sampled when a person initially calls in and the sample is held only for the duration of the call. For each subsequent transfer, a comparison is made to the baseline established from the initial call and at the end of the call the voice sample is discarded so no resources need to be maintained. Speaker verification and VOIP technologies are used to persist the customer's verification information as service representative hand-offs occur.
Abstract:
A method and system for using conversational biometrics and speaker identification and/or verification to filter voice streams during mixed mode communication. The method includes receiving an audio stream of a communication between participants. Additionally, the method includes filtering the audio stream of the communication into separate audio streams, one for each of the participants. Each of the separate audio streams contains portions of the communication attributable to a respective participant. Furthermore, the method includes outputting the separate audio streams to a storage system.
Abstract:
Methods and systems populate a speech signature database with unique speech signatures that are associated with one or more speaker identities and are further associated with one or more mobile stations and/or telephone numbers. Real-time voice signals are compared to the speech signatures in the speech signature database. When a match is found, the mobile station from which the voice signal originated is located in real-time. Further, the associations in the speech signature database are leveraged to find other relevant mobile stations or users and to generate additional associations and to also locate associated users and mobile stations.
Abstract:
A method, system and computer program product for alerting a participant when a topic of interest is being discussed and/or a speaker of interest is speaking during a conference call. A participant to a conference call identifies the topics and/or speakers of interest which is stored for future use along with the participant's contact information. When a participant's identified topic of interest is being discussed and/or a participant's identified speaker of interest is speaking during a conference call, the participant will be alerted to that fact, such as via the means specified in the participant's contact information.
Abstract:
A method and apparatus for providing access to teleconference services using voice recognition technology to receive information on packet networks such as Voice over Internet Protocol (VoIP) and Service over Internet Protocol (SoIP) networks are disclosed. In one embodiment, the service provider enables a caller to enter access information for accessing a conference service using at least one natural language response.
Abstract:
A method for grouping voice messages includes extracting a voice signature from a voice message and tagging the voice message with an identification associated with the voice signature. The method also includes grouping the voice message based on the identification.
Abstract:
Fraudulent callers that masquerade as legitimate callers in order to discover details of bank accounts or other accounts are an increasing problem. In order to detect possible fraudsters and preventing them from obtaining such details a method and system is proposed that transform the recorded speech of a batch of incoming calls to strings of phonemes or text. Thereafter similar speech patterns, such as distinct similar phrases or wording, in the recorded speech are determined and calls having similar speech patterns, and preferably also similar acoustic properties, are grouped together and identified as being from the same fraudulent caller. Transactions initiated by the fraudulent caller can as a result be stopped and preferably a voiceprint of the fraudulent caller's speech is generated and stored in a database for further use.
Abstract:
A method and apparatus for performing active speaker selection in teleconferencing applications illustratively comprises a microphone array module, a speaker recognition system, a user interface, and a speech signal selection module. The microphone array module separates the speech signal from each active speaker from those of other active speakers, providing a plurality of individual speaker's speech signals. The speaker recognition system identifies each currently active speaker using conventional speaker recognition/identification techniques. These identities are then transmitted to a remote teleconferencing location for display to remote participants via a user interface. The remote participants may then select one of the identified speakers, and the speech signal selection module then selects for transmission the speech signal associated with the selected identified speaker, thereby enabling the participants at the remote location to listen to the selected speaker and neglect the speech from other active speakers.