Abstract:
Systems, methods, and media for the automated removal of private information are provided herein. In an example implementation, a method for automatic removal of private information may include: receiving a transcript of communication data; applying a private information rule to the transcript in order to identify private information in the transcript; tagging the identified private information with a tag comprising an identification of the private information; applying a complicate rule to the tagged transcript in order to evaluate a compliance of the transcript with privacy standards; removing the identified private information from the transcript to produce a redacted transaction; and storing the redacted transcript.
Abstract:
By formulizing a specific company's internal knowledge and terminology, the ontology programming accounts for linguistic meaning to surface relevant and important content for analysis. The ontology is built on the premise that meaningful terms are detected in the corpus and then classified according to specific semantic concepts, or entities. Once the main terms are defined, direct relations or linkages can be formed between these terms and their associated entities. Then, the relations are grouped into themes, which are groups or abstracts that contain synonymous relations. The disclosed ontology programming adapts to the language used in a specific domain, including linguistic patterns and properties, such as word order, relationships between terms, and syntactical variations. The ontology programming automatically trains itself to understand the domain or environment of the communication data by processing and analyzing a defined corpus of communication data.
Abstract:
A rogue base station detection system that establishes a communication session with a suspected base station, and verifies whether the base station is rogue or innocent by testing which advanced communication features are supported by the base station. The detection system holds a definition of one or more communication features that are supported by innocent base stations and not by rogue base stations. During a communication session with a suspected base station, the detection system requests the base station to activate these communication features. If the base station does not support the features in question, it is likely to be rogue.
Abstract:
Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.
Abstract:
In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.
Abstract:
Methods and systems for automated generation of malicious traffic signatures, for use in Intrusion Detection Systems (IDS). A rule generation system formulates IDS rules based on traffic analysis results obtained from a network investigation system. The rule generation system then automatically configures the IDS to apply the rules. An analysis process in the network investigation system comprises one or more metadata filters that are indicative of malicious traffic. An operator of the rule generation system is provided with a user interface that is capable of displaying the network traffic filtered in accordance with such filters.
Abstract:
Method of automated anomaly detection includes obtaining a corpus of interaction data. Regular interaction data is identified from the corpus of interaction data with a processor. New interaction data is received. The processor compares the new interaction data to the identified regular interaction data The processor identities anomalies in the new interaction data.
Abstract:
Methods and systems for tracking mobile communication terminals based on their identifiers. The disclosed techniques identify cellular terminals and Wireless Local Area Network (WLAN) terminals that are likely to be carried by the same individual, or cellular and WLAN identifiers that belong to the same multi-mode terminal. A correlation system is connected to a cellular network and to a WLAN. The system receives location coordinates of cellular identifiers used by mobile terminals in the cellular network, and location coordinates of WLAN identifiers used by mobile terminals in the WLAN. Based on the location coordinates, the system is able to construct routes that are traversed by the terminals having the various cellular and WLAN identifiers. The system attempts to find correlations in time and space between the routes.
Abstract:
Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.
Abstract:
Methods and systems for filtering communication packets using a multi-stage filtering system that receives a large volume of communication packets from a communication network that filters the packets in two or more successive stages. The system comprises at least one front-end filtering unit and multiple back-end filtering units. Typically although not necessarily, the front-end filtering unit filters the packets based on layer-2 to layer-4 attributes of the packets. The back-end filtering units, on the other hand, filter the packets based on content extracted from the packet payloads. The back-end filtering units may perform filtering, for example, based on keyword spotting, application classification, malware detection and other content-related criteria. The front-end filtering unit typically performs filtering at the individual packet level and/or at the level of request-response transactions. The back-end filtering units, on the other hand, typically perform filtering at the level of entire reconstructed packet flows.