Abstract:
A method for generating a language model for an organization includes: receiving, by a processor, organization-specific training data; receiving, by the processor, generic training data; computing, by the processor, a plurality of similarities between the generic training data and the organization-specific training data; assigning, by the processor, a plurality of weights to the generic training data in accordance with the computed similarities; combining, by the processor, the generic training data with the organization-specific training data in accordance with the weights to generate customized training data; training, by the processor, a customized language model using the customized training data; and outputting, by the processor, the customized language model, the customized language model being configured to compute the likelihood of phrases in a medium.
Abstract:
A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.
Abstract:
A method for predicting a speech recognition quality of a phrase comprising at least one word includes: receiving, on a computer system including a processor and memory storing instructions, the phrase; computing, on the computer system, a set of features comprising one or more features corresponding to the phrase; providing the phrase to a prediction model on the computer system and receiving a predicted recognition quality value based on the set of features; and returning the predicted recognition quality value.
Abstract:
A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.
Abstract:
A method for tracking known topics in a plurality of interactions includes: extracting, by a processor, a plurality of fragments from the plurality of interactions; initializing, by the processor, a collection of tracked topics to an empty collection; computing, by the processor, a similarity between each fragment of the fragments and each of the known topics; and adding, by the processor, a known topic of the known topics to the tracked topics in response to the similarity between a fragment and the known topic exceeding a threshold value.
Abstract:
A method for predicting a speech recognition quality of a phrase comprising at least one word includes: receiving, on a computer system including a processor and memory storing instructions, the phrase; computing, on the computer system, a set of features comprising one or more features corresponding to the phrase; providing the phrase to a prediction model on the computer system and receiving a predicted recognition quality value based on the set of features; and returning the predicted recognition quality value.
Abstract:
A method for determining a cause of events detected in a plurality of interactions includes: identifying, on a processor, a plurality of elements in the interactions; detecting, on the processor, a plurality of sequences of elements in the interactions; mining, on the processor, the plurality of sequences for generating a set of supported patterns; computing, on the processor, association rules from the set of supported patterns; and returning the computed association rules.
Abstract:
Methods, systems, and computer program product for automatically performing sentiment analysis on texts, such as telephone call transcripts and electronic written communications. Disclosed techniques include, inter alia, lexicon training, handling of negations and shifters, pruning of lexicons, confidence calculation for token orientation, supervised customization, lexicon mixing, and adaptive segmentation.
Abstract:
A method for generating a dialogue tree for an automated self-help system of a contact center from a plurality of recorded interactions between customers and agents of the contact center includes: computing, by a processor, a plurality of feature vectors, each feature vector corresponding to one of the recorded interactions; computing, by the processor, similarities between pairs of the feature vectors; grouping, by the processor, similar feature vectors based on the computed similarities into groups of interactions; rating, by the processor, feature vectors within each group of interactions based on one or more criteria, wherein the criteria include at least one of interaction time, success rate, and customer satisfaction; and outputting, by the processor, a dialogue tree in accordance with the rated feature vectors for configuring the automated self-help system.
Abstract:
A method, system, and computer program product for unsupervised automated generation of lexicons in a specified target domain, comprising tokens having domain-specific sentiment orientation, by selecting a seed set of tokens from a source lexicon; generating a candidate set of tokens from a text corpus in the target domain based on a similarity parameter with the seed set; calculating a sentiment score for each of the tokens in the candidate set; and automatically updating the source lexicon based on the candidate list.