Weak Supervised Abnormal Entity Detection

    公开(公告)号:US20210256211A1

    公开(公告)日:2021-08-19

    申请号:US16789804

    申请日:2020-02-13

    Abstract: A mechanism is provided to implement an abnormal entity detection mechanism that facilitates detecting abnormal entities in real-time response systems through weak supervision. For each first intent from an entity labeled workspace that matches a second intent in labeled chat logs, when the entity score associated with each first entity or second entity is above a predefined significance level the first entity or the second entity is recorded. For each first intent from the entity labeled workspace that matches the second intent in the labeled chat logs: responsive to the first entity being recorded and the second entity failing to be recorded, that first entity is removed from the training data as being mistakenly included; or, responsive to the second entity being recorded and the first entity failing to be recorded, that second entity is added as a potential business case to the training data.

    System and method for enhanced chatflow application

    公开(公告)号:US10719770B2

    公开(公告)日:2020-07-21

    申请号:US15279250

    申请日:2016-09-28

    Abstract: Embodiments provide a computer implemented method of training an enhanced chatflow system, comprising: ingesting a corpus of information comprising at least one user input node corresponding to a user question and at least one expert-designed variation for each user input node; matching one or more user inputs to one or more corresponding dialog nodes using regular expressions and delimiters; ingesting one or more usage logs from a deployed dialog system, each usage log comprising at least one user input node; for each user input node: designating the node as a class; storing the node in a dialog node repository; designating each of the at least one variations as training examples for the designated class; converting the classes and the training examples into feature vector representations; training one or more classifiers and one or more classification objectives using the feature vector representations.

    SYSTEM AND METHOD FOR ENHANCED CHATFLOW APPLICATION

    公开(公告)号:US20180091457A1

    公开(公告)日:2018-03-29

    申请号:US15279248

    申请日:2016-09-28

    CPC classification number: H04L51/16 H04L51/18

    Abstract: Embodiments provide a computer implemented method, in a data processing system comprising a processor and a memory comprising instructions which are executed by the processor to cause the processor to train an enhanced chatflow system, the method comprising: ingesting a corpus of information comprising at least one user input node corresponding to a user question and at least one variation for each user input node; for each user input node: designating the node as a class; storing the node in a dialog node repository; designating each of the at least one variations as training examples for the designated class; converting the classes and the training examples into feature vector representations; training one or more training classifiers using the one or more feature vector representations of the classes; and training classification objectives using the one or more feature vector representations of the training examples.

    Weak supervised abnormal entity detection

    公开(公告)号:US11423227B2

    公开(公告)日:2022-08-23

    申请号:US16789804

    申请日:2020-02-13

    Abstract: A mechanism is provided to implement an abnormal entity detection mechanism that facilitates detecting abnormal entities in real-time response systems through weak supervision. For each first intent from an entity labeled workspace that matches a second intent in labeled chat logs, when the entity score associated with each first entity or second entity is above a predefined significance level the first entity or the second entity is recorded. For each first intent from the entity labeled workspace that matches the second intent in the labeled chat logs: responsive to the first entity being recorded and the second entity failing to be recorded, that first entity is removed from the training data as being mistakenly included; or, responsive to the second entity being recorded and the first entity failing to be recorded, that second entity is added as a potential business case to the training data.

    Weighting features for an intent classification system

    公开(公告)号:US10977445B2

    公开(公告)日:2021-04-13

    申请号:US16265618

    申请日:2019-02-01

    Abstract: A computer-implemented method includes obtaining a training data set including a plurality of training examples. The method includes generating, for each training example, multiple feature vectors corresponding, respectively, to multiple feature types. The method includes applying weighting factors to feature vectors corresponding to a subset of the feature types. The weighting factors are determined based on one or more of: a number of training examples, a number of classes associated with the training data set, an average number of training examples per class, a language of the training data set, a vocabulary size of the training data set, or a commonality of the vocabulary with a public corpus. The method includes concatenating the feature vectors of a particular training example to form an input vector and providing the input vector as training data to a machine-learning intent classification model to train the model to determine intent based on text input.

    CROSS-DOMAIN MULTI-TASK LEARNING FOR TEXT CLASSIFICATION

    公开(公告)号:US20200251100A1

    公开(公告)日:2020-08-06

    申请号:US16265740

    申请日:2019-02-01

    Abstract: A method includes providing input text to a plurality of multi-task learning (MTL) models corresponding to a plurality of domains. Each MTL model is trained to generate an embedding vector based on the input text. The method further includes providing the input text to a domain identifier that is trained to generate a weight vector based on the input text. The weight vector indicates a classification weight for each domain of the plurality of domains. The method further includes scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors, generating a feature vector based on the plurality of scaled embedding vectors, and providing the feature vector to an intent classifier that is trained to generate, based on the feature vector, an intent classification result associated with the input text.

    PRETRAINING OF SPLIT LAYER PORTIONS FOR MULTILINGUAL MODEL

    公开(公告)号:US20240193377A1

    公开(公告)日:2024-06-13

    申请号:US18063788

    申请日:2022-12-09

    CPC classification number: G06F40/58

    Abstract: A method, computer system, and a computer program product for training a machine learning model are provided. A machine learning model may be split into a lower portion and an upper portion. The lower portion includes at least one layer. The upper portion includes at least one layer. The lower portion may be pre-trained via a generator task and via alternating between inputting of monolingual text data and multilingual text data. The upper portion may be pre-trained via a discriminator task. The pre-trained lower portion may be joined to the pre-trained upper portion to form a trained multilingual machine learning model.

Patent Agency Ranking