-
公开(公告)号:US11853712B2
公开(公告)日:2023-12-26
申请号:US17303728
申请日:2021-06-07
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Haode Qi , Lin Pan , Abhishek Shah , Ladislav Kunc , Saloni Potdar
Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.
-
公开(公告)号:US20220391600A1
公开(公告)日:2022-12-08
申请号:US17303728
申请日:2021-06-07
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor: Haode Qi , LIN PAN , Abhishek Shah , Ladislav Kunc , Saloni Potdar
Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.
-
公开(公告)号:US11423333B2
公开(公告)日:2022-08-23
申请号:US16829055
申请日:2020-03-25
Applicant: International Business Machines Corporation
Inventor: Haode Qi , Ming Tan , Ladislav Kunc , Saloni Potdar
Abstract: Mechanisms are provided for optimizing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial default value and initial range for sampling of a parameter of the machine learning (ML) model and an initial AutoML process is executed on the ML model based on a plurality of datasets comprising a plurality of domains of data elements, utilizing the initially configured AutoML logic. For each domain, a cross-dataset default value and cross-dataset value range are derived from results of the execution of the initial AutoML process. For each domain, an entry is stored in a data structure, the entry storing the derived cross-dataset default value and cross-dataset value range for the domain. The AutoML logic performs a subsequent AutoML process on a new dataset based on one or more entries of the data structure.
-
公开(公告)号:US20210304055A1
公开(公告)日:2021-09-30
申请号:US16829055
申请日:2020-03-25
Applicant: International Business Machines Corporation
Inventor: Haode Qi , Ming Tan , Ladislav Kunc , Saloni Potdar
Abstract: Mechanisms are provided for optimizing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial default value and initial range for sampling of a parameter of the machine learning (ML) model and an initial AutoML process is executed on the ML model based on a plurality of datasets comprising a plurality of domains of data elements, utilizing the initially configured AutoML logic. For each domain, a cross-dataset default value and cross-dataset value range are derived from results of the execution of the initial AutoML process. For each domain, an entry is stored in a data structure, the entry storing the derived cross-dataset default value and cross-dataset value range for the domain. The AutoML logic performs a subsequent AutoML process on a new dataset based on one or more entries of the data structure.
-
25.
公开(公告)号:US11120225B2
公开(公告)日:2021-09-14
申请号:US16267951
申请日:2019-02-05
Applicant: International Business Machines Corporation
Inventor: Ming Tan , Ladislav Kunc , Yang Yu , Haoyu Wang , Saloni Potdar
Abstract: An online version of a sentence representation generation module updated by training a first sentence representation generation module using first labeled data of a first corpus. After training the first sentence representation generation module using the first labeled data, a second corpus of second labeled data is obtained. The second corpus is distinct from the first corpus. A subset of the first labeled data is identified based on similarities between the first corpus and the second corpus. A second sentence representation generation module is trained using the second labeled data of the second corpus and the subset of the first labeled data.
-
公开(公告)号:US20210141860A1
公开(公告)日:2021-05-13
申请号:US16679464
申请日:2019-11-11
Applicant: International Business Machines Corporation
Inventor: Panos Karagiannis , Ladislav Kunc , Saloni Potdar , Haoyu Wang , Navneet N. Rao
Abstract: Provided is a method, system, and computer program product for context-dependent spellchecking. The method comprises receiving context data to be used in spell checking. The method further comprises receiving a user input. The method further comprises identifying an out-of-vocabulary (OOV) word in the user input. An initial suggestion pool of candidate words is identified based, at least in part, on the context data. The method then comprises using a noisy channel approach to evaluate a probability that one or more of the candidate words of the initial suggestion pool is an intended word and should be used as a candidate for replacement of the OOV word. The method further comprises selecting one or more candidate words for replacement of the OOV word. The method further comprises outputting the one or more candidates.
-
27.
公开(公告)号:US20200089773A1
公开(公告)日:2020-03-19
申请号:US16131940
申请日:2018-09-14
Applicant: International Business Machines Corporation
Inventor: Yang Yu , Ladislav Kunc , Saloni Potdar
Abstract: A method, system and computer program product are provided for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems. User intents are identified using separately trained models with corresponding training data. Natural language processing (NLP) and statistical analysis are applied on the training data to classify the training data into groups and modules. A confidence rescaling algorithm is used for combining the modules. The dynamic confidence rescaling uses statistical information computed about each module being combined to identify user intents with enhanced accuracies in comparison to baseline models without confidence rescaling.
-
公开(公告)号:US20180089584A1
公开(公告)日:2018-03-29
申请号:US15279250
申请日:2016-09-28
Applicant: International Business Machines Corporation
Inventor: Raimo Bakis , Ladislav Kunc , David Nahamoo , Lazaros Polymenakos , John Zakos
Abstract: Embodiments provide a computer implemented method, in a data processing system comprising a processor and a memory comprising instructions which are executed by the processor to cause the processor to train an enhanced chatflow system, the method comprising: ingesting, using a rule-based module, a corpus of information comprising at least one user input node corresponding to a user question and at least one expert-designed variation for each user input node; matching, using the rule-based module, one or more user inputs to one or more corresponding dialog nodes using regular expressions and delimiters; ingesting, using a statistical matching module, one or more usage logs from a deployed dialog system, each usage log comprising at least one user input node; for each user input node: designating the node as a class; storing the node in a dialog node repository; designating each of the at least one variations as training examples for the designated class; converting the classes and the training examples into feature vector representations; training one or more classifiers using the one or more feature vector representations of the classes; training classification objectives using the one or more feature vector representations of the training examples; and incorporating the training of the classifiers and the classification objectives into enhanced chatflow system.
-
-
-
-
-
-
-