Natural language processing payload generation

    公开(公告)号:US11556705B2

    公开(公告)日:2023-01-17

    申请号:US17083510

    申请日:2020-10-29

    Abstract: An input text that is also transmitted to a text processing service (e.g., a cloud based text processing service) is received. Characterizing information (e.g., contiguous parts of speech, terms used per part of speech, payload length, etc.) is extracted from the input text. A text payload is generated using the characterizing information. A performance test is run on the text payload. The performance test can include performing at least one selected from a group consisting of: sentiment analysis on the text payload, entity analysis on the text payload, content classification on the text payload, and syntax analysis on the text payload. The performance test can yield a processing time required to perform the performance test. Memory and processing power resource allocation to the text processing service can be altered based on the processing time of the performance test.

    Method and System for Unlabeled Data Selection Using Failed Case Analysis

    公开(公告)号:US20210326719A1

    公开(公告)日:2021-10-21

    申请号:US16850985

    申请日:2020-04-16

    Abstract: A method, system, and a computer program product automatically select training data for updating a model by applying human-annotated training data to a model to generate results that are evaluated to identify correct case results and false case results that are categorized into error type categories for use in building error models corresponding to the error type categories, where each error model is built from at least failed case results belonging to a corresponding error type, and where unlabeled data samples are applied to each error model to compute an error likelihood for each unlabeled data sample with respect to each error type category, thereby enabling the selection and display of unlabeled data samples for annotation by a subject matter expert based on a computed error likelihood for the one or more unlabeled data samples in a specified error type category meeting or exceeding an error threshold requirement.

    Model training using a teacher-student learning paradigm

    公开(公告)号:US11526802B2

    公开(公告)日:2022-12-13

    申请号:US16451693

    申请日:2019-06-25

    Abstract: A method and a system for model training are provided. The method can include training a first classifier, a second classifier, and a third classifier with subsets of a labeled dataset. The method can also include predicting a pseudo labeled dataset from an unlabeled dataset using the first classifier, the second classifier, and the third classifier. The method further includes assigning a role to the first classifier, to the second classifier, and to the third classifier. The method can further include selecting a teaching sample dataset from the pseudo labeled dataset based on the role assigned to the third classifier, wherein the third classifier is assigned a role of a student. The method can also include retraining the third classifier with the teaching sample dataset in conjunction with a subset of the labeled dataset.

    Method and system for unlabeled data selection using failed case analysis

    公开(公告)号:US11443209B2

    公开(公告)日:2022-09-13

    申请号:US16850985

    申请日:2020-04-16

    Abstract: A method, system, and a computer program product automatically select training data for updating a model by applying human-annotated training data to a model to generate results that are evaluated to identify correct case results and false case results that are categorized into error type categories for use in building error models corresponding to the error type categories, where each error model is built from at least failed case results belonging to a corresponding error type, and where unlabeled data samples are applied to each error model to compute an error likelihood for each unlabeled data sample with respect to each error type category, thereby enabling the selection and display of unlabeled data samples for annotation by a subject matter expert based on a computed error likelihood for the one or more unlabeled data samples in a specified error type category meeting or exceeding an error threshold requirement.

Patent Agency Ranking