Systems and methods for named entity recognition

    公开(公告)号:US11436481B2

    公开(公告)日:2022-09-06

    申请号:US16134957

    申请日:2018-09-18

    Abstract: A method for natural language processing includes receiving, by one or more processors, an unstructured text input. An entity classifier is used to identify entities in the unstructured text input. The identifying the entities includes generating, using a plurality of sub-classifiers of a hierarchical neural network classifier of the entity classifier, a plurality of lower-level entity identifications associated with the unstructured text input. The identifying the entities further includes generating, using a combiner of the hierarchical neural network classifier, a plurality of higher-level entity identifications associated with the unstructured text input based on the plurality of lower-level entity identifications. Identified entities are provided based on the plurality of higher-level entity identifications.

    Systems and Methods for Out-of-Distribution Classification

    公开(公告)号:US20210150365A1

    公开(公告)日:2021-05-20

    申请号:US16877325

    申请日:2020-05-18

    Abstract: An embodiment provided herein preprocesses the input samples to the classification neural network, e.g., by adding Gaussian noise to word/sentence representations to make the function of the neural network satisfy Lipschitz property such that a small change in the input does not cause much change to the output if the input sample is in-distribution. Method to induce properties in the feature representation of neural network such that for out-of-distribution examples the feature representation magnitude is either close to zero or the feature representation is orthogonal to all class representations. Method to generate examples that are structurally similar to in-domain and semantically out-of domain for use in out-of-domain classification training. Method to prune feature representation dimension to mitigate long tail error of unused dimension in out-of-domain classification. Using these techniques, the accuracy of both in-domain and out-of-distribution identification can be improved.

Patent Agency Ranking