Semantic linkage qualification of ontologically related entities

    公开(公告)号:US11481561B2

    公开(公告)日:2022-10-25

    申请号:US16940625

    申请日:2020-07-28

    摘要: Aspects of the present disclosure include determining, by a processor, an ontology, the ontology comprising a plurality of ontological relationships, receiving, by the processor, a plurality of passages, determining, by the processor, a target set of co-occurring entities comprising a first entity and a second entity, determining a first passage in the plurality of passages that includes the first entity and the second entity, determining, from the ontology, a first ontological relationship between the first entity and the second entity, analyzing the first passage to determine a congruency score for the first ontological relationship, and generating a relationship annotation between the first entity and the second entity in the first passages based on the congruency score being within a threshold.

    SEMANTIC LINKAGE QUALIFICATION OF ONTOLOGICALLY RELATED ENTITIES

    公开(公告)号:US20220036009A1

    公开(公告)日:2022-02-03

    申请号:US16940625

    申请日:2020-07-28

    摘要: Aspects of the present disclosure include determining, by a processor, an ontology, the ontology comprising a plurality of ontological relationships, receiving, by the processor, a plurality of passages, determining, by the processor, a target set of co-occurring entities comprising a first entity and a second entity, determining a first passage in the plurality of passages that includes the first entity and the second entity, determining, from the ontology, a first ontological relationship between the first entity and the second entity, analyzing the first passage to determine a congruency score for the first ontological relationship, and generating a relationship annotation between the first entity and the second entity in the first passages based on the congruency score being within a threshold.

    PERFORMANCE CHARACTERISTICS OF CARTRIDGE ARTIFACTS OVER TEXT PATTERN CONSTRUCTS

    公开(公告)号:US20220012411A1

    公开(公告)日:2022-01-13

    申请号:US16925537

    申请日:2020-07-10

    摘要: Embodiments of the present invention are directed to evaluating the performance characteristics of annotator configurations against text pattern constructs in unstructured text. In a non-limiting embodiment of the invention, unstructured text is received by a processor. A text pattern construct is identified in the unstructured text and a first performance characteristic of an annotator is determined based on the text pattern construct. The text pattern construct is converted to a natural language text and a second performance characteristic of the annotator is determined based on the natural language text. A delta is determined between the first performance characteristic and the second performance characteristic. An alternative annotator configuration is identified for a portion of the unstructured text comprising the text pattern construct.

    Sliding window to detect entities in corpus using natural language processing

    公开(公告)号:US11222165B1

    公开(公告)日:2022-01-11

    申请号:US16996394

    申请日:2020-08-18

    IPC分类号: G06F40/166 G06F40/279

    摘要: According to one or more embodiments of the present invention, an input request to a natural language processing (NLP) system is optimized. A window-size is selected for annotating an input corpus. The corpus is divided into partitions of the window-size, each partition processed separately. Further, a first set of entities is identified in a first partition, and a second set of entities in a second partition. Further, a third partition containing a first segment and a second segment is determined. The first segment overlaps the first partition, and the second segment overlaps the second partition. The method further includes identifying a third set of entities in the third partition. In response to the third set of entities being distinct from a set of entities from the first segment and the second segment, the window-size is adjusted. The input request for the NLP system is generated using the adjusted window-size.

    RELEVANCE APPROXIMATION OF PASSAGE EVIDENCE

    公开(公告)号:US20210406294A1

    公开(公告)日:2021-12-30

    申请号:US16910159

    申请日:2020-06-24

    摘要: Aspects of the invention include receiving a search query from a user computing device. Retrieving a set of passages based on the search query, wherein each passage contains passage evidence and an annotation embedded as metadata. Scoring each annotation and each passage evidence, where each annotation score is based on a feature vector of the annotation and the search query, and where each passage evidence score is based on a feature vector of the passage evidence and the search query. Ranking each passage based on a passage evidence score and a score of one annotation contained in the passage. Returning a ranked list of each passage to the user computing device.

    PROPAGATION OF ANNOTATION METADATA TO OVERLAPPING ANNOTATIONS OF SYNONYMOUS TYPE

    公开(公告)号:US20210081496A1

    公开(公告)日:2021-03-18

    申请号:US16574167

    申请日:2019-09-18

    IPC分类号: G06F17/27 G06F16/33

    摘要: Aspects of the invention include systems and methods for the propagation of annotation metadata to overlapping annotations of a synonymous type. A non-limiting example computer-implemented method includes performing a comparison of a set of annotations to detect a subset of annotations that are candidates of being synonymous based on a first analysis. Whether a first annotation of the subset of annotations is synonymous with a second annotation of the subset of annotations is determined based on a second analysis. Distinct annotation metadata of the first annotation are cross-propogated with annotation metadata of the second annotation based on the second analysis.

    HANDLING FORM DATA ERRORS ARISING FROM NATURAL LANGUAGE PROCESSING

    公开(公告)号:US20220028502A1

    公开(公告)日:2022-01-27

    申请号:US16934061

    申请日:2020-07-21

    摘要: Aspects include receiving a document and classifying at least a subset of the document as having a first type of data. Features are extracted from the document. The extracting includes initiating processing of the at least a subset of the document by a first processing engine that was previously trained to extract features from the first type of data. The extracting also includes initiating processing of a remaining portion of the document not included in the at least a subset of the document by a second processing engine that was previously trained to extract features from a second type of data. The first type of data is different than the second type of data. Features are received from one or both of the first processing engine and the second processing engine. The received features are stored as features of the document.