TRANSFORMING TABLES IN DOCUMENTS INTO KNOWLEDGE GRAPHS USING NATURAL LANGUAGE PROCESSING

    公开(公告)号:US20250036594A1

    公开(公告)日:2025-01-30

    申请号:US18226502

    申请日:2023-07-26

    Abstract: Herein from tabular data in a text document, a machine learning text classification pipeline infers natural language syntax and semantics to prepare the tabular data for graph analytics. In an embodiment, a computer infers a respective classification of each column in a table in a text document that contains natural language. Based on the classifications of the columns in the table, the vertices of the knowledge graph are automatically generated. Based on those column classifications and automatic analysis of a particular document portion that does not contain the table, the edges of the knowledge graph are automatically generated. The knowledge graph may be generated and operated as a property graph. Pipeline subsystems herein include column classification, edge type identification, and subject/object detection that provide sufficient semantic enrichment and context sensitivity to faster generate a more accurate knowledge graph than the state of the art.

Patent Agency Ranking