-
公开(公告)号:US20210232908A1
公开(公告)日:2021-07-29
申请号:US16751755
申请日:2020-01-24
Applicant: Adobe Inc.
Inventor: Yikun Xian , Tak Yeon Lee , Sungchul Kim , Ryan Rossi , Handong Zhao
IPC: G06N3/08
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for dynamically determining schema labels for columns regardless of information availability within the columns. For example, the disclosed systems can identify a column that contains an arbitrary amount of information (e.g., a header-only column, a cell-only column, or a whole column). Additionally, the disclosed systems can generate a vector embedding for an arbitrary input column by selectively using a header neural network and/or a cell neural network based on whether the column includes a header label and/or whether the column includes a populated column cell. Furthermore, the disclosed systems can compare the column vector embedding to schema vector embeddings of candidate schema labels in a d-dimensional space to determine a schema label for the column.
-
公开(公告)号:US20220100714A1
公开(公告)日:2022-03-31
申请号:US17036453
申请日:2020-09-29
Applicant: ADOBE INC.
Inventor: Handong Zhao , Yikun Xian , Sungchul Kim , Tak Yeon Lee , Nikhil Belsare , Shashi Kant Rai , Vasanthi Holtcamp , Thomas Jacobs , Duy-Trung T. Dinh , Caroline Jiwon Kim
Abstract: Systems and methods for lifelong schema matching are described. The systems and methods include receiving data comprising a plurality of information categories, classifying each information category according to a schema comprising a plurality of classes, wherein the classification is performed by a neural network classifier trained based on a lifelong learning technique using a plurality of exemplar training sets, wherein each of the exemplar training sets includes a plurality of examples corresponding to one of the classes, and wherein the examples are selected based on a metric indicating how well each of the examples represents the corresponding class, and adding the data to a database based on the classification, wherein the database is organized according to the schema.
-
公开(公告)号:US20210264244A1
公开(公告)日:2021-08-26
申请号:US16796681
申请日:2020-02-20
Applicant: Adobe Inc.
Inventor: Yikun Xian , Tak Yeon Lee , Sungchul Kim , Ryan Rossi , Handong Zhao
IPC: G06N3/08 , G06F16/22 , G06F16/901 , G06F16/248 , G06F16/2457 , G06N5/02
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for generating generate explanatory paths for column annotations determined using a knowledge graph and a deep representation learning model. For instance, the disclosed systems can utilize a knowledge graph to generate an explanatory path for a column label determination from a deep representation learning model. For example, the disclosed systems can identify a column and determine a label for the column using a knowledge graph (e.g., a representation of a knowledge graph) that includes encodings of columns, column features, relational edges, and candidate labels. Then, the disclosed systems can determine a set of candidate paths between the column and the determined label for the column within the knowledge graph. Moreover, the disclosed systems can generate an explanatory path by ranking and selecting paths from the set of candidate paths using a greedy ranking and/or diversified ranking approach.
-
公开(公告)号:US11995048B2
公开(公告)日:2024-05-28
申请号:US17036453
申请日:2020-09-29
Applicant: ADOBE INC.
Inventor: Handong Zhao , Yikun Xian , Sungchul Kim , Tak Yeon Lee , Nikhil Belsare , Shashi Kant Rai , Vasanthi Holtcamp , Thomas Jacobs , Duy-Trung T Dinh , Caroline Jiwon Kim
IPC: G06F16/00 , G06F16/21 , G06F18/2115 , G06F18/214 , G06F18/2431 , G06N3/08 , G06V30/262
CPC classification number: G06F16/213 , G06F18/2115 , G06F18/2148 , G06F18/2431 , G06N3/08 , G06V30/274
Abstract: Systems and methods for lifelong schema matching are described. The systems and methods include receiving data comprising a plurality of information categories, classifying each information category according to a schema comprising a plurality of classes, wherein the classification is performed by a neural network classifier trained based on a lifelong learning technique using a plurality of exemplar training sets, wherein each of the exemplar training sets includes a plurality of examples corresponding to one of the classes, and wherein the examples are selected based on a metric indicating how well each of the examples represents the corresponding class, and adding the data to a database based on the classification, wherein the database is organized according to the schema.
-
公开(公告)号:US11645523B2
公开(公告)日:2023-05-09
申请号:US16796681
申请日:2020-02-20
Applicant: Adobe Inc.
Inventor: Yikun Xian , Tak Yeon Lee , Sungchul Kim , Ryan Rossi , Handong Zhao
IPC: G06F16/22 , G06F16/2457 , G06F16/248 , G06F16/901 , G06N3/08 , G06N5/02
CPC classification number: G06N3/08 , G06F16/221 , G06F16/248 , G06F16/24578 , G06F16/9024 , G06N5/02
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for generating generate explanatory paths for column annotations determined using a knowledge graph and a deep representation learning model. For instance, the disclosed systems can utilize a knowledge graph to generate an explanatory path for a column label determination from a deep representation learning model. For example, the disclosed systems can identify a column and determine a label for the column using a knowledge graph (e.g., a representation of a knowledge graph) that includes encodings of columns, column features, relational edges, and candidate labels. Then, the disclosed systems can determine a set of candidate paths between the column and the determined label for the column within the knowledge graph. Moreover, the disclosed systems can generate an explanatory path by ranking and selecting paths from the set of candidate paths using a greedy ranking and/or diversified ranking approach.
-
公开(公告)号:US11562234B2
公开(公告)日:2023-01-24
申请号:US16751755
申请日:2020-01-24
Applicant: Adobe Inc.
Inventor: Yikun Xian , Tak Yeon Lee , Sungchul Kim , Ryan Rossi , Handong Zhao
IPC: G06N3/08
Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for dynamically determining schema labels for columns regardless of information availability within the columns. For example, the disclosed systems can identify a column that contains an arbitrary amount of information (e.g., a header-only column, a cell-only column, or a whole column). Additionally, the disclosed systems can generate a vector embedding for an arbitrary input column by selectively using a header neural network and/or a cell neural network based on whether the column includes a header label and/or whether the column includes a populated column cell. Furthermore, the disclosed systems can compare the column vector embedding to schema vector embeddings of candidate schema labels in a d-dimensional space to determine a schema label for the column.
-
-
-
-
-