-
公开(公告)号:US20230004977A1
公开(公告)日:2023-01-05
申请号:US17363515
申请日:2021-06-30
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Nina Corvelo Benz
Abstract: In an embodiment, a computer stores a bipartite graph that consists of a source subgraph and a target subgraph. Each vertex in the bipartite graph represents an entity. The source subgraph and the target subgraph are connected by many similarity edges. Each similarity edge indicates an original amount of similarity between the entity of a source vertex in the source subgraph and the entity of a target vertex in the target subgraph. For each similarity edge, the computer determines: a set of neighbor source vertices that are reachable from the source vertex of the similarity edge by traversing at most a source radius count of source edges in the source subgraph, a set of neighbor target vertices that are reachable from the target vertex of the similarity edge by traversing at most a target radius count of target edges in the target subgraph, and various amounts based on graph topology. For each similarity edge, the computer calculates a new amount of similarity based on those various amounts.
-
公开(公告)号:US20240330130A1
公开(公告)日:2024-10-03
申请号:US18740689
申请日:2024-06-12
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06N3/04 , G06V30/18181
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US20240370500A1
公开(公告)日:2024-11-07
申请号:US18773452
申请日:2024-07-15
Applicant: Oracle International Corporation
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US20210287069A1
公开(公告)日:2021-09-16
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06N3/04 , G06F16/903 , G06K9/62 , G06N5/04 , G06F40/30
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (45RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US12079282B2
公开(公告)日:2024-09-03
申请号:US16989306
申请日:2020-08-10
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Aras Mumcuyan , Iraklis Psaroudakis , Miroslav Cepek , Rhicheek Patra
IPC: G06F40/00 , G06F16/903 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
CPC classification number: G06F16/90344 , G06F18/2113 , G06F18/214 , G06F18/22 , G06F40/30 , G06N3/045 , G06N3/08 , G06N5/04
Abstract: Techniques are described herein for a Name Matching Engine that integrates two Machine Learning (ML) module options. The first ML module is a feature-engineered classifier that boosts text-based name matching techniques with a binary classifier ML model. The feature-engineered classifier comprises a first stage of text-based candidate finding, and a second stage in which a binary classifier model predicts whether each string, of the candidate match list, is a match or not. The binary classifier model is based on features from two or more of: a name feature level, a word feature level, a character feature level, and an initial feature level. The second ML module of the Name Matching Engine comprises an end-to-end Recurrent Neural Network (RNN) model that directly accepts name strings as a sequence of n-grams and generates learned text embeddings. The text embeddings of matching name strings are close to each other in the feature space.
-
公开(公告)号:US12050522B2
公开(公告)日:2024-07-30
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06N3/04 , G06V30/18181
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US20230229570A1
公开(公告)日:2023-07-20
申请号:US17577711
申请日:2022-01-18
Applicant: Oracle International Corporation
Inventor: Miroslav Cepek , Iraklis Psaroudakis , Rhicheek Patra , Timothy Trovatelli
CPC classification number: G06F11/1476 , G06V30/18181 , G06N3/04
Abstract: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
-
-
-
-
-