-
公开(公告)号:US20210326636A1
公开(公告)日:2021-10-21
申请号:US16850735
申请日:2020-04-16
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Ruchi Mahindru , Md Faisal Mahbub Chowdhury , Yu Deng , Alfio Massimiliano Gliozzo , Sarthak Dash , Nicolas Rodolfo Fauceglia , Gaetano Rossiello
Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.
-
公开(公告)号:US20230009946A1
公开(公告)日:2023-01-12
申请号:US17373269
申请日:2021-07-12
Applicant: International Business Machines Corporation
IPC: G06F16/2452 , G06N7/00 , G06N5/02 , G06N20/00 , G06F16/21 , G06F16/901
Abstract: Systems, devices, computer-implemented methods, and/or computer program products that facilitate generative relation linking for question answering over knowledge bases. In one example, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components can comprise a relation linking component. The relation linking component can map relations identified in a natural language question to corresponding relations of a knowledge base using a generative model.
-
公开(公告)号:US20210109995A1
公开(公告)日:2021-04-15
申请号:US16600774
申请日:2019-10-14
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Robert G. Farrell , Nicolas Rodolfo Fauceglia , Alfio Massimiliano Gliozzo
IPC: G06F17/27 , G06F16/335 , G06F16/901
Abstract: Systems and techniques that facilitate spurious relationship filtration from external knowledge graphs based on distributional semantics of an input corpus are provided. In one or more embodiments, a context component can generate a context-based word embedding of one or more first terms in a document collection. The embedding can yield vector representations of the one or more first terms. The one or more first terms can correspond to knowledge terms in one or more first nodes of a knowledge graph. In one or more embodiments, a filtering component can filter out a relationship between the one or more first nodes and a second node of the knowledge graph based on a similarity value being less than a threshold. The similarity value can be a function of the vector representations of the one or more first terms. In various embodiments, cosine similarity can be used to compute the similarity value.
-
公开(公告)号:US12072841B2
公开(公告)日:2024-08-27
申请号:US18054984
申请日:2022-11-14
Applicant: International Business Machines Corporation
Inventor: Gaetano Rossiello , Md Faisal Mahbub Chowdhury , Alfio Massimiliano Gliozzo , Nandana Mihindukulasooriya , Michael Robert Glass
CPC classification number: G06F16/16 , G06F16/148 , G06N20/00
Abstract: One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to a process for generating the classification of files to allow for file system organization and/or query augmentation. A system can comprise a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory, wherein the computer executable components can comprise a generating component that generates a keyphrase based on a context derived from evaluation of an input file, wherein the generating component employs a public repository of files annotated with a plurality of keyphrases, including the keyphrase, to generate the keyphrase based on the context, and an execution component that classifies the input file based on the keyphrase. In one or more embodiments, the input file can comprise a query, and classification of the input file can comprise augmenting the query based on the keyphrase.
-
公开(公告)号:US11868716B2
公开(公告)日:2024-01-09
申请号:US17462327
申请日:2021-08-31
Applicant: International Business Machines Corporation
Inventor: Srinivas Ravishankar , Pavan Kapanipathi Bangalore , Ibrahim Abdelaziz , Nandana Mihindukulasooriya , Dinesh Garg , Salim Roukos , Alexander Gray
IPC: G06F40/20 , G06F16/2452 , G06N3/08 , G06F16/901 , G06N3/042
CPC classification number: G06F40/20 , G06F16/24522 , G06F16/9024 , G06N3/042 , G06N3/08
Abstract: One or more computer processors parse a received natural language question into an abstract meaning representation (AMR) graph. The one or more computer processors enrich the AMR graph into an extended AMR graph. The one or more computer processors transform the extended AMR graph into a query graph utilizing a path-based approach, wherein the query graph is a directed edge-labeled graph. The one or more computer processors generate one or more answers to the natural language question through one or more queries created utilizing the query graph.
-
公开(公告)号:US20240160607A1
公开(公告)日:2024-05-16
申请号:US18054984
申请日:2022-11-14
Applicant: International Business Machines Corporation
Inventor: Gaetano Rossiello , Md Faisal Mahbub Chowdhury , Alfio Massimiliano Gliozzo , Nandana Mihindukulasooriya , Michael Robert Glass
CPC classification number: G06F16/16 , G06F16/148 , G06N20/00
Abstract: One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to a process for generating the classification of files to allow for file system organization and/or query augmentation. A system can comprise a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory, wherein the computer executable components can comprise a generating component that generates a keyphrase based on a context derived from evaluation of an input file, wherein the generating component employs a public repository of files annotated with a plurality of keyphrases, including the keyphrase, to generate the keyphrase based on the context, and an execution component that classifies the input file based on the keyphrase. In one or more embodiments, the input file can comprise a query, and classification of the input file can comprise augmenting the query based on the keyphrase.
-
公开(公告)号:US11526688B2
公开(公告)日:2022-12-13
申请号:US16850735
申请日:2020-04-16
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Ruchi Mahindru , Md Faisal Mahbub Chowdhury , Yu Deng , Alfio Massimiliano Gliozzo , Sarthak Dash , Nicolas Rodolfo Fauceglia , Gaetano Rossiello
Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.
-
公开(公告)号:US20220129770A1
公开(公告)日:2022-04-28
申请号:US17079202
申请日:2020-10-23
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Gaetano Rossiello , Alfio Massimiliano Gliozzo , Pavan Kapanipathi Bangalore , Salim Roukos
Abstract: A computer-implemented method according to one embodiment includes identifying a natural language query; translating the natural language query into an intermediate representation; converting the intermediate representation into one or more query triples; and performing relation linking between each of the one or more query triples and a plurality of knowledge base triples.
-
公开(公告)号:US12106230B2
公开(公告)日:2024-10-01
申请号:US17079202
申请日:2020-10-23
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Gaetano Rossiello , Alfio Massimiliano Gliozzo , Pavan Kapanipathi Bangalore , Salim Roukos
CPC classification number: G06N5/04 , G06F16/334 , G06N20/00
Abstract: A computer-implemented method according to one embodiment includes identifying a natural language query; translating the natural language query into an intermediate representation; converting the intermediate representation into one or more query triples; and performing relation linking between each of the one or more query triples and a plurality of knowledge base triples.
-
公开(公告)号:US11755843B2
公开(公告)日:2023-09-12
申请号:US17323584
申请日:2021-05-18
Applicant: International Business Machines Corporation
Inventor: Nandana Mihindukulasooriya , Robert G. Farrell , Nicolas Rodolfo Fauceglia , Alfio Massimiliano Gliozzo
IPC: G06F40/30 , G06F16/335 , G06F16/901 , G06F40/211 , G06F40/247 , G06F40/268 , G06F40/284 , G06F40/295
CPC classification number: G06F40/30 , G06F16/335 , G06F16/9024 , G06F40/211 , G06F40/247 , G06F40/268 , G06F40/284 , G06F40/295
Abstract: Systems and techniques that facilitate spurious relationship filtration from external knowledge graphs based on distributional semantics of an input corpus are provided. In one or more embodiments, a context component can generate a context-based word embedding of one or more first terms in a document collection. The embedding can yield vector representations of the one or more first terms. The one or more first terms can correspond to knowledge terms in one or more first nodes of a knowledge graph. In one or more embodiments, a filtering component can filter out a relationship between the one or more first nodes and a second node of the knowledge graph based on a similarity value being less than a threshold. The similarity value can be a function of the vector representations of the one or more first terms. In various embodiments, cosine similarity can be used to compute the similarity value.
-
-
-
-
-
-
-
-
-