-
公开(公告)号:US20210133216A1
公开(公告)日:2021-05-06
申请号:US16735262
申请日:2020-01-06
Applicant: Microsoft Technology Licensing, LLC
Inventor: Dmitriy MEYERZON , Nikita VORONKOV , John Michael WINN , John GUIVER , Hadi Abbass KOTAICH
IPC: G06F16/28 , G06F16/2457 , G06F16/2455 , G06N5/02 , G06F16/901 , G06F16/93
Abstract: Examples described herein generally relate to a computer system including a knowledge graph storing a plurality of entities. The computer system generates an Aho Corasick trie including an entity name for each of the plurality of entities in the knowledge graph. The computer system compares a document viewed by a user to a plurality of templates defining potential entity names to identify extracts of the document matching at least one of the plurality of templates. The computer system applies the document to the Aho Corasick trie to determine potential entity names within the document that each match a respective one of the plurality of entities in the knowledge graph. The computer system annotates one or more matching entity names within the document with information from the knowledge graph for the respective ones of the plurality of entities to show, for example, a topic card providing information about the respective entities.
-
公开(公告)号:US20220019579A1
公开(公告)日:2022-01-20
申请号:US16933888
申请日:2020-07-20
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC.
Inventor: Dmitriy MEYERZON , Omar Zia KHAN , Hui LI , Vladimir V. GVOZDEV , John M. WINN , John GUIVER , Ivan KOROSTELEV , Matteo VENANZI , Alexander Armin SPENGLER , Pavel MYSHKOV , Elena POCHERNINA , Martin KUKLA , Yordan Kirilov ZAYKOV
IPC: G06F16/2458 , G06N3/04 , G06N5/04 , G06F16/28
Abstract: Examples described herein generally relate to a computer system including a knowledge graph storing a plurality of entities. A mining of a set of enterprise source documents within an enterprise intranet is performed, by a plurality of knowledge mining toolkits, to determine a plurality of entity names. The plurality of entity names are linked based on entity metadata by traversing various relationships between people, files, sites, groups, associated with entities. An entity record is generated within a knowledge graph for a mined entity name from the linked entity names based on an entity schema and ones of the set of enterprise source documents associated with the mined entity name. The entity record includes attributes aggregated from the ones of the set of enterprise source documents associated with the mined entity name.
-
公开(公告)号:US20230076773A1
公开(公告)日:2023-03-09
申请号:US17493819
申请日:2021-10-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: Elena POCHERNINA , John WINN , Matteo VENANZI , Ivan KOROSTELEV , Pavel MYSHKOV , Samuel Alexander WEBSTER , Yordan Kirilov ZAYKOV , Nikita VORONKOV , Dmitriy MEYERZON , Marius Alexandru BUNESCU , Alexander Armin SPENGLER , Vladimir GVOZDEV , Thomas P. MINKA , Anthony Arnold WIESER , Sanil RAJPUT , John GUIVER
IPC: G06F40/30 , G06F16/901 , G06F16/903
Abstract: In various examples there is a computer-implemented method of database construction. The method comprises storing a knowledge graph comprising nodes connected by edges, each node representing a topic. Accessing a topic type hierarchy comprising a plurality of types of topics, the topic type hierarchy having been computed from a corpus of text documents. One or more text documents are accessed and the method involves labelling a plurality of the nodes with one or more labels, each label denoting a topic type from the topic type hierarchy, by, using a deep language model; or for an individual one of the nodes representing a given topic, searching the accessed text documents for matches to at least one template, the template being a sequence of words and containing the given topic and a placeholder for a topic type; and storing the knowledge graph comprising the plurality of labelled nodes.
-
公开(公告)号:US20210110278A1
公开(公告)日:2021-04-15
申请号:US16601050
申请日:2019-10-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Dmitriy MEYERZON , Jeffrey WIGHT , Andrei Razvan POPOV , Andrei-Alin CORODESCU , Omar FARUK , Jan-Ove KARLBERG , Åge Andre KVALNES , Helge Grenager SOLHEIM , Thuy DUONG , Simon Thoresen HULT , Ivan KOROSTELEV , Matteo VENANZI , John GUIVER , John Michael WINN , Vladimir V. GVOZDEV , Nikita VORONKOV , Chia-Jiun TAN , Alexander Armin SPENGLER
IPC: G06N5/02 , G06F16/901 , G06K9/62 , G06F16/93
Abstract: Examples described herein generally relate to a computer system for generating a knowledge graph storing a plurality of entities and to displaying a topic page for an entity in the knowledge graph. The computer system performs a mining of source documents within an enterprise intranet to determine a plurality of entity names. The computer system generates an entity record within the knowledge graph for a mined entity name based on an entity schema and the source documents. The entity record includes attributes aggregated from the source documents. The computer system receives a curation action on the entity record from a first user. The computer system updates the entity record based on the curation action. The computer system displays an entity page including at least a portion of the attributes to a second user based on permissions of the second user to view the source documents.
-
公开(公告)号:US20220019622A1
公开(公告)日:2022-01-20
申请号:US16933947
申请日:2020-07-20
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor: Dmitriy MEYERZON , Omar Zia KHAN , Hui LI , John M. WINN , John GUIVER , Ivan KOROSTELEV , Matteo VENANZI , Alexander Armin SPENGLER , Pavel MYSHKOV , Elena POCHERNINA , Martin KUKLA , Yordan Kirilov ZAYKOV , Junyi CHAI , Noura FARRA , Sravya NARALA
IPC: G06F16/901 , G06F16/93
Abstract: Mining of a set of enterprise source documents within an enterprise intranet is performed, by a plurality of knowledge mining toolkits, to determine a plurality of entity names. A plurality of entity records are generated within a knowledge graph for mined entity names from the entity names based on an entity schema and ones of the set of enterprise source documents associated with the mined entity names. Pattern recognition is applied to an active document using an enterprise named entity recognition (ENER) system to identify potential entity names within the document that match a respective one of a plurality of entity records in the knowledge graph. One or more matching entity names are annotated within the document with information from the knowledge graph for the respective ones of the plurality of entity records. The annotated information is displayed with the active document.
-
公开(公告)号:US20190213484A1
公开(公告)日:2019-07-11
申请号:US15898211
申请日:2018-02-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: John Michael WINN , John GUIVER , Samuel Alexander WEBSTER , Yordan Kirilov ZAYKOV , Maciej KUKLA , Daniel FABIAN
CPC classification number: G06N5/022 , G06F16/334 , G06N7/005 , G06N20/00
Abstract: In various examples there is a knowledge base construction and/or maintenance system for use with a probabilistic knowledge base. The system has a probabilistic generative model comprising a process for generating text or other formatted data from the knowledge base. The system has an inference component configured to generate inference results, by carrying out inference using inference algorithms, run on the probabilistic generative model, in either a forward direction whereby text or other formatted data is generated, or a reverse direction whereby text or other formatted data is observed and at least one unobserved variable of the probabilistic generative model is inferred. The inference component is configured to update the knowledge base using at least some of the inference results.
-
-
-
-
-