-
1.
公开(公告)号:US20140279732A1
公开(公告)日:2014-09-18
申请号:US13826188
申请日:2013-03-14
Applicant: XEROX CORPORATION
Inventor: Kirk J. Ocke , Dale E. Gaucas , Michael D. Shepherd
IPC: G06N99/00
CPC classification number: G06N99/005
Abstract: The present disclosure provides for categorization of lists of words. The method comprises querying DBpedia to find the resources related to the given list of words. Once the resources are found, the corresponding media Wikipedia categories can be retrieved, as well as their ancestors, generating a graph of categories. A number of graph analysis algorithms can then be applied to the graph, each returning a selected category. For each algorithm a classifier is trained to decide whether the output of the algorithm is indeed the “best” category. An ensemble weighted majority voting can then be used to select the best category based on votes cast by each classifier. The disclosure demonstrates a more accurate selection of the best category and can include an ensemble majority rated voting algorithm comprising all voting members initially casting one vote; i.e., highest frequency, most frequently occurring word, least common ancestor and centrality measures.
Abstract translation: 本公开提供了词语列表的分类。 该方法包括查询DBpedia以查找与给定列表单词相关的资源。 一旦找到资源,可以检索相应的媒体维基百科类别,以及其祖先,生成类别图。 然后可以将许多图分析算法应用于图,每个返回所选类别。 对于每个算法,分类器被训练以决定算法的输出是否确实是“最佳”类别。 然后可以根据每个分类器投票的投票来使用合奏加权多数投票来选择最佳类别。 披露表明更精确地选择最佳类别,并且可以包括包含所有投票成员最初投一票的合奏多数评级投票算法; 即最高频率,最常出现的词,最不常见的祖先和中心性度量。
-
2.
公开(公告)号:US20170177813A1
公开(公告)日:2017-06-22
申请号:US14978611
申请日:2015-12-22
Applicant: XEROX CORPORATION
Inventor: Jinhui Yao , Jing Zhou , Michael D. Shepherd , Lina Fu , Faming Li , Dennis F. Quebe, JR. , Jennie Echols , Xuejin Wen
CPC classification number: G16H50/20
Abstract: A computer system configured to improve health outcomes and reduce medical service costs includes a memory storing a computer program and a processor that executes the computer program. The computer program receives a medical inquiry, extracts a keyword using natural language processing (NLP), selects a category of concern indicated by the medical inquiry from a library using the keyword, determines leading factors contributing to the category of concern based on a statistical model analysis, selects analytic modules from a library that receive at least one of the leading factors as an input parameter or produce at least one of the leading factors as an output parameter, and generates a recommendation including a listing of the selected analytic modules and/or a constructed workflow including at least two of the selected analytic modules chained together via respective input parameters and output parameters of the at least two selected analytic modules.
-
3.
公开(公告)号:US09171267B2
公开(公告)日:2015-10-27
申请号:US13826188
申请日:2013-03-14
Applicant: Xerox Corporation
Inventor: Kirk J. Ocke , Dale E. Gaucas , Michael D. Shepherd
CPC classification number: G06N99/005
Abstract: The present disclosure provides for categorization of lists of words. The method comprises querying DBpedia to find the resources related to the given list of words. Once the resources are found, the corresponding media Wikipedia categories can be retrieved, as well as their ancestors, generating a graph of categories. A number of graph analysis algorithms can then be applied to the graph, each returning a selected category. For each algorithm a classifier is trained to decide whether the output of the algorithm is indeed the “best” category. An ensemble weighted majority voting can then be used to select the best category based on votes cast by each classifier. The disclosure demonstrates a more accurate selection of the best category and can include an ensemble majority rated voting algorithm comprising all voting members initially casting one vote; i.e., highest frequency, most frequently occurring word, least common ancestor and centrality measures.
Abstract translation: 本公开提供了词语列表的分类。 该方法包括查询DBpedia以查找与给定列表单词相关的资源。 一旦找到资源,可以检索相应的媒体维基百科类别,以及其祖先,生成类别图。 然后可以将许多图分析算法应用于图,每个返回所选类别。 对于每个算法,分类器被训练以决定算法的输出是否确实是“最佳”类别。 然后可以根据每个分类器投票的投票来使用合奏加权多数投票来选择最佳类别。 披露表明更精确地选择最佳类别,并且可以包括包含所有投票成员最初投一票的合奏多数评级投票算法; 即最高频率,最常出现的词,最不常见的祖先和中心性度量。
-
-