System for categorizing lists of words of arbitrary origin
    1.
    发明申请
    System for categorizing lists of words of arbitrary origin 有权
    用于对任意来源的词列表进行分类的系统

    公开(公告)号:US20140279732A1

    公开(公告)日:2014-09-18

    申请号:US13826188

    申请日:2013-03-14

    CPC classification number: G06N99/005

    Abstract: The present disclosure provides for categorization of lists of words. The method comprises querying DBpedia to find the resources related to the given list of words. Once the resources are found, the corresponding media Wikipedia categories can be retrieved, as well as their ancestors, generating a graph of categories. A number of graph analysis algorithms can then be applied to the graph, each returning a selected category. For each algorithm a classifier is trained to decide whether the output of the algorithm is indeed the “best” category. An ensemble weighted majority voting can then be used to select the best category based on votes cast by each classifier. The disclosure demonstrates a more accurate selection of the best category and can include an ensemble majority rated voting algorithm comprising all voting members initially casting one vote; i.e., highest frequency, most frequently occurring word, least common ancestor and centrality measures.

    Abstract translation: 本公开提供了词语列表的分类。 该方法包括查询DBpedia以查找与给定列表单词相关的资源。 一旦找到资源,可以检索相应的媒体维基百科类别,以及其祖先,生成类别图。 然后可以将许多图分析算法应用于图,每个返回所选类别。 对于每个算法,分类器被训练以决定算法的输出是否确实是“最佳”类别。 然后可以根据每个分类器投票的投票来使用合奏加权多数投票来选择最佳类别。 披露表明更精确地选择最佳类别,并且可以包括包含所有投票成员最初投一票的合奏多数评级投票算法; 即最高频率,最常出现的词,最不常见的祖先和中心性度量。

    System for categorizing lists of words of arbitrary origin
    3.
    发明授权
    System for categorizing lists of words of arbitrary origin 有权
    用于对任意来源的词列表进行分类的系统

    公开(公告)号:US09171267B2

    公开(公告)日:2015-10-27

    申请号:US13826188

    申请日:2013-03-14

    CPC classification number: G06N99/005

    Abstract: The present disclosure provides for categorization of lists of words. The method comprises querying DBpedia to find the resources related to the given list of words. Once the resources are found, the corresponding media Wikipedia categories can be retrieved, as well as their ancestors, generating a graph of categories. A number of graph analysis algorithms can then be applied to the graph, each returning a selected category. For each algorithm a classifier is trained to decide whether the output of the algorithm is indeed the “best” category. An ensemble weighted majority voting can then be used to select the best category based on votes cast by each classifier. The disclosure demonstrates a more accurate selection of the best category and can include an ensemble majority rated voting algorithm comprising all voting members initially casting one vote; i.e., highest frequency, most frequently occurring word, least common ancestor and centrality measures.

    Abstract translation: 本公开提供了词语列表的分类。 该方法包括查询DBpedia以查找与给定列表单词相关的资源。 一旦找到资源,可以检索相应的媒体维基百科类别,以及其祖先,生成类别图。 然后可以将许多图分析算法应用于图,每个返回所选类别。 对于每个算法,分类器被训练以决定算法的输出是否确实是“最佳”类别。 然后可以根据每个分类器投票的投票来使用合奏加权多数投票来选择最佳类别。 披露表明更精确地选择最佳类别,并且可以包括包含所有投票成员最初投一票的合奏多数评级投票算法; 即最高频率,最常出现的词,最不常见的祖先和中心性度量。

Patent Agency Ranking