Document-based synonym generation
    1.
    发明授权
    Document-based synonym generation 有权
    基于文档的同义词生成

    公开(公告)号:US07890521B1

    公开(公告)日:2011-02-15

    申请号:US12027559

    申请日:2008-02-07

    CPC classification number: G06F17/30867 G06F17/2795

    Abstract: One embodiment of the present invention provides a system that automatically generates synonyms for words from documents. During operation, this system determines co-occurrence frequencies for pairs of words in the documents. The system also determines closeness scores for pairs of words in the documents, wherein a closeness score indicates whether a pair of words are located so close to each other that the words are likely to occur in the same sentence or phrase. Finally, the system determines whether pairs of words are synonyms based on the determined co-occurrence frequencies and the determined closeness scores. While making this determination, the system can additionally consider correlations between words in a title or an anchor of a document and words in the document as well as word-form scores for pairs of words in the documents.

    Abstract translation: 本发明的一个实施例提供了一种自动生成来自文档的单词的同义词的系统。 在操作期间,该系统确定文档中的单词对的同现频率。 该系统还确定文档中的单词对的接近度分数,其中一个接近度分数指示一对单词是否彼此靠近,以致该单词可能以相同的句子或短语发生。 最后,系统基于所确定的同现频率和所确定的接近度分数来确定词组是否是同义词。 在进行该确定的同时,系统还可以考虑文档中的标题或锚点之间的相关性以及文档中的单词以及文档中的单词对的单词分数。

    Document-based synonym generation
    2.
    发明授权
    Document-based synonym generation 有权
    基于文档的同义词生成

    公开(公告)号:US08392413B1

    公开(公告)日:2013-03-05

    申请号:US13352126

    申请日:2012-01-17

    CPC classification number: G06F17/30867 G06F17/2795

    Abstract: One embodiment of the present invention provides a system that automatically generates synonyms for words from documents. During operation, this system determines co-occurrence frequencies for pairs of words in the documents. The system also determines closeness scores for pairs of words in the documents, wherein a closeness score indicates whether a pair of words are located so close to each other that the words are likely to occur in the same sentence or phrase. Finally, the system determines whether pairs of words are synonyms based on the determined co-occurrence frequencies and the determined closeness scores. While making this determination, the system can additionally consider correlations between words in a title or an anchor of a document and words in the document as well as word-form scores for pairs of words in the documents.

    Abstract translation: 本发明的一个实施例提供了一种自动生成来自文档的单词的同义词的系统。 在操作期间,该系统确定文档中的单词对的同现频率。 该系统还确定文档中的单词对的接近度分数,其中一个接近度分数指示一对单词是否彼此靠近,以致该单词可能以相同的句子或短语发生。 最后,系统基于所确定的同现频率和所确定的接近度分数来确定词组是否是同义词。 在进行该确定的同时,系统还可以考虑文档中的标题或锚点之间的相关性以及文档中的单词以及文档中的单词对的单词分数。

    Document-based synonym generation
    3.
    发明授权
    Document-based synonym generation 有权
    基于文档的同义词生成

    公开(公告)号:US08161041B1

    公开(公告)日:2012-04-17

    申请号:US13024731

    申请日:2011-02-10

    CPC classification number: G06F17/30867 G06F17/2795

    Abstract: One embodiment of the present invention provides a system that automatically generates synonyms for words from documents. During operation, this system determines co-occurrence frequencies for pairs of words in the documents. The system also determines closeness scores for pairs of words in the documents, wherein a closeness score indicates whether a pair of words are located so close to each other that the words are likely to occur in the same sentence or phrase. Finally, the system determines whether pairs of words are synonyms based on the determined co-occurrence frequencies and the determined closeness scores. While making this determination, the system can additionally consider correlations between words in a title or an anchor of a document and words in the document as well as word-form scores for pairs of words in the documents.

    Abstract translation: 本发明的一个实施例提供了一种自动生成来自文档的单词的同义词的系统。 在操作期间,该系统确定文档中的单词对的同现频率。 该系统还确定文档中的单词对的接近度分数,其中一个接近度分数指示一对单词是否彼此靠近,以致该单词可能以相同的句子或短语发生。 最后,系统基于所确定的同现频率和所确定的接近度分数来确定词组是否是同义词。 在进行该确定的同时,系统还可以考虑文档中的标题或锚点之间的相关性以及文档中的单词以及文档中的单词对的单词分数。

    Refining search results
    4.
    发明授权
    Refining search results 有权
    精炼搜索结果

    公开(公告)号:US08738596B1

    公开(公告)日:2014-05-27

    申请号:US13310901

    申请日:2011-12-05

    CPC classification number: G06F17/3053 G06F17/30424 G06F17/30867

    Abstract: A computer-implemented method for processing query information includes receiving data representative of a search query from a user search session. The method also includes identifying a plurality of search results based upon the search query. Each search result is associated with a plurality of user characteristics and data that represents requestor behavior relative to previously submitted queries associated with the respective search result. The method also includes ordering the plurality of user characteristics based upon the data that represents requestor behavior relative to previously submitted queries and the respective search result. The method also includes adjusting the ordered plurality of user characteristics based upon at least one predefined compatibility associated with the user characteristics. The method also includes ranking the search results based upon the adjusted plurality of user characteristics.

    Abstract translation: 用于处理查询信息的计算机实现的方法包括从用户搜索会话接收表示搜索查询的数据。 该方法还包括基于搜索查询识别多个搜索结果。 每个搜索结果与多个用户特征和数据相关联,该多个用户特征和数据表示相对于与相应搜索结果相关联的先前提交的查询的请求者行为。 该方法还包括基于表示相对于先前提交的查询和相应搜索结果的请求者行为的数据来排序多个用户特征。 该方法还包括基于与用户特征相关联的至少一个预定义的兼容性来调整有序多个用户特征。 该方法还包括基于经调整的多个用户特征对搜索结果进行排序。

    Refining search results
    5.
    发明授权

    公开(公告)号:US09418104B1

    公开(公告)日:2016-08-16

    申请号:US13620528

    申请日:2012-09-14

    CPC classification number: G06F17/3053 G06F17/30424 G06F17/30867

    Abstract: A computer-implemented method for processing query information includes receiving data representative of a search query from a user search session. The method also includes identifying a plurality of search results based upon the search query. Each search result is associated with a plurality of user characteristics and data that represents requestor behavior relative to previously submitted queries associated with the respective search result. The method also includes ordering the plurality of user characteristics based upon the data that represents requestor behavior relative to previously submitted queries and the respective search result. The method also includes adjusting the ordered plurality of user characteristics based upon at least one predefined compatibility associated with the user characteristics. The method also includes ranking the search results based upon the adjusted plurality of user characteristics.

    Locally Significant Search Queries
    6.
    发明申请
    Locally Significant Search Queries 有权
    本地重要搜索查询

    公开(公告)号:US20140172843A1

    公开(公告)日:2014-06-19

    申请号:US13161836

    申请日:2011-06-16

    CPC classification number: G06F17/3087

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for servicing search queries. In one aspect, a method includes determining that a general search query is a locally significant query for a user location that is associated with the user general search query. In turn, a local search query is generated using the general search query and a location phrase representing the user location. A set of set of general search results responsive to the general search query and a set of local search results responsive to the local search query are requested. A final set of search results responsive to the search query are selected. The final set of search results include at least one search result that is included in the set of local search results, and is not included in a pre-specified quantity of highest ranking search results from the set of general search results. Data that cause presentation of the final set of search results are provided.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于服务搜索查询的计算机程序。 一方面,一种方法包括确定一般搜索查询是与用户一般搜索查询相关联的用户位置的本地重要查询。 反过来,使用一般搜索查询和表示用户位置的位置短语生成本地搜索查询。 请求响应于一般搜索查询的一组一般搜索结果和响应于本地搜索查询的一组本地搜索结果。 选择响应于搜索查询的最终搜索结果集。 最后一组搜索结果包括至少一个包含在本地搜索结果集中的搜索结果,并且不包括在来自一般搜索结果集的预定数量的最高排名搜索结果中。 提供了导致最终搜索结果集的呈现的数据。

    STATISTICAL STEMMING
    7.
    发明申请
    STATISTICAL STEMMING 有权
    统计学

    公开(公告)号:US20130173250A1

    公开(公告)日:2013-07-04

    申请号:US13710055

    申请日:2012-12-10

    CPC classification number: G06F17/2872 G06F17/2755

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating suffix rewriting rules. A method includes obtaining a plurality of canonical suffix-rewriting rules each associated with one or more words, generating a suffix tree from the words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of final suffix-rewriting rules from the nodes in the minimum colored subset. Another method includes receiving applicable and non-applicable words for a suffix-rewriting rule, generating a suffix tree from the applicable words and the non-applicable words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of suffix-rewriting rules, wherein each rule corresponds to a node in the minimum colored subset with a valid status.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生后缀重写规则。 一种方法包括获得与一个或多个单词相关联的多个规范后缀重写规则,从所述单词生成后缀树,选择所述节点中的最小彩色子集和所述后缀树中的叶子,以及生成多个最终后缀 来自最小彩色子集中的节点的初步规则。 另一种方法包括接收用于后缀重写规则的适用和不适用词语,从适用词语和不适用词语生成后缀树,选择后缀树中节点和树叶的最小有色子集,并生成 多个后缀重写规则,其中每个规则对应于具有有效状态的最小彩色子集中的节点。

    Locally significant search queries
    8.
    发明授权
    Locally significant search queries 有权
    本地重要搜索查询

    公开(公告)号:US09348925B2

    公开(公告)日:2016-05-24

    申请号:US13161836

    申请日:2011-06-16

    CPC classification number: G06F17/3087

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for servicing search queries. In one aspect, a method includes determining that a general search query is a locally significant query for a user location that is associated with the user general search query. In turn, a local search query is generated using the general search query and a location phrase representing the user location. A set of set of general search results responsive to the general search query and a set of local search results responsive to the local search query are requested. A final set of search results responsive to the search query are selected. The final set of search results include at least one search result that is included in the set of local search results, and is not included in a pre-specified quantity of highest ranking search results from the set of general search results. Data that cause presentation of the final set of search results are provided.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于服务搜索查询的计算机程序。 一方面,一种方法包括确定一般搜索查询是与用户一般搜索查询相关联的用户位置的本地重要查询。 反过来,使用一般搜索查询和表示用户位置的位置短语生成本地搜索查询。 请求响应于一般搜索查询的一组一般搜索结果和响应于本地搜索查询的一组本地搜索结果。 选择响应于搜索查询的最终搜索结果集。 最后一组搜索结果包括至少一个包含在本地搜索结果集中的搜索结果,并且不包括在来自一般搜索结果集的预定数量的最高排名搜索结果中。 提供了导致最终搜索结果集的呈现的数据。

    Refining search results
    9.
    发明授权
    Refining search results 有权
    精炼搜索结果

    公开(公告)号:US08498974B1

    公开(公告)日:2013-07-30

    申请号:US12551052

    申请日:2009-08-31

    CPC classification number: G06F17/3053 G06F17/30424 G06F17/30867

    Abstract: A computer-implemented method for processing query information includes receiving data representative of a search query from a user search session. The method also includes identifying a plurality of search results based upon the search query. Each search result is associated with a plurality of user characteristics and data that represents requestor behavior relative to previously submitted queries associated with the respective search result. The method also includes ordering the plurality of user characteristics based upon the data that represents requestor behavior relative to previously submitted queries and the respective search result. The method also includes adjusting the ordered plurality of user characteristics based upon at least one predefined compatibility associated with the user characteristics. The method also includes ranking the search results based upon the adjusted plurality of user characteristics.

    Abstract translation: 用于处理查询信息的计算机实现的方法包括从用户搜索会话接收表示搜索查询的数据。 该方法还包括基于搜索查询识别多个搜索结果。 每个搜索结果与多个用户特征和数据相关联,该多个用户特征和数据表示相对于与相应搜索结果相关联的先前提交的查询的请求者行为。 该方法还包括基于表示相对于先前提交的查询和相应搜索结果的请求者行为的数据来排序多个用户特征。 该方法还包括基于与用户特征相关联的至少一个预定义的兼容性来调整有序多个用户特征。 该方法还包括基于经调整的多个用户特征对搜索结果进行排序。

    Statistical stemming
    10.
    发明授权
    Statistical stemming 有权
    统计词干

    公开(公告)号:US08352247B2

    公开(公告)日:2013-01-08

    申请号:US13453473

    申请日:2012-04-23

    CPC classification number: G06F17/2872 G06F17/2755

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating suffix rewriting rules. A method includes obtaining a plurality of canonical suffix-rewriting rules each associated with one or more words, generating a suffix tree from the words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of final suffix-rewriting rules from the nodes in the minimum colored subset. Another method includes receiving applicable and non-applicable words for a suffix-rewriting rule, generating a suffix tree from the applicable words and the non-applicable words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of suffix-rewriting rules, wherein each rule corresponds to a node in the minimum colored subset with a valid status.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生后缀重写规则。 一种方法包括获得与一个或多个单词相关联的多个规范后缀重写规则,从所述单词生成后缀树,选择所述节点中的最小彩色子集和所述后缀树中的叶子,以及生成多个最终后缀 来自最小彩色子集中的节点的初步规则。 另一种方法包括接收用于后缀重写规则的适用和不适用词语,从适用词语和不适用词语生成后缀树,选择后缀树中节点和树叶的最小有色子集,并生成 多个后缀重写规则,其中每个规则对应于具有有效状态的最小彩色子集中的节点。

Patent Agency Ranking