Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems

    公开(公告)号:US09817920B1

    公开(公告)日:2017-11-14

    申请号:US14628692

    申请日:2015-02-23

    Applicant: GOOGLE INC.

    Abstract: A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

    Efficient query rewriting
    2.
    发明授权
    Efficient query rewriting 有权
    高效的查询重写

    公开(公告)号:US09165033B1

    公开(公告)日:2015-10-20

    申请号:US14154024

    申请日:2014-01-13

    Applicant: Google Inc.

    CPC classification number: G06F17/30448 G06F17/30457 G06F17/30554

    Abstract: Methods and systems for efficient query rewriting and the like are described here. One such described method comprises: offline mapping frequently-seen search queries to rewritten queries that may be better for searching; offline caching the mapping in a cache memory; and upon receiving a search query from a user similar to one of the mapped search queries, obtaining a corresponding rewritten query from the mapping in the cache memory based on predetermined conditions, and issuing a search of the rewritten query to the backend data system in order to avoid having to issue a search query to the backend data system twice while the user is online.

    Abstract translation: 这里描述了用于高效查询重写的方法和系统等。 一种这样描述的方法包括:将经常看到的搜索查询离线映射到可能更好地用于搜索的重写查询; 离线缓存高速缓存中的映射; 并且在接收到来自类似于所映射的搜索查询之一的用户的搜索查询时,基于预定条件从高速缓冲存储器中的映射获得相应的重写查询,并按顺序向后端数据系统发出重写查询的搜索 以避免在用户在线时必须向后端数据系统发出两次搜索查询。

    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
    3.
    发明授权
    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems 有权
    在基于关键字的检索系统中找到有意义的词汇或停止词组

    公开(公告)号:US08965919B1

    公开(公告)日:2015-02-24

    申请号:US14143161

    申请日:2013-12-30

    Applicant: Google Inc.

    Abstract: A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

    Abstract translation: 停止词检测组件在输入到基于关键字的信息检索系统的搜索查询中检测到停止词(也称为停止词)。 最初通过将搜索查询中的术语与已知无效词列表进行比较来识别潜在的禁忌词。 然后基于搜索查询和所识别的无效词来检索上下文数据。 在一个实现中,上下文数据包括从文档索引检索的文档。 在另一实现中,上下文数据包括与搜索查询相关的类别。 将检索到的上下文数据的集合彼此进行比较以确定它们是否基本相似。 如果上下文数据集合基本相似,则可以使用该事实来推断潜在的停止词的移除对搜索不重要。 如果上下文数据集基本上不相似,潜在的停用词可以被认为是搜索的重要内容,不应该从查询中移除。

    Interleaving search results
    4.
    发明授权
    Interleaving search results 有权
    交叉搜索结果

    公开(公告)号:US09002817B2

    公开(公告)日:2015-04-07

    申请号:US14279763

    申请日:2014-05-16

    Applicant: Google Inc.

    CPC classification number: G06F17/30864 G06F17/30 G06F17/3053

    Abstract: Methods, systems, and computer program products are provided for interleaving search results. A method includes presenting multiple first search results received from a first search engine. The first search results satisfy a search query directed to the first search engine and are presented in an order. A second search result from a second search engine is inserted at a position between two otherwise adjacent first search results. The second search result is received from a second search engine in response to the search query.

    Abstract translation: 提供了用于交织搜索结果的方法,系统和计算机程序产品。 一种方法包括呈现从第一搜索引擎接收的多个第一搜索结果。 第一搜索结果满足针对第一搜索引擎的搜索查询并以顺序呈现。 来自第二搜索引擎的第二搜索结果插入在两个相邻的第一搜索结果之间的位置。 响应于搜索查询,从第二搜索引擎接收第二搜索结果。

    INTERLEAVING SEARCH RESULTS
    5.
    发明申请
    INTERLEAVING SEARCH RESULTS 有权
    交互搜索结果

    公开(公告)号:US20140365458A1

    公开(公告)日:2014-12-11

    申请号:US14279763

    申请日:2014-05-16

    Applicant: Google Inc.

    CPC classification number: G06F17/30864 G06F17/30 G06F17/3053

    Abstract: Methods, systems, and computer program products are provided for interleaving search results. A method includes presenting multiple first search results received from a first search engine. The first search results satisfy a search query directed to the first search engine and are presented in an order. A second search result from a second search engine is inserted at a position between two otherwise adjacent first search results. The second search result is received from a second search engine in response to the search query.

    Abstract translation: 提供了用于交织搜索结果的方法,系统和计算机程序产品。 一种方法包括呈现从第一搜索引擎接收的多个第一搜索结果。 第一搜索结果满足针对第一搜索引擎的搜索查询并以顺序呈现。 来自第二搜索引擎的第二搜索结果插入在两个相邻的第一搜索结果之间的位置。 响应于搜索查询,从第二搜索引擎接收第二搜索结果。

    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
    6.
    发明授权
    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems 有权
    在基于关键字的检索系统中找到有意义的词汇或停止词组

    公开(公告)号:US08626787B1

    公开(公告)日:2014-01-07

    申请号:US13922968

    申请日:2013-06-20

    Applicant: Google Inc.

    Abstract: A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

    Abstract translation: 停止词检测组件在输入到基于关键字的信息检索系统的搜索查询中检测到停止词(也称为停止词)。 最初通过将搜索查询中的术语与已知无效词列表进行比较来识别潜在的禁忌词。 然后基于搜索查询和所识别的无效词来检索上下文数据。 在一个实现中,上下文数据包括从文档索引检索的文档。 在另一实现中,上下文数据包括与搜索查询相关的类别。 将检索到的上下文数据的集合彼此进行比较,以确定它们是否基本相似。 如果上下文数据集合基本相似,则可以使用该事实来推断潜在的停止词的移除对搜索不重要。 如果上下文数据集基本上不相似,潜在的停用词可以被认为是搜索的重要内容,不应该从查询中移除。

Patent Agency Ranking