GENERATING AND PRESENTING DEEP LINKS
    91.
    发明申请
    GENERATING AND PRESENTING DEEP LINKS 审中-公开
    生成和展示深层链接

    公开(公告)号:US20130110815A1

    公开(公告)日:2013-05-02

    申请号:US13283632

    申请日:2011-10-28

    CPC classification number: G06F16/9566 G06F16/9535

    Abstract: Concepts and technologies are described herein for generating and presenting deep links. In accordance with the concepts and technologies disclosed herein a search engine is configured to generate deep links associated with a site. A site is identified by the search engine and the site is analyzed by the search engine with data relating to searches of and/or usage of the site. The search engine identifies links or other resources contained in, associated with, or referenced by the site, generates deep links corresponding to the resources, and associates the deep links with the site. If a site having indexed deep links is identified in search results, the search engine identifies one or more deep links associated with the site and presents the deep links with the search results to provide a searcher with relevant resources that may not satisfy the search query submitted by the searcher.

    Abstract translation: 这里描述了用于生成和呈现深层链接的概念和技术。 根据本文公开的概念和技术,搜索引擎被配置为生成与站点相关联的深层链接。 搜索引擎识别出一个网站,并且搜索引擎对该站点进行了与网站搜索和/或使用有关的数据的分析。 搜索引擎识别站点中包含,关联或引用的链接或其他资源,生成对应于资源的深层链接,并将深层链接与站点相关联。 如果在搜索结果中识别出索引了深层链接的网站,则搜索引擎识别与该网站相关联的一个或多个深层链接,并向搜索者呈现与搜索结果的深层链接,以向搜索者提供可能不满足提交的搜索查询的相关资源 由搜索者。

    Search service administration web service protocol
    92.
    发明授权
    Search service administration web service protocol 有权
    搜索服务管理Web服务协议

    公开(公告)号:US08364795B2

    公开(公告)日:2013-01-29

    申请号:US12766703

    申请日:2010-04-23

    CPC classification number: H04L63/08 G06F17/30867

    Abstract: The embodiments described herein generally relate to a method and system for enabling a client to configure and control the crawling function available through a crawl configuration Web service. A client is able to configure and control the crawling function by defining the URL space of the crawl. Such space may be defined by configuring the starting point(s) and other properties of the crawl. The client further configures the crawling function by creating and configuring a content source and/or a crawl rule. Further, a client defines authentication information applicable to the crawl to enable the discovery and retrieval of electronic documents requiring authentication and/or authorization information for access thereof. A protocol governs the format, structure and syntax (using a Web Services Description Language schema) of messages for communicating to and from the Web crawler through an application programming interface on a server hosting the crawler application.

    Abstract translation: 本文描述的实施例通常涉及用于使客户端能够配置和控制通过爬网配置Web服务可用的爬取功能的方法和系统。 客户端能够通过定义爬网的URL空间来配置和控制爬网功能。 可以通过配置爬网的起点和其他属性来定义这样的空间。 客户端通过创建和配置内容源和/或爬网规则来进一步配置抓取功能。 此外,客户端定义适用于爬行的认证信息,以便能够发现和检索需要认证和/或授权信息以进行访问的电子文档。 协议管理消息的格式,结构和语法(使用Web服务描述语言模式),以通过托管爬网程序应用程序的服务器上的应用程序编程接口与Web爬网程序进行通信。

    QUERY SUGGESTIONS USING REPLACEMENT SUBSTITUTIONS AND AN ADVANCED QUERY SYNTAX
    93.
    发明申请
    QUERY SUGGESTIONS USING REPLACEMENT SUBSTITUTIONS AND AN ADVANCED QUERY SYNTAX 审中-公开
    使用替换替代品和高级查询语法查询建议

    公开(公告)号:US20120117102A1

    公开(公告)日:2012-05-10

    申请号:US12939958

    申请日:2010-11-04

    CPC classification number: G06F16/3322

    Abstract: Query suggestion and other features are provided that include using an advanced query syntax, but are not so limited. A computer-implemented query service of an embodiment, operates to provide advanced query translations and suggestions based in part on a query rewriting algorithm that uses mappings and an advanced query syntax. A query method of one embodiment operates to provide one or more advanced queries that include one or more replacement queries that contain advanced query syntax. The method of an embodiment can automatically execute a rewritten query and/or present the rewritten query to the user as a query suggestion. Other embodiments are also disclosed.

    Abstract translation: 提供了查询建议和其他功能,包括使用高级查询语法,但不限于此。 实施例的计算机实现的查询服务部分地基于使用映射和高级查询语法的查询重写算法来提供高级查询翻译和建议。 一个实施例的查询方法用于提供包括一个或多个包含高级查询语法的替换查询的一个或多个高级查询。 实施例的方法可以作为查询建议自动地执行重写的查询和/或将重写的查询呈现给用户。 还公开了其他实施例。

    Index optimization for ranking using a linear model
    94.
    发明授权
    Index optimization for ranking using a linear model 有权
    使用线性模型进行排序的索引优化

    公开(公告)号:US08171031B2

    公开(公告)日:2012-05-01

    申请号:US12690100

    申请日:2010-01-19

    CPC classification number: G06F17/30864

    Abstract: Technologies are described herein for providing a more efficient approach to ranking search results. An illustrative technology reduces an amount of ranking data analyzed at query time. In the technology, a term is selected, at index time, from a master index. The term corresponds to a number of documents greater than a threshold. A set of documents that includes the term is selected based on the master index. A rank is determined for each document in the set of documents that contains the term. Each document in the set of documents that contains the term is assigned to a top document list or a bottom document list based on the rank. Predefined values of at least part of the rank are stored in the top document list for documents in the top document list and are not stored in the bottom document list for documents in the bottom document list.

    Abstract translation: 本文描述了技术,以提供用于对搜索结果进行排名的更有效的方法。 说明性技术减少了在查询时间分析的排名数据量。 在技​​术中,在索引时间,从主索引中选择一个术语。 该术语对应于大于阈值的多个文档。 根据主索引选择一组包含该术语的文档。 在包含该术语的文档集中的每个文档确定排名。 包含该术语的文档集中的每个文档都会根据排名分配给顶级文档列表或底部文档列表。 至少部分等级的预定义值存储在顶部文档列表中的文档的顶部文档列表中,并且不存储在底部文档列表中的文档的底部文档列表中。

    Index optimization for ranking using a linear model
    95.
    发明授权
    Index optimization for ranking using a linear model 有权
    使用线性模型进行排序的索引优化

    公开(公告)号:US08161036B2

    公开(公告)日:2012-04-17

    申请号:US12147666

    申请日:2008-06-27

    CPC classification number: G06F17/30657

    Abstract: Technologies are described herein for providing a more efficient approach to ranking search results. One method reduces an amount of ranking data analyzed at query time. In the method, a term is selected, at index time, from a master index. The term corresponds to a number of documents greater than a threshold. A set of documents that includes the term is selected based on the master index. A rank is determined for each document in the set of documents that contains the term. Each document in the set of documents that contains the term is assigned to a high ranking index or a low ranking index based on the simple rank.

    Abstract translation: 本文描述了技术,以提供用于对搜索结果进行排名的更有效的方法。 一种方法减少了在查询时分析的排名数据量。 在该方法中,在索引时间,从主索引选择一个项。 该术语对应于大于阈值的多个文档。 根据主索引选择一组包含该术语的文档。 在包含该术语的文档集中的每个文档确定排名。 包含该术语的文档集中的每个文档被分配到基于简单等级的高排名索引或低排名索引。

    SEARCH SERVICE ADMINISTRATION WEB SERVICE PROTOCOL
    96.
    发明申请
    SEARCH SERVICE ADMINISTRATION WEB SERVICE PROTOCOL 有权
    搜索服务管理网络服务协议

    公开(公告)号:US20110145218A1

    公开(公告)日:2011-06-16

    申请号:US12766703

    申请日:2010-04-23

    CPC classification number: H04L63/08 G06F17/30867

    Abstract: The embodiments described herein generally relate to a method and system for enabling a client to configure and control the crawling function available through a crawl configuration Web service. A client is able to configure and control the crawling function by defining the URL space of the crawl. Such space may be defined by configuring the starting point(s) and other properties of the crawl. The client further configures the crawling function by creating and configuring a content source and/or a crawl rule. Further, a client defines authentication information applicable to the crawl to enable the discovery and retrieval of electronic documents requiring authentication and/or authorization information for access thereof. A protocol governs the format, structure and syntax (using a Web Services Description Language schema) of messages for communicating to and from the Web crawler through an application programming interface on a server hosting the crawler application.

    Abstract translation: 本文描述的实施例通常涉及用于使客户端能够配置和控制通过爬网配置Web服务可用的爬取功能的方法和系统。 客户端能够通过定义爬网的URL空间来配置和控制爬网功能。 可以通过配置爬网的起点和其他属性来定义这样的空间。 客户端通过创建和配置内容源和/或爬网规则来进一步配置抓取功能。 此外,客户端定义适用于爬行的认证信息,以便能够发现和检索需要认证和/或授权信息以进行访问的电子文档。 协议管理消息的格式,结构和语法(使用Web服务描述语言模式),以通过托管爬网程序应用程序的服务器上的应用程序编程接口与Web爬网程序进行通信。

    GENERATING SEARCH RESULT SUMMARIES
    97.
    发明申请
    GENERATING SEARCH RESULT SUMMARIES 有权
    生成搜索结果摘要

    公开(公告)号:US20110066611A1

    公开(公告)日:2011-03-17

    申请号:US12947649

    申请日:2010-11-16

    CPC classification number: G06F17/30867 G06F17/30719

    Abstract: Embodiments are configured to provide a summary of information associated with one or more search results. In an embodiment, a system includes a summary generator that can be configured to provide a summary of information including one or more snippets associated with a search term or search terms. The system includes a ranking component that can be used to rank snippets and the ranked snippets can be used when generating a summary that includes one or more ranked snippets. In one embodiment, the system can be configured to include one or more filters that can be used to filter snippets and the filtered snippets can be used when generating a summary. Other embodiments are available.

    Abstract translation: 实施例被配置为提供与一个或多个搜索结果相关联的信息的摘要。 在一个实施例中,系统包括摘要生成器,其可以被配置为提供包括与搜索项或搜索项相关联的一个或多个片段的信息的摘要。 该系统包括可用于对片段进行排名的排名组件,并且可以在生成包含一个或多个排名片段的摘要时使用排名片段。 在一个实施例中,该系统可以被配置为包括可用于过滤片段的一个或多个过滤器,并且可以在生成摘要时使用经过滤的片段。 其他实施例是可用的。

    System and method for batched indexing of network documents
    98.
    发明授权
    System and method for batched indexing of network documents 失效
    批量索引网络文件的系统和方法

    公开(公告)号:US07644107B2

    公开(公告)日:2010-01-05

    申请号:US10956891

    申请日:2004-09-30

    CPC classification number: G06F17/30861

    Abstract: A process takes advantage of a structure of a server hosting a network site that includes a change log stored in a database to batch index documents for search queries. The content of the site is batched and shipped in bulk from the server to an indexer. The change log keeps track of the changes to the content of the site. The indexer incrementally requests updates to the index using the change log and batches the changes so that the bandwidth usage and processor overhead costs are reduced.

    Abstract translation: 一个进程利用托管网站的服务器的结构,其中包括存储在数据库中的更改日志,用于搜索查询的批索引文档。 网站的内容已批量批量运输,并从服务器发货到索引器。 更改日志会跟踪站点内容的更改。 索引器使用更改日志递增地请求对索引的更新,并批量更改,以减少带宽使用量和处理器间接成本。

    SPECIFYING RELEVANCE RANKING PREFERENCES UTILIZING SEARCH SCOPES
    99.
    发明申请
    SPECIFYING RELEVANCE RANKING PREFERENCES UTILIZING SEARCH SCOPES 有权
    使用搜索范围指定相关排名优先级

    公开(公告)号:US20090187550A1

    公开(公告)日:2009-07-23

    申请号:US12015514

    申请日:2008-01-17

    CPC classification number: G06F17/3053

    Abstract: A mechanism for expressing a user preference to a set of documents based on user knowledge about the document corpora. The user preference input to the system can be positive, negative, or both. A set of documents that can be identified with a query can define a search scope definition. The search scope is mapped into an input ranking feature for a ranking function. The search scope definition is employed as a soft preference ranking feature, and thus, used to bias ranking via relevance feedback. The mechanism facilitates increasing or decreasing the final ranking score of a document based on whether the document falls into the user scope. The ranking weight can be configured by the user ad-hoc, or when relevance judgments are available, using machine learning techniques to find the optimal weights to optimize ranking.

    Abstract translation: 基于关于文档语料库的用户知识来表达用户对一组文档的偏好的机制。 输入到系统的用户偏好可以是正,负或两者。 可以使用查询识别的一组文档可以定义搜索范围定义。 搜索范围被映射到用于排序功能的输入排名特征。 搜索范围定义被用作软偏好排序特征,因此用于通过相关性反馈来偏移排名。 该机制有助于根据文档是否落入用户范围来增加或减少文档的最终排名分数。 排名权重可以由用户临时配置,或当相关判断可用时,使用机器学习技术来找到最优权重以优化排名。

    RANKING SEARCH RESULTS USING AUTHOR EXTRACTION
    100.
    发明申请
    RANKING SEARCH RESULTS USING AUTHOR EXTRACTION 审中-公开
    使用作者提取排名搜索结果

    公开(公告)号:US20090182723A1

    公开(公告)日:2009-07-16

    申请号:US11972613

    申请日:2008-01-10

    CPC classification number: G06F16/38

    Abstract: Architecture that extracts author information from general documents and uses the author information for search results ranking. The architecture performs automatic author value extraction and makes the extracted value available at index time for subsequent use at query processing and results ranking. Machine learning (e.g., a perceptron algorithm) is employed and a set of input features for the perceptron algorithm utilized for author value extraction. The extracted author value is converted into a feature for input a ranking function for generating a ranking score for each document. The input features can also be weighted according to weighting criteria.

    Abstract translation: 从一般文件中提取作者信息并使用作者信息进行搜索结果排名的架构。 该架构执行自动作者价值提取,并使提取的值在索引时间可用于随后在查询处理和结果排名中使用。 采用机器学习(例如,感知器算法)和用于感知器算法的用于作者价值提取的一组输入特征。 提取的作者价值被转换成用于输入用于生成每个文档的排名得分的排名功能的特征。 输入特征也可以根据加权标准加权。

Patent Agency Ranking