Incremental Maintenance of Inverted Indexes for Approximate String Matching
    111.
    发明申请
    Incremental Maintenance of Inverted Indexes for Approximate String Matching 有权
    反向索引的近似字符串匹配的增量维护

    公开(公告)号:US20120323870A1

    公开(公告)日:2012-12-20

    申请号:US13595270

    申请日:2012-08-27

    CPC classification number: G06F17/30631 G06F17/30336 G06F17/30622

    Abstract: In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy.

    Abstract translation: 在所公开的技术的实施例中,诸如反向索引之类的索引仅在必要时被更新以保证在与索引本身的更新相比较较少成本的预定阈值内的应答精度。 使用本技术,可以在几分钟内处理一批每日更新,而不是几个小时来重建索引,并且可以回答保证结果准确或准确的阈值。

    SEMANTICALLY AGGREGATED INDEX IN AN INDEXER-AGNOSTIC INDEX BUILDING SYSTEM
    112.
    发明申请
    SEMANTICALLY AGGREGATED INDEX IN AN INDEXER-AGNOSTIC INDEX BUILDING SYSTEM 有权
    一个指数指数建立系统中的语义聚合指数

    公开(公告)号:US20120179684A1

    公开(公告)日:2012-07-12

    申请号:US13005425

    申请日:2011-01-12

    CPC classification number: G06F17/30631 G06F17/30235 G06F17/3071

    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.

    Abstract translation: 用于不依赖索引器的索引构建系统的计算机程序产品包括用于存储计算机可读程序的计算机可读存储介质,其中当在计算机上执行时,计算机可读程序使计算机执行用于创建语义聚合索引的操作。 操作包括:从数据源提取文档,其中每个文档包括数据对象; 将文档分发到系统内的多个处理节点; 对于每个节点:使用语义规则将每个文档的数据对象索引到字段中; 并通过以下方式对相关字段的索引数据对象进行分组:通过语义规则将文档分类为逻辑组; 并为相关的逻辑组创建可搜索的索引分片。

    Incremental Maintenance of Inverted Indexes for Approximate String Matching
    113.
    发明申请
    Incremental Maintenance of Inverted Indexes for Approximate String Matching 失效
    反向索引的近似字符串匹配的增量维护

    公开(公告)号:US20100318519A1

    公开(公告)日:2010-12-16

    申请号:US12481693

    申请日:2009-06-10

    CPC classification number: G06F17/30631 G06F17/30336 G06F17/30622

    Abstract: In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy.

    Abstract translation: 在所公开的技术的实施例中,诸如反向索引之类的索引仅在必要时被更新以保证在与索引本身的更新相比较较少成本的预定阈值内的应答精度。 使用本技术,可以在几分钟内处理一批每日更新,而不是几个小时来重建索引,并且可以回答保证结果准确或准确的阈值。

    INDEXING AND QUERYING DATA STORES USING CONCATENATED TERMS
    114.
    发明申请
    INDEXING AND QUERYING DATA STORES USING CONCATENATED TERMS 有权
    使用协议条款来索引和查询数据存储

    公开(公告)号:US20100185629A1

    公开(公告)日:2010-07-22

    申请号:US12350977

    申请日:2009-01-09

    CPC classification number: G06F17/30634 G06F17/30631 G06F17/30657

    Abstract: Tools and techniques for indexing and querying data stores using concatenated terms are provided. These tools may receive input queries that include at least two query terms. The query terms are correlated respectively with fields contained within records within a data store, with these fields being populated with respective field values. The query terms are arranged according to an indexing priority according to which the fields are ranked within an indexing table, which is associated with the data store. The tools then concatenate the query terms as arranged according to the indexing priority. In turn, the tools search the index table for any entries that are responsive to the concatenated query terms

    Abstract translation: 提供了使用连接术语索引和查询数据存储的工具和技术。 这些工具可能会收到至少包含两个查询项的输入查询。 查询项分别与数据存储内的记录内包含的字段相关联,这些字段用相应的字段值填充。 查询词根据索引优先级排列,根据该索引优先级将这些字段排列在与数据存储相关联的索引表中。 然后,工具将按照索引优先级排列的查询项连接起来。 反过来,这些工具会在索引表中搜索响应于连接的查询条件的任何条目

    Number-range search system and method
    115.
    发明授权
    Number-range search system and method 有权
    数字搜索系统和方法

    公开(公告)号:US07693824B1

    公开(公告)日:2010-04-06

    申请号:US10690401

    申请日:2003-10-20

    Abstract: A system and method is disclosed for generating numerical index terms for numbers encountered in documents indexed by a search engine. The numerical index terms include information about the indexed number (e.g., fieldname, characteristic, sign) and each digit, or a subset of the digits, of the number (e.g., position, value). Also, disclosed is a system and method of processing number-range search queries having one or more number ranges and generating expressions (e.g., Boolean expression tree) of numerical index terms based on a boundary number associated with the number range. An expression is used to control the search of a document index so as to identify documents that contain numbers that satisfy the expression.

    Abstract translation: 公开了一种用于为由搜索引擎索引的文档中遇到的数字生成数字索引项的系统和方法。 数字索引项包括关于数字(例如,位置,值)的索引号码(例如,字段名称,特征,符号)和每个数字或数字的子集的信息。 此外,公开了一种处理具有一个或多个数字范围的数量范围搜索查询的系统和方法,并且基于与数字范围相关联的边界号来生成数字索引项的表达式(例如,布尔表达式树)。 表达式用于控制文档索引的搜索,以便识别包含满足表达式的数字的文档。

    DATABASE AND INDEX ORGANIZATION FOR ENHANCED DOCUMENT RETRIEVAL
    116.
    发明申请
    DATABASE AND INDEX ORGANIZATION FOR ENHANCED DOCUMENT RETRIEVAL 有权
    数据库和索引组织,用于增强文档检索

    公开(公告)号:US20090319524A1

    公开(公告)日:2009-12-24

    申请号:US12471748

    申请日:2009-05-26

    Abstract: A customized, specialty-oriented database and index, of a subject matter area and methods for constructing and using such a database are provided. Selection and indexing of articles is done by experts in the topic with which the database is concerned. As a result, articles are indexed in a manner that allows facile, rapid retrieval of highly relevant articles with few or no false positives with much reduced database maintenance cost through frugal limitation of number of documents in the database, number of terms in a Master Index, and number of codes assigned to each document. A thesaurus allows indexing and search in accordance with terminology familiar to different anticipated groups of users (e.g. doctors, patients, nurses, technicians, and the like). Key articles collections and rapid access to documents therein are also provided.

    Abstract translation: 提供了一个主题领域的定制化,面向专业的数据库和索引,以及构建和使用这种数据库的方法。 文章的选择和索引由数据库涉及的主题的专家完成。 因此,文章的索引方式允许通过节俭地限制数据库中的文档数量,大量索引中的术语数量,很少或没有误报,轻松快速检索高度相关的文章,从而大大降低了数据库维护成本 ,以及分配给每个文档的代码数量。 词典允许根据不同的预期用户组(例如医生,患者,护士,技术人员等)熟悉的术语进行索引和搜索。 还提供了关键文章收藏和快速访问其中的文档。

    Progressive reference system, method and apparatus
    117.
    发明申请
    Progressive reference system, method and apparatus 审中-公开
    逐行参考系,方法和装置

    公开(公告)号:US20090106206A1

    公开(公告)日:2009-04-23

    申请号:US12284706

    申请日:2008-09-24

    Abstract: A written document (hereinafter referred to as a “work,” on electronic format which includes, stories, novels, education texts, biographies, compilations, collections, anthologies, tracts, and any other traditional format for relatively extensive texts) provides access to reference, bibliography and/or definition material through an electronic software capability associated with the work. Depending upon reader access information or characteristics (e.g., age, grade, proficiency, or position within the work or any other identifiable reader characteristic or access limitation), any request for reference material, definitions, explanations, translations, or other material provided in the associated software capability is automatically limited by system acknowledgement of the reader access information or characteristics. As the reader's access information or characteristics change, the quality and/or quantity and/or format of requested information with respect to a work changes.

    Abstract translation: 电子格式的书面文件(以下简称“工作”,包括故事,小说,教育文本,传记,汇编,收藏,选集,小册子以及相对广泛文本的任何其他传统格式)提供参考 ,参考书目和/或定义材料,通过与工作相关的电子软件功能。 根据读者访问信息或特征(例如,工作中的年龄,成绩,熟练程度或职位或任何其他可识别的读者特征或访问限制),任何对参考资料,定义,解释,翻译或其他材料的请求 相关的软件功能由读取器访问信息或特征的系统确认自动限制。 随着读者的访问信息或特征的变化,所要求的信息的质量和/或数量和/或格式对于工作的变化。

    INDEX FOR DATA RETRIEVAL AND DATA STRUCTURING
    118.
    发明申请
    INDEX FOR DATA RETRIEVAL AND DATA STRUCTURING 失效
    数据检索和数据结构索引

    公开(公告)号:US20080319954A1

    公开(公告)日:2008-12-25

    申请号:US12202299

    申请日:2008-08-31

    Inventor: Volker BOETTIGER

    Abstract: A system of generating an index for a retrieval of data provided by at least one document is disclosed. The method and system comprise selecting data within the at least one document, assigning a category to the selected data, and assigning a timestamp to the selected data. The method and system further includes storing the selected data, the category, the timestamp and a location indication of the selected data as an entry of the index. The present invention therefore provides an effective and universally adaptive tool for contextual structuring and retrieval of data distributed over a plurality of electronic documents.

    Abstract translation: 公开了一种生成用于检索由至少一个文档提供的数据的索引的系统。 所述方法和系统包括选择所述至少一个文档内的数据,将类别分配给所选数据,以及为所选数据分配时间戳。 所述方法和系统还包括将所选数据的所选数据,类别,时间戳和位置指示存储为索引的条目。 因此,本发明为分布在多个电子文档上的数据的上下文结构化和检索提供了一种有效和普遍适用的工具。

    System and Method for Providing a Trustworthy Inverted Index to Enable Searching of Records
    119.
    发明申请
    System and Method for Providing a Trustworthy Inverted Index to Enable Searching of Records 失效
    提供可靠的倒置索引以启用记录搜索的系统和方法

    公开(公告)号:US20080059420A1

    公开(公告)日:2008-03-06

    申请号:US11466173

    申请日:2006-08-22

    CPC classification number: G06F17/30631 G06F21/64

    Abstract: A trustworthy inverted index system processes records to identify features for indexing, generates posting lists corresponding to features in a dictionary, maintains in a storage cache a tail of at least one of the posting lists to minimize random I/Os to the index, determines a desired number of the posting lists based on a desired level of insertion performance, a query performance, or a size of the storage cache, and reads a posting list corresponding to a search feature in a query to identify records that comprise the search feature. The system maps the features in the dictionary to the desired number of posting lists. The system uses a jump pointer to point from one entry to the next in the posting lists based on increasing values of entries in the posting lists.

    Abstract translation: 可靠的反向索引系统处理记录以识别用于索引的特征,生成与字典中的特征相对应的发布列表,在存储高速缓存中维护至少一个发布列表的尾部以最小化索引的随机I / O,确定 基于期望的插入性能水平,查询性能或存储高速缓存的大小,发送列表的期望数量,并且读取与查询中的搜索特征相对应的发布列表,以识别构成搜索特征的记录。 系统将字典中的功能映射到所需的发布列表数量。 系统使用跳转指针根据发布列表中条目的增加值从发布列表中的一个条目指向下一个条目。

    METHOD AND SYSTEM FOR SECURING APPLICATION INFORMATION IN SYSTEM-WIDE SEARCH ENGINES
    120.
    发明申请
    METHOD AND SYSTEM FOR SECURING APPLICATION INFORMATION IN SYSTEM-WIDE SEARCH ENGINES 有权
    用于在系统级搜索引擎中安全应用信息的方法和系统

    公开(公告)号:US20080033954A1

    公开(公告)日:2008-02-07

    申请号:US11462937

    申请日:2006-08-07

    CPC classification number: G06F21/33 G06F17/30613 G06F17/30631 G06F17/30867

    Abstract: A system for securing application information in a shared, system-wide search service. Each application can register a security filtering module that is to be used at search time to filter data associated with that application. When a user performs a search, initial, unfiltered search results are obtained based the contents of the shared search index. The unfiltered search results are organized by application, and previously registered filter modules are called to perform user specific, per-application filtering on the initial results. The filter modules cause data to which the user issuing the search request does not have access to be removed from the search results, on a per application basis. Those of the initial search results that are determined in this way to not be accessible to the user issuing the search request are removed, resulting in a set of filtered search results that are presented to the user. The filtered search results thus contain indications only of data that is accessible to the user. In this way, the system-wide search service filters search results to remove indications of data which match the search criteria provided by the user, but to which the user does not have access, based on a conveniently extensible, per-application search result filtering process.

    Abstract translation: 一种用于在共享的全系统搜索服务中保护应用程序信息的系统。 每个应用程序都可以注册要在搜索时使用的安全过滤模块,以过滤与该应用程序相关联的数据。 当用户执行搜索时,基于共享搜索索引的内容获得初始的,未过滤的搜索结果。 未过滤的搜索结果由应用程序组织,并且先前注册的过滤器模块被调用以对初始结果执行用户特定的每应用程序过滤。 过滤器模块导致发布搜索请求的用户无法从搜索结果中删除的数据,基于每个应用程序。 以这种方式确定为发布搜索请求的用户不可访问的初始搜索结果的那些消除,导致呈现给用户的一组经过滤的搜索结果。 因此,过滤的搜索结果仅包含用户可访问的数据的指示。 以这种方式,系统范围的搜索服务可以根据方便的可扩展的每个应用程序的搜索结果过滤来筛选搜索结果以消除符合用户提供的搜索条件但用户不具有访问权限的数据的指示 处理。

Patent Agency Ranking