Semantically aggregated index in an indexer-agnostic index building system
    51.
    发明授权
    Semantically aggregated index in an indexer-agnostic index building system 有权
    一个不依赖索引器的索引构建系统中的语义聚合索引

    公开(公告)号:US09104749B2

    公开(公告)日:2015-08-11

    申请号:US13005425

    申请日:2011-01-12

    CPC classification number: G06F17/30631 G06F17/30235 G06F17/3071

    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.

    Abstract translation: 用于不依赖索引器的索引构建系统的计算机程序产品包括用于存储计算机可读程序的计算机可读存储介质,其中当在计算机上执行时,计算机可读程序使计算机执行用于创建语义聚合索引的操作。 操作包括:从数据源提取文档,其中每个文档包括数据对象; 将文档分发到系统内的多个处理节点; 对于每个节点:使用语义规则将每个文档的数据对象索引到字段中; 并通过以下方式对相关字段的索引数据对象进行分组:通过语义规则将文档分类为逻辑组; 并为相关的逻辑组创建可搜索的索引分片。

    METHOD AND SYSTEMS FOR FLEXIBLE AND SCALABLE DATABASES
    52.
    发明申请
    METHOD AND SYSTEMS FOR FLEXIBLE AND SCALABLE DATABASES 有权
    用于灵活和可扩展数据库的方法和系统

    公开(公告)号:US20150213116A1

    公开(公告)日:2015-07-30

    申请号:US14680178

    申请日:2015-04-07

    Abstract: Methods and systems for utilizing a database are disclosed. The methods and systems determine a key representative of a storage location of first RDF data in a NoSQL database. In addition, the methods and systems read the first RDF data in the NoSQL database using the key. The methods and systems also write second RDF data derived from the first RDF data into a second database stored in memory. The methods and systems may also modify the second RDF data, and write third RDF data derived from the modified second RDF data into the NoSQL database.

    Abstract translation: 公开了利用数据库的方法和系统。 方法和系统确定了NoSQL数据库中第一个RDF数据的存储位置的关键代表。 此外,方法和系统使用密钥读取NoSQL数据库中的第一个RDF数据。 方法和系统还将从第一RDF数据导出的第二RDF数据写入存储在存储器中的第二数据库。 方法和系统还可以修改第二RDF数据,并将从修改的第二RDF数据导出的第三RDF数据写入NoSQL数据库。

    SYSTEM AND METHOD OF MANAGING CAPACITY OF SEARCH INDEX PARTITIONS
    53.
    发明申请
    SYSTEM AND METHOD OF MANAGING CAPACITY OF SEARCH INDEX PARTITIONS 有权
    管理搜索索引分类能力的系统和方法

    公开(公告)号:US20150074080A1

    公开(公告)日:2015-03-12

    申请号:US14539542

    申请日:2014-11-12

    Applicant: Open Text SA

    Abstract: A search system can maintain a search index of metadata and text for objects in a repository, repositories or distributed across a network. The search index can be divided into partitions with a partition assigned a first capacity utilization threshold and a second capacity utilization threshold. If the capacity utilization of the partition is below the first threshold, the system can add, update and delete information in the partition. If the capacity utilization of the partition is above the first threshold, the system can update and delete information in the partition, but cannot add information for new objects to the partition. If the capacity utilization of the partition is above the second threshold, the system can enter a rebalancing mode in which it seeks to rebalance capacity utilization between partitions. The behavior of the system can change depending upon the size of a partition relative to its configurable thresholds.

    Abstract translation: 搜索系统可以维护存储库,存储库中的对象或通过网络分发的对象的元数据和文本的搜索索引。 搜索索引可以被划分成具有分配了第一容量利用率阈值和第二容量利用阈值的分区的分区。 如果分区的容量利用率低于第一阈值,则系统可以在分区中添加,更新和删除信息。 如果分区的容量利用率高于第一阈值,则系统可以更新和删除分区中的信息,但不能向分区添加新对象的信息。 如果分区的容量利用率高于第二阈值,则系统可以进入重新平衡模式,其中它试图重新平衡分区之间的容量利用率。 系统的行为可以根据分区的大小相对于其可配置的阈值而改变。

    Securing application information in system-wide search engines
    54.
    发明授权
    Securing application information in system-wide search engines 有权
    在全系统搜索引擎中保护应用程序信息

    公开(公告)号:US08938474B2

    公开(公告)日:2015-01-20

    申请号:US11462937

    申请日:2006-08-07

    CPC classification number: G06F21/33 G06F17/30613 G06F17/30631 G06F17/30867

    Abstract: A system for securing application information in a shared, system-wide search service. Each application can register a security filtering module that is to be used at search time to filter data associated with that application. When a user performs a search, initial, unfiltered search results are obtained based the contents of the shared search index. The unfiltered search results are organized by application, and previously registered filter modules are called to perform user specific, per-application filtering on the initial results. The filter modules cause data to which the user issuing the search request does not have access to be removed from the search results, on a per application basis. Those of the initial search results that are determined in this way to not be accessible to the user issuing the search request are removed, resulting in a set of filtered search results that are presented to the user. The filtered search results thus contain indications only of data that is accessible to the user. In this way, the system-wide search service filters search results to remove indications of data which match the search criteria provided by the user, but to which the user does not have access, based on a conveniently extensible, per-application search result filtering process.

    Abstract translation: 一种用于在共享的全系统搜索服务中保护应用程序信息的系统。 每个应用程序都可以注册要在搜索时使用的安全过滤模块,以过滤与该应用程序相关联的数据。 当用户执行搜索时,基于共享搜索索引的内容获得初始的,未过滤的搜索结果。 未过滤的搜索结果由应用程序组织,并且先前注册的过滤器模块被调用以对初始结果执行用户特定的每应用程序过滤。 过滤器模块导致发布搜索请求的用户无法从搜索结果中删除的数据,基于每个应用程序。 以这种方式确定为发布搜索请求的用户不可访问的初始搜索结果的那些消除,导致呈现给用户的一组经过滤的搜索结果。 因此,过滤的搜索结果仅包含用户可访问的数据的指示。 以这种方式,系统范围的搜索服务可以根据方便的可扩展的每个应用程序的搜索结果过滤来筛选搜索结果以消除符合用户提供的搜索条件但用户不具有访问权限的数据的指示 处理。

    Systems and methods for efficiently storing index data on an electronic device
    55.
    发明授权
    Systems and methods for efficiently storing index data on an electronic device 有权
    用于在电子设备上有效地存储索引数据的系统和方法

    公开(公告)号:US08930371B1

    公开(公告)日:2015-01-06

    申请号:US12164913

    申请日:2008-06-30

    CPC classification number: G06F17/30631

    Abstract: A method for efficiently storing index data on an electronic device may include storing index data in data pages within an indexing data structure on the electronic device. The method may also include providing at least one directory for the indexing data structure. The method may also include dynamically modifying how many directory levels are provided for the indexing data structure in response to changes to the data pages within the indexing data structure.

    Abstract translation: 用于在电子设备上有效地存储索引数据的方法可以包括将索引数据存储在电子设备上的索引数据结构内的数据页中。 该方法还可以包括为索引数据结构提供至少一个目录。 该方法还可以包括动态地修改为索引数据结构提供多少目录级别以响应于索引数据结构内的数据页的改变。

    MESSAGE INDEX SUBDIVIDED BASED ON TIME INTERVALS
    56.
    发明申请
    MESSAGE INDEX SUBDIVIDED BASED ON TIME INTERVALS 有权
    基于时间间隔的消息索引

    公开(公告)号:US20140359029A1

    公开(公告)日:2014-12-04

    申请号:US13935088

    申请日:2013-07-03

    Abstract: During a storage technique, multiple messages (such as emails) associated with a user of a communication application are received. Then, the multiple messages are stored in a message table associated with the user and the multiple messages are indexed in an index associated with the user. This index may be divided into multiple divisions if a total number of messages stored in the message table exceeds a threshold value, where each division corresponds to messages received during a different time interval.

    Abstract translation: 在存储技术期间,接收与通信应用的用户相关联的多个消息(诸如电子邮件)。 然后,多个消息存储在与用户相关联的消息表中,并且多个消息在与用户相关联的索引中被索引。 如果消息表中存储的消息的总数超过阈值,则该索引可以被划分为多个分区,其中每个分区对应于在不同时间间隔期间接收的消息。

    Index partition maintenance over monotonically addressed document sequences
    57.
    发明授权
    Index partition maintenance over monotonically addressed document sequences 有权
    索引分区维护通过单调寻址的文档序列

    公开(公告)号:US08738673B2

    公开(公告)日:2014-05-27

    申请号:US12875615

    申请日:2010-09-03

    CPC classification number: G06F17/30233 G06F17/30584 G06F17/30631

    Abstract: Provided are techniques for partitioning a physical index into one or more physical partitions; assigning each of the one or more physical partitions to a node in a cluster of nodes; for each received document, assigning an assigned-doc-ID comprising an integer document identifier; and, in response to assigning the assigned-doc-ID to a document, determining a cut-off of assignment of new documents to a current virtual-index-epoch comprising a first set of physical partitions and placing the new documents into a new virtual-index-epoch comprising a second set of physical partitions by inserting each new document to a specific one of the physical partitions in the second set using one or more functions that direct the placement based on one of the assigned-doc-id, a field value derived from a set of fields obtained from the document, and a combination of the assigned-doc-id and the field value.

    Abstract translation: 提供了用于将物理索引分割成一个或多个物理分区的技术; 将一个或多个物理分区中的每一个分配给节点簇中的节点; 对于每个接收到的文档,分配包括整数文档标识符的分配文档ID; 并且响应于将分配的文档ID分配给文档,确定新文档的分配到当前虚拟索引时期的截断,该当前虚拟索引时期包括第一组物理分区,并将新文档放入新的虚拟 - 指数 - 历元包括第二组物理分区,通过使用一个或多个基于所分配的文档ID中的一个来指导所述布局的功能,将每个新文档插入第二组中的特定一个物理分区 从文档获得的一组字段中导出的值以及分配的doc-id和字段值的组合。

    Multi-level version format
    58.
    发明授权
    Multi-level version format 有权
    多级版本格式

    公开(公告)号:US08676771B2

    公开(公告)日:2014-03-18

    申请号:US13492814

    申请日:2012-06-09

    Abstract: A system and method for maintaining version information. An identifier (“ID”) that identifies a collection of associated files is obtained. An index is generated that specifies the contents of the collection of associated files. The ID may be saved along with the index in a target version file to convey version information about the collection of associated files. Subsequently, the index may be extracted from the target version file to compare with a corresponding index extracted from a reference version file. The result of the comparison may be used to determine whether the contents of the collection of associated files match a reference.

    Abstract translation: 一种用于维护版本信息的系统和方法。 获得标识相关文件集合的标识符(“ID”)。 生成指定相关文件集合的内容的索引。 ID可以与目标版本文件中的索引一起保存,以传达关于相关文件集合的版本信息。 随后,可以从目标版本文件中提取索引,以与从参考版本文件提取的相应索引进行比较。 比较的结果可以用于确定相关文件的集合的内容是否匹配引用。

    Analytics Data Indexing System and Methods
    59.
    发明申请
    Analytics Data Indexing System and Methods 有权
    分析数据索引系统和方法

    公开(公告)号:US20140052732A1

    公开(公告)日:2014-02-20

    申请号:US13219462

    申请日:2011-08-26

    CPC classification number: G06F17/30631

    Abstract: Provided is a method that includes a method for updating index data. The method includes receiving index data, including an index value indicative of user activity on a network site and an index time corresponding to a time used for calculating the index value, receiving an update index time corresponding to a time used for updating the index data, determining an updated index value using an exponential decay of the index value from the index time to the update index time, wherein the updated index value comprises a decayed value of the index value corresponding to the update time, and storing updated index data including the updated index value and the update index time.

    Abstract translation: 提供了一种包括更新索引数据的方法的方法。 该方法包括接收索引数据,包括指示网络站点上的用户活动的索引值和对应于用于计算索引值的时间的索引时间,接收与用于更新索引数据的时间相对应的更新索引时间, 使用从所述索引时间到所述更新索引时间的所述索引值的指数衰减来确定更新的索引值,其中所述更新的索引值包括与所述更新时间对应的所述索引值的衰减值,并且存储更新的索引数据, 索引值和更新索引时间。

    REAL TIME CONTENT SEARCHING IN SOCIAL NETWORK

    公开(公告)号:US20130246390A1

    公开(公告)日:2013-09-19

    申请号:US13866095

    申请日:2013-04-19

    Abstract: Indexing and retrieving real time content in a social networking system is disclosed. A user-term index includes user-term partitions, each user-term partition comprising temporal databases. As a post is received from a user, a user identifier, a post identifier, and a post is extracted. An object store communicatively coupled to a temporal database for recently received content is queried to determine whether terms in the post has already been stored. A term identifier is stored in the user-term index with the user and post identifiers. A forward index stores the post by post identifier. Responsive to a search query, the user-term index is searched by the user's connections and the terms. A real time search engine compiles the results of the user-term index query and retrieves the stored posts from the forward index. The search results may then be ranked and cached before presentation to the searching user.

Patent Agency Ranking