PRELIMINARY RANKER FOR SCORING MATCHING DOCUMENTS
    91.
    发明申请
    PRELIMINARY RANKER FOR SCORING MATCHING DOCUMENTS 审中-公开
    评分匹配文件的初步排名

    公开(公告)号:US20160378769A1

    公开(公告)日:2016-12-29

    申请号:US15186226

    申请日:2016-06-17

    IPC分类号: G06F17/30

    摘要: The technology described herein provides for preliminary ranking of matching documents for a search query. A preliminary ranker uses score tables for scoring each matching document based on its relevant to a search query. The score table for a document stores pre-computed data used to derive a frequency of terms and other information in the document. The preliminary ranker uses the score table for each matching document and the terms form the search query to determine a score for each matching document. The lowest scoring documents are removed from further consideration by a final ranker.

    摘要翻译: 本文描述的技术提供了用于搜索查询的匹配文档的初步排名。 初步筛选者使用得分表来根据与搜索查询相关的每个匹配文档进行评分。 文档的分数表存储用于导出文档中的术语和其他信息的频率的预先计算的数据。 初步职业选手使用每个匹配文件的分数表,并从搜索查询中选择术语,以确定每个匹配文档的得分。 得分最低的文件被最后的防守者进一步考虑。

    Message index subdivided based on time intervals
    94.
    发明授权
    Message index subdivided based on time intervals 有权
    消息索引根据时间间隔细分

    公开(公告)号:US09514217B2

    公开(公告)日:2016-12-06

    申请号:US13935088

    申请日:2013-07-03

    IPC分类号: G06F15/16 G06F17/30 H04L12/58

    摘要: During a storage technique, multiple messages (such as emails) associated with a user of a communication application are received. Then, the multiple messages are stored in a message table associated with the user and the multiple messages are indexed in an index associated with the user. This index may be divided into multiple divisions if a total number of messages stored in the message table exceeds a threshold value, where each division corresponds to messages received during a different time interval.

    摘要翻译: 在存储技术期间,接收与通信应用的用户相关联的多个消息(诸如电子邮件)。 然后,多个消息存储在与用户相关联的消息表中,并且多个消息在与用户相关联的索引中被索引。 如果消息表中存储的消息的总数超过阈值,则该索引可以被划分为多个分区,其中每个分区对应于在不同时间间隔期间接收的消息。

    DYNAMIC THRESHOLD GATES FOR INDEXING QUEUES
    95.
    发明申请
    DYNAMIC THRESHOLD GATES FOR INDEXING QUEUES 有权
    用于指导队伍的动态门槛

    公开(公告)号:US20160259785A1

    公开(公告)日:2016-09-08

    申请号:US14635093

    申请日:2015-03-02

    IPC分类号: G06F17/30

    摘要: Electronic files are selectively assigned to a plurality of different indexing queues by one or more dynamic throughput threshold gates based on characteristics of the different indexing queues as well as the static file characteristics associated with each of the files. The files are then indexed. Upon detecting a change in a dynamic characteristic of one or more indexed files, the throughput threshold gate(s) are then modified to obtain, maintain or modify a desired throughput for one or more of the indexing queues.

    摘要翻译: 基于不同索引队列的特性以及与每个文件相关联的静态文件特征,电子文件通过一个或多个动态吞吐量阈值门选择性地分配给多个不同的索引队列。 然后将文件编入索引。 一旦检测到一个或多个索引文件的动态特性的变化,则修改吞吐量阈值门以获得,维护或修改一个或多个索引队列的期望吞吐量。

    Managing time series databases
    97.
    发明授权
    Managing time series databases 有权
    管理时间序列数据库

    公开(公告)号:US09361329B2

    公开(公告)日:2016-06-07

    申请号:US14105660

    申请日:2013-12-13

    IPC分类号: G06F17/30

    摘要: A method for building indices for a time sequence in a time series database includes dividing, using a processing device, a time sequence in the time series database into a plurality of subsequences based on a sliding window; building spatial indices for the plurality of subsequences, the spatial indices being used for defining spatial locations of subsequences in the plurality of subsequences in the time sequence; and building content indices for the plurality of subsequences, the content indices being used for defining content ranges of subsequences in the plurality of subsequences.

    摘要翻译: 一种用于构建时间序列数据库中的时间序列的索引的方法,包括:使用处理装置,将时间序列数据库中的时间序列划分为基于滑动窗口的多个子序列; 构建所述多个子序列的空间索引,所述空间索引用于在所述时间序列中定义所述多个子序列中的子序列的空间位置; 以及构建所述多个子序列的内容索引,所述内容索引用于定义所述多个子序列中的子序列的内容范围。

    Search device, search method and recording medium
    98.
    发明授权
    Search device, search method and recording medium 有权
    搜索设备,搜索方法和记录介质

    公开(公告)号:US09292508B2

    公开(公告)日:2016-03-22

    申请号:US14137319

    申请日:2013-12-20

    发明人: Katsuhiko Satoh

    IPC分类号: G06F17/00 G06F17/30

    CPC分类号: G06F17/30011 G06F17/30619

    摘要: A search device comprises a memory device for storing document data containing search target character strings to which delimiting characters are appended at both ends; an acquirer for acquiring keywords; a generator for generating a search character string by appending delimiting characters to both ends of the keywords; a designator for designating appearance positions where those extracted partial strings from the search character string appear in the search target character string of the document data; a determiner for determining the frequency with which partial strings common to the partial strings of the search character string appear with a positional relationship similar to the search character string in the search target character string; an evaluator for evaluating the degree of similarity between the search target character string and the search character string; and an output device for outputting the search target character string.

    摘要翻译: 搜索装置包括存储装置,用于存储包含在两端附加有定界字符的搜索目标字符串的文档数据; 获取关键字的收购方; 生成器,用于通过将分隔符附加到关键字的两端来生成搜索字符串; 用于指定来自搜索字符串的那些提取的部分字符串出现在文档数据的搜索目标字符串中的外观位置的指示符; 确定器,用于确定与搜索目标字符串中的搜索字符串类似的位置关系出现搜索字符串的部分字符串共有的部分字符串的频率; 用于评估搜索目标字符串和搜索字符串之间的相似程度的评估器; 以及用于输出搜索目标字符串的输出装置。

    Method, device and computer program for identifying items having high frequency of occurrence among items included in a text data stream
    99.
    发明授权
    Method, device and computer program for identifying items having high frequency of occurrence among items included in a text data stream 有权
    方法,装置和计算机程序,用于识别包括在文本数据流中的项目之间具有高发生频率的项目

    公开(公告)号:US09292439B2

    公开(公告)日:2016-03-22

    申请号:US13799951

    申请日:2013-03-13

    摘要: A method, device and computer program for efficiently identifying items having a high frequency of occurrence among items included in a large-volume text data stream. Identification information for identifying an item and a count of items are stored in a higher level of memory and only identification information is stored in a lower level. Text data stream input is received, the increment of the count of an item is increased in response to storage in the higher level memory of identification information for an item included in a bucket divided from the received text data stream input, identification information for the item is transferred with the initial count to the higher level of memory in response to storage in the lower level and the identification information for the item is newly stored with the initial count in the higher level in response to not being stored on any level.

    摘要翻译: 一种方法,设备和计算机程序,用于在大容量文本数据流中包括的项目中有效地识别具有高频率发生的项目。 用于识别物品的识别信息和物品数量被存储在更高级别的存储器中,并且仅将识别信息存储在较低级别中。 接收到文本数据流输入,响应于从接收到的文本数据流输入划分的包含在桶中的项目的识别信息的较高级存储器中的存储,项目的计数增加,项目的标识信息 响应于在较低级别的存储而将初始计数转移到较高级别的存储器,并且响应于不存储在任何级别上,新存储具有较高级别的初始计数的项目的标识信息。

    PREVIEWING PARSED RAW DATA USING A GRAPHICAL USER INTERFACE
    100.
    发明申请
    PREVIEWING PARSED RAW DATA USING A GRAPHICAL USER INTERFACE 有权
    使用图形用户界面预览分色的RAW数据

    公开(公告)号:US20160055214A1

    公开(公告)日:2016-02-25

    申请号:US14929332

    申请日:2015-10-31

    申请人: Splunk Inc.

    摘要: Embodiments are directed towards previewing results generated from indexing data raw data before the corresponding index data is added to an index store. Raw data may be received from a preview data source. After an initial set of configuration information may be established, the preview data may be submitted to an index processing pipeline. A previewing application may generate preview results based on the preview index data and the configuration information. The preview results may enable previewing how the data is being processed by the indexing application. If the preview results are not acceptable, the configuration information may be modified. The preview application enables modification of the configuration information until the generated preview results may be acceptable. If the configuration information is acceptable, the preview data may be processed and indexed in one or more index stores.

    摘要翻译: 实施例针对在将对应的索引数据添加到索引存储之前预览从索引数据原始数据生成的结果。 可以从预览数据源接收原始数据。 在可以建立一组初始配置信息之后,可以将预览数据提交给索引处理流水线。 预览应用可以基于预览索引数据和配置信息生成预览结果。 预览结果可能可以预览索引应用程序如何处理数据。 如果预览结果不可接受,则可以修改配置信息。 预览应用程序可以修改配置信息,直到生成的预览结果可以接受。 如果配置信息是可接受的,则预览数据可以在一个或多个索引存储中被处理和索引。