-
公开(公告)号:US20150339344A1
公开(公告)日:2015-11-26
申请号:US14815884
申请日:2015-07-31
申请人: Splunk Inc.
发明人: Alice Emily Neels , Archana Sulochana Ganapathi , Marc Vincent Robichaud , Stephen Phillip Sorkin , Steve Yu Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30395 , G06F3/0482 , G06F17/248 , G06F17/30283 , G06F17/30424 , G06F17/30528 , G06F17/30554 , G06F17/30867
摘要: Embodiments include generating data models that may give semantic meaning for unstructured or structured data that may include data generated and/or received by search engines, including a time series engine. A method includes generating a data model for data stored in a repository. Generating the data model includes generating an initial query string, executing the initial query string on the data, generating an initial result set based on the initial query string being executed on the data, determining one or more candidate fields from one or results of the initial result set, generating a candidate data model based on the one or more candidate fields, iteratively modifying the candidate data model until the candidate data model models the data, and using the candidate data model as the data model.
摘要翻译: 实施例包括生成可以给非结构化或结构化数据赋予语义意义的数据模型,其可以包括由搜索引擎(包括时间序列引擎)生成和/或接收的数据。 一种方法包括为存储在存储库中的数据生成数据模型。 生成数据模型包括生成初始查询字符串,对数据执行初始查询字符串,基于对数据执行的初始查询字符串生成初始结果集,从一个或多个初始查询字符串的结果确定一个或多个候选字段 生成基于一个或多个候选字段的候选数据模型,迭代地修改候选数据模型,直到候选数据模型对数据建模,并使用候选数据模型作为数据模型。
-
公开(公告)号:US20150058375A1
公开(公告)日:2015-02-26
申请号:US14530680
申请日:2014-10-31
申请人: Splunk Inc.
发明人: Steve Yu Zhang , Stephen P. Sorkin
CPC分类号: G06F17/3053 , G06F17/30194 , G06F17/30312 , G06F17/30353 , G06F17/30386 , G06F17/30477 , G06F17/30483 , G06F17/30486 , G06F17/30528 , G06F17/30545 , G06F17/30551 , G06F17/30554 , G06F17/30675 , G06F17/30864 , G06F17/30867 , G06F17/30973 , G06F17/30991 , H04L41/0604 , H04L41/22 , H04L67/1097
摘要: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.
摘要翻译: 方法,系统和处理器可读存储介质被引导为生成从存储在多个分布式节点上的诸如事件数据的数据导出的报告。 在一个实施例中,使用“分割和征服”算法生成分析,使得每个分布式节点分析本地存储的事件数据,而聚合节点组合这些分析结果以生成报告。 在一个实施例中,每个分布式节点还将与分析结果相关联的事件数据引用的列表发送到聚合节点。 然后,聚合节点可以基于从每个分布式节点接收的事件数据参考的列表来生成数据引用的全局有序列表。 随后,响应于用户选择一系列全局事件数据,报告可以动态地从一个或多个分布式节点检索事件数据,以便根据全局顺序进行显示。
-
公开(公告)号:US20140317111A1
公开(公告)日:2014-10-23
申请号:US14266838
申请日:2014-05-01
申请人: Splunk Inc.
IPC分类号: G06F17/30
CPC分类号: G06F17/3053 , G06F17/30194 , G06F17/30312 , G06F17/30353 , G06F17/30386 , G06F17/30477 , G06F17/30483 , G06F17/30486 , G06F17/30528 , G06F17/30545 , G06F17/30551 , G06F17/30554 , G06F17/30675 , G06F17/30864 , G06F17/30867 , G06F17/30973 , G06F17/30991 , H04L41/0604 , H04L41/22 , H04L67/1097
摘要: A method, system, and processor-readable storage medium are directed towards generating a report derived from data, such as event data, stored on a plurality of distributed nodes. In one embodiment the analysis is generated using a “divide and conquer” algorithm, such that each distributed node analyzes locally stored event data while an aggregating node combines these analysis results to generate the report. In one embodiment, each distributed node also transmits a list of event data references associated with the analysis result to the aggregating node. The aggregating node may then generate a global ordered list of data references based on the list of event data references received from each distributed node. Subsequently, in response to a user selection of a range of global event data, the report may dynamically retrieve event data from one or more distributed nodes for display according to the global order.
摘要翻译: 方法,系统和处理器可读存储介质被引导为生成从存储在多个分布式节点上的诸如事件数据的数据导出的报告。 在一个实施例中,使用“分割和征服”算法生成分析,使得每个分布式节点分析本地存储的事件数据,而聚合节点组合这些分析结果以生成报告。 在一个实施例中,每个分布式节点还将与分析结果相关联的事件数据引用的列表发送到聚合节点。 然后,聚合节点可以基于从每个分布式节点接收的事件数据参考的列表来生成数据引用的全局有序列表。 随后,响应于用户选择一系列全局事件数据,报告可以动态地从一个或多个分布式节点检索事件数据,以便根据全局顺序进行显示。
-
公开(公告)号:US20140074817A1
公开(公告)日:2014-03-13
申请号:US13662369
申请日:2012-10-26
申请人: SPLUNK INC.
发明人: Alice Emily Neels , Archara Sulochana Ganapathi , Marc Vincent Robichaud , Stephen Phillip Sorkin , Steve Yu Zhang
IPC分类号: G06F17/30
CPC分类号: G06F17/30395 , G06F3/0482 , G06F17/248 , G06F17/30283 , G06F17/30424 , G06F17/30528 , G06F17/30554 , G06F17/30867
摘要: Embodiments are directed towards generating data models that may give semantic meaning for unstructured data or structured data that may include data generated and/or received by search engines, including a time series engine. Data models also may be generated to provide semantic meaning to structured data. A data model may be composed of a hierarchical data model objects analogous to an object-oriented programming class hierarchy. Users may employ a data modeling application to produce reports using search objects that may be part of, or associated with the data model. The data modeling application may employ the search object and the data model to generate a query string for searching a data repository to produce a result set. A data modeling application may map the result set data to data model objects that may be used to generate reports.
摘要翻译: 实施例涉及生成可能给非结构化数据或结构化数据提供语义意义的数据模型,这些结构化数据或结构化数据可能包括由搜索引擎(包括时间序列引擎)生成和/或接收的数据。 也可以生成数据模型以为结构化数据提供语义。 数据模型可以由类似于面向对象的编程类层次结构的分层数据模型对象组成。 用户可以使用数据建模应用程序来生成使用可能是数据模型的一部分或与数据模型相关联的搜索对象的报告。 数据建模应用程序可以使用搜索对象和数据模型来生成用于搜索数据存储库以产生结果集的查询字符串。 数据建模应用程序可将结果集数据映射到可用于生成报告的数据模型对象。
-
公开(公告)号:US20130054660A1
公开(公告)日:2013-02-28
申请号:US13660874
申请日:2012-10-25
申请人: Splunk Inc.
发明人: Steve Yu Zhang
IPC分类号: G06F17/18
CPC分类号: G06F17/18 , G06F7/22 , G06F7/483 , G06F7/544 , G06F17/30536 , G06K9/6222
摘要: A method, system, and processor-readable storage medium are directed towards calculating approximate order statistics on a collection of real numbers. In one embodiment, the collection of real numbers is processed to create a digest comprising hierarchy of buckets. Each bucket is assigned a real number N having P digits of precision and ordinality O. The hierarchy is defined by grouping buckets into levels, where each level contains all buckets of a given ordinality. Each individual bucket in the hierarchy defines a range of numbers—all numbers that, after being truncated to that bucket's P digits of precision, are equal to that bucket's N. Each bucket additionally maintains a count of how many numbers have fallen within that bucket's range. Approximate order statistics may then be calculated by traversing the hierarchy and performing an operation on some or all of the ranges and counts associated with each bucket.
摘要翻译: 方法,系统和处理器可读存储介质被引导以计算关于实数集合的近似顺序统计。 在一个实施例中,处理实数的集合以创建包括桶的层次结构的摘要。 每个桶被分配一个具有精确度和序数O的P位数的实数N.层次结构通过将桶分组为级别来定义,其中每个级别包含给定序数的所有桶。 层次结构中的每个单独的桶定义了一个数字范围 - 所有数字在被截断到该桶的P位精度之后都等于该桶的N。每个桶还保留有多少数量落在该桶的范围内的数量 。 然后可以通过遍历层级并对与每个桶相关联的一些或全部范围和计数执行操作来计算近似订单统计。
-
公开(公告)号:US12072939B1
公开(公告)日:2024-08-27
申请号:US17589712
申请日:2022-01-31
申请人: Splunk Inc.
发明人: Alexandros Batsakis , Nir Frenkel , Nitilaksha Halakatti , Balaji Rao , Anish Shrigondekar , Ruochen Zhang , Steve Yu Zhang
IPC分类号: G06F16/00 , G06F16/23 , G06F16/2458 , G06F16/903 , G06F16/9032
CPC分类号: G06F16/90335 , G06F16/23 , G06F16/2471 , G06F16/9032
摘要: A data intake and query system can generate local data enrichment objects and receive federated data enrichment objects from another data intake and query system. In response to receiving a query, the data intake and query system can determine whether the query is subquery of a federated query. If the query is a subquery, the data intake and query system can use the federated data enrichment objects to execute the query.
-
公开(公告)号:US11914562B1
公开(公告)日:2024-02-27
申请号:US18166326
申请日:2023-02-08
申请人: SPLUNK INC.
IPC分类号: G06F16/22 , G06F16/245 , G06F16/248 , G06F16/27 , G06F16/901
CPC分类号: G06F16/2228 , G06F16/245 , G06F16/248 , G06F16/278 , G06F16/901
摘要: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.
-
公开(公告)号:US11836146B1
公开(公告)日:2023-12-05
申请号:US17163236
申请日:2021-01-29
申请人: SPLUNK INC.
发明人: Jay A. Pathak , Steve Yu Zhang
IPC分类号: G06F16/20 , G06F16/2458 , G06F16/22 , G06F16/248 , G06F16/2457
CPC分类号: G06F16/2477 , G06F16/2228 , G06F16/248 , G06F16/24573
摘要: A computer-implemented method of determining indexed fields at query time comprises indexing time-stamped events ingested from a plurality of source types. The time-stamped searchable events compare portions of raw data. The method also comprises generating an index containing each keyword in the time-stamped searchable events and an associated location reference of a respective event in which the keyword appears. Further, the method comprises generating a fields metadata file identifying indexed fields in the time-stamped searchable events for each source type. The fields metadata file comprises reference values for accessing indexed fields associated with each source type from the index. The method also comprises accessing the fields metadata file to identify the indexed fields associated with each source type prior to executing a query.
-
公开(公告)号:US11604779B1
公开(公告)日:2023-03-14
申请号:US17316444
申请日:2021-05-10
申请人: Splunk Inc.
IPC分类号: G06F16/22 , G06F16/245 , G06F16/248 , G06F16/27 , G06F16/901
摘要: A method and system for managing searches of a data set that is partitioned based on a plurality of events. A structure of a search query may be analyzed to determine if logical computational actions performed on the data set is reducible. Data in each partition is analyzed to determine if at least a portion of the data in the partition is reducible. In response to a subsequent or reoccurring search request, intermediate summaries of reducible data and reducible search computations may be aggregated for each partition. Next, a search result may be generated based on at least one of the aggregated intermediate summaries, the aggregated reducible search computations, and a query of adhoc non-reducible data arranged in at least one of the plurality of partitions for the data set.
-
公开(公告)号:US11436222B2
公开(公告)日:2022-09-06
申请号:US16591432
申请日:2019-10-02
申请人: Splunk Inc.
IPC分类号: G06F17/30 , G06F16/2453 , G06F16/2458 , G06F16/22
摘要: Embodiments of the present disclosure provide techniques for using an inverted index in a pipelined search query. A field searchable data store is provided that comprises a plurality of event records, each event record comprising a time-stamped portion of raw machine data. Responsive to the receipt of an incoming search query, the search engine accesses an inverted index, wherein each entry in the inverted index comprises at least one field name, a corresponding at least one field value and a reference value associated with each field name and value pair that identifies a location in the data store where an associated event record is stored. Once the inverted index is accessed, it can be used to identify and search a subset of the plurality of event records, wherein the subset comprises one or more event records with corresponding reference values in the inverted index.
-
-
-
-
-
-
-
-
-