-
公开(公告)号:US20190258635A1
公开(公告)日:2019-08-22
申请号:US16398044
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/2453 , G06F16/242 , G06F16/22 , G06F16/901 , G06F16/2458
Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.
-
公开(公告)号:US11580107B2
公开(公告)日:2023-02-14
申请号:US16398038
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade , Nikhil Roy
IPC: G06F16/00 , G06F16/2455 , G06F9/50 , G06F16/22
Abstract: Systems and methods are described for exporting bucket data from one or more buckets to one or more worker nodes. The system can identify data from different bucket data from buckets stored in a data intake and query system that is to be processed by one or more worker nodes. The system can allocate one or more execution resources, such as a processing pipeline, to process and export the bucket data from the buckets. The system can assign bucket data corresponding to individual buckets to the execution resource based on a bucket distribution policy. The indexer can export the bucket data to the worker nodes for further processing based on the bucket data-execution resource assignment.
-
公开(公告)号:US11442935B2
公开(公告)日:2022-09-13
申请号:US16397930
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/00 , G06F16/2453 , G06F16/2455 , G06F9/50 , G06F16/2458
Abstract: Systems and methods are described for determining a record generation estimate related to a particular processing task. The system obtains a sample set of data that includes multiple records. The system applies a processing task, such as a transform or regular expression rule to the sample set of data and determines how many records are generated by the processing task. Based on the number of records generated, the system determines a record generation estimate. The system can use the record generation estimate to allocate compute resources or determine a query execution time for at least a portion of the query based on the record generation estimate.
-
公开(公告)号:US11704313B1
公开(公告)日:2023-07-18
申请号:US17074236
申请日:2020-10-19
Applicant: Splunk Inc.
Inventor: Asha Andrade , Tingting Bao , Vanco Buca , Weichao Duan , Anuradha Pariti , Xiaowei Wang
IPC: G06F16/2453
CPC classification number: G06F16/24535 , G06F16/24532 , G06F16/24537
Abstract: The disclosed implementations include a method performed by a data intake and query system. The method includes receiving a search query at a search head, the search query including a branching operation between sets of data, generating a first subquery and a second subquery corresponding to the sets of data for execution by a search node, generating instructions for an intermediary node to combine partial results of the first subquery and the second subquery and instructions to concurrently communicate the subqueries to a search node, and executing the query by providing the instructions for the intermediary node to the intermediary node and the subqueries to the search node, the intermediary node receiving sets of partial search results for the subqueries, performing at least a portion of the branching operation on the partial results, and communicating the combined results to another intermediary node or the search head.
-
公开(公告)号:US20190310977A1
公开(公告)日:2019-10-10
申请号:US16398038
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade , Nikhil Roy
IPC: G06F16/2455 , G06F16/22 , G06F9/50
Abstract: Systems and methods are described for exporting bucket data from one or more buckets to one or more worker nodes. The system can identify data from different bucket data from buckets stored in a data intake and query system that is to be processed by one or more worker nodes. The system can allocate one or more execution resources, such as a processing pipeline, to process and export the bucket data from the buckets. The system can assign bucket data corresponding to individual buckets to the execution resource based on a bucket distribution policy. The indexer can export the bucket data to the worker nodes for further processing based on the bucket data-execution resource assignment.
-
公开(公告)号:US20190258632A1
公开(公告)日:2019-08-22
申请号:US16397930
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/2453 , G06F16/2458 , G06F9/50 , G06F16/2455
Abstract: Systems and methods are described for determining a record generation estimate related to a particular processing task. The system obtains a sample set of data that includes multiple records. The system applies a processing task, such as a transform or regular expression rule to the sample set of data and determines how many records are generated by the processing task. Based on the number of records generated, the system determines a record generation estimate. The system can use the record generation estimate to allocate compute resources or determine a query execution time for at least a portion of the query based on the record generation estimate.
-
公开(公告)号:US20180089306A1
公开(公告)日:2018-03-29
申请号:US15665279
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F17/30
CPC classification number: G06F16/3334 , G06F16/24535 , G06F16/24554 , G06F16/2465 , G06F16/3349
Abstract: Systems and methods for a data index and query system that utilize a query acceleration data store. An example method includes receiving a query identifying a set of data to be processed and a manner of processing the set of data. A query processing scheme for obtaining and processing the set of data is defined. First partial results of the query stored in a data store are identified, with the first partial results corresponding to a first portion of the set of data. One or more partitions are dynamically allocated to obtain a second portion of the set of data from different data sources. The second portion of the set of data is processed to obtain second partial results. The first partial results and second partial results are combined. The query is executed based on the query processing scheme.
-
公开(公告)号:US11599541B2
公开(公告)日:2023-03-07
申请号:US16398044
申请日:2019-04-29
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/00 , G06F16/2453 , G06F16/2458 , G06F16/22 , G06F16/901 , G06F16/242
Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.
-
公开(公告)号:US11416528B2
公开(公告)日:2022-08-16
申请号:US15665279
申请日:2017-07-31
Applicant: Splunk Inc.
Inventor: Sourav Pal , Arindam Bhattacharjee , Asha Andrade
IPC: G06F16/33 , G06F16/2458 , G06F16/2453 , G06F16/2455
Abstract: Systems and methods for a data index and query system that utilize a query acceleration data store. An example method includes receiving a query identifying a set of data to be processed and a manner of processing the set of data. A query processing scheme for obtaining and processing the set of data is defined. First partial results of the query stored in a data store are identified, with the first partial results corresponding to a first portion of the set of data. One or more partitions are dynamically allocated to obtain a second portion of the set of data from different data sources. The second portion of the set of data is processed to obtain second partial results. The first partial results and second partial results are combined. The query is executed based on the query processing scheme.
-
-
-
-
-
-
-
-