Determining Records Generated by a Processing Task of a Query

    公开(公告)号:US20190258635A1

    公开(公告)日:2019-08-22

    申请号:US16398044

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.

    Bucket data distribution for exporting data to worker nodes

    公开(公告)号:US11580107B2

    公开(公告)日:2023-02-14

    申请号:US16398038

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for exporting bucket data from one or more buckets to one or more worker nodes. The system can identify data from different bucket data from buckets stored in a data intake and query system that is to be processed by one or more worker nodes. The system can allocate one or more execution resources, such as a processing pipeline, to process and export the bucket data from the buckets. The system can assign bucket data corresponding to individual buckets to the execution resource based on a bucket distribution policy. The indexer can export the bucket data to the worker nodes for further processing based on the bucket data-execution resource assignment.

    Determining a record generation estimate of a processing task

    公开(公告)号:US11442935B2

    公开(公告)日:2022-09-13

    申请号:US16397930

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for determining a record generation estimate related to a particular processing task. The system obtains a sample set of data that includes multiple records. The system applies a processing task, such as a transform or regular expression rule to the sample set of data and determines how many records are generated by the processing task. Based on the number of records generated, the system determines a record generation estimate. The system can use the record generation estimate to allocate compute resources or determine a query execution time for at least a portion of the query based on the record generation estimate.

    Parallel branch operation using intermediary nodes

    公开(公告)号:US11704313B1

    公开(公告)日:2023-07-18

    申请号:US17074236

    申请日:2020-10-19

    Applicant: Splunk Inc.

    CPC classification number: G06F16/24535 G06F16/24532 G06F16/24537

    Abstract: The disclosed implementations include a method performed by a data intake and query system. The method includes receiving a search query at a search head, the search query including a branching operation between sets of data, generating a first subquery and a second subquery corresponding to the sets of data for execution by a search node, generating instructions for an intermediary node to combine partial results of the first subquery and the second subquery and instructions to concurrently communicate the subqueries to a search node, and executing the query by providing the instructions for the intermediary node to the intermediary node and the subqueries to the search node, the intermediary node receiving sets of partial search results for the subqueries, performing at least a portion of the branching operation on the partial results, and communicating the combined results to another intermediary node or the search head.

    BUCKET DATA DISTRIBUTION FOR EXPORTING DATA TO WORKER NODES

    公开(公告)号:US20190310977A1

    公开(公告)日:2019-10-10

    申请号:US16398038

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for exporting bucket data from one or more buckets to one or more worker nodes. The system can identify data from different bucket data from buckets stored in a data intake and query system that is to be processed by one or more worker nodes. The system can allocate one or more execution resources, such as a processing pipeline, to process and export the bucket data from the buckets. The system can assign bucket data corresponding to individual buckets to the execution resource based on a bucket distribution policy. The indexer can export the bucket data to the worker nodes for further processing based on the bucket data-execution resource assignment.

    Determining a Record Generation Estimate of a Processing Task

    公开(公告)号:US20190258632A1

    公开(公告)日:2019-08-22

    申请号:US16397930

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for determining a record generation estimate related to a particular processing task. The system obtains a sample set of data that includes multiple records. The system applies a processing task, such as a transform or regular expression rule to the sample set of data and determines how many records are generated by the processing task. Based on the number of records generated, the system determines a record generation estimate. The system can use the record generation estimate to allocate compute resources or determine a query execution time for at least a portion of the query based on the record generation estimate.

    QUERY ACCELERATION DATA STORE
    7.
    发明申请

    公开(公告)号:US20180089306A1

    公开(公告)日:2018-03-29

    申请号:US15665279

    申请日:2017-07-31

    Applicant: Splunk Inc.

    Abstract: Systems and methods for a data index and query system that utilize a query acceleration data store. An example method includes receiving a query identifying a set of data to be processed and a manner of processing the set of data. A query processing scheme for obtaining and processing the set of data is defined. First partial results of the query stored in a data store are identified, with the first partial results corresponding to a first portion of the set of data. One or more partitions are dynamically allocated to obtain a second portion of the set of data from different data sources. The second portion of the set of data is processed to obtain second partial results. The first partial results and second partial results are combined. The query is executed based on the query processing scheme.

    Determining records generated by a processing task of a query

    公开(公告)号:US11599541B2

    公开(公告)日:2023-03-07

    申请号:US16398044

    申请日:2019-04-29

    Applicant: Splunk Inc.

    Abstract: Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.

    Query acceleration data store
    9.
    发明授权

    公开(公告)号:US11416528B2

    公开(公告)日:2022-08-16

    申请号:US15665279

    申请日:2017-07-31

    Applicant: Splunk Inc.

    Abstract: Systems and methods for a data index and query system that utilize a query acceleration data store. An example method includes receiving a query identifying a set of data to be processed and a manner of processing the set of data. A query processing scheme for obtaining and processing the set of data is defined. First partial results of the query stored in a data store are identified, with the first partial results corresponding to a first portion of the set of data. One or more partitions are dynamically allocated to obtain a second portion of the set of data from different data sources. The second portion of the set of data is processed to obtain second partial results. The first partial results and second partial results are combined. The query is executed based on the query processing scheme.

Patent Agency Ranking