Field extraction rules from clustered data samples

    公开(公告)号:US11216491B2

    公开(公告)日:2022-01-04

    申请号:US15143563

    申请日:2016-04-30

    Applicant: Splunk Inc.

    Abstract: The operation of an automatic data input and query system is controlled by well-defined control data. Certain control data may relate to data schemas and direct operations performed by the system to extract fields from machine data. Automatic methods may determine proper field extraction control information by analyzing a sample of data from a source, breaking the sample data into event segments, classifying the segments into groups based on a measure of similarity, determining an operable extraction rule for each group, and storing the resulting extraction model. Data patterns known by the system can be leveraged to perform the event breaking and field identification for the classifying. Embodiments may provide a user interface to view, interact with, and approve the computer-generated extraction model.

    Extraction rule validation
    4.
    发明授权

    公开(公告)号:US11086890B1

    公开(公告)日:2021-08-10

    申请号:US16264525

    申请日:2019-01-31

    Applicant: SPLUNK INC.

    Abstract: Embodiments of the present invention are directed to validating extraction rules. In embodiments, a set of events for which field extraction is desired is obtained. Thereafter, an extraction rule is applied to the set of events to extract fields of the events. The application of the extraction rule can be monitored to determine that the applied extraction rule is invalid. Based on the applied extraction rule being invalid, a new extraction rule can be generated to apply to the set of events.

    Field Extraction Rules from Clustered Data Samples

    公开(公告)号:US20170286525A1

    公开(公告)日:2017-10-05

    申请号:US15143563

    申请日:2016-04-30

    Applicant: Splunk Inc.

    CPC classification number: G06F16/287 G06F16/2477

    Abstract: The operation of an automatic data input and query system is controlled by well-defined control data. Certain control data may relate to data schemas and direct operations performed by the system to extract fields from machine data. Automatic methods may determine proper field extraction control information by analyzing a sample of data from a source, breaking the sample data into event segments, classifying the segments into groups based on a measure of similarity, determining an operable extraction rule for each group, and storing the resulting extraction model. Data patterns known by the system can be leveraged to perform the event breaking and field identification for the classifying. Embodiments may provide a user interface to view, interact with, and approve the computer-generated extraction model.

    EXTRACTION RULE GENERATION USING CLUSTERING

    公开(公告)号:US20220083572A1

    公开(公告)日:2022-03-17

    申请号:US17539143

    申请日:2021-11-30

    Applicant: Splunk Inc.

    Abstract: Determining a set of extraction rules include clustering event segments into at least a first group of event segments, and determining, using first field data in the first group of event segments, a first set of extraction rules for extracting the first field data from each event segment of the first group of event segments. A determination is made that the first set of extraction rules fails to successfully extract all of the first field data. Responsive to the determination, the event segments are re-clustered into at least a second group of event segments and a third group of event segments until a successful set of extraction rules are identified. The successful set of extraction rules are stored in computer memory.

    Technology add-on control console

    公开(公告)号:US11249710B2

    公开(公告)日:2022-02-15

    申请号:US15088106

    申请日:2016-03-31

    Applicant: Splunk Inc.

    Abstract: The operation of an automatic data input and query system is controlled by well-defined control data. The system exposes user interfaces enabling an administrator to interact with control data to modify the ongoing operation of the system. Certain control data determines the collection and treatment of data from various technology sources. A robust control interface is provided enabling the efficient and reliable adding on of new technology data sources. Once established, control data for a new technology data source may be packaged in a form for archiving or distribution. The system may support the export and import of such packages. Such packages may be created independently of the system.

    Enhanced data extraction via efficient extraction rule matching

    公开(公告)号:US11977523B1

    公开(公告)日:2024-05-07

    申请号:US16859203

    申请日:2020-04-27

    Applicant: SPLUNK INC.

    CPC classification number: G06F16/211

    Abstract: Embodiments of the present invention are directed to facilitating performing data extraction via efficient extraction rule matching. Generally, an extraction rule can be determined to match an event based on a two-step process. In particular, initially, a determination that a set of fixed substrings associated with the extraction rule matches fixed substrings of the event can be made. Based on fixed substring match, a determination can be made that a set of fields associated with the extraction rule matches fields of the event. In such a case, the extraction rule can be deemed to match the event and used to extract values from the event.

Patent Agency Ranking