-
公开(公告)号:US12182174B1
公开(公告)日:2024-12-31
申请号:US18147639
申请日:2022-12-28
Applicant: SPLUNK Inc.
Inventor: Francis Beckert , Kristal Curtis , Om Rajyaguru , Abraham Starosta , Poonam Yadav
IPC: G06F16/24 , G06F16/248 , G06F16/28 , G06F16/957
Abstract: A search assistant engine is described that integrates with a data intake and query system and provides an intuitive user interface to assist a user in searching and evaluating indexed event data. Additionally, the search assistant engine provides logic to intelligently provide data to the user through the user interface such as determining fields of events likely to be of interest based on determining a mutual information score for each field and determining groups of related fields based on determining a mutual information score for each field grouping. Some implementations utilize machine learning techniques in certain analyses such as when clustering events and determining an event templates for each cluster. Additionally, the search assistant engine may import terms or characters from user interaction into predetermined search query templates to generate tailored search query for the user.
-
公开(公告)号:US12050507B1
公开(公告)日:2024-07-30
申请号:US17582995
申请日:2022-01-24
Applicant: Splunk, Inc.
Inventor: Abraham Starosta , Francis Beckert , Chandrima Sarkar
IPC: G06F11/07 , G06F16/2455 , G06F16/2458
CPC classification number: G06F11/0781 , G06F16/24561 , G06F16/2471
Abstract: A computerized method is disclosed for automated handling of data ingestion anomalies. The method features training a data model based on a first volume of data associated with a first time period. Thereafter, using the data model, a predictive analysis is conducted on a second volume of data associated with a second time period subsequent to the first time period to produce a predicted data ingestion volume. After, a correlative analysis between the predicted data ingestion volume and an actual data ingestion volume during the second time period is conducted to produce a prediction error. A notification is generated based on the prediction error.
-
公开(公告)号:US12216527B1
公开(公告)日:2025-02-04
申请号:US17583056
申请日:2022-01-24
Applicant: Splunk, Inc.
Inventor: Abraham Starosta , Francis Beckert , Chandrima Sarkar
Abstract: A computerized method is disclosed for automated handling of data ingestion anomalies. The method features operations of detecting a data ingestion anomaly and determining a cause for the data ingestion anomaly. The causal determination may be conducted by at least (i) determining features of an anomalous data ingestion volume, (ii) training a second data model, after a first data model being used to detect the data ingestion anomaly, with data sets consistent with the determined features, (iii) applying the second data model to predict whether a data ingestion sub-volume is anomalous, (iv) obtaining system state information during ingestion of the anomalous data ingestion sub-volume, and (v) determining the cause of the anomalous data ingestion volume based on the system state information.
-
公开(公告)号:US12111874B1
公开(公告)日:2024-10-08
申请号:US18147641
申请日:2022-12-28
Applicant: SPLUNK Inc.
Inventor: Francis Beckert , Kristal Curtis , Om Rajyaguru , Abraham Starosta , Poonam Yadav
IPC: G06F16/9535 , G06F16/2457 , G06F16/248
CPC classification number: G06F16/9535 , G06F16/24578 , G06F16/248
Abstract: Implementations of this disclosure provide a search assistant engine that integrates with a data intake and query system and provides an intuitive user interface to assist a user in searching and evaluating indexed event data. Additionally, the search assistant engine provides logic to intelligently provide data to the user through the user interface such as determining fields of events likely to be of interest based on determining a mutual information score for each field and determining groups of related fields based on determining a mutual information score for each field grouping. Some implementations utilize machine learning techniques in certain analyses such as when clustering events and determining an event templates for each cluster. Additionally, the search assistant engine may import terms or characters from user interaction into predetermined search query templates to generate tailored search query for the user.
-
5.
公开(公告)号:US12158880B1
公开(公告)日:2024-12-03
申请号:US17978153
申请日:2022-10-31
Applicant: SPLUNK, INC.
Inventor: Kristal Curtis , William Deaderick , Tanner Gilligan , Joseph Ross , Abraham Starosta , Sichen Zhong
IPC: G06F16/22 , G06F16/242 , G06F16/2458 , G06F16/28
Abstract: Implementations of this disclosure provide an anomaly detection system and methods of performing anomaly detection on a time-series dataset. The anomaly detection may include utilization of a forecasting machine learning algorithm to obtain a prediction of points of the dataset and comparing the predicted value of a point in the dataset with the actual value to determine an error value associated with that point. Additionally, the anomaly detection may include determination of a sensitivity threshold that impacts whether points within the dataset associated with certain error values are flagged as anomalies. The forecasting machine learning algorithm may implement a seasonality component determination process that accounts for seasonality or patterns in the dataset. A search query statement may be automatically generated through importing the sensitivity threshold into a predetermined search query statement that implements that forecasting machine learning algorithm.
-
6.
公开(公告)号:US12008046B1
公开(公告)日:2024-06-11
申请号:US17837931
申请日:2022-06-10
Applicant: Splunk, Inc.
Inventor: Kristal Curtis , William Deaderick , Abraham Starosta
IPC: G06F16/903 , H04L41/069
CPC classification number: G06F16/90335 , H04L41/069
Abstract: A computerized method is disclosed that includes operations of obtaining a data set, selecting candidate parameter pairs to be analyzed, wherein the candidate parameter pairs include a window length and a sensitivity multiplier, and wherein the window length is a number of data points, performing an anomaly detection process for each candidate parameter pair including importing each candidate parameter pair into a predetermined search query thereby generating a set of populated predetermined search queries, wherein the predetermined search query is configured to perform the anomaly detection process, executing each search query of the set of populated predetermined search queries on the data set to obtain a set of anomaly detection results, and scoring each anomaly detection result by applying a set of heuristics to the set of the anomaly detection results, and generating an auto-tuned search query by selecting a first candidate parameter pair based on a score of each of the set of anomaly detection results and importing the first candidate parameter pair into the predetermined search query.
-
公开(公告)号:US20250021767A1
公开(公告)日:2025-01-16
申请号:US18228654
申请日:2023-07-31
Applicant: Splunk Inc.
Inventor: Vedant Dharnidharka , Robert Riachi , Abraham Starosta , Alexander Sasha Stojanovic , Julien Didier Jean Veron Vialard , Rong Tan Wang , Poonam Yadav , Om Rajyaguru
IPC: G06F40/40 , G06F16/9032 , G06F40/211 , G06F40/30
Abstract: Implementations of this disclosure provide a machine learning model training system that receives user input being a natural language description of a search query, and packages and transmits the natural language description as a prompt to a plurality of large learning models (LLMs). The model training system also receives response from the plurality of LLMs being translations of the natural language descriptions to an executable search query and displays the translations to a user via a graphical user interface. The model training system receives user feedback via the graphical user interface that corresponds to indications as to whether each translation is correct, syntactically and/or semantically, and, in some examples, an indication of which response was preferred. The model training system also generates training data from the user input, translations generated by the plurality of LLMs, and user feedback, and subsequently, initiates training of a LLM using the training data.
-
公开(公告)号:US12056169B1
公开(公告)日:2024-08-06
申请号:US17513670
申请日:2021-10-28
Applicant: SPLUNK Inc.
Inventor: Abhinav Mishra , Giovanni Mola , Ram Sriharsha , Abraham Starosta , Zhaohui Wang
CPC classification number: G06F16/334 , G06F16/35 , G06N20/00
Abstract: A computerized method is disclosed that includes operations of training a machine learning model using a labeled training set of data, wherein the machine learning model is configured to classify domain name server (DNS) records, obtaining DNS record data including at least a first DNS Txt record, applying the trained machine learning model to the first DNS Txt record to classify the first DNS Txt record and responsive to the classification of the first DNS Txt record, generating a flag for a system administrator. The trained machine learning model may classify the first DNS Txt record using logistic regression. In some instances, applying the trained machine learning model to the first DNS Txt record includes performing a tokenizing operation on the first DNS Txt record to generate a tokenized first DNS Txt record.
-
公开(公告)号:US11663176B2
公开(公告)日:2023-05-30
申请号:US16945229
申请日:2020-07-31
Applicant: Splunk Inc.
Inventor: Ram Sriharsha , Zhaohui Wang , Kristal Curtis , Abraham Starosta
CPC classification number: G06F16/213 , G06F16/252 , G06F16/258 , G06K9/6231 , G06K9/6257 , G06N3/08
Abstract: Systems and methods are described for training an artificial intelligence model to extract one or more data fields from a log. For example, the artificial intelligence model may be a neural network. The neural network may be trained using training data obtained by iterating through a plurality of logs using active learning, and selecting a subset of the logs in the plurality to be labeled by a user. For example, the selected subset of logs may be logs that are not similar to other logs already labeled by a user. The user may be prompted to label the selected subset of logs to identify one or more data fields to extract. Once the selected subset of logs are labeled, these labeled logs can be used as the training data to train the neural network.
-
-
-
-
-
-
-
-