-
公开(公告)号:US11907227B1
公开(公告)日:2024-02-20
申请号:US17591511
申请日:2022-02-02
Applicant: Splunk, Inc.
Inventor: Zhaohui Wang , Ryan Gannon , Xiao Lin , Abhinav Mishra , Chandrima Sarkar , Ram Sriharsha
IPC: G06F16/00 , G06F16/2455 , G06F16/22 , G06F16/2458
CPC classification number: G06F16/24568 , G06F16/22 , G06F16/2462 , G06F16/24552
Abstract: A computerized method is disclosed including operations of receiving a data stream, performing a changepoint detection resulting in a detection of changepoints in the data stream including: maintaining a listing of starting indices for each run within the data stream in a buffer of size L wherein each index of the listing has a run length probability representing a likelihood of being a changepoint, receiving a new data point within the data stream and adding a new index to the buffer resulting in the buffer having size L+1, calculating a posterior run length probability that the new data point is a changepoint, and removing an index from the listing that has a lowest run length probability thereby returning the buffer to size L, and responsive to determining the index removed from the listing does not correspond to the new data point, identifying a changepoint associated with the new data point.
-
公开(公告)号:US11995052B1
公开(公告)日:2024-05-28
申请号:US17591528
申请日:2022-02-02
Applicant: Splunk Inc.
Inventor: Zhaohui Wang , Ryan Gannon , Xiao Lin , Chandrima Sarkar
IPC: G06F16/215
CPC classification number: G06F16/215
Abstract: A computerized method for detection of categorical drift within an incoming data stream. Herein, an error threshold is computed based on a first set of training data samples selected to detect categorical drift occurring for a data stream. Thereafter, probability distributions associated with content of a first and second data samples of the data stream are computed. Analytics are conducted to compute a difference between content of the first probability distribution that is based on a first data point of the first data sample and content of the second probability distribution that is based on a first data point of the second data sample. After computing the difference, that categorical drift is determined whether categorical drift detection has been conducted.
-