System and method for changepoint detection in streaming data

    公开(公告)号:US11907227B1

    公开(公告)日:2024-02-20

    申请号:US17591511

    申请日:2022-02-02

    Applicant: Splunk, Inc.

    CPC classification number: G06F16/24568 G06F16/22 G06F16/2462 G06F16/24552

    Abstract: A computerized method is disclosed including operations of receiving a data stream, performing a changepoint detection resulting in a detection of changepoints in the data stream including: maintaining a listing of starting indices for each run within the data stream in a buffer of size L wherein each index of the listing has a run length probability representing a likelihood of being a changepoint, receiving a new data point within the data stream and adding a new index to the buffer resulting in the buffer having size L+1, calculating a posterior run length probability that the new data point is a changepoint, and removing an index from the listing that has a lowest run length probability thereby returning the buffer to size L, and responsive to determining the index removed from the listing does not correspond to the new data point, identifying a changepoint associated with the new data point.

    System and method for categorical drift detection

    公开(公告)号:US11995052B1

    公开(公告)日:2024-05-28

    申请号:US17591528

    申请日:2022-02-02

    Applicant: Splunk Inc.

    CPC classification number: G06F16/215

    Abstract: A computerized method for detection of categorical drift within an incoming data stream. Herein, an error threshold is computed based on a first set of training data samples selected to detect categorical drift occurring for a data stream. Thereafter, probability distributions associated with content of a first and second data samples of the data stream are computed. Analytics are conducted to compute a difference between content of the first probability distribution that is based on a first data point of the first data sample and content of the second probability distribution that is based on a first data point of the second data sample. After computing the difference, that categorical drift is determined whether categorical drift detection has been conducted.

Patent Agency Ranking