System and Method for Machine Learning Driven Automated Incident Prevention for Distributed Systems

    公开(公告)号:US20240121254A1

    公开(公告)日:2024-04-11

    申请号:US17960204

    申请日:2022-10-05

    申请人: Xiaohui Gu

    发明人: Xiaohui Gu

    IPC分类号: H04L9/40 G06N5/022

    摘要: An unsupervised pattern extraction system and method for extracting incident and root cause patterns from various kinds of machine data such as system-level metric values, system call traces, and semi-structured or free form text log data and performing holistic root cause analysis for distributed systems. The system utilizing Natural Language Processing and machine learning techniques to extract incident and root cause information from received incident reports and other system data. The system consists of both real time data collection (104) and analytics functions (200). The previously reported incident data is used to discover and apply remediation techniques to utilize prior remediation efforts to automatically classify and correct incidents. The system may then annotate a remediation data file with the technique applied. The system will utilize prior known remediation techniques for identified categories to predict and prevent future issues.

    System and Method for Online Unsupervised Event Pattern Extraction and Holistic Root Cause Analysis for Distributed Systems

    公开(公告)号:US20190324831A1

    公开(公告)日:2019-10-24

    申请号:US15937362

    申请日:2018-03-27

    申请人: Xiaohui Gu

    发明人: Xiaohui Gu

    IPC分类号: G06F11/07 G06N99/00

    摘要: An unsupervised pattern extraction system and method for extracting user interested patterns from various kinds of data such as system-level metric values, system call traces, and semi-structured or free form text log data and performing holistic root cause analysis for distributed systems. The distributed system includes a plurality of computer machines or smart devices. The system consists of both real time data collection and analytics functions. The analytics functions automatically extract event patterns and recognize recurrent events in real time by analyzing collected data streams from different sources. A root cause analysis component analyzes the extracted events and identifies both correlation and causality relationships among different components to pinpoint root cause of a networked-system anomaly. Furthermore, an anomaly impact prediction component estimates the impact scope of the detected anomaly and raises early alarms about impending service outages or application performance degradations based on the identified correlation and causality relationships.

    Systems and methods for optimal component composition in a stream processing system
    3.
    发明授权
    Systems and methods for optimal component composition in a stream processing system 有权
    流处理系统中最佳组件组成的系统和方法

    公开(公告)号:US08286153B2

    公开(公告)日:2012-10-09

    申请号:US12061284

    申请日:2008-04-02

    IPC分类号: G06F9/45

    CPC分类号: H04L12/4641

    摘要: A system and method are provided for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. The system includes an adaptive composition probing (ACP) module and a hierarchical state manager. The ACP module probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. The hierarchical state manager manages local and global information for use by said ACP module in determining the optimal component composition.

    摘要翻译: 提供了一种用于在分布式流处理环境中优化组件组成的系统和方法,其具有能够与多个流处理组件中的一个或多个相关联的多个节点。 该系统包括自适应组合探测(ACP)模块和分级状态管理器。 ACP模块探测多个流处理组件的子集,以响应于流处理请求来确定最佳组件组成。 层级状态管理器管理本地和全局信息,供所述ACP模块在确定最佳组件组成时使用。

    Method and system for indexing and serializing data
    4.
    发明授权
    Method and system for indexing and serializing data 失效
    索引和序列化数据的方法和系统

    公开(公告)号:US07752192B2

    公开(公告)日:2010-07-06

    申请号:US11681486

    申请日:2007-03-02

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30911

    摘要: The present invention provides a computer implemented method, an apparatus, and a computer usable program product for indexing data. A controller identifies a set of data to be indexed, wherein a set of data structure trees represents the set of data. The controller merges the set of data structure trees to form a unified tree, wherein the unified tree contains a node for each unit of data in the set of data. The controller assigns an identifier to the node for each unit of data in the set of data that describes the node within the unified tree. The controller then serializes the unified tree to form a set of sequential series that represents the set of data structure trees, wherein the set of sequential series forms an index for the set of data.

    摘要翻译: 本发明提供了一种用于索引数据的计算机实现的方法,装置和计算机可用程序产品。 控制器识别要索引的一组数据,其中一组数据结构树表示该组数据。 控制器将数据结构树组合成一个统一的树,其中统一树包含一组数据中每个数据单元的节点。 控制器为描述统一树中节点的数据集中的每个数据单元向节点分配一个标识符。 然后,控制器对统一树进行序列化以形成一组代表数据结构树的顺序序列,其中,该顺序序列集合形成该组数据的索引。

    Systems and methods for optimal component composition in a stream processing system
    5.
    发明授权
    Systems and methods for optimal component composition in a stream processing system 失效
    流处理系统中最佳组件组成的系统和方法

    公开(公告)号:US07562355B2

    公开(公告)日:2009-07-14

    申请号:US11068785

    申请日:2005-03-01

    IPC分类号: G06F9/45

    CPC分类号: H04L12/4641

    摘要: A system and method are provided for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. The system includes an adaptive composition probing (ACP) module and a hierarchical state manager. The ACP module probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. The hierarchical state manager manages local and global information for use by said ACP module in determining the optimal component composition.

    摘要翻译: 提供了一种用于在分布式流处理环境中优化组件组成的系统和方法,其具有能够与多个流处理组件中的一个或多个相关联的多个节点。 该系统包括自适应组合探测(ACP)模块和分级状态管理器。 ACP模块探测多个流处理组件的子集,以响应于流处理请求来确定最佳组件组成。 层级状态管理器管理本地和全局信息,供所述ACP模块在确定最佳组件组成时使用。

    METHOD AND APPARATUS FOR PROVIDING LOAD DIFFUSION IN DATA STREAM CORRELATIONS
    6.
    发明申请
    METHOD AND APPARATUS FOR PROVIDING LOAD DIFFUSION IN DATA STREAM CORRELATIONS 有权
    在数据流相关中提供负载扩展的方法和装置

    公开(公告)号:US20080168179A1

    公开(公告)日:2008-07-10

    申请号:US12054207

    申请日:2008-03-24

    IPC分类号: G06F15/16

    摘要: A computer implemented method, apparatus, and computer usable program code for performing load diffusion to process data stream pairs. A data stream pair is received for correlation. The data stream pair is partitioned into portions to meet correlation constraints for correlating data in the data stream pair to form a partitioned data stream pair. The partitioned data stream pair is sent to a set of nodes for correlation processing to perform the load diffusion.

    摘要翻译: 用于执行负载扩散以处理数据流对的计算机实现的方法,装置和计算机可用程序代码。 接收数据流对以进行相关。 将数据流对划分成部分以满足用于使数据流对中的数据相关的相关约束,以形成分区数据流对。 分区数据流对被发送到一组节点进行相关处理以执行负载扩散。

    Model-based self-optimizing distributed information management
    7.
    发明授权
    Model-based self-optimizing distributed information management 有权
    基于模型的自优化分布式信息管理

    公开(公告)号:US07720841B2

    公开(公告)日:2010-05-18

    申请号:US11538525

    申请日:2006-10-04

    IPC分类号: G06F13/14

    摘要: Disclosed are a method, information processing system, and computer readable medium for managing data collection in a distributed processing system. The method includes dynamically collecting at least one statistical query pattern associated with a selected group of information processing nodes. The statistical query pattern is dynamically collected from a plurality of information processing nodes in a distributed processing system. At least one operating attribute distribution associated with an operating attribute that has been queried for the selected group is dynamically monitored. The selected group is dynamically configured, based on the query pattern and the operating attribute distribution, to periodically push a set of attributes associated with the each information processing node in the selected group.

    摘要翻译: 公开了一种用于管理分布式处理系统中的数据收集的方法,信息处理系统和计算机可读介质。 该方法包括动态地收集与所选择的一组信息处理节点相关联的至少一个统计查询模式。 统计查询模式是从分布式处理系统中的多个信息处理节点动态收集的。 动态地监视与被选择组查询的操作属性相关联的至少一个操作属性分布。 基于查询模式和操作属性分布动态地配置所选择的组,以周期性地推送与所选择的组中的每个信息处理节点相关联的一组属性。

    SYSTEMS AND METHODS FOR PREDICTIVE FAILURE MANAGEMENT
    8.
    发明申请
    SYSTEMS AND METHODS FOR PREDICTIVE FAILURE MANAGEMENT 有权
    预测失效管理系统与方法

    公开(公告)号:US20080250265A1

    公开(公告)日:2008-10-09

    申请号:US11696795

    申请日:2007-04-05

    IPC分类号: G06F11/00

    摘要: A system and method for using continuous failure predictions for proactive failure management in distributed cluster systems includes a sampling subsystem configured to continuously monitor and collect operation states of different system components. An analysis subsystem is configured to build classification models to perform on-line failure predictions. A failure prevention subsystem is configured to take preventive actions on failing components based on failure warnings generated by the analysis subsystem.

    摘要翻译: 用于在分布式集群系统中使用连续故障预测进行主动故障管理的系统和方法包括:被配置为连续监视和收集不同系统组件的操作状态的采样子系统。 分析子系统被配置为构建分类模型以执行在线故障预测。 故障预防子系统被配置为根据分析子系统生成的故障警告对故障组件采取预防措施。

    System and method for peer-to-peer multi-party voice-over-IP services
    9.
    发明申请
    System and method for peer-to-peer multi-party voice-over-IP services 有权
    用于点对点多方语音IP服务的系统和方法

    公开(公告)号:US20070211703A1

    公开(公告)日:2007-09-13

    申请号:US11372634

    申请日:2006-03-10

    IPC分类号: H04L12/66

    摘要: A system, method, and computer program product for establishing multi-party VoIP conference audio calls in a distributed, peer-to-peer network where any number of nodes are able to arbitrarily and asynchronously start or stop producing audio output to be mixed into a single composite audio stream that is distributed to all nodes. A single distribution tree is used that has optimal communications characteristics to distribute the composite audio signal to all nodes. An audio mixing tree is established and maintained by adaptively and dynamically adding and merging intermediate mixing nodes operating between user nodes and the root of the single distribution tree. The intermediate mixing nodes and the root of the single distribution tree are all hosted, in an exemplary embodiment, on user nodes that are endpoints of the distribution tree.

    摘要翻译: 一种用于在分布式对等网络中建立多方VoIP会议音频呼叫的系统,方法和计算机程序产品,其中任何数量的节点能够任意地和异步地开始或停止产生混合到 单个复合音频流分配给所有节点。 使用具有最佳通信特性以将复合音频信号分配给所有节点的单个分发树。 通过自适应地动态地添加和合并在用户节点和单个分发树的根之间运行的中间混合节点来建立和维护音频混合树。 在示例性实施例中,分发树的中间混合节点和根分别在作为分发树的端点的用户节点上托管。

    Method and apparatus for providing load diffusion in data stream correlations
    10.
    发明申请
    Method and apparatus for providing load diffusion in data stream correlations 失效
    用于在数据流相关中提供负载扩散的方法和装置

    公开(公告)号:US20070016560A1

    公开(公告)日:2007-01-18

    申请号:US11183149

    申请日:2005-07-15

    申请人: Xiaohui Gu Philip Yu

    发明人: Xiaohui Gu Philip Yu

    IPC分类号: G06F17/30

    摘要: A computer implemented method, apparatus, and computer usable program code for performing load diffusion to process data stream pairs. A data stream pair is received for correlation. The data stream pair is partitioned into portions to meet correlation constraints for correlating data in the data stream pair to form a partitioned data stream pair. The partitioned data stream pair is sent to a set of nodes for correlation processing to perform the load diffusion.

    摘要翻译: 用于执行负载扩散以处理数据流对的计算机实现的方法,装置和计算机可用程序代码。 接收数据流对以进行相关。 将数据流对划分成部分以满足用于使数据流对中的数据相关的相关约束,以形成分区数据流对。 分区数据流对被发送到一组节点进行相关处理以执行负载扩散。