Detection of misbehaving components for large scale distributed systems
摘要:
A method or apparatus for monitoring a system by detecting misbehaving components in the system is presented. A computing device receives historical data points based on a set of monitored signals of a system. The system has components that are monitored through the set of monitored signals. For each monitored component, the computing device performs unsupervised machine learning based on the historical data points to identify expected states and state transitions for the component. The computing device identifies one or more steady components based on the identified states of the monitored components. The computing device also receives real-time data points based on monitoring the set of signals from the system. For each identified steady component, the computing device examines the received real-time data points for deviation from the expected state and state transitions of the steady component. The computing device reports anomaly in the system based on the detected deviations.
信息查询
0/0