Root cause detection and monitoring for storage systems

    公开(公告)号:US09898357B1

    公开(公告)日:2018-02-20

    申请号:US14751047

    申请日:2015-06-25

    CPC classification number: G06F11/079 G06F11/0727 G06F11/3452 G06F2201/815

    Abstract: Notification routines are described for implementation by a monitoring service. As part of an exemplary notification routine, a faulty storage volume is correlated at multiple logical storage levels of a storage system with other faulty storage volumes. The correlation pattern can follow a tree-based decision format, where each faulty storage volume is sequentially compared at a lower logical storage level. Advantageously, once a common logical storage component of a group of storage volumes is identified, a notification is issued about the group of faulty storage volumes sharing the common logical storage component. Additionally, notifications can be issued according to a severity level of the group of faulty storage volumes. In some embodiments, before issuing the notification, the group of faulty storage volumes can be compared to a time allowed for the group of faulty storage volume to be at fault.

    Autonomous host deployment in managed deployment systems

    公开(公告)号:US10110502B1

    公开(公告)日:2018-10-23

    申请号:US14611961

    申请日:2015-02-02

    Abstract: Autonomous host deployment may be implemented in managed deployment environments in order to deploy resources at resource host(s) when a deployment authority is unavailable. Upon startup of a resource host, a determination may be made as to whether a remote deployment state authority is available. If the deployment state authority is unavailable, a deployment state for a resource host and/or resources hosted at a resource host may be identified. Different resources at a resource host and the resource host itself may have different deployment states identified. In some embodiments, deployment state information may be locally maintained and accessed to determine the deployment state. The resource host may perform operations to deploy the resource host and/or resources according to the identified deployment state.

    Root cause detection and monitoring for storage systems

    公开(公告)号:US10223189B1

    公开(公告)日:2019-03-05

    申请号:US14751036

    申请日:2015-06-25

    Abstract: Suppression routines are described for implementation by a monitoring service. The monitoring service uses collected data to identify faulty storage volumes. Advantageously, in some cases, the monitoring service can notify an operator of the storage system that certain storage volumes are faulty. In some embodiments, these notifications are to be suppressed because not all notifications of faulty volumes are necessary. Suppression rules can indicate that a faulty storage volume is at fault because it is a test volume, associated with a large power outage, or some other learned event from storage command metrics. The monitoring service can suppress notifications about these known system issues, among others.

    Root cause detection and monitoring for storage systems

    公开(公告)号:US10282245B1

    公开(公告)日:2019-05-07

    申请号:US14751028

    申请日:2015-06-25

    Abstract: A storage system includes a monitoring service that identifies root causes of storage systems issues using relationships. The monitoring service can use thresholds associated with the relationships to detect the root causes. Relationships can be based on correlation relationships between the different levels of the storage system. In various embodiments, relationships can also be based on events that affect multiple storage volumes or on short-term events. Once a relationship is identified, a threshold for that relationship is generated or updated. The monitoring service can make that threshold accessible to other components of the monitoring service or an operator of the storage system to be used in detecting root causes.

Patent Agency Ranking