摘要:
Cluster membership in a distributed computer system is determined by determining with which other nodes each node is in communication and distributing that connectivity information through the nodes of the system. Accordingly, each node can determine an optimized new cluster based upon the connectivity information. Specifically, each node has information regarding with which nodes the node is in communication and similar information for each other node of the system. Therefore, each node has complete information regarding interconnectivity of all nodes which are directly or indirectly connected. Each node applies optimization criteria to such connectivity information to determine an optimal new cluster. Data represent the optimal new cluster is broadcast by each node. In addition, the optimal new cluster determined by the various nodes are collected by each node. Thus, each node has data representing the proposed new cluster which is perceived by each respective node to be optimal. Each node uses such data to elect a new cluster from the various proposed new clusters. For example, the new cluster represented by more proposed new clusters than any other is elected as the new cluster. Since each node receives the same proposed new clusters from the potential member nodes of the new cluster, the new cluster membership is reached unanimously. In addition, since each node has more complete information regarding the potential member nodes of the new cluster, the resulting new cluster consistently has a relatively optimal configuration.
摘要:
A distributed computer system and method for determining cluster membership in a distributed computer system. A plurality of computers configurable as cluster nodes are coupled through one or more public and/or private communications networks. Cluster management software running on the plurality of computers is configured to group various ones of the computers into a cluster. Weighting values are assigned to each node, such as by relative processing power. Each fully connected subset of nodes are grouped into a possible cluster configuration. The weighting value of each subset is calculated. The membership in the cluster is chosen based on the subset with the optimum weighting value among all the possible cluster configurations. The maximum weighting value may be adjusted if the maximum weighting value is greater than or equal to the sum of all other weighting values for all other nodes in the current cluster configuration. The maximum weighting factor may be adjusted to a value below the sum of all other weighting values for all other nodes in the current cluster configuration.
摘要:
A computing system develops time/date values by using a free-running counter to measure and accumulate increments of time. The increments of time are converted from the resolution of the free-running counter to that used for the time and date values by dividing by a conversion variable and then used to update the time/date value. The accuracy of the time/date value is monitored by periodically comparing the rate of the free-running counter to the rate of a more accurate, external clock. The ratio of these two rates is used to adjust the conversion variable. The conversion variable reflects any differences between (1) the rate of change of the increments of time used for developing the time/data value and (2) the external clock. Its use here, therefore, will operate to either slow down or speed up the rate of change of the time/date value so that it more closely tracks the external clock.
摘要:
Cluster membership in a distributed computer system is determined by determining with which other nodes each node is in communication and distributing that connectivity information through the nodes of the system. Accordingly, each node can determine an optimized new cluster based upon the connectivity information. Specifically, each node has information regarding with which nodes the node is in communication and similar information for each other node of the system. Therefore, each node has complete information regarding interconnectivity of all nodes which are directly or indirectly connected. Each node applies optimization criteria to such connectivity information to determine an optimal new cluster. Data represent the optimal new cluster is broadcast by each node. In addition, the optimal new cluster determined by the various nodes are collected by each node. Thus, each node has data representing the proposed new cluster which is perceived by each respective node to be optimal. Each node uses such data to elect a new cluster from the various proposed new clusters. For example, the new cluster represented by more proposed new clusters than any other is elected as the new cluster. Since each node receives the same proposed new clusters from the potential member nodes of the new cluster, the new cluster membership is reached unanimously. In addition, since each node has more complete information regarding the potential member nodes of the new cluster, the resulting new cluster consistently has a relatively optimal configuration.
摘要:
Multiple nodes can concurrently gain membership in a cluster of nodes of a distributed computer system by broadcasting reconfiguration messages to all nodes of the distributed computer system. In response to a reconfiguration request resulting from a node petitioning to join a cluster or a node leaving the cluster, each node determines to which nodes of the distributed computer system the node is connected, i.e., which are sending reconfiguration messages which the node receives. In addition, if multiple nodes fail substantially simultaneously, each node which continues to operate does not receive a reconfiguration message from each of the failed nodes and the failed nodes are omitted from the proposed new cluster. Thus, multiple simultaneous failures are processed in a single reconfiguration. Each of the member nodes of the proposed cluster determine the membership of the proposed cluster and broadcast a reconfiguration message to all proposed member nodes and collects similar messages. If all reconfiguration messages agree, the proposed cluster is accepted. In the case in which one or more nodes leave the cluster, quorum is established in the new cluster relative to the old cluster.
摘要:
Each node of a failing distributed computer system, e.g., as a result of a split-brain failure, races to achieve a quorum by successfully reserving two shared storage devices which are designated quorum controllers. During normal operation of the distributed computer system, each of the quorum controllers is associated with and reserved by a respective node. During the race for quorum in response to a detected failure of the distributed computer system, each node which has not failed forcibly reserves the quorum controller which is associated with the other node. If a node simultaneously holds reservations for both quorum controllers, that node has acquired a quorum. The forcible reservation of a shared storage device does not fail even if another node holds a valid reservation to the same storage device. Accordingly, a failed node which does not relinquish a reservation to the node's quorum controller cannot prevent another node from acquiring a quorum. Prior to forcibly reserving the quorum controller of another node, each node verifies that it continues to hold a reservation of the node's own associated quorum controller. If a node no longer holds a reservation of the node's own associated quorum controller, that node has lost the race for quorum since another node has already forcibly reserved the former node's associated quorum controller. Thus, quorum can be efficiently and effectively determined by independent nodes of a failing distributed computer system notwithstanding the failure of a failing node to relinquish shared storage device reservations held by the failing node.
摘要:
Each node of a failing distributed computer system, e.g., as a result of a split-brain failure, races to achieve a quorum by successfully reserving two shared storage devices which are designated quorum controllers. During normal operation of the distributed computer system, each of the quorum controllers is associated with and reserved by a respective node. During the race for quorum in response to a detected failure of the distributed computer system, each node which has not failed forcibly reserves the quorum controller which is associated with the other node. If a node simultaneously holds reservations for both quorum controllers, that node has acquired a quorum. The forcible reservation of a shared storage device does not fail even if another node holds a valid reservation to the same storage device. Accordingly, a failed node which does not relinquish a reservation to the node's quorum controller cannot prevent another node from acquiring a quorum. Prior to forcibly reserving the quorum controller of another node, each node verifies that it continues to hold a reservation of the node's own associated quorum controller. If a node no longer holds a reservation of the node's own associated quorum controller, that node has lost the race for quorum since another node has already forcibly reserved the former node's associated quorum controller. Thus, quorum can be efficiently and effectively determined by independent nodes of a failing distributed computer system notwithstanding the failure of a failing node to relinquish shared storage device reservations held by the failing node.
摘要:
A computer system compnses a processor (2), memory (4) and a plurality of devices (6, 8, 12), the processor (2) and the memory (4) being operable to effect the operation of a fault response processor (AFR), and a device driver (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) for each of the devices. The fault response processor (AFR) is operable to generate a model which represents the processor (2), the memory (4) and the devices (6, 8, 12) of the computer system and the inter-connection of the processor (2), memory (4) and the devices (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL). The device driver (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) for each of the devices (6, 8, 12) is arranged, consequent upon a change of operational status of the device, to generate fault report data indicating whether the change of status was caused internally within the device or externally by another connected device. The devices of the computer system may be formed as a plurality of Field Replaceable Units (FRU). The fault response processor (AFR) is operable, consequent upon receipt of the fault reports from the device drivers (GRAPHICS, NETWORK, H2IO, IO2L, SERIAL) to estimate the location of a FRU containing a faulty device by applying the fault indication to the model. In other embodiments the fault report data includes direction information indicating a connection between the device and the other connected device which caused the external fault. Having identified the faulty device the FRU may be replaced, thereby minimizing down time of the computer system.
摘要:
A system and method for monitoring a distributed fault tolerant computer system. A hardware counter mechanism (e.g. a countdown counter) is reset repeatedly by a software reset mechanism during normal operation, thereby preventing the counter mechanism from reaching a count indicative of the existence of a fault. A unit provides a signal to a bus indicative of the status (ON or OFF) of the unit. A management subsystem defines a configuration for the distributed fault tolerant computer system. The management subsystem is responsive to status signals on the bus and selectively reconfigures a stored representation in response to changing status signals on the bus.
摘要:
Data integrity and availability is assured by preventing a node of a distributed, clustered system from accessing shared data in the case of a failure of the node or communication links with the node. The node is prevented from accessing the shared data in the presence of such a failure by ensuring that such a failure is detected in less time than a secondary node would allow user I/O activities to commence after reconfiguration. The prompt detection of failure is assured by periodically determining which configuration of the current cluster each node believes itself to be a member of Each node maintains a sequence number which identifies the current configuration of the cluster. Periodically, each node exchanges its sequence number with all other nodes of the cluster. If a particular node detects that it believes itself to be a member of a preceding configuration to that to which another node belongs, the node determines that the cluster has been reconfigured since the node last performed a reconfiguration. Therefore, the node must no longer be a member of the cluster. The node then refrains from accessing shared data. In addition, if a node suspects a failure in the cluster, the node broadcasts a reconfigure message to all other nodes of the cluster through a public network. Since the messages are sent through a public network, failure of the private communications links between the nodes does not prevent receipt of the reconfigure messages.