摘要:
A method, computer program product and/or computer system assigns access to a quorum disk in a split-storage cluster environment when a communication link between storage systems fails. Access to the quorum disk is based on storage system I/O performance. Priority is given to the storage system that has a higher performance before the link failure. When the communication link fails, both storage systems attempt to access the quorum disk. If the system that first attempts to access the quorum disk is the non-priority storage system, a timer is started. If the priority system attempts to access the quorum disk within a predetermined time interval, the priority system locks the quorum disk and forms the cluster. If the priority system does not attempt to access the quorum disk within the predetermined time interval, the non-priority system locks the quorum disk and forms the cluster.
摘要:
An apparatus 2 comprises at least three processing circuits 4 to perform redundant processing of a common thread of program instructions. Error detection circuitry 16 is provided comprising a number of comparators 22 for detecting a mismatch between signals on corresponding signal nodes 20 in the processing circuits 4. When a comparator 22 detects a mismatch, this triggers a recovery process. The error detection circuitry 16 generates an unresolvable error signal 36 indicating that a detected area is unresolvable by the recovery process when, during the recovery process, a mismatch is detected by one of the proper subset 34 of the comparators 22. By considering fewer comparators 22 during the recovery process than during normal operation, the chances of unrecoverable errors being detected can be reduced, increasing system availability.
摘要:
An approach is provided in which a system selects a first processor as a master Time of Day (TOD) processor in a first TOD topology. The system then assigns a second processor as an alternate master TOD processor to a second TOD topology based upon determining that the second processor is on a different node than the first processor. The system configures to the first TOD topology and, when the system detects a TOD failure requiring a topology switch, the system re-configures to the second TOD topology.
摘要:
An approach is provided in which a system selects a first processor as a master Time of Day (TOD) processor in a first TOD topology. The system then assigns a second processor as an alternate master TOD processor to a second TOD topology based upon determining that the second processor is on a different node than the first processor. The system configures to the first TOD topology and, when the system detects a TOD failure requiring a topology switch, the system re-configures to the second TOD topology.
摘要:
In computing systems that provide multiple computing domains configured to operate according to an active-standby model, techniques are provided for intentionally biasing the race to gain mastership between competing computing domains, which determines which computing domain operates in the active mode, in favor of a particular computer domain. The race to gain mastership may be biased in favor of a computing domain operating in a particular mode prior to the occurrence of the event that triggered the race to gain mastership. For example, in certain embodiments, the race to mastership may be biased in favor of the computing domain that was operating in the active mode prior to the occurrence of an event that triggered the race to gain mastership.
摘要:
An information processing system includes a controller, a terminal and a plurality of servers, and the controller attaches successive identification numbers to requests received from the terminal and transmits the requests to the plurality of servers and transmits one of responses received from the servers to the terminal, and each server receives, in the case where the identification number of this time and the identification number of the previous time are not successive numbers, the contents of a change (s) in resources associated with a missing identification number (a) from another server, and after changing the resources, performs predetermined processing in accordance with received request, creates a response, changes the resources, transmits the response to the controller, and stores the identification number of this time and the contents of the change in the resources associated with the identification number of this time.
摘要:
The present invention realizes a functional safety of a multiprocessor system without tightly coupling processor elements. When causing a plurality of processor elements to execute the same data processing and realizing a functional safety of the processor element, there is adopted a bus interface unit that performs control of performing safety measure processing when the non-coincidence of access requests issued from the processor elements has been fixed, and of starting access processing responding the access request when these access requests coincide with one another.
摘要:
A high availability scheduler of tasks in a cluster of server devices is provided. A server device of the cluster of server devices enters a leader state based upon the results of a consensus election process in which the server device participates with others of the cluster of server devices. Upon entering the leader state, the server device schedules one or more tasks by assigning each of the one or more tasks to a server device in the cluster.
摘要:
A system and method for controlling processor instruction execution. In one example, a method for controlling a total number of instructions executed by a processor includes instructing the processor to iteratively execute instructions via multiple iterations until a predetermined time period has elapsed. A number of instructions executed in each iteration of the iterations is less than a number of instructions executed in a prior iteration of the iterations. The method also includes determining the total number of instructions executed during the predetermined time period.
摘要:
A system includes a primary functionality and a backup functionality for the primary functionality. A measurement circuit measures operational parameter values of the primary functionality. A fault detection circuit determines a level of equivalence between the operation of the primary functionality and a reference functionality based on a weighted comparison of the measured operational parameter values of the primary functionality to corresponding reference operational parameter values for the reference functionality If the equivalence determination fails to find equivalence, the fault detection circuit signals a fault in the primary functionality and activates the backup functionality.