发明申请
US20070180288A1 METHOD, SYSTEM AND PROGRAM FOR SECURING REDUNDANCY IN PARALLEL COMPUTING SYTEM 失效
用于并行计算冗余的方法,系统和程序

METHOD, SYSTEM AND PROGRAM FOR SECURING REDUNDANCY IN PARALLEL COMPUTING SYTEM
摘要:
In a parallel computing system having a plurality of computing node groups including at least one spare computing node group, a plurality of managing nodes for allocating jobs to the computing node groups and an information management server having respective computing node group status information are associated with the computing node groups, and the respective managing nodes update respective in-use computing node group status information by accessing the information management server. Furthermore, when the managing node detects an occurrence of a failure, the managing node having used then the computing node group disabled due to the failure identifies a spare computing node group by accessing the computing node group status information in the information management server. Then, the managing node having used then the disabled computing node group obtains the computing node group information of the identified spare computing node group. Furthermore, since the managing node having used then the disabled computing node group can continue processing by switching the disabled computing node group to the identified spare computing node group as a computing node group to be used, on the basis of the computing node group information of the identified spare computing node group, the redundancy in the parallel computing system can be secured.
信息查询
0/0