发明申请
- 专利标题: METHOD, SYSTEM AND PROGRAM FOR SECURING REDUNDANCY IN PARALLEL COMPUTING SYTEM
- 专利标题(中): 用于并行计算冗余的方法,系统和程序
-
申请号: US11608331申请日: 2006-12-08
-
公开(公告)号: US20070180288A1公开(公告)日: 2007-08-02
- 发明人: Masakuni Okada , Fumitomo Ohsawa , Yoshiko Ishii , Naoki Matsuo
- 申请人: Masakuni Okada , Fumitomo Ohsawa , Yoshiko Ishii , Naoki Matsuo
- 申请人地址: US NY Armonk
- 专利权人: INTERNATIONAL BUSINESS MACHINES CORPORATION
- 当前专利权人: INTERNATIONAL BUSINESS MACHINES CORPORATION
- 当前专利权人地址: US NY Armonk
- 优先权: JPJP2005-369863 20051222
- 主分类号: G06F11/00
- IPC分类号: G06F11/00
摘要:
In a parallel computing system having a plurality of computing node groups including at least one spare computing node group, a plurality of managing nodes for allocating jobs to the computing node groups and an information management server having respective computing node group status information are associated with the computing node groups, and the respective managing nodes update respective in-use computing node group status information by accessing the information management server. Furthermore, when the managing node detects an occurrence of a failure, the managing node having used then the computing node group disabled due to the failure identifies a spare computing node group by accessing the computing node group status information in the information management server. Then, the managing node having used then the disabled computing node group obtains the computing node group information of the identified spare computing node group. Furthermore, since the managing node having used then the disabled computing node group can continue processing by switching the disabled computing node group to the identified spare computing node group as a computing node group to be used, on the basis of the computing node group information of the identified spare computing node group, the redundancy in the parallel computing system can be secured.
公开/授权文献
信息查询