发明授权
- 专利标题: Recoverable error detection for concurrent computing programs
- 专利标题(中): 并发计算程序的可恢复错误检测
-
申请号: US11488432申请日: 2006-07-17
-
公开(公告)号: US07925791B2公开(公告)日: 2011-04-12
- 发明人: Edric Ellis , Jocelyn Luke Martin , Halldor Narfi Stefansson
- 申请人: Edric Ellis , Jocelyn Luke Martin , Halldor Narfi Stefansson
- 申请人地址: US MA Natick
- 专利权人: The Math Works, Inc.
- 当前专利权人: The Math Works, Inc.
- 当前专利权人地址: US MA Natick
- 代理机构: Nelson Mullins Riley & Scarborough LLP
- 代理商 Kevin J. Canning
- 主分类号: G06F15/16
- IPC分类号: G06F15/16
摘要:
The present invention provides a system and method for detecting communication error among multiple nodes in a concurrent computing environment. A barrier synchronization point or regions are used to check for communication mismatch. The barrier synchronization can be placed anywhere in a concurrent computing program. If a communication error occurred before the barrier synchronization point, it would at least be detected when a node enters the barrier synchronization point. Once a node has reached the barrier synchronization point, it is not allowed to communicate with another node regarding data that is needed to execute the concurrent computing program, even if the other node has not reached the barrier synchronization point. Regions can also be used to detect a communication mismatch instead of barrier synchronization points. A concurrent program on each node is separated into one or more regions. Two nodes can only communicate with each other when their regions are compatible. If their regions are not compatible, then there is a communication mismatch.
公开/授权文献
信息查询