Performing optimized collective operations in a irregular subcommunicator of compute nodes in a parallel computer
摘要:
In a parallel computer, performing optimized collective operations in an irregular subcommunicator of compute nodes may be carried out by: identifying, within the irregular subcommunicator, regular neighborhoods of compute nodes; selecting, for each neighborhood from the compute nodes of the neighborhood, a local root node; assigning each local root node to a node of a neighborhood-wide tree topology; mapping, for each neighborhood, the compute nodes of the neighborhood to a local tree topology having, at its root, the local root node of the neighborhood; and performing a one way, rooted collective operation within the subcommunicator including: performing, in one phase, the collective operation within each neighborhood; and performing, in another phase, the collective operation amongst the local root nodes.
信息查询
0/0