-
公开(公告)号:US08751595B2
公开(公告)日:2014-06-10
申请号:US13680772
申请日:2012-11-19
IPC分类号: G06F15/16
CPC分类号: G06F15/17306 , G06F9/546
摘要: Completion processing of data communications instructions in a distributed computing environment with computers coupled for data communications through communications adapters and an active messaging interface (‘AMI’), injecting for data communications instructions into slots in an injection FIFO buffer a transfer descriptor, at least some of the instructions specifying callback functions; injecting a completion descriptor for each instruction that specifies a callback function into an injection FIFO buffer slot having a corresponding slot in a pending callback list; listing in the pending callback list callback functions specified by data communications instructions; processing each descriptor in the injection FIFO buffer, setting a bit in a completion bit mask corresponding to the slot in the FIFO where the completion descriptor was injected; and calling by the AMI any callback functions in the pending callback list as indicated by set bits in the completion bit mask.
-
2.
公开(公告)号:US08732726B2
公开(公告)日:2014-05-20
申请号:US13710066
申请日:2012-12-10
发明人: Charles J. Archer , Michael A. Blocksome , Douglas R. Miller , Jeffrey J. Parker , Joseph D. Ratterman , Brian E. Smith
CPC分类号: G06F15/167 , G06F9/544 , G06F15/17318
摘要: A parallel computer includes nodes, each having main memory and a messaging unit (MU). Each MU includes computer memory, which in turn includes, MU message buffers. Each MU message buffer is associated with an uninitialized process on the compute node. In the parallel computer, managing internode data communications for an uninitialized process includes: receiving, by an MU of a compute node, one or more data communications messages in an MU message buffer associated with an uninitialized process on the compute node; determining, by an application agent, that the MU message buffer associated with the uninitialized process is full prior to initialization of the uninitialized process; establishing, by the application agent, a temporary message buffer for the uninitialized process in main computer memory; and moving, by the application agent, data communications messages from the MU message buffer associated with the uninitialized process to the temporary message buffer in main computer memory.
-
公开(公告)号:US08650581B2
公开(公告)日:2014-02-11
申请号:US13711108
申请日:2012-12-11
发明人: Charles J. Archer , Michael A. Blocksome , Douglas R. Miller , Jeffrey J. Parker , Joseph D. Ratterman , Brian E. Smith
CPC分类号: G06F9/544
摘要: Internode data communications in a parallel computer that includes compute nodes that each include main memory and a messaging unit, the messaging unit including computer memory and coupling compute nodes for data communications, in which, for each compute node at compute node boot time: a messaging unit allocates, in the messaging unit's computer memory, a predefined number of message buffers, each message buffer associated with a process to be initialized on the compute node; receives, prior to initialization of a particular process on the compute node, a data communications message intended for the particular process; and stores the data communications message in the message buffer associated with the particular process. Upon initialization of the particular process, the process establishes a messaging buffer in main memory of the compute node and copies the data communications message from the message buffer of the messaging unit into the message buffer of main memory.
摘要翻译: 并行计算机中的国际数据通信包括计算节点,每个计算节点包括主存储器和消息传送单元,消息传送单元包括计算机存储器和耦合用于数据通信的计算节点,其中针对计算节点启动时的每个计算节点:消息 单元在消息接发单元的计算机存储器中分配预定数量的消息缓冲器,每个消息缓冲器与在计算节点上被初始化的进程相关联; 在计算节点上的特定进程的初始化之前接收用于该特定进程的数据通信消息; 并将数据通信消息存储在与特定进程相关联的消息缓冲器中。 在特定进程的初始化时,该过程在计算节点的主存储器中建立消息缓存器,并将数据通信消息从消息传送单元的消息缓冲器复制到主存储器的消息缓冲器中。
-
公开(公告)号:US20160011996A1
公开(公告)日:2016-01-14
申请号:US14701371
申请日:2015-04-30
发明人: Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu
CPC分类号: G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14
摘要: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
摘要翻译: 100 petaflop规模的多千兆高效并行超级计算机包括基于片上系统技术的节点架构,其中每个处理节点包括单个专用集成电路(ASIC)。 ASIC节点通过五维环面网络互连,最优化节点之间的分组通信的吞吐量并最小化等待时间。 网络实现集体网络和提供全局障碍和通知功能的全球异步网络。 集成在节点设计中包括一个基于列表的预取器。 存储系统实现事务存储器,线程级别推测和多重切换缓存,同时提高软错误率,并支持DMA功能,允许并行处理消息传递。
-
公开(公告)号:US08745123B2
公开(公告)日:2014-06-03
申请号:US13690168
申请日:2012-11-30
IPC分类号: G06F15/16
CPC分类号: H04L29/08135 , G06F9/546 , H04L67/10
摘要: Completion processing of data communications instructions in a distributed computing environment, including receiving, in an active messaging interface (‘AMI’) data communications instructions, at least one instruction specifying a callback function; injecting into an injection FIFO buffer of a data communication adapter, an injection descriptor, each slot in the injection FIFO buffer having a corresponding slot in a pending callback list; listing in the pending callback list any callback function specified by an instruction, incrementing a pending callback counter for each listed callback function; transferring payload data as per each injection descriptor, incrementing a transfer counter upon completion of each transfer; determining from counter values whether the pending callback list presently includes callback functions whose data transfers have been completed; calling by the AMI any such callback functions from the pending callback list, decrementing the pending callback counter for each callback function called.
-
-
-
-