CONFIGURATION OF A CLUSTER SERVER USING CELLULAR AUTOMATA
    3.
    发明申请
    CONFIGURATION OF A CLUSTER SERVER USING CELLULAR AUTOMATA 审中-公开
    使用细胞自动机配置集群服务器

    公开(公告)号:US20150333956A1

    公开(公告)日:2015-11-19

    申请号:US14461614

    申请日:2014-08-18

    IPC分类号: H04L12/24 H04L29/08

    摘要: A cluster computer server is configured after a system reset or other configuration event. Each node of a fabric of the cluster compute server is employed, for purposes of configuration, as a cell in a cellular automaton, thereby obviating the need for a special configuration network to communicate configuration information from a central management unit. Instead, the nodes communicate configuration information using the same fabric interconnect that is used to communicate messages during normal execution of software services at the nodes.

    摘要翻译: 在系统复位或其他配置事件后配置集群计算机服务器。 集群计算服务器的结构的每个节点被用于配置,作为细胞自动机中的小区,从而避免需要特殊配置网络来从中央管理单元传送配置信息。 相反,节点在节点处的软件服务的正常执行期间使用用于传送消息的相同的结构互连来传送配置信息。

    DIAGONALLY ENHANCED CONCENTRATED HYPERCUBE TOPOLOGY
    4.
    发明申请
    DIAGONALLY ENHANCED CONCENTRATED HYPERCUBE TOPOLOGY 审中-公开
    对称增强浓缩超分子拓扑学

    公开(公告)号:US20120023260A1

    公开(公告)日:2012-01-26

    申请号:US13186096

    申请日:2011-07-19

    申请人: Cyriel Minkenberg

    发明人: Cyriel Minkenberg

    IPC分类号: G06F15/16

    CPC分类号: G06F15/17387

    摘要: The invention is directed to a system comprising routing nodes, computing nodes, first communication links, wherein the first communication links connect pairs consisting of two routing nodes together, the routing nodes and the first communication links forming a hypercube structure, second communication links, wherein the second communication links connect pairs consisting of a routing node and a computing node together, third communication links, wherein the third communication links connect pairs consisting of two routing nodes together.

    摘要翻译: 本发明涉及包括路由节点,计算节点,第一通信链路的系统,其中第一通信链路将由两个路由节点组成的对连接在一起,路由节点和形成超立方体结构的第一通信链路,第二通信链路,其中 第二通信链路将由路由节点和计算节点组成的对连接在一起,第三通信链路,其中第三通信链路将由两个路由节点组成的对连接在一起。

    Methods and apparatus for signal flow graph pipelining that reduce storage of temporary variables
    5.
    发明授权
    Methods and apparatus for signal flow graph pipelining that reduce storage of temporary variables 有权
    用于信号流图流水线的方法和装置,减少临时变量的存储

    公开(公告)号:US09507603B2

    公开(公告)日:2016-11-29

    申请号:US14450222

    申请日:2014-08-02

    IPC分类号: G06F9/38 G06F15/173

    摘要: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.

    摘要翻译: 用于通过以3D物理布置组织的多个共享存储器处理器来流水线化信号流图的系统,其中所述存储器叠加在处理器节点上,从而减少临时变量的存储。 由两个或多个指令组成的组函数来指定组功能的两个或多个部分。 第一指令指定第一部分并且指定与第一指令相邻的第二指令的控制信息或相对于第一指令的预定位置。 执行时的第一条指令将控制信息传送到待处理寄存器,并产生一个传输到与第二条指令相关的操作数输入的结果。 第二指令指定组功能的第二部分,当执行时将控制信息从挂起寄存器传送到第二执行单元,以调整所接收的操作数的第二执行单元的操作。

    METHODS AND APPARATUS FOR SIGNAL FLOW GRAPH PIPELINING THAT REDUCE STORAGE OF TEMPORARY VARIABLES
    6.
    发明申请
    METHODS AND APPARATUS FOR SIGNAL FLOW GRAPH PIPELINING THAT REDUCE STORAGE OF TEMPORARY VARIABLES 审中-公开
    信号流图形管道的方法和装置,减少临时变量的存储

    公开(公告)号:US20150039855A1

    公开(公告)日:2015-02-05

    申请号:US14450222

    申请日:2014-08-02

    IPC分类号: G06F9/30

    摘要: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.

    摘要翻译: 用于通过以3D物理布置组织的多个共享存储器处理器来流水线化信号流图的系统,其中所述存储器叠加在处理器节点上,从而减少临时变量的存储。 由两个或多个指令组成的组函数来指定组功能的两个或多个部分。 第一指令指定第一部分并且指定与第一指令相邻的第二指令的控制信息或相对于第一指令的预定位置。 执行时的第一条指令将控制信息传送到待处理寄存器,并产生一个传输到与第二条指令相关的操作数输入的结果。 第二指令指定组功能的第二部分,当执行时将控制信息从挂起寄存器传送到第二执行单元,以调整所接收的操作数的第二执行单元的操作。

    Multiprocessor communication networks
    7.
    发明授权
    Multiprocessor communication networks 有权
    多处理器通信网络

    公开(公告)号:US08819272B2

    公开(公告)日:2014-08-26

    申请号:US12703938

    申请日:2010-02-11

    申请人: William S. Song

    发明人: William S. Song

    IPC分类号: G06F15/173

    摘要: A parallel multiprocessor system includes a packet-switching communication network comprising a plurality of processor nodes operating concurrently in parallel. Each processor node generates messages to be sent simultaneously to a plurality of other processor nodes in the communication network. Each message is divided into a plurality of packets having a common destination processor node. Each processor node has an arbiter that determines an order in which to forward the packets onto the network toward their destination processor nodes and a network interface that sends the packets onto the network in accordance with the determined order. The determined order operates to substantially avoid sending consecutive packets from a given source processor node to a given destination processor node and to randomize the destination processor nodes of those packets presently traversing the communication network.

    摘要翻译: 并行多处理器系统包括分组交换通信网络,其包括并行操作的多个处理器节点。 每个处理器节点产生要同时发送到通信网络中的多个其他处理器节点的消息。 每个消息被分成具有公共目的地处理器节点的多个分组。 每个处理器节点具有仲裁器,该仲裁器确定将数据包转发到网络上的目的地处理器节点的顺序和根据确定的顺序将分组发送到网络的网络接口。 确定的顺序操作以基本上避免将给定源处理器节点的连续分组发送到给定的目的地处理器节点,并使目前遍历通信网络的分组的目的地处理器节点随机化。

    3-D STACKED MULTIPROCESSOR STRUCTURES AND METHODS FOR MULTIMODAL OPERATION OF SAME
    8.
    发明申请
    3-D STACKED MULTIPROCESSOR STRUCTURES AND METHODS FOR MULTIMODAL OPERATION OF SAME 有权
    3-D堆叠多处理器结构及其多种操作方法

    公开(公告)号:US20130283010A1

    公开(公告)日:2013-10-24

    申请号:US13601289

    申请日:2012-08-31

    IPC分类号: G06F15/76

    CPC分类号: G06F15/17387 G06F9/3802

    摘要: Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a processor system includes a first processor chip comprising a first processor and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively operate the processor system in one of a plurality of operating modes. For example, in a one mode of operation, the first and second processors are configured to implement a run-ahead function, wherein the first processor operates a primary thread of execution and the second processor operates a run-ahead thread of execution.

    摘要翻译: 提供三维(3-D)处理器设备,其通过以堆叠配置连接处理器而构成。 例如,处理器系统包括第一处理器芯片,其包括第一处理器和包括第二处理器的第二处理器芯片。 第一和第二处理器芯片以堆叠配置连接,第一和第二处理器通过第一和第二处理器芯片之间的垂直连接而连接。 处理器系统还包括模式控制电路,用于以多种操作模式之一选择性地操作处理器系统。 例如,在一种操作模式中,第一处理器和第二处理器被配置为实现超前功能,其中第一处理器操作主要执行线程,并且第二处理器操作预先执行的线程。

    Performing A Local Reduction Operation On A Parallel Computer
    9.
    发明申请
    Performing A Local Reduction Operation On A Parallel Computer 失效
    在并行计算机上执行局部缩减操作

    公开(公告)号:US20120317399A1

    公开(公告)日:2012-12-13

    申请号:US13585993

    申请日:2012-08-15

    CPC分类号: G06F15/17387 G06F15/17318

    摘要: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

    摘要翻译: 并行计算机包括计算节点,每个包括两个减少处理核心,一个网络写入处理核心和一个网络读取处理核心,每个处理核心分配一个输入缓冲器。 通过缩小处理核心在交织块中将缩小处理核心的输入缓冲器的内容复制到共享存储器中的交错缓冲器; 通过一个还原处理核心将网络写处理核心的输入缓冲器的内容复制到共享存储器; 通过另一个还原处理核心将网络读处理核心的输入缓冲器的内容复制到共享存储器; 并通过还原处理核心并行减少:还原处理核心的输入缓冲器的内容; 交错缓冲器的每隔一个交错块; 复制内容的网络写入处理核心的输入缓冲区; 以及网络读取处理核心的输入缓冲区的复制内容。

    Performing A Local Reduction Operation On A Parallel Computer
    10.
    发明申请
    Performing A Local Reduction Operation On A Parallel Computer 失效
    在并行计算机上执行局部缩减操作

    公开(公告)号:US20110258245A1

    公开(公告)日:2011-10-20

    申请号:US12760020

    申请日:2010-04-14

    CPC分类号: G06F15/17387 G06F15/17318

    摘要: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

    摘要翻译: 并行计算机包括计算节点,每个包括两个减少处理核心,一个网络写入处理核心和一个网络读取处理核心,每个处理核心分配一个输入缓冲器。 通过缩小处理核心在交织块中将缩小处理核心的输入缓冲器的内容复制到共享存储器中的交错缓冲器; 通过一个还原处理核心将网络写处理核心的输入缓冲器的内容复制到共享存储器; 通过另一个还原处理核心将网络读处理核心的输入缓冲器的内容复制到共享存储器; 并通过还原处理核心并行减少:还原处理核心的输入缓冲器的内容; 交错缓冲器的每隔一个交错块; 复制内容的网络写入处理核心的输入缓冲区; 以及网络读取处理核心的输入缓冲区的复制内容。