专利检索 cpc:"G06F15/17387" 第 1 页

1.

发明申请
METHODS AND APPARATUS FOR ADJACENCY NETWORK DELIVERY OF OPERANDS TO INSTRUCTION SPECIFIED DESTINATIONS THAT REDUCES STORAGE OF TEMPORARY VARIABLES 审中-公开

公开(公告)号：US20180189068A1

公开(公告)日：2018-07-05

申请号：US15910211

申请日：2018-03-02

申请人： Gerald George Pechanek

发明人： Gerald George Pechanek

IPC分类号： G06F9/38 , G06F15/173 , G06F9/30

CPC分类号： G06F9/3838 , G06F9/30043 , G06F9/30181 , G06F9/3828 , G06F9/3836 , G06F9/3851 , G06F9/3895 , G06F15/17387

摘要： A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.

2.

发明授权
Multi-petascale highly efficient parallel supercomputer 有权

公开(公告)号：US09971713B2

公开(公告)日：2018-05-15

申请号：US14701371

申请日：2015-04-30

申请人： GLOBALFOUNDRIES INC.

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F9/38 , G06F9/30 , G06F13/28 , G06F15/173 , G06F15/76 , G06F9/06 , G06F12/0862 , G06F12/0864 , G06F15/80 , G06F12/0811 , G06F12/0831 , G06F12/1027

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.

3.

发明申请
CONFIGURATION OF A CLUSTER SERVER USING CELLULAR AUTOMATA 审中-公开
标题翻译：使用细胞自动机配置集群服务器

公开(公告)号：US20150333956A1

公开(公告)日：2015-11-19

申请号：US14461614

申请日：2014-08-18

申请人： Advanced Micro Devices, Inc.

发明人： Michael E. James , Jean-Philippe Fricker

IPC分类号： H04L12/24 , H04L29/08

CPC分类号： H04L41/0886 , G06F15/17387 , H04L41/0803 , H04L41/0806 , H04L41/16 , H04L45/48 , H04L67/10

摘要： A cluster computer server is configured after a system reset or other configuration event. Each node of a fabric of the cluster compute server is employed, for purposes of configuration, as a cell in a cellular automaton, thereby obviating the need for a special configuration network to communicate configuration information from a central management unit. Instead, the nodes communicate configuration information using the same fabric interconnect that is used to communicate messages during normal execution of software services at the nodes.

摘要翻译： 在系统复位或其他配置事件后配置集群计算机服务器。集群计算服务器的结构的每个节点被用于配置，作为细胞自动机中的小区，从而避免需要特殊配置网络来从中央管理单元传送配置信息。相反，节点在节点处的软件服务的正常执行期间使用用于传送消息的相同的结构互连来传送配置信息。

4.

发明申请
DIAGONALLY ENHANCED CONCENTRATED HYPERCUBE TOPOLOGY 审中-公开
标题翻译：对称增强浓缩超分子拓扑学

公开(公告)号：US20120023260A1

公开(公告)日：2012-01-26

申请号：US13186096

申请日：2011-07-19

申请人： Cyriel Minkenberg

发明人： Cyriel Minkenberg

IPC分类号： G06F15/16

CPC分类号： G06F15/17387

摘要： The invention is directed to a system comprising routing nodes, computing nodes, first communication links, wherein the first communication links connect pairs consisting of two routing nodes together, the routing nodes and the first communication links forming a hypercube structure, second communication links, wherein the second communication links connect pairs consisting of a routing node and a computing node together, third communication links, wherein the third communication links connect pairs consisting of two routing nodes together.

摘要翻译： 本发明涉及包括路由节点，计算节点，第一通信链路的系统，其中第一通信链路将由两个路由节点组成的对连接在一起，路由节点和形成超立方体结构的第一通信链路，第二通信链路，其中第二通信链路将由路由节点和计算节点组成的对连接在一起，第三通信链路，其中第三通信链路将由两个路由节点组成的对连接在一起。

5.

发明授权
Methods and apparatus for signal flow graph pipelining that reduce storage of temporary variables 有权
标题翻译：用于信号流图流水线的方法和装置，减少临时变量的存储

公开(公告)号：US09507603B2

公开(公告)日：2016-11-29

申请号：US14450222

申请日：2014-08-02

申请人： Gerald George Pechanek

发明人： Gerald George Pechanek

IPC分类号： G06F9/38 , G06F15/173

CPC分类号： G06F9/3838 , G06F9/30043 , G06F9/3851 , G06F9/3895 , G06F15/17387

摘要： A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.

摘要翻译： 用于通过以3D物理布置组织的多个共享存储器处理器来流水线化信号流图的系统，其中所述存储器叠加在处理器节点上，从而减少临时变量的存储。由两个或多个指令组成的组函数来指定组功能的两个或多个部分。第一指令指定第一部分并且指定与第一指令相邻的第二指令的控制信息或相对于第一指令的预定位置。执行时的第一条指令将控制信息传送到待处理寄存器，并产生一个传输到与第二条指令相关的操作数输入的结果。第二指令指定组功能的第二部分，当执行时将控制信息从挂起寄存器传送到第二执行单元，以调整所接收的操作数的第二执行单元的操作。

6.

发明申请
METHODS AND APPARATUS FOR SIGNAL FLOW GRAPH PIPELINING THAT REDUCE STORAGE OF TEMPORARY VARIABLES 审中-公开
标题翻译：信号流图形管道的方法和装置，减少临时变量的存储

公开(公告)号：US20150039855A1

公开(公告)日：2015-02-05

申请号：US14450222

申请日：2014-08-02

申请人： Gerald George Pechanek

发明人： Gerald George Pechanek

IPC分类号： G06F9/30

CPC分类号： G06F9/3838 , G06F9/30043 , G06F9/3851 , G06F9/3895 , G06F15/17387

摘要： A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.

摘要翻译： 用于通过以3D物理布置组织的多个共享存储器处理器来流水线化信号流图的系统，其中所述存储器叠加在处理器节点上，从而减少临时变量的存储。由两个或多个指令组成的组函数来指定组功能的两个或多个部分。第一指令指定第一部分并且指定与第一指令相邻的第二指令的控制信息或相对于第一指令的预定位置。执行时的第一条指令将控制信息传送到待处理寄存器，并产生一个传输到与第二条指令相关的操作数输入的结果。第二指令指定组功能的第二部分，当执行时将控制信息从挂起寄存器传送到第二执行单元，以调整所接收的操作数的第二执行单元的操作。

7.

发明授权
Multiprocessor communication networks 有权
标题翻译：多处理器通信网络

公开(公告)号：US08819272B2

公开(公告)日：2014-08-26

申请号：US12703938

申请日：2010-02-11

申请人： William S. Song

发明人： William S. Song

IPC分类号： G06F15/173

CPC分类号： H04L47/125 , G06F15/17381 , G06F15/17387 , H04L45/06 , H04L45/12 , H04L45/24

摘要： A parallel multiprocessor system includes a packet-switching communication network comprising a plurality of processor nodes operating concurrently in parallel. Each processor node generates messages to be sent simultaneously to a plurality of other processor nodes in the communication network. Each message is divided into a plurality of packets having a common destination processor node. Each processor node has an arbiter that determines an order in which to forward the packets onto the network toward their destination processor nodes and a network interface that sends the packets onto the network in accordance with the determined order. The determined order operates to substantially avoid sending consecutive packets from a given source processor node to a given destination processor node and to randomize the destination processor nodes of those packets presently traversing the communication network.

摘要翻译： 并行多处理器系统包括分组交换通信网络，其包括并行操作的多个处理器节点。每个处理器节点产生要同时发送到通信网络中的多个其他处理器节点的消息。每个消息被分成具有公共目的地处理器节点的多个分组。每个处理器节点具有仲裁器，该仲裁器确定将数据包转发到网络上的目的地处理器节点的顺序和根据确定的顺序将分组发送到网络的网络接口。确定的顺序操作以基本上避免将给定源处理器节点的连续分组发送到给定的目的地处理器节点，并使目前遍历通信网络的分组的目的地处理器节点随机化。

8.

发明申请
3-D STACKED MULTIPROCESSOR STRUCTURES AND METHODS FOR MULTIMODAL OPERATION OF SAME 有权
标题翻译： 3-D堆叠多处理器结构及其多种操作方法

公开(公告)号：US20130283010A1

公开(公告)日：2013-10-24

申请号：US13601289

申请日：2012-08-31

申请人： Alper Buyuktosunoglu , Philip G. Emma , Allan M. Hartstein , Michael B. Healy , Krishnan Kunjunny Kailas

发明人： Alper Buyuktosunoglu , Philip G. Emma , Allan M. Hartstein , Michael B. Healy , Krishnan Kunjunny Kailas

IPC分类号： G06F15/76

CPC分类号： G06F15/17387 , G06F9/3802

摘要： Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a processor system includes a first processor chip comprising a first processor and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively operate the processor system in one of a plurality of operating modes. For example, in a one mode of operation, the first and second processors are configured to implement a run-ahead function, wherein the first processor operates a primary thread of execution and the second processor operates a run-ahead thread of execution.

摘要翻译： 提供三维（3-D）处理器设备，其通过以堆叠配置连接处理器而构成。例如，处理器系统包括第一处理器芯片，其包括第一处理器和包括第二处理器的第二处理器芯片。第一和第二处理器芯片以堆叠配置连接，第一和第二处理器通过第一和第二处理器芯片之间的垂直连接而连接。处理器系统还包括模式控制电路，用于以多种操作模式之一选择性地操作处理器系统。例如，在一种操作模式中，第一处理器和第二处理器被配置为实现超前功能，其中第一处理器操作主要执行线程，并且第二处理器操作预先执行的线程。

9.

发明申请
Performing A Local Reduction Operation On A Parallel Computer 失效
标题翻译：在并行计算机上执行局部缩减操作

公开(公告)号：US20120317399A1

公开(公告)日：2012-12-13

申请号：US13585993

申请日：2012-08-15

申请人： Michael A. Blocksome , Daniel A. Faraj

发明人： Michael A. Blocksome , Daniel A. Faraj

IPC分类号： G06F15/76 , G06F15/16 , G06F9/02 , G06F12/00

CPC分类号： G06F15/17387 , G06F15/17318

摘要： A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

摘要翻译： 并行计算机包括计算节点，每个包括两个减少处理核心，一个网络写入处理核心和一个网络读取处理核心，每个处理核心分配一个输入缓冲器。通过缩小处理核心在交织块中将缩小处理核心的输入缓冲器的内容复制到共享存储器中的交错缓冲器; 通过一个还原处理核心将网络写处理核心的输入缓冲器的内容复制到共享存储器; 通过另一个还原处理核心将网络读处理核心的输入缓冲器的内容复制到共享存储器; 并通过还原处理核心并行减少：还原处理核心的输入缓冲器的内容; 交错缓冲器的每隔一个交错块; 复制内容的网络写入处理核心的输入缓冲区; 以及网络读取处理核心的输入缓冲区的复制内容。

10.

发明申请
Performing A Local Reduction Operation On A Parallel Computer 失效
标题翻译：在并行计算机上执行局部缩减操作

公开(公告)号：US20110258245A1

公开(公告)日：2011-10-20

申请号：US12760020

申请日：2010-04-14

申请人： Michael A. Blocksome , Daniel A. Faraj

发明人： Michael A. Blocksome , Daniel A. Faraj

IPC分类号： G06F15/76 , G06F15/16 , G06F9/02 , G06F12/06

CPC分类号： G06F15/17387 , G06F15/17318

摘要： A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

摘要翻译： 并行计算机包括计算节点，每个包括两个减少处理核心，一个网络写入处理核心和一个网络读取处理核心，每个处理核心分配一个输入缓冲器。通过缩小处理核心在交织块中将缩小处理核心的输入缓冲器的内容复制到共享存储器中的交错缓冲器; 通过一个还原处理核心将网络写处理核心的输入缓冲器的内容复制到共享存储器; 通过另一个还原处理核心将网络读处理核心的输入缓冲器的内容复制到共享存储器; 并通过还原处理核心并行减少：还原处理核心的输入缓冲器的内容; 交错缓冲器的每隔一个交错块; 复制内容的网络写入处理核心的输入缓冲区; 以及网络读取处理核心的输入缓冲区的复制内容。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类