摘要:
A method for building a multi-processor system with nodes having multiple cache coherency domains. In this system, a directory built in anode controller needs to include processor domain attribute information, and the information can be acquired by configuring cache coherency domain attributes of ports of the node controller connected to processors. In the disclosure herein, the node ca roller can support the multiple physical cache coherency domains in a node.
摘要:
A method and system for adapting a device is disclosed. The method and system comprises providing a data stream to the device to be changed based upon a parameter. In a second aspect an adaptable device is disclosed. The adaptable device comprises an adaptable computerized environment (ACE) for receiving a data stream that allows the device to be changed based upon a parameter. The adaptable device includes a mechanism within the ACE for authorizing the data stream. A system and method in accordance with the present invention provides a hardware device that can be changed based upon a particular parameter such as time and location. In so doing, a provider of the hardware device can provide a more adaptable component which provides more value to the provider. Indeed, it is possible to give away the hardware upfront or even give an incentive to a receiver of the hardware and thereby use the device in a variety of ways.
摘要:
A processing unit having a dual channel bus architecture associated with a specific instruction set, configured to receive an input message and transmit an output message that is identical or derived therefrom. A message consists of one opcode, with or without associated data, used to control each processing unit depending on logic conditions stored in dedicated registers in each unit. Processing units are serially connected but can work simultaneously for a total pipelined operation. This dual architecture is organized around two channels labeled Channel 1 and Channel 2. Channel 1 mainly transmits an input message to all units while Channel 2 mainly transmits the results after processing in a unit as an output message. Depending on the logic conditions, an input message not processed in a processing unit may be transmitted to the next one without any change.
摘要:
To generate an optimum communication schedule when data is transmitted or received between processors which constitute a parallel computer or a distributed multiprocessor system.Processors which each perform inter-processor communication are sorted into a plurality of groups. A communication graph is generated whose nodes correspond to the groups and edges correspond to the communications. Communication graphs are generated for distances between nodes from one through N-1. Each communication graph corresponds to a communication step of the inter-processor communication. Communication is grasped as a whole by the communication graph and the edge of the communication graph means the inter-processor communication which is performed in a certain communication step. In this way, the communication can be optimized.
摘要:
A digital data processing system and method with shared, distributed memory transfers data between corresponding data sets within memory. The digital data processing system includes a plurality of processing cells interconnected by a hierarchical network, at least some of the processing cells including a processor and a memory. Each memory provides storage space which is arranged in sets, with each set being capable of holding a plurality of data pages. At least one of the processing cells, as a first processing cell, includes a page distributor for determining when at least a first set in the associated memory has reached a predetermined storage commitment condition (for example, a filled condition). Under such a condition, the page distributor invokes a page-transfer element that selects a candidate processing cell from among the other processing cells and transfers one or more pages from the first set to a corresponding set in the candidate processing cell.
摘要:
Multiprocessor parallel computing systems and a byte serial SIMD processor parallel architecture is used for parallel array processing with a simplified architecture adaptable to chip implementation in an air cooled environment. The array provided is an N dimensional array of byte wide processing units each coupled with an adequate segment of byte wide memory and control logic. A partitionable section of the array containing several processing units are contained on a silicon chip arranged with "Picket"s, an element of the processing array preferably consisting of combined processing element with a local memory for processing bit parallel bytes of information in a clock cycle. A Picket Processor system (or Subsystem) comprises an array of pickets, a communication network, an I/O system, and a SIMD controller consisting of a microprocessor, a canned routine processor, and a microcontroller that runs the array. The Picket Architecture for SIMD includes set associative processing, parallel numerically intensive processing, with physical array processing similar to image processing, a military picket line analogy fits quite well. Pickets, having a bit parallel processing element, with local memory coupled to the processing element for the parallel processing of information in an associative way where each picket is adapted to perform one element of the associative process. We have provided a way for horizontal association with each picket. The memory of the picket units is arranged in an array. The array of pickets thus arranged comprises a set associative memory. The set associative parallel processing system on a single chip permits a smaller set of `data` out of a larger set to be brought out of memory where an associative operation can be performed on it. This associative operation, typically an exact compare, is performed on the whole set of data in parallel, utilizing the Picket's memory and execution unit.
摘要:
A parallel processor system includes: a reception buffer pointer controller for generating an address of a reception buffer area in which a received packet is written and for checking whether there is no space area in the reception buffer area; a discard command bit capable of being set and reset by an instruction processor; a received packet discard judging unit for judging from the discard command bit and information supplied from the reception buffer pointer controller, whether the received packet is written, suspended, or discarded; and a reception controller for controlling to write the received packet in the reception buffer area in accordance with an judgement by the received packet discard judging unit. With this arrangement, even if there is no space area in the reception buffer area for storing a received packet or even if the received packet cannot be received because of a failure in the reception processor unit, the received packet can be discarded at the reception processor unit.
摘要:
A parallel processing computer system having an improved architecture for communication of information between nodes. The computer system of the present invention comprises at least three nodes; each of the three nodes for processing information. Each of the nodes comprises a routing means for routing information between nodes. The routing means allow reservation of a route through the network of nodes. Messages may then be transmitted from an origin node to a destination node over the reserved route. Use of a route reservation system reduces requirements for buffering of information at intermediate nodes on a route, improves message passing latency and increases node-to-node bandwidth. The present invention teaches communication of messages between nodes in a synchronous manner using a common strobe signal. The strobe signal is modified by regenerating alternate edges of the signal in order to eliminate pulse shrinkage of the strobe signal.
摘要:
MIMD and pipeline processing is executed by entering data and control signals into processing chips in the form of optical signals, and entering multi-bit information (data and control signals) in parallel and at high speed on the basis of non-coherence of light beams. The efficiency of MIMD processing function has been improved by expanding data transfer buses between processors and output buses in place of data and control signal input buses that have become unnecessary. A processing chip for receiving optical signals consists of a large number of cells dedicated for vector computations, and/or a large number of cells dedicated for vector computations and/or cells dedicated for arithmetic and logical computations. A processing chip for wide applications ranging from vector computations to logical computations by employing a construction combining both processors.
摘要:
A SIMD parallel processor includes two types of circuitry interconnecting its processing units: One kind interconnects the processing units into an array so that each processing unit can transfer data to an adjacent processing unit in the array and can receive data from an adjacent processing unit; the processing units can, for example, be interconnected in a one-dimensional array. Another kind of interconnecting circuitry includes bus circuitry to permit greater freedom in transferring data to and from processing units. Connected to the bus is a register, so that data can be transferred between processing units by first transferring data from one processing unit to the register and by then transferring data from the register to another processing unit. Or data stored in the register can be sent to a subset or to all of the processing units. Similarly, control circuitry can itself provide data on the bus for transfer to one, a subset, or all of the processing units. A bidirectional register can be connected between each processing unit and the bus, so that a processing unit can be selected to provide data to the bus by selecting its bidirectional register. Similarly, each processing unit can include a memory that can be selected with a write enable signal so that a set of processing units can be selected to receive and store in memory data from the bus.