Abstract:
Embodiments of the present invention provide a solution for managing inter-domain resource allocation in a Peripheral Component Interconnect-Express (PCIe) network. One processor among a plurality of link processors is elected as a management processor. The management processor obtains information about available resources of PCIe network. When a resource request from a request processor is received, the management processor allocates a resource of the available resources to the requesting processor. The management processor instructs one or more link processors to program one or more inter-domain NTBs through which the traffic between the allocated resource and the requesting processor is going to flow according to the memory address information of the allocated resource, to allow cross-domain resource access between the requesting processor and the allocated resource.
Abstract:
A method of communicating data over a Peripheral Component Interconnect Express (PCIe) Non-Transparent Bridge (NTB) comprising transmitting a first posted write message to a remote processor via the NTB, wherein the first posted write message indicates an intent to transfer data to the remote processor, and receiving a second posted write message in response to the first posted write message, wherein the second posted write message indicates a destination address list for the data. Also disclosed is a method of communicating data over a PCIe NTB comprising transmitting a first posted write message to a remote processor via the NTB, wherein the first posted write message comprises a request to read data, and receiving a data transfer message comprising at least some of the data requested by the first posted write message.
Abstract:
In a high-dimensional PCI-Express (PCIe) network, implementation of alternative paths is accomplished to facilitate flexible topology implementation and network domain scaling while enabling improved communication latency. Different portions of the PCIe tree structure are connected to allow a shorter path for communications by utilizing a bridge circuit configured as an end-point with respect to two switches that are not directly connected in the PCIe tree topology. The bridge circuit performs address translations to allow communications from one switch to be passed via the bridge circuit to the other switch.
Abstract:
A peripheral component interconnect express PCI-e network system having a processor for (a) assigning addresses to the PCI-e topology tree, comprising: traversing, at a given level and in a breadth direction, down-link couplings to an interconnection; ascertaining, at the level, which of the down-link couplings are connected to nodes; assigning, at the level, addresses to nodes of ascertained down-link coupling having nodes; and (b) propagating, a level, comprising: traversing, at the level and in a depth direction, down-link couplings to the interconnection of the PCI-e network, ascertaining, at the level, which of the downlink couplings are coupled to other interconnections in the depth direction, consecutively proceeding in the depth direction, to a next level of the down-link coupling of a next interconnection; and alternatively repeating (a) and (b) until the nodes are assigned addresses within the PCI-e tree topology network.
Abstract:
An apparatus for initialization. The apparatus includes a management I/O device controller for managing initialization of a plurality of I/O devices coupled to a PCI-Express (PCIe) fabric. The management I/O device controller is configured for receiving a request to register a target interrupt register address of a first worker computing resource, wherein the target interrupt register address is associated with a first interrupt generated by a first I/O device coupled to the PCIe fabric. A mapping module of the management I/O device controller is configured for mapping the target interrupt register address to a mapped interrupt register address of a domain in which the first I/O device resides. A translating interrupt register table includes a plurality of mapped interrupt register addresses in the domain that is associated with a plurality of target interrupt register addresses of a plurality of worker computing resources.
Abstract:
Embodiments of the present invention pertain to a method and apparatus for a scalable sorting of a data set in a database on a computer system. A number of contiguous ranges spanning the data set are defined. Each individual data value of the data set is assigned to a range to which it falls into. The values in the ranges are then sorted. The sorting can be performed by different nodes in parallel. Once the sorting is completed, the results are stored in contiguous memory locations. This results the overall data set being sorted.
Abstract:
An apparatus for initialization. The apparatus includes a management I/O device controller for managing initialization of a plurality of I/O devices coupled to a PCI-Express (PCIe) fabric. The management I/O device controller is configured for receiving a request to register a target interrupt register address of a first worker computing resource, wherein the target interrupt register address is associated with a first interrupt generated by a first I/O device coupled to the PCIe fabric. A mapping module of the management I/O device controller is configured for mapping the target interrupt register address to a mapped interrupt register address of a domain in which the first I/O device resides. A translating interrupt register table includes a plurality of mapped interrupt register addresses in the domain that is associated with a plurality of target interrupt register addresses of a plurality of worker computing resources.
Abstract:
An apparatus for initialization. The apparatus includes a management I/O device controller for managing initialization of a plurality of I/O devices coupled to a PCI-Express (PCIe) fabric. The management I/O device controller is configured for receiving a request to register a target interrupt register address of a first worker computing resource, wherein the target interrupt register address is associated with a first interrupt generated by a first I/O device coupled to the PCIe fabric. A mapping module of the management I/O device controller is configured for mapping the target interrupt register address to a mapped interrupt register address of a domain in which the first I/O device resides. A translating interrupt register table includes a plurality of mapped interrupt register addresses in the domain that is associated with a plurality of target interrupt register addresses of a plurality of worker computing resources.
Abstract:
Systems and methods for offloading computations from a CPU directly to an accelerator engine are disclosed. One embodiment includes determining a function of an application to be offloaded from a CPU to an accelerator engine, locating data within a file necessary to perform the functions, programming a logic of the accelerator engine based on the function to be offloaded, programming a DMA engine to move a copy the data from a secondary storage device to the accelerator engine, and processing the data at the accelerator engine using the programmed logic.
Abstract:
Methods and apparatuses for insertion, searching, deletion, and load balancing using a hierarchical series of hash tables are described herein. The techniques disclosed provide nearly collision free or deterministic hash functions using a bitmap as a pre-filter. The hash functions have different priorities and one hashing result will be used to perform main memory access. For the hash functions, two hash bitmaps are used to store valid data and collision information. There is no collision allowed in the hash tables except for the hash table with the lowest priority. The hash tables and bitmaps may be stored in one or more caches in (e.g., a cache of a CPU, Block RAMs in FPGAs, etc.) which perform much faster than main memory.