Abstract:
Methods, apparatus and software for implementing enhanced data center congestion management for non-TCP traffic. Non-congested transmit latencies are determined for transmission of packets or Ethernet frames along paths between source and destination end-end-nodes when congestion along the paths is not present or minimal. Transmit latencies are similarly measured along the same source-destination paths during ongoing operations during which traffic congestion may vary. Based on whether a difference between the transmit latency for a packet or frame and the non-congested transmit latency for the path exceeds a threshold, the path is marked as congested or not congested. A rate at which the non-TCP packets are transmitted along the path is then managed as function of a rate at which the path is marked as congested. In one implementation, non-TCP traffic is managed by mimicking a Data Center TCP technique, under which the congestion marking status of the path is substituted as an input to a DCTP algorithm in place of the normally-used ECN-Echo flag input. The congestion window output by the DCTCP algorithm is then used to manage the rate at which non-TCP packets to be forwarded via the path are transmitted from a source end-node.
Abstract:
In an embodiment of the present invention, a method includes partitioning a plurality of remote direct memory access context objects among a plurality of virtual functions, establishing a remote direct memory access connection between a first of the plurality of virtual functions, and migrating the remote direct memory access connection from the first of the plurality of virtual functions to a second of the plurality of virtual functions without disconnecting from the remote peer.
Abstract:
Methods, apparatus and software for implementing enhanced data center congestion management for non-TCP traffic. Non-congested transit latencies are determined for transmission of packets or Ethernet frames along paths between source and destination end-end-nodes when congestion along the paths is not present or minimal. Transit latencies are similarly measured along the same source-destination paths during ongoing operations during which traffic congestion may vary. Based on whether a difference between the transit latency for a packet or frame and the non-congested transit latency for the path exceeds a threshold, the path is marked as congested or not congested. A rate at which the non-TCP packets are transmitted along the path is then managed as function of a rate at which the path is marked as congested. In one implementation, non-TCP traffic is managed by mimicking a Data Center TCP technique, under which the congestion marking status of the path is substituted as an input to a DCTP algorithm in place of the normally-used ECN-Echo flag input. The congestion window output by the DCTCP algorithm is then used to manage the rate at which non-TCP packets to be forwarded via the path are transmitted from a source end-node.
Abstract:
Apparatus, method and system for supporting Remote Direct Memory Access (RDMA) Read V2 Request and Response messages using the Internet Wide Area RDMA Protocol (iWARP). iWARP logic in an RDMA Network Interface Controller (RNIC) is configured to generate a new RDMA Read V2 Request message and generate a new RDMA Read V2 Response message in response to a received RDMA Read V2 Request message, and send the messages to an RDMA remote peer using iWARP implemented over an Ethernet network. The iWARP logic is further configured to process RDMA Read V2 Response messages received from the RDMA remote peer, and to write data contained in the messages to appropriate locations using DMA transfers from buffers on the RNIC into system memory. In addition, the new semantics removes the need for extra operations to grant and revoke remote access rights.
Abstract:
Apparatus, methods and systems for supporting Send with Immediate Data messages using Remote Direct Memory Access (RDMA) and the Internet Wide Area RDMA Protocol (iWARP). iWARP logic in an RDMA Network Interface Controller (RNIC) is configured to generate different types of Send with Immediate Data messages, each including a header with a unique RDMA opcode identifying the type of Send with Immediate Data message, and send the message to an RDMA remote peer using iWARP implemented over an Ethernet network. The iWARP logic is further configured to process the Send with Immediate Data messages received from the RDMA remote peer. The Send with Immediate Data messages include a Send with Immediate Data message, a Send with Invalidate and Immediate Data message, a Send with Solicited Event (SE) and Immediate Data message, and a Send with Invalidate and SE and Immediate Data message.
Abstract:
Methods, apparatus and software for implementing enhanced data center congestion management for non-TCP traffic. Non-congested transmit latencies are determined for transmission of packets or Ethernet frames along paths between source and destination end-end-nodes when congestion along the paths is not present or minimal. Transmit latencies are similarly measured along the same source-destination paths during ongoing operations during which traffic congestion may vary. Based on whether a difference between the transmit latency for a packet or frame and the non-congested transmit latency for the path exceeds a threshold, the path is marked as congested or not congested. A rate at which the non-TCP packets are transmitted along the path is then managed as function of a rate at which the path is marked as congested. In one implementation, non-TCP traffic is managed by mimicking a Data Center TCP technique, under which the congestion marking status of the path is substituted as an input to a DCTP algorithm in place of the normally-used ECN-Echo flag input. The congestion window output by the DCTCP algorithm is then used to manage the rate at which non-TCP packets to be forwarded via the path are transmitted from a source end-node.
Abstract:
In an embodiment of the present invention, a method includes partitioning a plurality of remote direct memory access context objects among a plurality of virtual functions, establishing a remote direct memory access connection between a first of the plurality of virtual functions, and migrating the remote direct memory access connection from the first of the plurality of virtual functions to a second of the plurality of virtual functions without disconnecting from the remote peer.
Abstract:
Methods, apparatus and systems for reducing usage of Doorbell Rings in connection with RDMA operations. A portion of system memory is employed as a Memory-Mapped Input/Output (MMIO) address space configured to be accessed via a hardware networking device. A Send Queue (SQ) is stored in MMIO and is used to facilitate processing of Work Requests (WRs) that are written to SQ entries by software and read from the SQ via the hardware networking device. The software and logic in the hardware networking device employ pointers identifying locations in the SQ corresponding to a next write WR entry slot and last read WR entry slot that are implemented to enable WRs to be written to and read from the SQ during ongoing operations under which the SQ is not emptied such that doorbell rings to notify the hardware networking device that new WRs have been written to the SQ are not required.