Self-synchronizing remote memory operations in a multiprocessor system

    公开(公告)号:US12105960B2

    公开(公告)日:2024-10-01

    申请号:US17900808

    申请日:2022-08-31

    IPC分类号: G06F3/06

    摘要: Various embodiments include techniques for performing self-synchronizing remote memory operations in a multiprocessor computing system. During a remote memory operation in the multiprocessor computing system, a source processing unit transmits multiple segments of data to a destination processing. For each segment of data, the source processing unit transmits a remote memory operation to the destination processing unit that includes associated metadata that identifies the memory location of a corresponding synchronization object. The remote memory operation along with the metadata is transmitted as a single unit to the destination processing unit. The destination processing unit splits the operation into the remote memory operation and the memory synchronization operation. As a result, the source processing unit avoids the need to perform a separate memory synchronization operation, thereby reducing inter-processor communications and increasing performance of remote memory operations.

    NETWORK ROUTING USING AGGREGATED LINKS

    公开(公告)号:US20210014156A1

    公开(公告)日:2021-01-14

    申请号:US16700611

    申请日:2019-12-02

    摘要: Introduced herein is a routing technique that, for example, routes a transaction to a destination port over a network that supports link aggregation and multi-port connection. In one embodiment, two tables that can be searched based on the target and supplemental routing IDs of the transaction are utilized to route the transaction to the proper port of the destination endpoint. In an embodiment, the first table provides a list of available ports at each hop/route point that can route the transaction to the destination endpoint, and the second table provides a supplemental routing ID that can select a specific group of ports from the first table that can correctly route the transaction to the proper port.

    MULTICAST AND REFLECTIVE MEMORY BEHAVIOR FOR MEMORY MODEL CONSISTENCY

    公开(公告)号:US20230229599A1

    公开(公告)日:2023-07-20

    申请号:US17578266

    申请日:2022-01-18

    摘要: In various examples, a memory model may support multicasting where a single request for a memory access operation may be propagated to multiple physical addresses associated with multiple processing elements (e.g., corresponding to respective local memory). Thus, the request may cause data to be read from and/or written to memory for each of the processing elements. In some examples, a memory model exposes multicasting to processes. This may include providing for separate multicast and unicast instructions or shared instructions with one or more parameters (e.g., indicating a virtual address) being used to indicate multicasting or unicasting. Additionally or alternatively, whether a request(s) is processed using multicasting or unicasting may be opaque to a process and/or application or may otherwise be determined by the system. One or more constraints may be imposed on processing requests using multicasting to maintain a coherent memory interface.

    Techniques for efficiently synchronizing data transmissions on a network

    公开(公告)号:US10789194B2

    公开(公告)日:2020-09-29

    申请号:US16364565

    申请日:2019-03-26

    摘要: Systems and techniques for synchronizing transactions between processing devices on an interconnection network are provided. Upon receiving a stream of posted transactions followed by a flush transaction from a source processing device connected to the interconnection network, the flush transaction is trapped before it enters the interconnecting network. Subsequently, based on monitoring for responses received from a destination processing device for transactions corresponding to the posted transactions, a flush response is generated and returned to the source processing device. The described techniques enable efficient synchronizing posted writes, posted atomics and the like over complex interconnection fabrics such that a first GPU can write data to a second GPU so that a third GPU can safely consume the data written to the second GPU.