专利检索 ap:("NVIDIA Corporation") AND inv:"Nan Jiang" 第 1 页

1.

发明授权
Multicast-reduction assisted by network devices 有权

公开(公告)号：US11956306B1

公开(公告)日：2024-04-09

申请号：US17709111

申请日：2022-03-30

申请人： NVIDIA Corporation

发明人： Glenn Dearth , Mark Hummel , Nan Jiang , Gregory Thorson

IPC分类号： H04L47/70 , H04L47/80 , H04L67/1008 , H04L67/1014

CPC分类号： H04L67/1008 , H04L47/806 , H04L47/827 , H04L67/1014

摘要： Systems and techniques for performing multicast-reduction operations. In at least one embodiment, a network device receives first network data associated with a multicast operation to be collectively performed by at least a plurality of endpoints. The network device reserves resources to process second network data to be received from the endpoints, and sends the first network data to a plurality of additional network devices. The network device receives the second network data, and processes the second network data using the reserved resources.

2.

发明授权
Scalable in-network computation for massively-parallel shared-memory processors 有权

公开(公告)号：US11463272B2

公开(公告)日：2022-10-04

申请号：US17495547

申请日：2021-10-06

申请人： NVIDIA Corporation

发明人： Benjamin Klenk , Nan Jiang , Larry Robert Dennison , Gregory M. Thorson

IPC分类号： H04L67/55 , H04L12/18 , G06F9/50 , H04L47/10 , H04L47/20 , H04L47/80 , H04L45/74 , H04L67/568

摘要： A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

3.

发明公开
NETWORK MULTICASTING USING ALTERNATE SETS OF DIRECTIVES 审中-公开

公开(公告)号：US20230224239A1

公开(公告)日：2023-07-13

申请号：US17575354

申请日：2022-01-13

申请人： NVIDIA Corporation

发明人： Glenn Dearth , Nan Jiang , Mark Hummel , Richard Reeves

IPC分类号： H04L45/16 , H04L12/18 , H04L45/745

CPC分类号： H04L45/16 , H04L12/18 , H04L45/745

摘要： Apparatuses, systems, and techniques to multicast a transaction to a group of targets. In at least one embodiment, a set is selected from alternate sets of directives associated with the group of targets, and the transaction is transmitted to the group of targets in accordance with the selected set.

4.

发明申请
TECHNIQUES FOR REDUCING CONGESTION IN A COMPUTER NETWORK 审中-公开

公开(公告)号：US20190297018A1

公开(公告)日：2019-09-26

申请号：US16277349

申请日：2019-02-15

申请人： Nvidia Corporation

发明人： Glenn Dearth , Nan Jiang , John Wortman , Alex Ishii , Mark Hummel , Rich Reeves

IPC分类号： H04L12/801 , H04L12/825 , H04L12/26

摘要： Multiple processors are often used in computing systems to solve very large, complex problems, such as those encountered in artificial intelligence. Such processors typically exchange data among each other via an interconnect fabric (such as, e.g., a group of network connections and switches) in solving such complex problems. The amount of data injected into the interconnect fabric by the processors can at times overwhelm the interconnect fabric preventing some of the processors from communicating with each other. To address this problem, techniques are disclosed to enable, for example, processors that are connected to an interconnect fabric to coordinate and control the amount of data injected so that the interconnect fabric does not get overwhelmed.

5.

发明公开
CROSSBAR WITH AT-DISPATCH DYNAMIC DESTINATION SELECTION 审中-公开

公开(公告)号：US20240356866A1

公开(公告)日：2024-10-24

申请号：US18137132

申请日：2023-04-20

申请人： NVIDIA Corporation

发明人： Karan Gupta , Dane Thomas Mrazek , Nan Jiang , Mukesh Chand Agarwal

IPC分类号： H04L49/101 , H04L49/00 , H04L49/9047

CPC分类号： H04L49/101 , H04L49/3018 , H04L49/3027 , H04L49/9047

摘要： A Dynamic Destination Selection (DDS) crossbar, system for routing a packet, and a switch are provided. An illustrative DDS crossbar includes one or more adaptive routing circuits to track destination credit and port availability at a time of dispatching a packet, group multiple destinations into super destination groups, perform dynamic destination routing within a super destination group, and use the destination credit and port availability for the super destination group at the time of receiving the packet to select an output destination for the packet.

6.

发明申请
SCALABLE IN-NETWORK COMPUTATION FOR MASSIVELY-PARALLEL SHARED-MEMORY PROCESSORS 有权

公开(公告)号：US20220029845A1

公开(公告)日：2022-01-27

申请号：US17495547

申请日：2021-10-06

申请人： NVIDIA Corporation

发明人： Benjamin Klenk , Nan Jiang , Larry Robert Dennison , Gregory M. Thorson

IPC分类号： H04L12/18 , G06F9/50 , H04L12/801 , H04L12/813 , H04L12/927 , H04L12/741 , H04L29/08

摘要： A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

7.

发明申请
INJECTION LIMITING AND WAVE SYNCHRONIZATION FOR SCALABLE IN-NETWORK COMPUTATION 有权

公开(公告)号：US20210036881A1

公开(公告)日：2021-02-04

申请号：US16938044

申请日：2020-07-24

申请人： NVIDIA Corporation

发明人： Benjamin Klenk , Nan Jiang , Larry Robert Dennison

IPC分类号： H04L12/18 , G06F9/50 , H04L12/927 , H04L12/813 , H04L12/801

摘要： A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. An injection policy comprising the issuing of credits enables each endpoint to limit the amount of collective communication primitives injected into the network simultaneously to reduce network congestion caused by increased network traffic due to the multicast capability of the network devices.

8.

发明公开
MULTICAST-REDUCTION ASSISTED BY NETWORK DEVICES 审中-公开

公开(公告)号：US20240137410A1

公开(公告)日：2024-04-25

申请号：US18545339

申请日：2023-12-19

申请人： NVIDIA Corporation

发明人： Glenn Dearth , Mark Hummel , Nan Jiang , Gregory Thorson

IPC分类号： H04L67/1008 , H04L47/70 , H04L47/80 , H04L67/1014

CPC分类号： H04L67/1008 , H04L47/806 , H04L47/827 , H04L67/1014

摘要： Systems and techniques for performing multicast-reduction operations. In at least one embodiment, a network device receives first network data associated with a multicast operation to be collectively performed by at least a plurality of endpoints. The network device reserves resources to process second network data to be received from the endpoints, and sends the first network data to a plurality of additional network devices. The network device receives the second network data, and processes the second network data using the reserved resources.

9.

发明申请
CROSSBAR MULTIPATHING FOR MULTICAST PERFORMANCE IN TILED SWITCHES 有权

公开(公告)号：US20220417176A1

公开(公告)日：2022-12-29

申请号：US17848088

申请日：2022-06-23

申请人： NVIDIA Corporation

发明人： Glenn Alan Dearth , Nan Jiang , Mark D. Hummel , Gregory Michael Thorson , Karan Gupta , Dane Thomas Mrazek , Eric Anderson , Larry Robert Dennison

IPC分类号： H04L49/101 , H04L49/201 , H04L45/00 , H04L45/24 , H04L45/30 , H04L45/80

摘要： A method is provided for operating a network switch comprising a plurality of input ports and a plurality of output ports. The method comprises receiving a first data packet received via a first input port and a second data packet received via a second input port to be delivered to an egress endpoint connected to a first output port, configuring a plurality of crossbar switch units arranged in a tiled architecture to pass the first data packet to the first output port via a primary path and pass the second data packet to the first output port via a secondary path, and transmitting the first data packet and the second data packet to the egress endpoint. The first data packet and the second data packet pass through the plurality of crossbar switch units simultaneously.

10.

发明申请
SCALABLE IN-NETWORK COMPUTATION FOR MASSIVELY-PARALLEL SHARED-MEMORY PROCESSORS 有权

公开(公告)号：US20210037107A1

公开(公告)日：2021-02-04

申请号：US16938097

申请日：2020-07-24

申请人： NVIDIA Corporation

发明人： Benjamin Klenk , Nan Jiang , Larry Robert Dennison , Gregory M. Thorson

IPC分类号： H04L29/08 , H04L12/18 , H04L12/741

摘要： A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类