-
公开(公告)号:US12236323B1
公开(公告)日:2025-02-25
申请号:US18217483
申请日:2023-06-30
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal
Abstract: Distributed machine learning systems and other distributed computing systems are improved by embedding compute logic at the network switch level to perform collective actions, such as reduction operations, on gradients or other data processed by the nodes of the system. The switch is configured to recognize data units that carry data associated with a collective action that needs to be performed by the distributed system, referred to herein as “compute data,” and process that data using a compute subsystem within the switch. The compute subsystem includes a compute engine that is configured to perform various operations on the compute data, such as “reduction” operations, and forward the results back to the compute nodes. The reduction operations may include, for instance, summation, averaging, bitwise operations, and so forth. In this manner, the network switch may take over some or all of the processing of the distributed system during the collective phase.
-
公开(公告)号:US12068972B1
公开(公告)日:2024-08-20
申请号:US18208648
申请日:2023-06-12
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal , Bruce Hui Kwan
IPC: H04L47/625 , H04L49/90 , H04L49/901
CPC classification number: H04L47/6255 , H04L49/901 , H04L49/9084
Abstract: A traffic manager is shared amongst two or more egress blocks of a network device, thereby allowing traffic management resources to be shared between the egress blocks. Schedulers within a traffic manager may generate and queue read instructions for reading buffered portions of data units that are ready to be sent to the egress blocks. The traffic manager may be configured to select a read instruction for a given buffer bank from the read instruction queues based on a scoring mechanism or other selection logic. To avoid sending too much data to an egress block during a given time slot, once a data unit portion has been read from the buffer, it may be temporarily stored in a shallow read data cache. Alternatively, a single, non-bank specific controller may determine all of the read instructions and write operations that should be executed in a given time slot.
-
公开(公告)号:US11637786B1
公开(公告)日:2023-04-25
申请号:US17121404
申请日:2020-12-14
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal , Bruce Hui Kwan , Ajit Kumar Jain
IPC: H04L12/26 , H04L47/41 , H04L47/22 , H04L49/9047 , H04L49/9015 , H04L49/90 , H04L47/6275 , H04L45/16 , H04L45/24 , H04L47/30 , H04L47/625 , H04L47/32
Abstract: When a measure of buffer space queued for garbage collection in a network device grows beyond a certain threshold, one or more actions are taken to decreasing an enqueue rate of certain classes of traffic, such as of multicast traffic, whose reception may have caused and/or be likely to exacerbate garbage-collection-related performance issues. When the amount of buffer space queued for garbage collection shrinks to an acceptable level, these one or more actions may be reversed. In an embodiment, to more optimally handle multi-destination traffic, queue admission control logic for high-priority multi-destination data units, such as mirrored traffic, may be performed for each destination of the data units prior to linking the data units to a replication queue. If a high-priority multi-destination data unit is admitted to any queue, the high-priority multi-destination data unit can no longer be dropped, and is linked to a replication queue for replication.
-
公开(公告)号:US11516149B1
公开(公告)日:2022-11-29
申请号:US17367331
申请日:2021-07-03
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal
Abstract: Distributed machine learning systems and other distributed computing systems are improved by compute logic embedded in extension modules coupled directly to network switches. The compute logic performs collective actions, such as reduction operations, on gradients or other compute data processed by the nodes of the system. The reduction operations may include, for instance, summation, averaging, bitwise operations, and so forth. In this manner, the extension modules may take over some or all of the processing of the distributed system during the collective phase. An inline version of the module sits between a switch and the network. Data units carrying compute data are intercepted and processed using the compute logic, while other data units pass through the module transparently to or from the switch. Multiple modules may be connected to the switch, each coupled to a different group of nodes, and sharing intermediate results. A sidecar version is also described.
-
公开(公告)号:US11057318B1
公开(公告)日:2021-07-06
申请号:US16552938
申请日:2019-08-27
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal
IPC: H04L12/935 , H04L12/771 , G06N20/00
Abstract: Distributed machine learning systems and other distributed computing systems are improved by compute logic embedded in extension modules coupled directly to network switches. The compute logic performs collective actions, such as reduction operations, on gradients or other compute data processed by the nodes of the system. The reduction operations may include, for instance, summation, averaging, bitwise operations, and so forth. In this manner, the extension modules may take over some or all of the processing of the distributed system during the collective phase. An inline version of the module sits between a switch and the network. Data units carrying compute data are intercepted and processed using the compute logic, while other data units pass through the module transparently to or from the switch. Multiple modules may be connected to the switch, each coupled to a different group of nodes, and sharing intermediate results. A sidecar version is also described.
-
公开(公告)号:US11044204B1
公开(公告)日:2021-06-22
申请号:US16527278
申请日:2019-07-31
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal
IPC: H04L12/825 , H04L12/26
Abstract: Nodes within a network are configured to adapt to changing path states, due to congestion, node failures, and/or other factors. A node may selectively convey path information and/or other state information to another node by annotating the information into packets it receives from the other node. A node may selectively reflect these annotated packets back to the other node, or other nodes that subsequently receive these annotated packets may reflect them. A weighted cost multipathing selection technique is improved by dynamically adjusting weights of paths in response to feedback indicating the current state of the network topology, such as collected through these reflected packets. In an embodiment, certain packets that would have been dropped may instead be transformed into “special visibility” packets that may be stored and/or sent for analysis. In an embodiment, insight into the performance of a network device is enhanced through the use of programmable visibility engines.
-
公开(公告)号:US10673770B1
公开(公告)日:2020-06-02
申请号:US16288165
申请日:2019-02-28
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Puneet Agarwal , Ajit Kumar Jain
IPC: H04L12/863 , H04L12/801 , H04L12/26
Abstract: A network device organizes packets into various queues, in which the packets await processing. Queue management logic tracks how long certain packet(s), such as a designated marker packet, remain in a queue. Based thereon, the logic produces a measure of delay for the queue, referred to herein as the “queue delay.” Based on a comparison of the current queue delay to one or more thresholds, various associated delay-based actions may be performed, such as tagging and/or dropping packets departing from the queue, or preventing addition enqueues to the queue. In an embodiment, a queue may be expired based on the queue delay, and all packets dropped. In other embodiments, when a packet is dropped prior to enqueue into an assigned queue, copies of some or all of the packets already within the queue at the time the packet was dropped may be forwarded to a visibility component for analysis.
-
公开(公告)号:US10652154B1
公开(公告)日:2020-05-12
申请号:US16186369
申请日:2018-11-09
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Bruce Hui Kwan
IPC: H04L12/803 , H04L12/947 , H04L12/24 , G06N3/08 , G06N5/02 , G06N20/00 , H04W52/02
Abstract: Approaches, techniques, and mechanisms facilitate actionable reporting of network state information and real-time, autonomous network engineering directly in-network at a switch or other network device. A data collector within the network device collects state information and/or data unit information from various device components, such as traffic managers and packet processors. The data collector, which may optionally generate additional state information by performing various calculations on the information it receives, is configured to then provide at least some of the state information to an analyzer device connected to an analyzer interface. The analyzer device, which may be a separate device, performs various analyses on the state information, depending on how it is configured. The analyzer device outputs reports that identify statuses, errors, misconfigurations, and/or suggested actions to take to improve operation of the network device. In an embodiment, some or all actions that may be suggested therein are executed automatically.
-
公开(公告)号:US10554572B1
公开(公告)日:2020-02-04
申请号:US15433825
申请日:2017-02-15
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Paul Roy Kim , Puneet Agarwal
IPC: H04L12/865 , H04L12/863 , H04L29/06
Abstract: Approaches, techniques, and mechanisms are disclosed for improving the efficiency with which data units are handled within a device, such as a networking device. Received data units, or portions thereof, are temporarily stored within one or more memories of a merging component, while the merging component waits to receive control information for the data units. Once received, the merging component merges the control information with the associated data units. The merging component dispatches the merged data units, or portions thereof, to an interconnect component, which forwards the merged data units to destinations indicated by the control information. The device is configured to intelligently schedule the dispatching of merged data units to the interconnect component. To this end, the device includes a scheduler configured to select which merged data units to dispatch at which times based on a variety of factors described herein.
-
公开(公告)号:US20180081577A1
公开(公告)日:2018-03-22
申请号:US15815854
申请日:2017-11-17
Applicant: Innovium, Inc.
Inventor: William Brad Matthews , Bruce H. Kwan , Mohammad K. Issa , Neil Barrett , Avinash Gyanendra Mani
Abstract: A memory system for a network device is described. The memory system includes a main memory configured to store one or more data elements. Further, the memory system includes a link memory that is configured to maintain one or more pointers to interconnect the one or more data elements stored in the main memory. The memory system also includes a free-entry manager that is configured to generate an available bank set including one or more locations in the link memory. In addition, the memory system includes a context manager that is configured to maintain metadata for a list of the one or more data elements.
-
-
-
-
-
-
-
-
-