-
1.
公开(公告)号:US20240231903A1
公开(公告)日:2024-07-11
申请号:US18614639
申请日:2024-03-23
发明人: Qi ZHENG , Arnav GOEL , Conrad Alexander TURLIK , Guoyao FENG , Joshua Earle POLZIN , Fansheng CHENG , Ravinder KUMAR , Greg DYKEMA , Subhra MAZUMDAR , Milad SHARIF , Jiayu BAI , Neal SANGHVI , Arjun SABNIS , Letao CHEN
CPC分类号: G06F9/4881 , G06F9/3877
摘要: In a computer-implemented method a Dynamic Transfer Engine (DTE) included in a computing system receives a dynamic stimulus associated with transfer of stage data during execution of a dataflow application by the system. The DTE determines, based on source and destination devices of the transfer, a transfer method and a transfer channel to transfer the stage data between memories coupled to the source and destination devices. The DTE acquires, hardware resources of the computing system to transfer the stage using the channel and, initiates the transfer. A computer program product can cause one or more processors to perform the method. A computing system can comprise source and destination processors and memories, hardware channels to transfer data between the memories, a resource manager, and a DTE configured to perform the method.
-
公开(公告)号:US20220198117A1
公开(公告)日:2022-06-23
申请号:US17586571
申请日:2022-01-27
发明人: Martin Russell RAUMANN , Qi ZHENG , Bandish B. SHAH , Ravinder KUMAR , Kin Hing LEUNG , Sumti JAIRATH , Gregory Frederick GROHOSKI
摘要: A system for executing a graph partitioned across a plurality of reconfigurable computing units includes a processing node that has a first computing unit reconfigurable at a first level of configuration granularity and a second computing unit reconfigurable at a second, finer, level of configuration granularity. The first computing unit is configured by a host system to execute a first dataflow segment of the graph using one or more dataflow pipelines to generate a first intermediate result and to provide the first intermediate result to the second computing unit without passing through the host system. The second computing unit is configured by the host system to execute a second dataflow segment of the graph, dependent upon the first intermediate result, to generate a second intermediate result and to send the second intermediate result to a third computing unit, without passing through the host system, to continue execution of the graph.
-
公开(公告)号:US20220198114A1
公开(公告)日:2022-06-23
申请号:US17379921
申请日:2021-07-19
发明人: Martin Russell RAUMANN , Qi ZHENG , Bandish B. SHAH , Ravinder KUMAR , Kin Hing LEUNG , Sumti JAIRATH , Gregory Frederick GROHOSKI
摘要: Roughly described, the invention involves a system including a plurality of functional units that execute different segments of a dataflow, and share intermediate results via a peer-to-peer messaging protocol. The functional units are reconfigurable, with different units being reconfigurable at different levels of granularity. The peer-to-peer messaging protocol includes control tokens or other mechanisms by which the consumer of the intermediate results learns that data has been transferred, and in response thereto triggers its next dataflow segment. A host or configuration controller configures the data units with their respective dataflow segments, but once execution of the configured dataflow begins, no host need be involved in orchestrating data synchronization, the transfer of intermediate results, or the triggering of processing after the data are received. Control overhead is therefore minimized
-
公开(公告)号:US20230333879A1
公开(公告)日:2023-10-19
申请号:US18133632
申请日:2023-04-12
发明人: Arnav GOEL , Ravinder KUMAR , Qi ZHENG , Milad SHARIF , Jiayu BAI , Neal SANGHVI
CPC分类号: G06F9/4843 , G06F9/44505 , G06F9/5016
摘要: A data processing system is presented that is configured as a server in a client-server configuration for executing applications that a client in the client-server configuration can offload as execution tasks for execution on the server. The data processing system includes a reconfigurable processor, a storage device that stores configuration files for the applications, and a host processor that is coupled to the storage device and to the reconfigurable processor. The host processor is configured to receive an execution task of the execution tasks with an identifier of an application from the client, retrieve a configuration file that is associated with the application from the storage device using the identifier of the application, configure the reconfigurable processor with the configuration file, and start execution of the application on the reconfigurable processor, whereby the reconfigurable processor provides output data of the execution of the application to the client.
-
公开(公告)号:US20230205613A1
公开(公告)日:2023-06-29
申请号:US18087104
申请日:2022-12-22
发明人: Joshua POLZIN , Conrad Alexander TURLIK , Arnav GOEL , Qi ZHENG , Maran WILSON , Neal SANGHVI
CPC分类号: G06F9/544 , G06F9/5016
摘要: A method of pipelining execution stages of a pipelined application can comprise a Buffer Pipeline Manager (BPM) of a Buffer Pipelined Application computing System (BPAS) allocating pipeline buffers, configuring access to the pipeline buffers by stage processors of the system, transferring buffers from one stage processor to a successor stage processor, and transferring data from a buffer in one memory to a buffer in an alternative memory. The BPM can allocate the buffers based on execution parameters associated with the pipelined application and/or stage processors. The BPM can transfer data to a buffer in an alternative memory based on performance, capacity, and/or topological attributes of the memories and/or processors utilizing the memories. The BPM can perform operations of the method responsive to interfaces of a Pipeline Programming Interface (PPI). A BPAS can comprise hardware processors, physical memories, stage processors, an application execution program, the PPI, and the BPM.
-
公开(公告)号:US20240338297A1
公开(公告)日:2024-10-10
申请号:US18244677
申请日:2023-09-11
发明人: Arnav GOEL , Qi ZHENG , Guoyao FENG , Chen YANG , Jianding LUO
CPC分类号: G06F11/3644 , G06F11/3636 , G06F15/7871
摘要: A data processing system includes an array of reconfigurable units and a compiler configured to generate one or more configuration files for an application for execution on one or more reconfigurable processors. The data processing system further includes an execution flow logic which is configured to cause execution of the configuration files on the reconfigurable processors to be dependent upon one or more breakpoint conditions. The data processing further includes a runtime logic configured to execute the configuration files depending upon the breakpoint conditions. A corresponding method is also disclosed herein.
-
公开(公告)号:US20220269534A1
公开(公告)日:2022-08-25
申请号:US17185264
申请日:2021-02-25
发明人: Anand MISRA , Arnav GOEL , Qi ZHENG , Raghunath SHENBAGAM , Ravinder KUMAR
摘要: A method for executing applications in a system comprising general hardware and reconfigurable hardware includes accessing a first execution file comprising metadata storing a first priority indicator associated with a first application, and a second execution file comprising metadata storing a second priority indicator associated with a second application. In an example, use of the reconfigurable hardware is interleaved between the first application and the second application, and the interleaving is scheduled to take into account (i) workload of the reconfigurable hardware and (ii) the first priority indicator and the second priority indicator associated with the first application and the second application, respectively. In an example, when the reconfigurable hardware is used by one of the first and second applications, the general hardware is used by another of the first and second applications.
-
公开(公告)号:US20220197714A1
公开(公告)日:2022-06-23
申请号:US17582925
申请日:2022-01-24
发明人: Martin Russell RAUMANN , Qi ZHENG , Bandish B. SHAH , Ravinder KUMAR , Kin Hing LEUNG , Sumti JAIRATH , Gregory Frederick GROHOSKI
摘要: A system for training parameters of a neural network includes a processing node with a processor reconfigurable at a first level of configuration granularity and a controller reconfigurable at a finer level of configuration granularity. The processor is configured to execute a first dataflow segment of the neural network with training data to generate a predicted output value using a set of neural network parameters, calculate a first intermediate result for a parameter based on the predicted output value, a target output value, and a parameter gradient, and provide the first intermediate result to the controller. The controller is configured to receive a second intermediate result over a network, and execute a second dataflow segment, dependent upon the first intermediate result and the second intermediate result, to generate a third intermediate result indicative of an update of the parameter.
-
公开(公告)号:US20230409395A1
公开(公告)日:2023-12-21
申请号:US18211962
申请日:2023-06-20
发明人: Ravinder KUMAR , Conrad Alexander TURLIK , Arnav GOEL , Qi ZHENG , Raghunath SHENBAGAM , Anand MISRA , Ananda Reddy VAYYALA
CPC分类号: G06F9/5011 , G06F9/5016 , G06F15/7871 , G06F15/7867 , G06F9/5077 , G06F2209/501 , G06F2209/5011
摘要: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.
-
公开(公告)号:US20230388373A1
公开(公告)日:2023-11-30
申请号:US18200311
申请日:2023-05-22
发明人: Milad SHARIF , Ravinder KUMAR , Qi ZHENG , Neal SANGHVI , Jiayu BAI , Arnav GOEL
IPC分类号: H04L67/1014 , H04L67/1097
CPC分类号: H04L67/1014 , H04L67/1097
摘要: A data processing system is presented in a client-server configuration for executing first and second applications that a client in the client-server configuration can offload for execution onto the data processing system. The data processing system includes a server and a pool of reconfigurable data flow resources that is configured to execute the first application in a first runtime context and the second application in a second runtime context. The server is configured to establish a session with the client, receive first and second execution requests for executing the first application and the second application from the client, start respective first and second execution of the first and second applications in the respective first and second runtime contexts in response to receiving the first and second execution requests, and balance a first load from the first execution with a second load from the second execution.
-
-
-
-
-
-
-
-
-