Abstract:
An example method of managing a plurality of hardware accelerators in a computing system includes executing workload management software in the computing system configured to allocate a plurality of jobs in a job queue among a pool of resources in the computer system; monitoring the job queue to determine required hardware functionalities for the plurality of jobs; provisioning at least one hardware accelerator of the plurality of hardware accelerators to provide the required hardware functionalities; configuring a programmable device of each provisioned hardware accelerator to implement at least one of the required hardware functionalities; and notifying the workload management software that each provisioned hardware accelerator is an available resource in the pool of resources.
Abstract:
An integrated circuit (IC) includes a first region being static and providing an interface between the IC and a host processor. The first region includes a first interconnect circuit block having a first master interface and a second interconnect circuit block having a first slave interface. The IC includes a second region coupled to the first region. The second region implements a kernel of a heterogeneous, multiprocessor design and includes a slave interface coupled to the first master interface of the first interconnect circuit block and configured to receive commands from the host processor. The second region also includes a master interface coupled the first slave interface of the second interconnect circuit block, wherein the master interface of the second region is a master for a memory controller.
Abstract:
A software development-based compilation flow for circuit design may include executing, using a processor, a makefile including a plurality of rules for hardware implementation. Responsive to executing a first rule of the plurality of rules, a source file including a kernel specified in a high level programming language may be selected; and, an intermediate file specifying a register transfer level implementation of the kernel may be generated using the processor. Responsive to executing a second rule of the plurality of rules, a configuration bitstream for a target integrated circuit may be generated from the intermediate file using the processor. The configuration bitstream includes a compute unit circuit implementation of the kernel.
Abstract:
OpenCL program compilation may include generating, using a processor, a register transfer level (RTL) description of a first kernel of a heterogeneous, multiprocessor design and integrating the RTL description of the first kernel with a base platform circuit design. The base platform circuit design provides a static interface within a programmable integrated circuit to a host of the heterogeneous, multiprocessor design. A first configuration bitstream may be generated from the RTL description of the first kernel using the processor. The first configuration bitstream specifies a hardware implementation of the first kernel and supporting data for the configuration bitstream. The first configuration bitstream and the supporting data may be included within a binary container.
Abstract:
Implementing hardware accelerators using programmable integrated circuits may include performing, using a processor, a design flow on a static circuit design. The static circuit design may specify a region reserved for a hardware accelerator and a static region comprising interface circuitry configured to couple the hardware accelerator with an external node. The design flow may generate an implemented static circuit design. Metadata describing the interface circuitry may be generated using a processor. A device support archive including the implemented static circuit design and the metadata may be written, using the processor, to a computer readable storage medium.
Abstract:
An integrated circuit (IC) includes a first region being static and providing an interface between the IC and a host processor. The first region includes a first interconnect circuit block having a first master interface and a second interconnect circuit block having a first slave interface. The IC includes a second region coupled to the first region. The second region implements a kernel of a heterogeneous, multiprocessor design and includes a slave interface coupled to the first master interface of the first interconnect circuit block and configured to receive commands from the host processor. The second region also includes a master interface coupled the first slave interface of the second interconnect circuit block, wherein the master interface of the second region is a master for a memory controller.