Abstract:
A unified container file can be selected using computer hardware. The unified container file can include a plurality of files embedded therein used to configure a programmable integrated circuit (IC). The plurality of files can include a first partial configuration bitstream and a second partial configuration bitstream. The unified container file also includes metadata specifying a defined relationship between the first partial configuration bitstream and the second partial configuration bitstream for programming the programmable IC. The defined relationship can be determined using computer hardware by reading the metadata from the unified container file. The programmable IC can be configured, using the computer hardware, based on the defined relationship specified by the metadata using the first partial configuration bitstream and the second partial configuration bitstream.
Abstract:
Remote kernel execution in a heterogeneous computing system can include executing, using a device processor of a device communicatively linked to a host processor, a device runtime and receiving from the host processor within a hardware submission queue of the device, a command. The command requests execution of a software kernel and specifies a descriptor stored in a region of a memory of the device shared with the host processor. In response to receiving the command, the device runtime, as executed by the device processor, invokes a runner program associated with the software kernel. The runner program can map a physical address of the descriptor to a virtual memory address corresponding to the descriptor that is usable by the software kernel. The runner program can execute the software kernel. The software kernel can access data specified by the descriptor using the virtual memory address as provided by the runner program.
Abstract:
Remote kernel execution in a heterogeneous computing system can include executing, using a device processor of a device communicatively linked to a host processor, a device runtime and receiving from the host processor within a hardware submission queue of the device, a command. The command requests execution of a software kernel and specifies a descriptor stored in a region of a memory of the device shared with the host processor. In response to receiving the command, the device runtime, as executed by the device processor, invokes a runner program associated with the software kernel. The runner program can map a physical address of the descriptor to a virtual memory address corresponding to the descriptor that is usable by the software kernel. The runner program can execute the software kernel. The software kernel can access data specified by the descriptor using the virtual memory address as provided by the runner program.
Abstract:
A software development-based compilation flow for circuit design may include executing, using a processor, a makefile including a plurality of rules for hardware implementation. Responsive to executing a first rule of the plurality of rules, a source file including a kernel specified in a high level programming language may be selected; and, an intermediate file specifying a register transfer level implementation of the kernel may be generated using the processor. Responsive to executing a second rule of the plurality of rules, a configuration bitstream for a target integrated circuit may be generated from the intermediate file using the processor. The configuration bitstream includes a compute unit circuit implementation of the kernel.
Abstract:
A method, non-transitory computer readable medium and apparatus for using an out-of-context sub-block in a hierarchical design flow for an integrated circuit are disclosed. For example, the method identifies one or more sub-blocks in the hierarchical design flow that are eligible for creating the out-of-context sub-block, receives a selection of one of the one or more sub-blocks that are eligible and creates the out-of-context sub-block for the one of the one or more sub-blocks that is selected.
Abstract:
A computer program product can include a non-transitory computer readable storage medium storing a unified container. The unified container can include a header structure, wherein the header structure has a fixed length and specifies a number of section headers included in the unified container. The unified container can include a plurality of section headers equivalent to the number of section headers specified in the header structure. The unified container can include a plurality of data sections corresponding to the plurality of section headers on a one-to-one basis. The plurality of data sections includes a first data section including a hardware binary and a second data section including a software binary. The hardware binary and the software binary are configured to program a programmable integrated circuit. Each section header specifies a type of data stored in the corresponding data section and specifies a mapping for the corresponding data section.
Abstract:
Modifying initialization data for a memory array of a circuit design can include providing, using a processor, portions of an incoming stream of data for initializing the memory array to emulation objects of a memory array emulator. The memory array emulator is configured to emulate an implementation of the memory array and the emulation objects represent block random access memories (block RAMs) of the memory array. Using the processor, the data can be formatted using the emulation objects to generate initialization data, wherein the data is formatted based upon configuration settings of the block RAMs emulated by the respective emulation objects. A configuration bitstream can be updated, using the processor, with the initialization data.
Abstract:
Using a flat shell for an accelerator card includes reading a flat shell from one or more computer readable storage media using computer hardware, wherein the flat shell is a synthesized, unplaced, and unrouted top-level circuit design specifying platform circuitry. A kernel specifying user circuitry is synthesized using the computer hardware. The kernel is obtained from the one or more computer readable storage media. The synthesized kernel is linked, using the computer hardware, to the flat shell forming a unified circuit design. The unified circuit design is placed and routed, using the computer hardware, to generate a placed and routed circuit design specifying the platform circuitry and the user circuitry for implementation in an integrated circuit.
Abstract:
An example computing system includes: a processing system, a hardware accelerator coupled to the processing system, and a software platform executing on the processing system. The hardware accelerator includes: a programmable integrated circuit (IC) configured with an acceleration circuit having a static region and a programmable region; a memory in the programmable IC configured to store metadata describing interface circuitry in at least one of the static region and the programmable region of the acceleration circuit. The software platform includes program code executable by the processing system to read the metadata from the memory of the hardware accelerator.
Abstract:
Constraint handling for a circuit design may include determining, using a processor, instances of parameterizable modules of a circuit design associated with constraints based upon a predefined hardware description language attribute within the instances, extracting, using the processor, parameter values from the instances of the parameterizable modules, and generating, using the processor, static constraint files for the instances of the parameterizable modules using the extracted parameter values.