摘要:
A processor for batch thread processing includes a central register file, and one or more function unit batches each including two or more function units and one or more ports to access the central register file. The function units of the function unit batches execute an instruction batch including one or more instructions to sequentially execute the one or more instructions in the instruction batch.
摘要:
An apparatus and method for supporting a multi-mode. The apparatus for supporting a multi-mode may include an instruction distributor configured to select, according to a current execution mode, at least one instruction from among a plurality of received instructions that each include an operand and an opcode, and transfer the opcode included in each of at least one selected instruction to the plurality of functional units; an operand switch controller configured to generate, based on the operand included in each of the selected at least one instruction, switch configuration information for routing in order to execute the selected at least one instruction; and an operand switch configured to route, based on the switch configuration information, a functional unit output or a register file output to either a functional unit input or a register file input.
摘要:
A swizzle pattern generator is provided to reduce an overhead due to execution of a swizzle instruction in vector processing. The swizzle pattern generator is configured to provide swizzle patterns with respect to data sets of at least one vector register or vector processing unit. The swizzle pattern generator may be reconfigurable to generate various swizzle patterns for different vector operations.
摘要:
A method of compiling a program to be executed on a multicore processor is provided. The method may include generating an initial solution by mapping a task to a source processing element (PE) and a destination PE, and selecting a communication scheme for transmission of the task from the source PE to the destination PE, approximately optimizing the mapping and communication scheme included in the initial solution, and scheduling the task, wherein the communication scheme is designated in a compiling process.
摘要:
A coarse-grained reconfigurable processor having an improved code compression rate and a code decompression method thereof are provided to reduce a capacity of a configuration memory and reduce power consumption in a processor chip. The coarse-grained reconfigurable processor includes a configuration memory configured to store reconfiguration information including a header storing a compression mode indicator and a compressed code for each of a plurality of units and a body storing at least one uncompressed code, a decompressor configured to specify a code corresponding to each of the plurality of units among the at least one uncompressed code within the body based on the compression mode indicator and the compressed code within the header, and a reconfigurator including a plurality of PEs and configured to reconfigure data paths of the plurality of PEs based on the code corresponding to each unit.