Abstract:
A reconfigurable array is provided. The reconfigurable array includes a Very Long Instruction Word (VLIW) mode and a Coarse-Grained Array (CGA) mode. When the VLIW mode is converted to the CGA mode, instead of sharing a central register file between the VLIW mode and the CGA mode, live data to be used in the CGA mode is copied from the central register file to local register files.
Abstract:
A processor and a memory management method are provided. The processor includes a processor core, a cache which transceives data to/from the processor core via a single port, and stores the data accessed by the processor core, and a Scratch Pad Memory (SPM) which transceives the data to/from the processor core via at least one of a plurality of multi ports.
Abstract:
A cache memory system using temporal locality information and a data storage method are provided. The cache memory system including: a main cache which stores data accessed by a central processing unit; an extended cache which stores the data if the data is evicted from the main cache; and a separation cache which stores the data of the extended cache when the data of the extended cache is evicted from the extended cache and temporal locality information corresponding to the data of the extended cache satisfies a predetermined condition.
Abstract:
Described herein is a reconfigurable processor which uses a distributed configuration memory structure and an operation method thereof in which power consumption is reduced. A processing unit which configures the reconfigurable processor includes a functional unit, a distributed configuration memory, a no-operation (NOP) register, and a controller. The NOP register stores information which represents whether or not a NOP operation is performed at each clock cycle. The controller controls to deactivate the distributed configuration memory at a clock cycle at which a NOP operation is performed.
Abstract:
An apparatus and method for compressing trace data is provided. The apparatus includes a detection unit configured to detect trace data corresponding to one or more function units performing a substantially significant operation in a reconfigurable processor as valid trace data, and a compression unit configured to compress the valid trace data.
Abstract:
A method and apparatus for thread synchronization is provided. The apparatus for thread synchronization includes a reader configured to generate a data read request, a writer configured to generate a data write request, a register file configured to have a full status indicating that the register file stores data and an empty status indicating that the register file stores no data, and a controller configured to receive the data read request from the reader or the data write request from the writer, and to process the received data read request or the received data write request while stalling or releasing the reader or the writer according to whether the register file is in the full status or in the empty status and according to an operating status of the reader or the writer.
Abstract:
An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.
Abstract:
An apparatus and method that includes a single memory as a VLIW instruction cache and CGA configuration memory is provided. Data is provided from a storage unit to a processing core that is capable of processing data in a first mode and a second mode. If the processing core is processing in the first mode, first data is output. If the processing core is processing in the second mode, second data is output.
Abstract:
An instruction compressing apparatus and method for a parallel processing computer such as a very long instruction word (VLIW) computer, are provided. The instruction compressing apparatus includes a bundle code generating unit, an instruction compressing unit, and an instruction converting unit. The bundle code generating unit may generate a bundle code in response to an input of instructions to be compressed. The bundle code may indicate whether a current instruction group is terminated, and also whether an instruction group following the current instruction group is a no-operation (NOP) instruction group. The instruction compressing unit may remove a NOP instruction and/or a NOP instruction group from the input instructions according to the generated bundle code. The instruction converting unit may include the generated bundle code in the remaining instructions which have not been removed by the instruction compressing unit.
Abstract:
An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.