Abstract:
A neural network accelerator in which processing elements are configured in a systolic array structure includes a memory to store a plurality of feature data including first and second feature data and a plurality of kernel data including first and second kernel data, a first processing element to perform an operation based on the first feature data and the first kernel data and output the first feature data, a selection circuit to select one of the first feature data and the second feature data, based on a control signal, and output the selected feature data, a second processing element to perform an operation based on the selected feature data and one of the first and the second kernel data, and a controller to generate the control signal, based on a neural network characteristic associated with the plurality of feature data and kernel data.
Abstract:
A method and an apparatus for generating an address of data for an artificial neural network through steps of: performing an N-dimensional loop operation for generating the address of the data based on predetermined parameters, and generating the address of the data in order according to a predetermined direction are provided.
Abstract:
An Ultra Low Voltage (ULV) digital circuit includes a logic circuit comprising a plurality of logic gates and a plurality of buffered interconnects for connecting between the plurality of logic gates, a temperature sensor configured to detect a temperature of the logic circuit, and a voltage controller configured to control a driving voltage provided to the logic circuit in order to reduce a power consumption of the logic circuit based on the detected temperature. Each of the plurality of logic gates and buffered interconnects reduces a signal delay as a temperature increases.
Abstract:
Disclosed herein is a method and apparatus for injecting a fault and analyzing fault tolerance. The fault tolerance analysis apparatus extracts design information from a design. The fault tolerance analysis apparatus may inject a fault into a simulation of the design based on the extracted design information and parameters, and analyzes an influence of the fault on the simulation. Accordingly, in accordance with the fault tolerance analysis apparatus, fault tolerance for the fault injected into the simulation is analyzed, and the effect of the fault tolerance mechanism provided in the design is analyzed.
Abstract:
Disclosed herein are a method for supporting cache coherency based on virtual addresses for an artificial intelligence processor having large on-chip memory and an apparatus for the same. The method for supporting cache coherency according to an embodiment of the present disclosure includes, by an artificial intelligence processor including multiple processor cores and multiple caches, setting external memory address areas which do not overlap each other for respective multiple caches; and providing virtual addresses with which the multiple processor cores access the multiple caches.
Abstract:
Disclosed herein are a duty cycle monitoring method and apparatus for a memory interface, including receiving a clock signal as input and generating a first delay time offset and a second delay time offset, receiving the clock signal and the first delay time offset and then outputting a first delayed signal, receiving the first delayed signal and the second delay time offset and then outputting a second delayed signal, receiving the clock signal and the second delayed signal and then outputting a delay value corresponding to a half-period of the clock signal, and monitoring, based on the first delayed signal, whether a duty cycle of the clock signal conforms to a duty cycle specification.
Abstract:
Disclosed herein is a method for outer-product-based matrix multiplication for a floating-point data type includes receiving first floating-point data and second floating-point data and performing matrix multiplication on the first floating-point data and the second floating-point data, and the result value of the matrix multiplication is calculated based on the suboperation result values of floating-point units.
Abstract:
Provided is a memory interface device. A memory interface device, comprising: a DQS input buffer configured to receive input data strobe signals and output a first intermediate data strobe signal, the DQS input buffer providing a static offset; an offset control circuit configured to receive the first intermediate data strobe signal and output a second intermediate data strobe signal; and a duty adjustment buffer configured to receive the second intermediate data strobe signal and output a clean data strobe signal, wherein the offset control circuit provides a dynamic offset using the clean data strobe signal.
Abstract:
Provided is an image recognition processor. The image recognition processor includes a plurality of nano cores arranged in rows and columns and configured to perform a pattern recognition operation on an input feature using a kernel coefficient in response to each instruction, an instruction memory configured to provide the instruction to each of the plurality of nano cores, a feature memory configured to provide the input feature to each of the plurality of nano cores, a kernel memory configured to provide the kernel coefficients to the plurality of nano cores, and a functional safety processor core configured to receive a result of a pattern recognition operation outputted from the plurality of nano cores to detect the presence of a recognition error, and perform a fault tolerance function on the detected recognition error.
Abstract:
Provided is an image sensor. The image sensor includes a pixel array including pixels arranged along a first direction and a second direction, and partitioned into blocks, a converter configured to convert image signals outputted from the pixels into digital signals based on an image, and an image signal processor configured to add amplitudes of the digital signals belonging to each of the blocks to determine edge blocks among the blocks, compare the amplitudes of the digital signals to determine directions in which direction lines of the edge blocks are directed, and connect the direction lines to extract an edge of the image.