-
公开(公告)号:US20200050457A1
公开(公告)日:2020-02-13
申请号:US16505913
申请日:2019-07-09
Inventor: Yong Wang , Jiaxin Shi , Rong Chen , Jinchen Han
Abstract: Embodiments of the present disclosure disclose a method and apparatus for executing an instruction for an artificial intelligence chip. A specific embodiment of the method comprises: receiving descriptive information for describing a neural network model sent by a central processing unit, the descriptive information including at least one operation instruction; analyzing the descriptive information to acquire the at least one operation instruction; determining, for an operation instruction of the at least one operation instruction, a special-purpose execution component executing the operation instruction, and locking the determined special-purpose execution component; sending the operation instruction to the determined special-purpose execution component; and unlocking the determined special-purpose execution component in response to receiving a notification for instructing the operation instruction being completely executed.
-
公开(公告)号:US20180129933A1
公开(公告)日:2018-05-10
申请号:US15618415
申请日:2017-06-09
Inventor: Yong Wang , Jian Ouyang , Wei Qi , Sizhong Li
CPC classification number: G06N3/0445 , G06N3/063
Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
-
公开(公告)号:US20180107630A1
公开(公告)日:2018-04-19
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
CPC classification number: G06F17/16 , G06F9/3895 , G06N99/005
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n-1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
公开(公告)号:US11615296B2
公开(公告)日:2023-03-28
申请号:US16871473
申请日:2020-05-11
Inventor: Yong Wang
Abstract: Embodiments of the present disclosure provide a method and an apparatus for testing a depth learning chip, an electronic device, and a computer-readable storage medium. The method includes: testing a plurality of logic units in the depth learning chip. The plurality of logic units are configured to perform at least one of an inference operation and a training operation for depth learning. The method further include: obtaining one or more error units that do not pass the testing from the plurality of logic units. In addition, the method further includes: in response to a ratio of a number of the one or more error units to a total number of the plurality of logic units being lower than or equal to a predetermined ratio, determining the depth learning chip as a qualified chip.
-
公开(公告)号:US20200050924A1
公开(公告)日:2020-02-13
申请号:US16502687
申请日:2019-07-03
Inventor: Jiaxin Shi , Huimin Li , Yong Wang
Abstract: Embodiments of the present disclosure relate to a data processing method and apparatus for a neural network. The neural network is provided with at least one activation function. A method may include: converting, in response to that an activation function acquiring current data is a target function, based on a conversion relationship between the target function and a preset function, the current data into input data of the preset function; finding out first output data of the preset function with the input data as an input in a lookup table corresponding to the preset function; obtaining second output data of the target function with the current data as an input by conversion based on the conversion relationship and the first output data; and outputting the second output data.
-
公开(公告)号:US10454680B2
公开(公告)日:2019-10-22
申请号:US15619151
申请日:2017-06-09
Abstract: The present application discloses an RSA decryption processor and a method for controlling an RSA decryption processor. A specific implementation of the processor includes a memory, a control component, and a parallel processor. The memory is configured to store decryption parameters comprising a private key. The control component is configured to receive a ciphertext set, and send a decryption signal comprising the ciphertext set to the parallel processor. The parallel processor is configured to: read a decryption parameter from the memory in response to receiving the decryption signal, and use at least one modular exponentiation circuit unit in the parallel processor to perform in parallel a modular exponentiation operation on ciphertexts in the ciphertext set by using the read decryption parameter, to obtain plaintexts corresponding to the ciphertexts. This implementation improves the efficiency of RSA decryption.
-
17.
公开(公告)号:US10127040B2
公开(公告)日:2018-11-13
申请号:US15279217
申请日:2016-09-28
Inventor: Wei Qi , Jian Ouyang , Yong Wang
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: The present application discloses a processor and a method for executing an instruction on a processor. A specific implementation of the processor includes: a host interaction device, an instruction control device, an off-chip memory, an on-chip cache and an array processing device, wherein the host interaction device is configured to exchange data and instructions with a host connected with the processor, wherein the exchanged data has a granularity of a matrix; the off-chip memory is configured to store a matrix received from the host, on which a matrix operation is to be performed; and the instruction control device is configured to convert an external instruction received from the host to a series of memory access instructions and a series of computing instructions and execute the converted instructions. The implementation can improve the execution efficiency of a deep learning algorithm.
-
公开(公告)号:US11651198B2
公开(公告)日:2023-05-16
申请号:US16502687
申请日:2019-07-03
Inventor: Jiaxin Shi , Huimin Li , Yong Wang
Abstract: Embodiments of the present disclosure relate to a data processing method and apparatus for a neural network. The neural network is provided with at least one activation function. A method may include: converting, in response to that an activation function acquiring current data is a target function, based on a conversion relationship between the target function and a preset function, the current data into input data of the preset function; finding out first output data of the preset function with the input data as an input in a lookup table corresponding to the preset function; obtaining second output data of the target function with the current data as an input by conversion based on the conversion relationship and the first output data; and outputting the second output data.
-
公开(公告)号:US10607668B2
公开(公告)日:2020-03-31
申请号:US15281283
申请日:2016-09-30
Inventor: Jian Ouyang , Wei Qi , Yong Wang
Abstract: The present application discloses a data processing method and apparatus. A specific embodiment of the method includes: preprocessing received to-be-processed input data; obtaining a storage address of configuration parameters of the to-be-processed input data based on a result of the preprocessing and a result obtained by linearly fitting an activation function, the configuration parameters being preset according to curve characteristics of the activation function; acquiring the configuration parameters of the to-be-processed input data according to the storage address; and processing the result of the preprocessing of the to-be-processed input data based on the configuration parameters of the to-be-processed input data and a preset circuit structure, to obtain a processing result. This implementation manner implements the processing of the input data to be processed by using the configuration parameter and the preset circuit structure, without the need to use any special circuit for implementing the activation function, thereby simplifying the circuit structure. In addition, this implementation manner can support multiple types of activation functions, thereby improving the flexibility. With such an embodiment, the processing of the input data to be processed can be realized by using the configuration parameters and the preset circuit structure, without the need of using a special circuit to implement the activation function, thereby simplifying the circuit structure, supporting various activation functions, and improving the flexibility.
-
公开(公告)号:US10140251B2
公开(公告)日:2018-11-27
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n−1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
-
-
-
-
-
-
-
-