-
公开(公告)号:US11210322B2
公开(公告)日:2021-12-28
申请号:US16814178
申请日:2020-03-10
Inventor: Huimin Li , Jian Ouyang
IPC: G06F17/30 , G06F16/28 , G06F16/22 , G06F16/901
Abstract: Embodiments of the present disclosure relate to a method and apparatus for reducing storage space of a parameter table. The method may include: storing the parameter table in a lookup table system configured to compute an output value of a non-linear function according to an input value of the non-linear function, the parameter table including only an index value associated with an input value on one side of a median in a domain of the non-linear function and a parameter value corresponding to the index value; determining, by using a corresponding relationship between the index value associated with the input value on one side and the parameter value corresponding to the index value, a parameter value corresponding to an index value associated with an input value on the other side; and computing the output value by using the input value on the other side and the determined corresponding parameter value.
-
公开(公告)号:US11087203B2
公开(公告)日:2021-08-10
申请号:US15618415
申请日:2017-06-09
Inventor: Yong Wang , Jian Ouyang , Wei Qi , Sizhong Li
Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
-
公开(公告)号:US11023801B2
公开(公告)日:2021-06-01
申请号:US15618817
申请日:2017-06-09
Inventor: Jian Ouyang , Wei Qi , Yong Wang , Lin Liu
Abstract: The present application discloses a data processing method and apparatus. A specific implementation of the method includes: receiving floating point data sent from an electronic device; converting the received floating point data into fixed point data according to a data length and a value range of the received floating point data; performing calculation on the obtained fixed point data according to a preset algorithm to obtain result data in a fixed point form; and converting the obtained result data in the fixed point form into result data in a floating point form and sending the result data in the floating point form to the electronic device. This implementation improves the data processing efficiency.
-
公开(公告)号:US20210049045A1
公开(公告)日:2021-02-18
申请号:US16809020
申请日:2020-03-04
Inventor: Xianglun Leng , Zhibiao Zhao , Jinchen Han , Jian Ouyang , Wei Qi , Yong Wang
Abstract: Embodiments of the present disclosure relate to a method and apparatus for resource management, an electronic device, and a computer-readable storage medium. The method may include: determining a plurality of virtual functions to be supported, where each of the plurality of virtual functions corresponds to a virtual machine running on a computing device. The method may further include: dividing a physical resource set into a plurality of physical resource subsets according to a predetermined ratio, a number of the physical resource subsets being identical to a number of the virtual functions. The method may further include: allocating the plurality of physical resource subsets to the plurality of virtual functions respectively.
-
公开(公告)号:US20180129933A1
公开(公告)日:2018-05-10
申请号:US15618415
申请日:2017-06-09
Inventor: Yong Wang , Jian Ouyang , Wei Qi , Sizhong Li
CPC classification number: G06N3/0445 , G06N3/063
Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
-
公开(公告)号:US20180107630A1
公开(公告)日:2018-04-19
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
CPC classification number: G06F17/16 , G06F9/3895 , G06N99/005
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n-1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
7.
公开(公告)号:US11221851B2
公开(公告)日:2022-01-11
申请号:US16936676
申请日:2020-07-23
Inventor: Huimin Li , Peng Wu , Jian Ouyang
Abstract: Embodiments of the present disclosure provide a method, executed by a computing device, for configuring a vector operation, an apparatus, a device, and a storage medium. The method includes obtaining information indicating at least one configurable vector operation parameter. The information indicating the at least one configurable vector operation parameter indicates a type and a value of the configurable vector operation parameter. The method further includes: based on the type and the value of the configurable vector operation parameter, configuring multiple vector operation circuits to enable each of the vector operation circuits to execute a target vector operation including two or more basic vector operations and defined based on the type and value of the configurable vector operation parameter.
-
8.
公开(公告)号:US20210271482A1
公开(公告)日:2021-09-02
申请号:US17210616
申请日:2021-03-24
Inventor: Yingnan Xu , Jian Ouyang , Xueliang Du , Kang An
Abstract: Example embodiments of the present application provide an instruction executing method and apparatus, an electronic device, and a computer-readable storage medium that may be applied in the field of artificial intelligence. The instruction executing method may include: executing an instruction sequence that includes memory instructions and non-memory instructions, the instructions in the sequence executed starting to be executed in order; determining that execution of a first memory instruction needs to be completed before a second memory instruction starts to be executed, the second memory instruction being a next memory instruction following the first memory instruction in the instruction sequence; and executing non-memory instructions between the first memory instruction and the second memory instruction without executing the second memory instruction, during a cycle of executing the first memory instruction.
-
公开(公告)号:US10127040B2
公开(公告)日:2018-11-13
申请号:US15279217
申请日:2016-09-28
Inventor: Wei Qi , Jian Ouyang , Yong Wang
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: The present application discloses a processor and a method for executing an instruction on a processor. A specific implementation of the processor includes: a host interaction device, an instruction control device, an off-chip memory, an on-chip cache and an array processing device, wherein the host interaction device is configured to exchange data and instructions with a host connected with the processor, wherein the exchanged data has a granularity of a matrix; the off-chip memory is configured to store a matrix received from the host, on which a matrix operation is to be performed; and the instruction control device is configured to convert an external instruction received from the host to a series of memory access instructions and a series of computing instructions and execute the converted instructions. The implementation can improve the execution efficiency of a deep learning algorithm.
-
公开(公告)号:US12141228B2
公开(公告)日:2024-11-12
申请号:US17017600
申请日:2020-09-10
Inventor: Xiaozhang Gong , Jian Ouyang , Jing Wang , Wei Qi
Abstract: Embodiments of the present disclosure propose a deep learning processing apparatus and method, device and storage medium, relating to the field of artificial intelligence. A deep learning processing apparatus includes: at least one matrix multiply-add module, configured to perform a matrix multiply-add operation of a convolution kernel parameter value matrix of a convolutional layer in a convolutional neural network and a first error gradient value matrix to obtain a plurality of intermediate matrices; a storage apparatus, configured to store the plurality of intermediate matrices without reshaping elements in the plurality of intermediate matrices; and a plurality of matrix accumulation modules, configured to read the plurality of intermediate matrices from the storage apparatus and perform a matrix accumulation operation based on the plurality of intermediate matrices according to a convolution scheme of the convolutional layer in parallel, to obtain a second error gradient value matrix for the convolutional layer.
-
-
-
-
-
-
-
-
-