-
公开(公告)号:US20180143903A1
公开(公告)日:2018-05-24
申请号:US15620794
申请日:2017-06-12
Applicant: MediaTek Inc.
Inventor: Ming-Ju Wu , Chien-Hung Lin , Chia-Hao Hsu , Pi-Cheng Hsiao , Shao-Yu Wang
IPC: G06F12/0804 , G06F12/0815 , G06F12/121
CPC classification number: G06F12/0804 , G06F12/0811 , G06F12/0815 , G06F12/0833 , G06F12/12 , G06F12/121 , G06F2212/60 , G06F2212/621
Abstract: A multi-cluster, multi-processor computing system performs a cache flushing method. The method begins with a cache maintenance hardware engine receiving a request from a processor to flush cache contents to a memory. In response, the cache maintenance hardware engine generates commands to flush the cache contents to thereby remove workload of generating the commands from the processors. The commands are issued to the clusters, with each command specifying a physical address that identifies a cache line to be flushed.
-
公开(公告)号:US11436483B2
公开(公告)日:2022-09-06
申请号:US16246884
申请日:2019-01-14
Applicant: MediaTek Inc.
Inventor: Yu-Ting Kuo , Chien-Hung Lin , Shao-Yu Wang , ShengJe Hung , Meng-Hsuan Cheng , Chi-Ta Wu , Henrry Andrian , Yi-Siou Chen , Tai-Lung Chen
IPC: G06N3/08 , G06N3/04 , G06N3/063 , G06F12/084 , G06F17/15
Abstract: An accelerator for neural network computing includes hardware engines and a buffer memory. The hardware engines include a convolution engine and at least a second engine. Each hardware engine includes circuitry to perform neural network operations. The buffer memory stores a first input tile and a second input tile of an input feature map. The second input tile overlaps with the first input tile in the buffer memory. The convolution engine is operative to retrieve the first input tile from the buffer memory, perform convolution operations on the first input tile to generate an intermediate tile of an intermediate feature map, and pass the intermediate tile to the second engine via the buffer memory.
-
公开(公告)号:US20190303757A1
公开(公告)日:2019-10-03
申请号:US16221295
申请日:2018-12-14
Applicant: MediaTek Inc.
Inventor: Wei-Ting Wang , Han-Lin Li , Chih Chung Cheng , Shao-Yu Wang
Abstract: A deep learning accelerator (DLA) includes processing elements (PEs) grouped into PE groups to perform convolutional neural network (CNN) computations, by applying multi-dimensional weights on an input activation to produce an output activation. The DLA also includes a dispatcher which dispatches input data in the input activation and non-zero weights in the multi-dimensional weights to the processing elements according to a control mask. The DLA also includes a buffer memory which stores the control mask which specifies positions of zero weights in the multi-dimensional weights. The PE groups generate output data of respective output channels in the output activation, and share a same control mask specifying same positions of the zero weights.
-
公开(公告)号:US20190220742A1
公开(公告)日:2019-07-18
申请号:US16246884
申请日:2019-01-14
Applicant: MediaTek Inc.
Inventor: Yu-Ting Kuo , Chien-Hung Lin , Shao-Yu Wang , ShengJe Hung , Meng-Hsuan Cheng , Chi-Ta Wu , Henrry Andrian , Yi-Siou Chen , Tai-Lung Chen
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: An accelerator for neural network computing includes hardware engines and a buffer memory. The hardware engines include a convolution engine and at least a second engine. Each hardware engine includes circuitry to perform neural network operations. The buffer memory stores a first input tile and a second input tile of an input feature map. The second input tile overlaps with the first input tile in the buffer memory. The convolution engine is operative to retrieve the first input tile from the buffer memory, perform convolution operations on the first input tile to generate an intermediate tile of an intermediate feature map, and pass the intermediate tile to the second engine via the buffer memory.
-
-
-