Patent search ap:("MediaTek Inc.") AND inv:"Chien-Hung Lin" Page 1

1.

发明授权
Asymmetric quantization of multiple-and-accumulate operations in deep learning processing 有权

公开(公告)号：US10977001B2

公开(公告)日：2021-04-13

申请号：US16250874

申请日：2019-01-17

Applicant: MediaTek Inc.

Inventor： Chien-Hung Lin , Pei-Kuei Tsung , Chi-Ming Chen , Meng-Hsuan Cheng , ShengJe Hung

IPC: G06F7/544 , G06N3/063 , G06N3/04 , G06N3/08

Abstract: A processing unit performs multiply-and-accumulate (MAC) operations on asymmetrically quantized data. The processing unit includes a MAC hardware unit to perform the MAC operations on a first data sequence and a second data sequence to generate an asymmetric MAC output. Both the first data sequence and the second data sequence are asymmetrically quantized. The processing unit further includes an accumulator hardware unit to accumulate the first data sequence concurrently with the MAC operations to generate an accumulated output. The processing unit further includes a multiply-and-add (MAD) hardware unit to multiply the accumulated output with a second offset to generate a multiplication output, and to add the multiplication output, the asymmetric MAC output and a pre-computed value calculated before runtime to generate a final output. The second offset indicates an amount of asymmetry of the second data sequence with respect to zero.

2.

发明申请
MULTI-PROCESSOR SYSTEM WITH CACHE SHARING AND ASSOCIATED CACHE SHARING METHOD 审中-公开

公开(公告)号：US20170300427A1

公开(公告)日：2017-10-19

申请号：US15487402

申请日：2017-04-13

Applicant: MEDIATEK INC.

Inventor： Chien-Hung Lin , Ming-Ju Wu , Wei-Hao Chiao , Kun-Geng Lee , Shun-Chieh Chang , Ming-Ku Chang , Chia-Hao Hsu , Pi-Cheng Hsiao

IPC: G06F12/12 , G06F12/08

CPC classification number: G06F12/128 , G06F12/0804 , G06F12/0811 , G06F12/0831 , G06F12/084 , G06F12/0842 , G06F12/0862 , G06F2212/1016 , G06F2212/1044 , G06F2212/602 , G06F2212/621

Abstract: A multi-processor system with cache sharing has a plurality of processor sub-systems and a cache coherence interconnect circuit. The processor sub-systems have a first processor sub-system and a second processor sub-system. The first processor sub-system includes at least one first processor and a first cache coupled to the at least one first processor. The second processor sub-system includes at least one second processor and a second cache coupled to the at least one second processor. The cache coherence interconnect circuit is coupled to the processor sub-systems, and used to obtain a cache line data from an evicted cache line in the first cache, and transfer the obtained cache line data to the second cache for storage.

3.

发明授权
Hybrid non-uniform convolution transform engine for deep learning applications 有权

公开(公告)号：US10755169B2

公开(公告)日：2020-08-25

申请号：US15841733

申请日：2017-12-14

Applicant: MediaTek Inc.

Inventor： Pei-Kuei Tsung , Chien-Hung Lin , Yao-Sheng Wang , Po-Yu Chen

IPC: G06N3/063 , G06F17/15 , G06N3/04 , G06N3/08 , G06F7/544 , G06F7/48

Abstract: A system performs convolution operations based on an analysis of the input size. The input includes data elements and filter weights. The system includes multiple processing elements. Each processing element includes multipliers and adders, with more of the adders than the multipliers. According to at least the analysis result which indicates whether the input size matches a predetermined size, the system is operative to select a first mode or a second mode. In the first mode, a greater number of the adders than the multipliers are enabled for each processing element to multiply transformed input and to perform an inverse transformation. In the second mode, an equal number of the adders and the multipliers are enabled for each processing element to multiply-and-accumulate the input. One or more of the multipliers are shared by the first mode and the second mode.

4.

发明申请
HYBRID NON-UNIFORM CONVOLUTION TRANSFORM ENGINE FOR DEEP LEARNING APPLICATIONS 审中-公开

公开(公告)号：US20190114536A1

公开(公告)日：2019-04-18

申请号：US15841733

申请日：2017-12-14

Applicant: MediaTek Inc.

Inventor： Pei-Kuei Tsung , Chien-Hung Lin , Yao-Sheng Wang , Po-Yu Chen

IPC: G06N3/063 , G06N3/08 , G06N3/04

CPC classification number: G06N3/063 , G06F7/4806 , G06F7/5443 , G06F17/156 , G06N3/04 , G06N3/0454 , G06N3/08

Abstract: A system performs convolution operations based on an analysis of the input size. The input includes data elements and filter weights. The system includes multiple processing elements. Each processing element includes multipliers and adders, with more of the adders than the multipliers. According to at least the analysis result which indicates whether the input size matches a predetermined size, the system is operative to select a first mode or a second mode. In the first mode, a greater number of the adders than the multipliers are enabled for each processing element to multiply transformed input and to perform an inverse transformation. In the second mode, an equal number of the adders and the multipliers are enabled for each processing element to multiply-and-accumulate the input. One or more of the multipliers are shared by the first mode and the second mode.

5.

发明申请
HARDWARE ASSISTED CACHE FLUSHING MECHANISM 审中-公开

公开(公告)号：US20180143903A1

公开(公告)日：2018-05-24

申请号：US15620794

申请日：2017-06-12

Applicant: MediaTek Inc.

Inventor： Ming-Ju Wu , Chien-Hung Lin , Chia-Hao Hsu , Pi-Cheng Hsiao , Shao-Yu Wang

IPC: G06F12/0804 , G06F12/0815 , G06F12/121

CPC classification number: G06F12/0804 , G06F12/0811 , G06F12/0815 , G06F12/0833 , G06F12/12 , G06F12/121 , G06F2212/60 , G06F2212/621

Abstract: A multi-cluster, multi-processor computing system performs a cache flushing method. The method begins with a cache maintenance hardware engine receiving a request from a processor to flush cache contents to a memory. In response, the cache maintenance hardware engine generates commands to flush the cache contents to thereby remove workload of generating the commands from the processors. The commands are issued to the clusters, with each command specifying a physical address that identifies a cache line to be flushed.

6.

发明申请
SNOOP FILTER FOR MULTI-PROCESSOR SYSTEM AND RELATED SNOOP FILTERING METHOD 有权
Title translation: 用于多处理器系统的SNOOP过滤器和相关SNOOP过滤方法

公开(公告)号：US20160117249A1

公开(公告)日：2016-04-28

申请号：US14820571

申请日：2015-08-07

Applicant: MEDIATEK INC.

Inventor： Chien-Hung Lin , Wei-Hao Chiao

IPC: G06F12/08

CPC classification number: G06F12/0815 , G06F12/0824 , G06F12/0833 , G06F12/0842 , G06F2212/1021 , G06F2212/1044 , G06F2212/283 , G06F2212/608

Abstract: A snoop filter for a multi-processor system has a storage device and a control circuit. The control circuit manages at least a first-type entry and at least a second-type entry stored in the storage device. The first-type entry is configured to record information indicative of a first cache of the multi-processor system and first requested memory addresses that are associated with multiple first cache lines each being only available in the first cache. The second-type entry is configured to record information indicative of multiple second caches of the multi-processor system and at least a second requested memory address that is associated with a second cache line being available in each of the multiple second caches.

Abstract translation: 用于多处理器系统的窥探滤波器具有存储装置和控制电路。控制电路管理存储在存储装置中的至少第一类型条目和至少第二类型条目。第一类型条目被配置为记录指示多处理器系统的第一高速缓存的信息和与多个第一高速缓存行相关联的第一请求存储器地址，每个第一高速缓存行仅在第一高速缓存中可用。第二类型条目被配置为记录指示多处理器系统的多个第二高速缓存的信息，以及与第二高速缓存行相关联的至少第二请求存储器地址在多个第二高速缓存中的每一个中可用。

7.

发明申请
NEURAL NETWORK ENGINE WITH TILE-BASED EXECUTION 审中-公开

公开(公告)号：US20190220742A1

公开(公告)日：2019-07-18

申请号：US16246884

申请日：2019-01-14

Applicant: MediaTek Inc.

Inventor： Yu-Ting Kuo , Chien-Hung Lin , Shao-Yu Wang , ShengJe Hung , Meng-Hsuan Cheng , Chi-Ta Wu , Henrry Andrian , Yi-Siou Chen , Tai-Lung Chen

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: An accelerator for neural network computing includes hardware engines and a buffer memory. The hardware engines include a convolution engine and at least a second engine. Each hardware engine includes circuitry to perform neural network operations. The buffer memory stores a first input tile and a second input tile of an input feature map. The second input tile overlaps with the first input tile in the buffer memory. The convolution engine is operative to retrieve the first input tile from the buffer memory, perform convolution operations on the first input tile to generate an intermediate tile of an intermediate feature map, and pass the intermediate tile to the second engine via the buffer memory.

8.

发明授权
Neural network engine with tile-based execution 有权

公开(公告)号：US11436483B2

公开(公告)日：2022-09-06

申请号：US16246884

申请日：2019-01-14

Applicant: MediaTek Inc.

Inventor： Yu-Ting Kuo , Chien-Hung Lin , Shao-Yu Wang , ShengJe Hung , Meng-Hsuan Cheng , Chi-Ta Wu , Henrry Andrian , Yi-Siou Chen , Tai-Lung Chen

IPC: G06N3/08 , G06N3/04 , G06N3/063 , G06F12/084 , G06F17/15

Abstract: An accelerator for neural network computing includes hardware engines and a buffer memory. The hardware engines include a convolution engine and at least a second engine. Each hardware engine includes circuitry to perform neural network operations. The buffer memory stores a first input tile and a second input tile of an input feature map. The second input tile overlaps with the first input tile in the buffer memory. The convolution engine is operative to retrieve the first input tile from the buffer memory, perform convolution operations on the first input tile to generate an intermediate tile of an intermediate feature map, and pass the intermediate tile to the second engine via the buffer memory.

9.

发明申请
NEURAL NETWORK PROCESSING UNIT FOR HYBRID AND MIXED PRECISION COMPUTING 有权

公开(公告)号：US20220156567A1

公开(公告)日：2022-05-19

申请号：US17505422

申请日：2021-10-19

Applicant: MediaTek Inc.

Inventor： Chien-Hung Lin , Yi-Min Tsai , Chia-Lin Yu , Chi-Wei Yang

IPC: G06N3/063 , G06N3/04

Abstract: A neural network (NN) processing unit includes an operation circuit to perform tensor operations of a given layer of a neural network in one of a first number representation and a second number representation. The NN processing unit further includes a conversion circuit coupled to at least one of an input port and an output port of the operation circuit to convert between the first number representation and the second number representation. The first number representation is one of a fixed-point number representation and a floating-point number representation, and the second number representation is the other one of the fixed-point number representation and the floating-point number representation.

10.

发明申请
ASYMMETRIC QUANTIZATION OF MULTIPLE-AND-ACCUMULATE OPERATIONS IN DEEP LEARNING PROCESSING 审中-公开

公开(公告)号：US20190243610A1

公开(公告)日：2019-08-08

申请号：US16250874

申请日：2019-01-17

Applicant: MediaTek Inc.

Inventor： Chien-Hung Lin , Pei-Kuei Tsung , Chi-Ming Chen , Meng-Hsuan Cheng , ShengJe Hung

IPC: G06F7/544 , G06N3/063

CPC classification number: G06F7/5443 , G06N3/063

Abstract: A processing unit performs multiply-and-accumulate (MAC) operations on asymmetrically quantized data. The processing unit includes a MAC hardware unit to perform the MAC operations on a first data sequence and a second data sequence to generate an asymmetric MAC output. Both the first data sequence and the second data sequence are asymmetrically quantized. The processing unit further includes an accumulator hardware unit to accumulate the first data sequence concurrently with the MAC operations to generate an accumulated output. The processing unit further includes a multiply-and-add (MAD) hardware unit to multiply the accumulated output with a second offset to generate a multiplication output, and to add the multiplication output, the asymmetric MAC output and a pre-computed value calculated before runtime to generate a final output. The second offset indicates an amount of asymmetry of the second data sequence with respect to zero.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification