Asymmetric quantization of multiple-and-accumulate operations in deep learning processing

    公开(公告)号:US10977001B2

    公开(公告)日:2021-04-13

    申请号:US16250874

    申请日:2019-01-17

    Applicant: MediaTek Inc.

    Abstract: A processing unit performs multiply-and-accumulate (MAC) operations on asymmetrically quantized data. The processing unit includes a MAC hardware unit to perform the MAC operations on a first data sequence and a second data sequence to generate an asymmetric MAC output. Both the first data sequence and the second data sequence are asymmetrically quantized. The processing unit further includes an accumulator hardware unit to accumulate the first data sequence concurrently with the MAC operations to generate an accumulated output. The processing unit further includes a multiply-and-add (MAD) hardware unit to multiply the accumulated output with a second offset to generate a multiplication output, and to add the multiplication output, the asymmetric MAC output and a pre-computed value calculated before runtime to generate a final output. The second offset indicates an amount of asymmetry of the second data sequence with respect to zero.

    Graphics Pipeline That Supports Multiple Concurrent Processes

    公开(公告)号:US20180033114A1

    公开(公告)日:2018-02-01

    申请号:US15219509

    申请日:2016-07-26

    Applicant: MediaTek Inc.

    CPC classification number: G06T1/20 G06F9/544 G06T1/60 G06T15/005 G06T15/80

    Abstract: A Graphics Processing Unit (GPU) concurrently executes kernel codes programmed in more than one programming framework. The GPU includes a first command decoder that decodes a first set of commands issued by a first Application Programming Interface (API) for executing a first kernel code. The GPU also includes a second command decoder that decodes a second set of commands issued by a second API for executing a second kernel code. The GPU also includes a plurality of shader cores and a pipe manager. According to decoded commands, the pipe manager assigns a first set of shader cores and a second set of shader cores to concurrently execute the first kernel code and the second kernel code, respectively.

    ASYMMETRIC QUANTIZATION OF MULTIPLE-AND-ACCUMULATE OPERATIONS IN DEEP LEARNING PROCESSING

    公开(公告)号:US20190243610A1

    公开(公告)日:2019-08-08

    申请号:US16250874

    申请日:2019-01-17

    Applicant: MediaTek Inc.

    CPC classification number: G06F7/5443 G06N3/063

    Abstract: A processing unit performs multiply-and-accumulate (MAC) operations on asymmetrically quantized data. The processing unit includes a MAC hardware unit to perform the MAC operations on a first data sequence and a second data sequence to generate an asymmetric MAC output. Both the first data sequence and the second data sequence are asymmetrically quantized. The processing unit further includes an accumulator hardware unit to accumulate the first data sequence concurrently with the MAC operations to generate an accumulated output. The processing unit further includes a multiply-and-add (MAD) hardware unit to multiply the accumulated output with a second offset to generate a multiplication output, and to add the multiplication output, the asymmetric MAC output and a pre-computed value calculated before runtime to generate a final output. The second offset indicates an amount of asymmetry of the second data sequence with respect to zero.

Patent Agency Ranking