Architecture to support tanh and sigmoid operations for inference acceleration in machine learning
摘要:
A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.
信息查询
0/0