Fabric vectors for deep learning acceleration
摘要:
Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes various attributes of the fabric vector: length, microthreading eligibility, number of data elements to receive, transmit, and/or process in parallel, virtual channel and task identification information, whether to terminate upon receiving a control wavelet, and whether to mark an outgoing wavelet a control wavelet.
公开/授权文献
信息查询
0/0