发明授权
- 专利标题: Optimized and scalable sparse triangular linear systems on networks of accelerators
-
申请号: US16044145申请日: 2018-07-24
-
公开(公告)号: US10936697B2公开(公告)日: 2021-03-02
- 发明人: Khaled Hamidouche , Michael W. LeBeane , Nicholas P. Malaya , Joseph L. Greathouse
- 申请人: Advanced Micro Devices, Inc.
- 申请人地址: US CA Santa Clara
- 专利权人: Advanced Micro Devices, Inc.
- 当前专利权人: Advanced Micro Devices, Inc.
- 当前专利权人地址: US CA Santa Clara
- 代理机构: Liang & Cheng, PC
- 主分类号: G06F17/16
- IPC分类号: G06F17/16 ; G06F9/38 ; G06F9/30 ; G06F17/12
摘要:
A method includes storing a first portion of a sparse triangular matrix in a local memory and launching a kernel for executing a set of workgroups. The first portion includes a plurality of row blocks, and each workgroup in the set of workgroups is associated with one of the plurality of row blocks. The method also includes, for each workgroup in the set of workgroups, solving the row block. The row block is solved by, for each row segment of a first subset of row segments in the row block, calculating a partial sum for the row segment based on one or more matrix elements in the row segment, and writing the partial sum to a remote memory of a first remote processing unit prior to terminating the kernel.
公开/授权文献
信息查询