Matrix transfer accelerator system and method

    公开(公告)号:US12073105B2

    公开(公告)日:2024-08-27

    申请号:US17877518

    申请日:2022-07-29

    摘要: A matrix transfer accelerator (MTA) system/method that coordinates data transfers between an external data memory (EDM) and a local data memory (LDM) using matrix tiling and/or grouping is disclosed. The system utilizes foreground/background buffering that overlaps compute and data transfer operations and permits EDM-to-LDM data transfers with or without zero pad peripheral matrix filling. The system may incorporate an automated zero-fill direct memory access (DMA) controller (ZDC) that transfers data from the EDM to the LDM based on a set of DMA controller registers including data width register (DWR), transfer count register (TCR), fill count register (FCR), EDM source address register (ESR), and LDM target address register (LTR). The ZDC transfers matrix data from the EDM[ESR] to the LDM[LTR] such that EDM matrix data of DWR row data width is automatically zero-filled around a periphery of a matrix written to the LDM matrix based on the FCR value.

    CIRCUIT, SYSTEM, AND METHOD FOR MATRIX DECIMATION

    公开(公告)号:US20230297377A1

    公开(公告)日:2023-09-21

    申请号:US18164806

    申请日:2023-02-06

    IPC分类号: G06F9/30 G06F9/38

    CPC分类号: G06F9/30098 G06F9/3816

    摘要: A method is described herein. The method generally includes fetching a set of data from a memory coupled to a memory controller. The method generally includes determining a first subset of data from the set of data. The method generally includes determining a second subset of data from the set of data. The method generally includes determining a first element from the set of data. The method generally includes providing a vector including the first subset, the first element, and the second subset, wherein each element of the first subset is disposed in one portion of the vector and each element of the second subset is disposed in another portion of the vector. The method generally includes storing the vector into a register of the memory controller.

    Two-dimensional zero padding in a stream of matrix elements

    公开(公告)号:US11249759B2

    公开(公告)日:2022-02-15

    申请号:US16420457

    申请日:2019-05-23

    摘要: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for two selected dimensions of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When either selected dimension in the stream of vectors exceeds a respective specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.

    ON-THE-FLY PADDING FOR CNN FEATURE MAPS
    9.
    发明公开

    公开(公告)号:US20240354003A1

    公开(公告)日:2024-10-24

    申请号:US18305871

    申请日:2023-04-24

    IPC分类号: G06F3/06 G06N3/0464

    摘要: Disclosed herein are systems and methods for providing on-the-fly padding to feature maps of convolutional neural networks (CNNs). In an implementation, a processor first identifies a padding schema for a feature map based on a type of convolution to be performed on the feature map. Next the processor identifies a feature vector from the feature map currently in an associated memory. Then, the processor determines a padding for the feature vector based on the padding schema. Finally, the processor applies the padding to the feature vector while the feature vector is transferred from the associated memory to registers of the suitable computer.