METHOD AND APPARATUS FOR READING CACHE DATA, AND STORAGE MEDIUM

    公开(公告)号:US20240184704A1

    公开(公告)日:2024-06-06

    申请号:US18524706

    申请日:2023-11-30

    摘要: A method and an apparatus for reading cache data, and a storage medium are provided. The method includes: receiving a read instruction; converting at least one type of address offset position corresponding to the read instruction into a coupled address offset position according to a preset rule; performing a matching operation in a data group according to the coupled address offset position which corresponds to the at least one type of address offset position and obtaining corresponding cache data; reading the cache data obtained through matching. This method is able to simultaneously perform matching operations and reading operations for at least two combined address offset positions, thereby greatly improving data reading efficiency and doubling cache data reading throughput without significantly increasing hardware logic. Furthermore, this configuration is applicable to read-only cache, read-write cache, and write-only cache, possessing great versatility.

    METHOD AND APPARATUS FOR LOADING TASK DATA, AND COMPUTER DEVICE

    公开(公告)号:US20240289048A1

    公开(公告)日:2024-08-29

    申请号:US18226687

    申请日:2023-07-26

    IPC分类号: G06F3/06

    摘要: Disclosed are method and apparatus for loading task data, a computer device, a storage medium, and a computer program product. The method includes: analyzing various types of buffers involved in a task after the task is initiated, and determining whether each of the buffers satisfies a preset read-only buffer condition; determining a buffer satisfying the read-only buffer condition as a read-only buffer; mapping the read-only buffer into a matched read-only storage space based on space information of the read-only storage space and size information of the read-only buffer, and obtaining corresponding read-only mapping information; and loading task data in the read-only buffer into the matched read-only storage space based on the read-only mapping information. With the method, the loading efficiency of task data can be improved.

    MULTIPLICATION-ACCUMULATION SYSTEM, MULTIPLICATION-ACCUMULATION METHOD, AND ELECTRONIC DEVICE

    公开(公告)号:US20240020094A1

    公开(公告)日:2024-01-18

    申请号:US18222101

    申请日:2023-07-14

    IPC分类号: G06F7/523

    CPC分类号: G06F7/523

    摘要: Multiplication-accumulation method and apparatus, a processor, and a computer program product are provided. The method includes: when a logical operation unit performs single-precision floating-point number multiplication-accumulation operation, combining two half-precision multiplier-accumulators in each single-precision multiplication-accumulation unit to perform the multiplication-accumulation operation on to-be-processed single-precision floating-point numbers to obtain corresponding single-precision multiplication-accumulation results, a total of N multiplication-accumulation results being obtained; and when the logical operation unit performs half-precision floating-point number multiplication-accumulation operation, performing, by each half-precision multiplier-accumulator, the multiplication-accumulation operation on to-be-processed half-precision floating-point numbers to obtain corresponding half-precision multiplication-accumulation results, a total of 2N multiplication-accumulation results being obtained. Utilization of the multiplier-accumulators is improved.

    DATA PROCESSING METHOD, COMPUTER DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20240176585A1

    公开(公告)日:2024-05-30

    申请号:US18225467

    申请日:2023-07-24

    IPC分类号: G06F7/523 G06F7/36

    CPC分类号: G06F7/523 G06F7/36

    摘要: The present application relates to a data processing method, a computer device, and a storage medium. The method includes: acquiring data formats of two pieces of input data; the data formats of the two pieces of input data being the same; determining a target data conversion algorithm matching the data formats from a plurality of preset data conversion algorithms, and performing, by using the target data conversion algorithm, data format conversion on the two pieces of input data to obtain at least two pieces of target input data; processing, by using a multiplier, the at least two pieces of target input data to obtain a preliminary operation result; and determining truncation bit widths corresponding to the two pieces of input data, and processing the preliminary operation result according to the truncation bit widths, to obtain a multiplication operation result corresponding to the two pieces of input data.

    THREAD CONSTRUCTION METHOD AND DEVICE
    5.
    发明公开

    公开(公告)号:US20240004702A1

    公开(公告)日:2024-01-04

    申请号:US17952730

    申请日:2022-09-26

    IPC分类号: G06F9/48

    CPC分类号: G06F9/4881

    摘要: Disclosed are a thread construction method and device. The method includes: a workload is divided into a plurality of work groups; for any work group, a pattern type that matches a size of the any work group is selected, a target thread construction pattern is determined from a plurality of candidate thread construction patterns corresponding to the pattern type; a plurality of threads are constructed according to the target thread construction pattern; the threads are composed of a plurality of consecutive work items in the any work group; the work item index corresponding to at least one key work item in the work item sequence of each thread is cached and the work item index of each thread is obtained, which is configured to schedule the any work item corresponding to the thread to the processing unit.

    CONVOLUTION OPERATION METHOD AND APPARATUS, MATRIX DECOMPRESSION DEVICE, AND GRAPHICS PROCESSOR

    公开(公告)号:US20240004615A1

    公开(公告)日:2024-01-04

    申请号:US18216809

    申请日:2023-06-30

    IPC分类号: G06F7/78 G06F17/15

    CPC分类号: G06F7/78 G06F17/15

    摘要: Convolution operation method and apparatus, matrix decompression device and graphics processor are provided. The method includes: loading, from a preset memory layout, at least one target feature tile constituting any sub-feature map in an original feature map for the any sub-feature map; the memory layout being obtained by writing at least one feature tile into memory according to preset way of data arrangement; the at least one feature tile being obtained by tiling the original feature map; decompressing a feature map which is composed of the at least one target feature tile according to a convolution parameter of a convolutional layer to obtain a destination decompressed matrix; performing a matrix multiplication operation on the destination decompressed matrix and the decompressed matrix corresponding to a convolution kernel to obtain a convolution operation result of the original feature map. The present disclosure may improve the convolution operation efficiency.