RECONFIGURABLE ARCHITECTURE FOR FUSED DEPTH-WISE SEPARABLE CONVOLUTION (DSC)

    公开(公告)号:US20240070441A1

    公开(公告)日:2024-02-29

    申请号:US18451726

    申请日:2023-08-17

    CPC classification number: G06N3/0464 G06N3/10

    Abstract: A method of operating a depth-wise separable convolutional (DSC) network on a DSC accelerator includes determining a difference between a first throughput associated with a depth-wise convolution (DWC) engine of the DSC accelerator and a second throughput associated with a point-wise convolution (PWC) engine of the DSC accelerator. The method also includes selectively activating, for each layer of the DSC network, each first processing elements (PEs) in one or more of a first set of columns of first PEs associated with the DWC engine and/or each second PE in one or more of a second set of columns associated with the PWC engine based on the difference between the first throughput and the second throughput. The method further includes processing, for each layer of the DSC network, an input via the DSC accelerator based on selectively activating each first PE and/or each second PE.

    FEDERATED LEARNING WITH TRAINING METADATA
    3.
    发明公开

    公开(公告)号:US20230316090A1

    公开(公告)日:2023-10-05

    申请号:US18153687

    申请日:2023-01-12

    CPC classification number: G06N3/098 G06F9/54

    Abstract: Certain aspects of the present disclosure provide techniques and apparatus for performing federated learning. One example method generally includes sending model update data to a server, generating training metadata using a trained local machine learning model and local validation data, and sending the training metadata to the server. The trained local machine learning model generally incorporates the model update data and global model data defining a global machine learning model, and the training metadata generally includes data bout the trained local machine learning model used to determine when to discontinue federated learning operations for training the global machine learning model. Another example method generally includes sending a global model to a federated learning client device and receiving training metadata from the federated learning client device.

Patent Agency Ranking