SYSTEMS AND METHODS FOR CONTRASTIVE ATTENTION-SUPERVISED TUNING

    公开(公告)号:US20220156527A1

    公开(公告)日:2022-05-19

    申请号:US17209011

    申请日:2021-03-22

    Abstract: Embodiments described herein embodiments described herein provide Contrastive Attention-Supervised Tuning (CAST), a training method to fix the visual grounding ability of contrastive SSL methods based on a data augmentation strategy using unsupervised saliency maps. In addition to the contrastive loss that encourages the model to pick the crop that comes from the corresponding image, CAST provides an explicit grounding supervision through a Grad-CAM based attention loss that enforces models to look at the specified object of interest that is common across different crops when making this decision. A new geometric transform is introduced for randomly cropping different views from an input image based on certain constraints derived from a saliency map.

    SYSTEMS AND METHODS FOR CONTRASTIVE ATTENTION-SUPERVISED TUNING

    公开(公告)号:US20220156592A1

    公开(公告)日:2022-05-19

    申请号:US17209013

    申请日:2021-03-22

    Abstract: Embodiments described herein embodiments described herein provide Contrastive Attention-Supervised Tuning (CAST), a training method to fix the visual grounding ability of contrastive SSL methods based on a data augmentation strategy using unsupervised saliency maps. In addition to the contrastive loss that encourages the model to pick the crop that comes from the corresponding image, CAST provides an explicit grounding supervision through a Grad-CAM based attention loss that enforces models to look at the specified object of interest that is common across different crops when making this decision. A new geometric transform is introduced for randomly cropping different views from an input image based on certain constraints derived from a saliency map.

Patent Agency Ranking