Patent search ap:("Mohamed bin Zayed University of Artificial Intelligence") AND inv:"Muhammad Muzammal NASEER" Page 1

1.

发明公开
SYSTEM AND METHOD FOR SELF-SUPERVISED VIDEO TRANSFORMER 审中-公开

公开(公告)号：US20240169692A1

公开(公告)日：2024-05-23

申请号：US17991410

申请日：2022-11-21

Applicant: Mohamed bin Zayed University of Artificial Intelligence

Inventor： Kanchana RANASINGHE , Muhammad Muzammal NASEER , Salman KHAN , Fahad KHAN

IPC: G06V10/74 , G06V20/59 , G06V40/20 , H04N19/132

CPC classification number: G06V10/761 , G06V20/597 , G06V40/23 , H04N19/132

Abstract: A system, computer readable medium and method trains a video transformer, using a machine learning engine, for human action recognition in a video. The method includes sampling video clips with varying temporal resolutions in global views and sampling the video clips from different spatiotemporal windows in local views. The machine learning engine is configured to match the global and local views in a framework of student-teacher networks to learn cross-view correspondence between local and global views, and to learn motion correspondence between varying temporal resolutions. The video transformer can output for display video clips in a manner that emphasizes attention to the recognized human action.

2.

发明公开
SYSTEM AND METHOD FOR SELF-DISTILLED VISION TRANSFORMER FOR DOMAIN GENERALIZATION 审中-公开

公开(公告)号：US20240203098A1

公开(公告)日：2024-06-20

申请号：US18084152

申请日：2022-12-19

Applicant: Mohamed bin Zayed University of Artificial Intelligence

Inventor： Maryam SULTANA , Muhammad Muzammal NASEER , Muhammad Haris KHAN , Salman KHAN , Fahad Shahbaz KHAN

IPC: G06V10/774 , G06V10/764 , G06V10/77 , G06V10/776 , G06V10/82

CPC classification number: G06V10/774 , G06V10/764 , G06V10/7715 , G06V10/776 , G06V10/82 , G06V2201/03

Abstract: An apparatus and method for a machine learning engine for domain generalization which trains a vision transformer neural network using a training dataset including at least two domains for diagnosis of a medical condition. Image patches and class tokens are processed through a sequence of feature extraction transformer blocks to obtain a predicted class token. In parallel, intermediate class tokens are extracted as outputs of each of the feature extraction transformer blocks, where each transformer block is a sub-model. One sub-model is randomly sampled from the sub-models to obtain a sampled intermediate class token. The intermediate class token is used to make a sub-model prediction. The vision transformer neural network is optimized based on a difference between the predicted class token and the sub-model prediction. Inferencing is performed for a target medical image in a target domain that is different from the at least two domains.

3.

发明公开
SYSTEM AND METHOD OF TRAINING VISION TRANSFORMER ON SMALL-SCALE DATASETS 审中-公开

公开(公告)号：US20240212330A1

公开(公告)日：2024-06-27

申请号：US18089107

申请日：2022-12-27

Applicant: Mohamed bin Zayed University of Artificial Intelligence

Inventor： Mohammad Hanan GANI , Muhammad Muzammal NASEER , Mohammad YAQUB

IPC: G06V10/774 , G06V10/94 , G06V20/69 , G06V20/70

CPC classification number: G06V10/7753 , G06V10/95 , G06V20/695 , G06V20/698 , G06V20/70 , G06V2201/03

Abstract: A deep learning training system and method, includes an imaging system for capturing medical images, a machine learning engine, and display. The machine learning engine selects a small-scale of images from a training dataset, generates global views by randomly selecting regions in one image, generates local views by randomly selecting regions covering less than a majority of the image, receives the generated global views as a first sequence of non-overlapping image patches, receives the generated global views and the generated local views as a second sequence of non-overlapping image patches, trains parameters in a student-teacher network to predict a class of objects by self-supervised view prediction using the first sequence and the second sequence. The teacher parameters are updated via exponential moving average of the student network parameters. The parameters in the teacher network are transferred to the vision transformer, and the vision transformer is trained by supervised learning.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification