Patent search ap:("NVIDIA Corporation") AND inv:"Shikun Liu" Page 1

1.

发明公开
VISION-LANGUAGE MODEL WITH AN ENSEMBLE OF EXPERTS 审中-公开

公开(公告)号：US20240265690A1

公开(公告)日：2024-08-08

申请号：US18544840

申请日：2023-12-19

Applicant: NVIDIA Corporation

Inventor： Animashree Anandkumar , Linxi Fan , Zhiding Yu , Chaowei Xiao , Shikun Liu

IPC: G06V10/82 , G06V10/80

CPC classification number: G06V10/82 , G06V10/811

Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.

Patent Agency Ranking