Patent search ap:("NVIDIA Corporation") AND inv:"Linxi Fan" Page 1

1.

发明公开
VISION-LANGUAGE MODEL WITH AN ENSEMBLE OF EXPERTS 审中-公开

公开(公告)号：US20240265690A1

公开(公告)日：2024-08-08

申请号：US18544840

申请日：2023-12-19

Applicant: NVIDIA Corporation

Inventor： Animashree Anandkumar , Linxi Fan , Zhiding Yu , Chaowei Xiao , Shikun Liu

IPC: G06V10/82 , G06V10/80

CPC classification number: G06V10/82 , G06V10/811

Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.

2.

发明申请
DATA GENERATION OF ROBOTIC DEVICES PERFORMING TASKS 有权

公开(公告)号：US20250073901A1

公开(公告)日：2025-03-06

申请号：US18239601

申请日：2023-08-29

Applicant: NVIDIA Corporation

Inventor： Ajay Uday Mandlekar , Soroush Nasiriany , Bowen Wen , Iretiayo Akinola , Yashraj Shyam Narang , Linxi Fan , Yuke Zhu , Dieter Fox

IPC: B25J9/16 , B25J19/02

Abstract: Apparatuses, systems, and techniques to generate data to train a robotic device to perform tasks. In at least one embodiment, one or more first videos of a robotic device performing a task is used to generate one or more second videos of the robotic device performing the task differently than depicted in the one or more first videos.

Patent Agency Ranking