Patent search ap:("NVIDIA Corporation") AND inv:"Animashree Anandkumar" Page 2

11.

发明公开
VISION-LANGUAGE MODEL WITH AN ENSEMBLE OF EXPERTS 审中-公开

公开(公告)号：US20240265690A1

公开(公告)日：2024-08-08

申请号：US18544840

申请日：2023-12-19

Applicant: NVIDIA Corporation

Inventor： Animashree Anandkumar , Linxi Fan , Zhiding Yu , Chaowei Xiao , Shikun Liu

IPC: G06V10/82 , G06V10/80

CPC classification number: G06V10/82 , G06V10/811

Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.

12.

发明公开
SCENE RECONSTRUCTION FROM MONOCULAR VIDEO 审中-公开

公开(公告)号：US20240257443A1

公开(公告)日：2024-08-01

申请号：US18524803

申请日：2023-11-30

Applicant: NVIDIA Corporation

Inventor： Christopher B. Choy , Or Litany , Charles Loop , Yuke Zhu , Animashree Anandkumar , Wei Dong

IPC: G06T15/20 , G06T1/20 , G06T5/50 , G06T5/70 , G06T7/579 , G06T7/90 , G06T19/20

CPC classification number: G06T15/20 , G06T1/20 , G06T5/50 , G06T5/70 , G06T7/579 , G06T7/90 , G06T19/20 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084 , G06T2210/04 , G06T2210/21 , G06T2219/2012

Abstract: A technique for reconstructing a three-dimensional scene from monocular video adaptively allocates an explicit sparse-dense voxel grid with dense voxel blocks around surfaces in the scene and sparse voxel blocks further from the surfaces. In contrast to conventional systems, the two-level voxel grid can be efficiently queried and sampled. In an embodiment, the scene surface geometry is represented as a signed distance field (SDF). Representation of the scene surface geometry can be extended to multi-modal data such as semantic labels and color. Because properties stored in the sparse-dense voxel grid structure are differentiable, the scene surface geometry can be optimized via differentiable volume rendering.

13.

发明公开
FAIRNESS-BASED NEURAL NETWORK MODEL TRAINING USING REAL AND GENERATED DATA 审中-公开

公开(公告)号：US20240144000A1

公开(公告)日：2024-05-02

申请号：US18307227

申请日：2023-04-26

Applicant: NVIDIA Corporation

Inventor： Yuji Roh , Weili Nie , De-An Huang , Arash Vahdat , Animashree Anandkumar

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A neural network model is trained for fairness and accuracy using both real and synthesized training data, such as images. During training a first sampling ratio between the real and synthesized training data is optimized. The first sampling ratio may comprise a value for each group (or attribute), where each value is optimized. A second sampling ratio defines relative amounts of training data that are used for each one of the groups. Furthermore, a neural network model accuracy and a fairness metric are both used for updating the first and second sampling ratios during training iterations. The neural network model may be trained using different classes of training data. The second sampling ratio may vary for each class.

14.

发明公开
DATA SET GENERATION AND AUGMENTATION FOR MACHINE LEARNING MODELS 审中-公开

公开(公告)号：US20230351807A1

公开(公告)日：2023-11-02

申请号：US17661706

申请日：2022-05-02

Applicant: NVIDIA Corporation

Inventor： Yuzhuo Ren , Weili Nie , Arash Vahdat , Animashree Anandkumar , Nishant Puri , Niranjan Avadhanam

IPC: G06V40/16 , G06V10/82 , G06V10/774 , G06V10/62

CPC classification number: G06V40/176 , G06V10/82 , G06V10/774 , G06V10/62 , G06V40/164

Abstract: A machine learning model (MLM) may be trained and evaluated. Attribute-based performance metrics may be analyzed to identify attributes for which the MLM is performing below a threshold when each are present in a sample. A generative neural network (GNN) may be used to generate samples including compositions of the attributes, and the samples may be used to augment the data used to train the MLM. This may be repeated until one or more criteria are satisfied. In various examples, a temporal sequence of data items, such as frames of a video, may be generated which may form samples of the data set. Sets of attribute values may be determined based on one or more temporal scenarios to be represented in the data set, and one or more GNNs may be used to generate the sequence to depict information corresponding to the attribute values.

Patent Agency Ranking