Patent search ap:("Nvidia Corporation") AND inv:"Zhiding Yu" Page 1

1.

发明授权
Learning dense correspondences for images 有权

公开(公告)号：US12169882B2

公开(公告)日：2024-12-17

申请号：US17929182

申请日：2022-09-01

Applicant: NVIDIA Corporation

Inventor： Sifei Liu , Jiteng Mu , Shalini De Mello , Zhiding Yu , Jan Kautz

IPC: G06T11/00 , G06T3/18

Abstract: Embodiments of the present disclosure relate to learning dense correspondences for images. Systems and methods are disclosed that disentangle structure and texture (or style) representations of GAN synthesized images by learning a dense pixel-level correspondence map for each image during image synthesis. A canonical coordinate frame is defined and a structure latent code for each generated image is warped to align with the canonical coordinate frame. In sum, the structure associated with the latent code is mapped into a shared coordinate space (canonical coordinate space), thereby establishing correspondences in the shared coordinate space. A correspondence generation system receives the warped coordinate correspondences as an encoded image structure. The encoded image structure and a texture latent code are used to synthesize an image. The shared coordinate space enables propagation of semantic labels from reference images to synthesized images.

2.

发明授权
Learning contrastive representation for semantic correspondence 有权

公开(公告)号：US11960570B2

公开(公告)日：2024-04-16

申请号：US17412091

申请日：2021-08-25

Applicant: NVIDIA Corporation

Inventor： Taihong Xiao , Sifei Liu , Shalini De Mello , Zhiding Yu , Jan Kautz

IPC: G06F18/00 , G06F18/213 , G06F18/214 , G06N3/08 , G06V10/22 , G06V30/14

CPC classification number: G06F18/2155 , G06F18/213 , G06N3/08 , G06V10/22 , G06V30/1444

Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.

3.

发明公开
NEURAL NETWORK PROMPT TUNING 审中-公开

公开(公告)号：US20240095534A1

公开(公告)日：2024-03-21

申请号：US18243348

申请日：2023-09-07

Applicant: NVIDIA Corporation

Inventor： Anima Anandkumar , Chaowei Xiao , Weili Nie , De-An Huang , Zhiding Yu , Manli Shu

IPC: G06N3/084 , G06N3/045

CPC classification number: G06N3/084 , G06N3/045

Abstract: Apparatuses, systems, and techniques to perform neural networks. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected based, at least in part, on a plurality of variances of one or more inputs to the one or more neural networks.

4.

发明申请
AUTOMATIC LABELING AND SEGMENTATION USING MACHINE LEARNING MODELS 有权

公开(公告)号：US20220292306A1

公开(公告)日：2022-09-15

申请号：US17201816

申请日：2021-03-15

Applicant: NVIDIA Corporation

Inventor： Subhashree Radhakrishnan , Partha Sriram , Farzin Aghdasi , Seunghwan Cha , Zhiding Yu

IPC: G06K9/62 , G06K9/20 , G06K9/32 , G06T3/00 , G06T7/12 , G06K9/00

Abstract: In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.

5.

发明公开
VISION-LANGUAGE MODEL WITH AN ENSEMBLE OF EXPERTS 审中-公开

公开(公告)号：US20240265690A1

公开(公告)日：2024-08-08

申请号：US18544840

申请日：2023-12-19

Applicant: NVIDIA Corporation

Inventor： Animashree Anandkumar , Linxi Fan , Zhiding Yu , Chaowei Xiao , Shikun Liu

IPC: G06V10/82 , G06V10/80

CPC classification number: G06V10/82 , G06V10/811

Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.

6.

发明授权
Automatic labeling and segmentation using machine learning models 有权

公开(公告)号：US11899749B2

公开(公告)日：2024-02-13

申请号：US17201816

申请日：2021-03-15

Applicant: NVIDIA Corporation

Inventor： Subhashree Radhakrishnan , Partha Sriram , Farzin Aghdasi , Seunghwan Cha , Zhiding Yu

IPC: G06F18/214 , G06T7/12 , G06T3/00 , G06V10/22 , G06V10/24 , G06V20/40

CPC classification number: G06F18/214 , G06T3/0006 , G06T7/12 , G06V10/22 , G06V10/242 , G06V20/40 , G06T2207/20081 , G06T2207/20084

Abstract: In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.

7.

发明公开
ESTIMATING OPTIMAL TRAINING DATA SET SIZES FOR MACHINE LEARNING MODEL SYSTEMS AND APPLICATIONS 审中-公开

公开(公告)号：US20230376849A1

公开(公告)日：2023-11-23

申请号：US18318212

申请日：2023-05-16

Applicant: NVIDIA Corporation

Inventor： Rafid Reza Mahmood , Marc Law , James Robert Lucas , Zhiding Yu , Jose Manuel Alvarez Lopez , Sanja Fidler

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: In various examples, estimating optimal training data set sizes for machine learning model systems and applications. Systems and methods are disclosed that estimate an amount of data to include in a training data set, where the training data set is then used to train one or more machine learning models to reach a target validation performance. To estimate the amount of training data, subsets of an initial training data set may be used to train the machine learning model(s) in order to determine estimates for the minimum amount of training data needed to train the machine learning model(s) to reach the target validation performance. The estimates may then be used to generate one or more functions, such as a cumulative density function and/or a probability density function, wherein the function(s) is then used to estimate the amount of training data needed to train the machine learning model(s).

8.

发明公开
LEARNING DENSE CORRESPONDENCES FOR IMAGES 审中-公开

公开(公告)号：US20230252692A1

公开(公告)日：2023-08-10

申请号：US17929182

申请日：2022-09-01

Applicant: NVIDIA Corporation

Inventor： Sifei Liu , Jiteng Mu , Shalini De Mello , Zhiding Yu , Jan Kautz

IPC: G06T11/00 , G06T3/00

CPC classification number: G06T11/001 , G06T3/0093

Abstract: Embodiments of the present disclosure relate to learning dense correspondences for images. Systems and methods are disclosed that disentangle structure and texture (or style) representations of GAN synthesized images by learning a dense pixel-level correspondence map for each image during image synthesis. A canonical coordinate frame is defined and a structure latent code for each generated image is warped to align with the canonical coordinate frame. In sum, the structure associated with the latent code is mapped into a shared coordinate space (canonical coordinate space), thereby establishing correspondences in the shared coordinate space. A correspondence generation system receives the warped coordinate correspondences as an encoded image structure. The encoded image structure and a texture latent code are used to synthesize an image. The shared coordinate space enables propagation of semantic labels from reference images to synthesized images.

9.

发明申请
LEARNING CONTRASTIVE REPRESENTATION FOR SEMANTIC CORRESPONDENCE 有权

公开(公告)号：US20230074706A1

公开(公告)日：2023-03-09

申请号：US17412091

申请日：2021-08-25

Applicant: NVIDIA Corporation

Inventor： Taihong Xiao , Sifei Liu , Shalini De Mello , Zhiding Yu , Jan Kautz

IPC: G06K9/62 , G06K9/20 , G06N3/08

Abstract: A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.

10.

发明授权
Cross-domain image processing for object re-identification 有权

公开(公告)号：US11367268B2

公开(公告)日：2022-06-21

申请号：US16998890

申请日：2020-08-20

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Yang Zou , Zhiding Yu , Jan Kautz

IPC: G06K9/44 , G06V10/34 , G06K9/62 , G06N3/04 , G06N3/08 , G06V10/40 , G06V20/40 , G06V30/262

Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class. The identification-related features may then be used to train a neural network to perform re-identification of objects in that object class from images captured from the second domain.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification