-
公开(公告)号:US20210081752A1
公开(公告)日:2021-03-18
申请号:US16931211
申请日:2020-07-16
Applicant: NVIDIA Corporation
Inventor: Yu-Wei Chao , De-An Huang , Christopher Jason Paxton , Animesh Garg , Dieter Fox
Abstract: Apparatuses, systems, and techniques to identify a goal of a demonstration. In at least one embodiment, video data of a demonstration is analyzed to identify a goal. Object trajectories identified in the video data are analyzed with respect to a task predicate satisfied by a respective object trajectory, and with respect to motion predicate. Analysis of the trajectory with respect to the motion predicate is used to assess intentionality of a trajectory with respect to the goal.
-
公开(公告)号:US20240144000A1
公开(公告)日:2024-05-02
申请号:US18307227
申请日:2023-04-26
Applicant: NVIDIA Corporation
Inventor: Yuji Roh , Weili Nie , De-An Huang , Arash Vahdat , Animashree Anandkumar
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: A neural network model is trained for fairness and accuracy using both real and synthesized training data, such as images. During training a first sampling ratio between the real and synthesized training data is optimized. The first sampling ratio may comprise a value for each group (or attribute), where each value is optimized. A second sampling ratio defines relative amounts of training data that are used for each one of the groups. Furthermore, a neural network model accuracy and a fairness metric are both used for updating the first and second sampling ratios during training iterations. The neural network model may be trained using different classes of training data. The second sampling ratio may vary for each class.
-
公开(公告)号:US11893468B2
公开(公告)日:2024-02-06
申请号:US16931211
申请日:2020-07-16
Applicant: NVIDIA Corporation
Inventor: Yu-Wei Chao , De-An Huang , Christopher Jason Paxton , Animesh Garg , Dieter Fox
Abstract: Apparatuses, systems, and techniques to identify a goal of a demonstration. In at least one embodiment, video data of a demonstration is analyzed to identify a goal. Object trajectories identified in the video data are analyzed with respect to a task predicate satisfied by a respective object trajectory, and with respect to motion predicate. Analysis of the trajectory with respect to the motion predicate is used to assess intentionality of a trajectory with respect to the goal.
-
公开(公告)号:US20240273682A1
公开(公告)日:2024-08-15
申请号:US18431527
申请日:2024-02-02
Applicant: NVIDIA Corporation
Inventor: Weili Nie , Guan-Horng Liu , Arash Vahdat , De-An Huang , Anima Anandkumar
Abstract: Image restoration generally involves recovering a target clean image from a given image having noise, blurring, or other degraded features. Current image restoration solutions typically include a diffusion model that is trained for image restoration by a forward process that progressively diffuses data to noise, and then by learning in a reverse process to generate the data from the noise. However, the forward process relies on Gaussian noise to diffuse the original data, which has little or no structural information corresponding to the original data versus learning from the degraded image itself which is much more structurally informative compared to the random Gaussian noise. Similar problems also exist for other data-to-data translation tasks. The present disclosure trains a data translation conditional diffusion model from diffusion bridge(s) computed between a first version of the data and a second version of the data, which can yield a model that can provide interpretable generation, sampling efficiency, and reduced processing time.
-
公开(公告)号:US20240037756A1
公开(公告)日:2024-02-01
申请号:US18144071
申请日:2023-05-05
Applicant: NVIDIA Corporation
Inventor: De-An Huang , Zhiding Yu , Anima Anandkumar
CPC classification number: G06T7/20 , G06T5/20 , G06T7/70 , G06V10/761 , G06V10/82 , G06V2201/07 , G06T2207/20081
Abstract: Apparatuses, systems, and techniques to track one or more objects in one or more frames of a video. In at least one embodiment, one or more objects in one or more frames of a video are tracked based on, for example, one or more sets of embeddings.
-
公开(公告)号:US20250111552A1
公开(公告)日:2025-04-03
申请号:US18819064
申请日:2024-08-29
Applicant: NVIDIA Corporation
Inventor: Sihyun Yu , Weili Nie , De-An Huang , Boyi Li , Animashree Anandkumar
IPC: G06T11/00 , G06N3/0455 , G06T9/00
Abstract: Systems and methods are disclosed that train a content frame-motion latent diffusion model (CDM) and use the CDM to generate requested videos. The CMD may be a two-stage framework that first compresses videos to a succinct latent space and then learns the video distribution in this latent space. For instance, the CMD may include an autoencoder and two diffusion models. In a first stage, using the autoencoder, a low-dimensional latent decomposition into a content frame and latent motion representation is learned. In the second stage, without adding any new parameters, the content frame distribution may be fine-tuned by using a pretrained image diffusion model, which allows the CMD to leverage the rich visual knowledge in pretrained image diffusion models. In addition, a new lightweight diffusion model may be used to generate motion latent representations that are conditioned on the given content frame.
-
公开(公告)号:US20240095534A1
公开(公告)日:2024-03-21
申请号:US18243348
申请日:2023-09-07
Applicant: NVIDIA Corporation
Inventor: Anima Anandkumar , Chaowei Xiao , Weili Nie , De-An Huang , Zhiding Yu , Manli Shu
Abstract: Apparatuses, systems, and techniques to perform neural networks. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected. In at least one embodiment, a most consistent output of one or more pre-trained neural networks is to be selected based, at least in part, on a plurality of variances of one or more inputs to the one or more neural networks.
-
公开(公告)号:US20250103968A1
公开(公告)日:2025-03-27
申请号:US18821611
申请日:2024-08-30
Applicant: NVIDIA Corporation
Inventor: Zizheng Pan , De-An Huang , Weili Nie , Zhiding Yu , Chaowei Xiao , Anima Anandkumar
IPC: G06N20/20
Abstract: Diffusion models are machine learning algorithms that are uniquely trained to generate high-quality data from an input lower-quality data. Diffusion probabilistic models use discrete-time random processes or continuous-time stochastic differential equations (SDEs) that learn to gradually remove the noise added to the data points. With diffusion probabilistic models, high quality output currently requires sampling from a large diffusion probabilistic model which corners at a high computational cost. The present disclosure stitches together the trajectory of two or more inferior diffusion probabilistic models during a denoising process, which can in turn accelerate the denoising process by avoiding use of only a single large diffusion probabilistic model.
-
公开(公告)号:US20240221166A1
公开(公告)日:2024-07-04
申请号:US18395198
申请日:2023-12-22
Applicant: NVIDIA Corporation
Inventor: Zhiding Yu , Shuaiyi Huang , De-An Huang , Shiyi Lan , Subhashree Radhakrishnan , Jose M. Alvarez Lopez , Anima Anandkumar
IPC: G06T7/12 , G06V10/764 , G06V20/70
CPC classification number: G06T7/12 , G06V10/764 , G06V20/70 , G06T2207/20081
Abstract: Video instance segmentation is a computer vision task that aims to detect, segment, and track objects continuously in videos. It can be used in numerous real-world applications, such as video editing, three-dimensional (3D) reconstruction, 3D navigation (e.g. for autonomous driving and/or robotics), and view point estimation. However, current machine learning-based processes employed for video instance segmentation are lacking, particularly because the densely annotated videos needed for supervised training of high-quality models are not readily available and are not easily generated. To address the issues in the prior art, the present disclosure provides point-level supervision for video instance segmentation in a manner that allows the resulting machine learning model to handle any object category.
-
-
-
-
-
-
-
-