-
公开(公告)号:US12141941B2
公开(公告)日:2024-11-12
申请号:US17562494
申请日:2021-12-27
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Miika Samuli Aittala , Samuli Matias Laine , Erik Andreas Härkönen , Janne Johannes Hellsten , Jaakko T. Lehtinen , Timo Oskari Aila
IPC: G06T3/20 , G06T3/4046 , G06T3/4053 , G06T3/60
Abstract: Systems and methods are disclosed that improve output quality of any neural network, particularly an image generative neural network. In the real world, details of different scale tend to transform hierarchically. For example, moving a person's head causes the nose to move, which in turn moves the skin pores on the nose. Conventional generative neural networks do not synthesize images in a natural hierarchical manner: the coarse features seem to mainly control the presence of finer features, but not the precise positions of the finer features. Instead, much of the fine detail appears to be fixed to pixel coordinates which is a manifestation of aliasing. Aliasing breaks the illusion of a solid and coherent object moving in space. A generative neural network with reduced aliasing provides an architecture that exhibits a more natural transformation hierarchy, where the exact sub-pixel position of each feature is inherited from underlying coarse features.
-
公开(公告)号:US20240135630A1
公开(公告)日:2024-04-25
申请号:US18485225
申请日:2023-10-11
Applicant: NVIDIA Corporation
Inventor: Koki Nagano , Eric Ryan Wong Chan , Tero Tapani Karras , Shalini De Mello , Miika Samuli Aittala , Matthew Aaron Wong Chan
IPC: G06T15/06 , G06T5/00 , G06T5/50 , G06V10/44 , G06V10/771
CPC classification number: G06T15/06 , G06T5/002 , G06T5/50 , G06V10/44 , G06V10/771 , G06T2207/20084 , G06T2207/20221
Abstract: A method and system for performing novel image synthesis using generative networks are provided. The encoder-based model is trained to infer a 3D representation of an input image. A feature image is then generated using volume rendering techniques in accordance with the 3D representation. The feature image is then concatenated with a noisy image and processed by a denoiser network to predict an output image from a novel viewpoint that is consistent with the input image. The denoiser network can be a modified Noise Conditional Score Network (NCSN). In some embodiments, multiple input images or keyframes can be provided as input, and a different 3D representation is generated for each input image. The feature image is then generated, during volume rendering, by sampling each of the 3D representations and applying a mean-pooling operation to generate an aggregate feature image.
-
公开(公告)号:US11790598B2
公开(公告)日:2023-10-17
申请号:US17365574
申请日:2021-07-01
Applicant: NVIDIA Corporation
Inventor: Onni August Kosomaa , Jaakko T. Lehtinen , Samuli Matias Laine , Tero Tapani Karras , Miika Samuli Aittala
CPC classification number: G06T15/08 , G06N3/045 , G06T11/006 , G16H30/40 , G06T2210/41
Abstract: A three-dimensional (3D) density volume of an object is constructed from tomography images (e.g., x-ray images) of the object. The tomography images are projection images that capture all structures of an object (e.g., human body) between a beam source and imaging sensor. The beam effectively integrates along a path through the object producing a tomography image at the imaging sensor, where each pixel represents attenuation. A 3D reconstruction pipeline includes a first neural network model, a fixed function backprojection unit, and a second neural network model. Given information for the capture environment, the tomography images are processed by the reconstruction pipeline to produce a reconstructed 3D density volume of the object. In contrast with a set of 2D slices, the entire 3D density volume is reconstructed, so two-dimensional (2D) density images may be produced by slicing through any portion of the 3D density volume at any angle.
-
公开(公告)号:US11610122B2
公开(公告)日:2023-03-21
申请号:US17143608
申请日:2021-01-07
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
公开(公告)号:US11580395B2
公开(公告)日:2023-02-14
申请号:US17069449
申请日:2020-10-13
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
公开(公告)号:US20210329306A1
公开(公告)日:2021-10-21
申请号:US17069253
申请日:2020-10-13
Applicant: NVIDIA Corporation
Inventor: Ming-Yu Liu , Ting-Chun Wang , Arun Mohanray Mallya , Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko Lehtinen , Miika Samuli Aittala , Timo Oskari Aila
Abstract: Apparatuses, systems, and techniques to perform compression of video data using neural networks to facilitate video streaming, such as video conferencing. In at least one embodiment, a sender transmits to a receiver a key frame from video data and one or more keypoints identified by a neural network from said video data, and a receiver reconstructs video data using said key frame and one or more received keypoints.
-
公开(公告)号:US20210049468A1
公开(公告)日:2021-02-18
申请号:US17069449
申请日:2020-10-13
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
公开(公告)号:US20250111245A1
公开(公告)日:2025-04-03
申请号:US18898040
申请日:2024-09-26
Applicant: NVIDIA Corporation
Inventor: Samuli Matias Laine , Miika Samuli Aittala , Janne Johannes Hellsten , Jaakko T. Lehtinen , Timo Oskari Aila , Tero Tapani Karras
IPC: G06N3/0985
Abstract: Apparatuses, systems, and techniques to compute neural network parameters and to use a neural network to perform inference. In at least one embodiment, neural network parameters are computed, after training, by determining a weighted average of snapshots of averaged parameters that form a basis set of averaged parameter snapshots, each respective snapshot of averaged parameters including a plurality of network parameters averaged by a respective combination of an averaging function and one or more averaging parameters.
-
公开(公告)号:US20250111233A1
公开(公告)日:2025-04-03
申请号:US18897232
申请日:2024-09-26
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Miika Samuli Aittala , Janne Johannes Hellsten , Jaakko T. Lehtinen , Timo Oskari Aila , Samuli Matias Laine
IPC: G06N3/084
Abstract: Apparatuses, systems, and techniques to train neural networks. In at least one embodiment, a first normalization of learned parameters of one or more learned layers is performed during a forward pass of a training iteration and a second normalization of the learned parameters is performed during a parameter update phase of the training iteration. In at least one embodiment, the first normalization is performed using first scaling factors and the second normalization is performed using second scaling factors.
-
公开(公告)号:US11625613B2
公开(公告)日:2023-04-11
申请号:US17143516
申请日:2021-01-07
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
-
-
-
-
-
-
-
-