-
公开(公告)号:US10922793B2
公开(公告)日:2021-02-16
申请号:US16353195
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Seung-Hwan Baek , Kihwan Kim , Jinwei Gu , Orazio Gallo , Alejandro Jose Troccoli , Ming-Yu Liu , Jan Kautz
Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.
-
公开(公告)号:US10762620B2
公开(公告)日:2020-09-01
申请号:US16200192
申请日:2018-11-26
Applicant: NVIDIA Corporation
Inventor: Orazio Gallo , Jinwei Gu , Jan Kautz , Patrick Wieschollek
Abstract: When a computer image is generated from a real-world scene having a semi-reflective surface (e.g. window), the computer image will create, at the semi-reflective surface from the viewpoint of the camera, both a reflection of a scene in front of the semi-reflective surface and a transmission of a scene located behind the semi-reflective surface. Similar to a person viewing the real-world scene from different locations, angles, etc., the reflection and transmission may change, and also move relative to each other, as the viewpoint of the camera changes. Unfortunately, the dynamic nature of the reflection and transmission negatively impacts the performance of many computer applications, but performance can generally be improved if the reflection and transmission are separated. The present disclosure uses deep learning to separate reflection and transmission at a semi-reflective surface of a computer image generated from a real-world scene.
-
公开(公告)号:US20190213439A1
公开(公告)日:2019-07-11
申请号:US16353835
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
CPC classification number: G06K9/6215 , G06K9/00744 , G06K9/6256 , G06N3/04 , G06N3/08 , G06N3/084 , G06T5/009 , G06T5/50 , G06T7/10 , G06T7/90 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/20208
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20190180469A1
公开(公告)日:2019-06-13
申请号:US15836549
申请日:2017-12-08
Applicant: NVIDIA Corporation
Inventor: Jinwei Gu , Xiaodong Yang , Shalini De Mello , Jan Kautz
CPC classification number: G06T7/73 , G06N3/08 , G06T3/4046 , G06T13/40 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30201 , G06T2207/30204
Abstract: A method, computer readable medium, and system are disclosed for dynamic facial analysis. The method includes the steps of receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data. The method also includes the step of processing, by a recurrent neural network, the spatial features for two or more image frames in the sequence of image frames to produce head pose estimates for the at least one head.
-
公开(公告)号:US20190108651A1
公开(公告)日:2019-04-11
申请号:US16137064
申请日:2018-09-20
Applicant: NVIDIA Corporation
Inventor: Jinwei Gu , Samarth Manoj Brahmbhatt , Kihwan Kim , Jan Kautz
Abstract: A deep neural network (DNN) system learns a map representation for estimating a camera position and orientation (pose). The DNN is trained to learn a map representation corresponding to the environment, defining positions and attributes of structures, trees, walls, vehicles, walls, etc. The DNN system learns a map representation that is versatile and performs well for many different environments (indoor, outdoor, natural, synthetic, etc.). The DNN system receives images of an environment captured by a camera (observations) and outputs an estimated camera pose within the environment. The estimated camera pose is used to perform camera localization, i.e., recover the three-dimensional (3D) position and orientation of a moving camera, which is a fundamental task in computer vision with a wide variety of applications in robot navigation, car localization for autonomous driving, device localization for mobile navigation, and augmented/virtual reality.
-
-
-
-