-
公开(公告)号:US20200311855A1
公开(公告)日:2020-10-01
申请号:US16902097
申请日:2020-06-15
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: Pose estimation generally refers to a computer vision technique that determines the pose of some object, usually with respect to a particular camera. Pose estimation has many applications, but is particularly useful in the context of robotic manipulation systems. To date, robotic manipulation systems have required a camera to be installed on the robot itself (i.e. a camera-in-hand) for capturing images of the object and/or a camera external to the robot for capturing images of the object. Unfortunately, the camera-in-hand has a limited field of view for capturing objects, whereas the external camera, which may have a greater field of view, requires costly calibration each time the camera is even slightly moved. Similar issues apply when estimating the pose of any object with respect to another object (i.e. which may be moving or not). The present disclosure avoids these issues and provides object-to-object pose estimation from a single image.
-
公开(公告)号:US11417063B2
公开(公告)日:2022-08-16
申请号:US17181946
申请日:2021-02-22
Applicant: NVIDIA Corporation
Inventor: Yunzhi Lin , Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
公开(公告)号:US20220068024A1
公开(公告)日:2022-03-03
申请号:US17181946
申请日:2021-02-22
Applicant: NVIDIA Corporation
Inventor: Yunzhi Lin , Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
公开(公告)号:US10783394B2
公开(公告)日:2020-09-22
申请号:US16006728
申请日:2018-06-12
Applicant: NVIDIA Corporation
Inventor: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.
-
公开(公告)号:US20180365532A1
公开(公告)日:2018-12-20
申请号:US16006709
申请日:2018-06-12
Applicant: NVIDIA Corporation
Inventor: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
Abstract: A method, computer readable medium, and system are disclosed for sequential multi-tasking to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A neural network model processes input image data to generate pixel-level likelihood estimates for landmarks in the input image data and a soft-argmax function computes predicted coordinates of each landmark based on the pixel-level likelihood estimates.
-
公开(公告)号:US20180365512A1
公开(公告)日:2018-12-20
申请号:US16006728
申请日:2018-06-12
Applicant: NVIDIA Corporation
Inventor: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.
-
公开(公告)号:US12175703B2
公开(公告)日:2024-12-24
申请号:US17470979
申请日:2021-09-09
Applicant: NVIDIA Corporation
Inventor: Stanley Thomas Birchfield , Jonathan Tremblay , Yunzhi Lin , Stephen Walter Tyree
Abstract: Apparatuses, systems, and techniques to determine a pose and relative dimensions of an object from an image. In at least one embodiment, a pose and relative dimensions of an object are determined from an image based at least in part on, for example, features of the image.
-
公开(公告)号:US20220277472A1
公开(公告)日:2022-09-01
申请号:US17470979
申请日:2021-09-09
Applicant: NVIDIA Corporation
Inventor: Stanley Thomas Birchfield , Jonathan Tremblay , Yunzhi Lin , Stephen Walter Tyree
Abstract: Apparatuses, systems, and techniques to determine a pose and relative dimensions of an object from an image. In at least one embodiment, a pose and relative dimensions of an object are determined from an image based at least in part on, for example, features of the image.
-
公开(公告)号:US11315018B2
公开(公告)日:2022-04-26
申请号:US15786406
申请日:2017-10-17
Applicant: NVIDIA Corporation
Inventor: Pavlo Molchanov , Stephen Walter Tyree , Tero Tapani Karras , Timo Oskari Aila , Jan Kautz
Abstract: A method, computer readable medium, and system are disclosed for neural network pruning. The method includes the steps of receiving first-order gradients of a cost function relative to layer parameters for a trained neural network and computing a pruning criterion for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network and is associated with the layer parameter. The method includes the additional steps of identifying at least one neuron having a lowest importance and removing the at least one neuron from the trained neural network to produce a pruned neural network.
-
10.
公开(公告)号:US20170206405A1
公开(公告)日:2017-07-20
申请号:US15402128
申请日:2017-01-09
Applicant: NVIDIA Corporation
Inventor: Pavlo Molchanov , Xiaodong Yang , Shalini De Mello , Kihwan Kim , Stephen Walter Tyree , Jan Kautz
CPC classification number: G06K9/00355 , G06K9/00201 , G06K9/00765 , G06K9/4628 , G06K9/4652 , G06K9/6251 , G06K9/6256 , G06K9/627 , G06K9/6277 , G06N3/0445 , G06N3/0454 , G06N3/084 , Y04S10/54
Abstract: A method, computer readable medium, and system are disclosed for detecting and classifying hand gestures. The method includes the steps of receiving an unsegmented stream of data associated with a hand gesture, extracting spatio-temporal features from the unsegmented stream by a three-dimensional convolutional neural network (3DCNN), and producing a class label for the hand gesture based on the spatio-temporal features.
-
-
-
-
-
-
-
-
-