-
公开(公告)号:US20200311855A1
公开(公告)日:2020-10-01
申请号:US16902097
申请日:2020-06-15
申请人: NVIDIA Corporation
摘要: Pose estimation generally refers to a computer vision technique that determines the pose of some object, usually with respect to a particular camera. Pose estimation has many applications, but is particularly useful in the context of robotic manipulation systems. To date, robotic manipulation systems have required a camera to be installed on the robot itself (i.e. a camera-in-hand) for capturing images of the object and/or a camera external to the robot for capturing images of the object. Unfortunately, the camera-in-hand has a limited field of view for capturing objects, whereas the external camera, which may have a greater field of view, requires costly calibration each time the camera is even slightly moved. Similar issues apply when estimating the pose of any object with respect to another object (i.e. which may be moving or not). The present disclosure avoids these issues and provides object-to-object pose estimation from a single image.
-
公开(公告)号:US11417063B2
公开(公告)日:2022-08-16
申请号:US17181946
申请日:2021-02-22
申请人: NVIDIA Corporation
摘要: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
公开(公告)号:US20220068024A1
公开(公告)日:2022-03-03
申请号:US17181946
申请日:2021-02-22
申请人: NVIDIA Corporation
摘要: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
公开(公告)号:US10783394B2
公开(公告)日:2020-09-22
申请号:US16006728
申请日:2018-06-12
申请人: NVIDIA Corporation
发明人: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
摘要: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.
-
公开(公告)号:US20180365532A1
公开(公告)日:2018-12-20
申请号:US16006709
申请日:2018-06-12
申请人: NVIDIA Corporation
发明人: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
摘要: A method, computer readable medium, and system are disclosed for sequential multi-tasking to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A neural network model processes input image data to generate pixel-level likelihood estimates for landmarks in the input image data and a soft-argmax function computes predicted coordinates of each landmark based on the pixel-level likelihood estimates.
-
公开(公告)号:US20180365512A1
公开(公告)日:2018-12-20
申请号:US16006728
申请日:2018-06-12
申请人: NVIDIA Corporation
发明人: Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari
摘要: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.
-
公开(公告)号:US20220277472A1
公开(公告)日:2022-09-01
申请号:US17470979
申请日:2021-09-09
申请人: NVIDIA Corporation
摘要: Apparatuses, systems, and techniques to determine a pose and relative dimensions of an object from an image. In at least one embodiment, a pose and relative dimensions of an object are determined from an image based at least in part on, for example, features of the image.
-
公开(公告)号:US11315018B2
公开(公告)日:2022-04-26
申请号:US15786406
申请日:2017-10-17
申请人: NVIDIA Corporation
摘要: A method, computer readable medium, and system are disclosed for neural network pruning. The method includes the steps of receiving first-order gradients of a cost function relative to layer parameters for a trained neural network and computing a pruning criterion for each layer parameter based on the first-order gradient corresponding to the layer parameter, where the pruning criterion indicates an importance of each neuron that is included in the trained neural network and is associated with the layer parameter. The method includes the additional steps of identifying at least one neuron having a lowest importance and removing the at least one neuron from the trained neural network to produce a pruned neural network.
-
9.
公开(公告)号:US20170206405A1
公开(公告)日:2017-07-20
申请号:US15402128
申请日:2017-01-09
申请人: NVIDIA Corporation
发明人: Pavlo Molchanov , Xiaodong Yang , Shalini De Mello , Kihwan Kim , Stephen Walter Tyree , Jan Kautz
CPC分类号: G06K9/00355 , G06K9/00201 , G06K9/00765 , G06K9/4628 , G06K9/4652 , G06K9/6251 , G06K9/6256 , G06K9/627 , G06K9/6277 , G06N3/0445 , G06N3/0454 , G06N3/084 , Y04S10/54
摘要: A method, computer readable medium, and system are disclosed for detecting and classifying hand gestures. The method includes the steps of receiving an unsegmented stream of data associated with a hand gesture, extracting spatio-temporal features from the unsegmented stream by a three-dimensional convolutional neural network (3DCNN), and producing a class label for the hand gesture based on the spatio-temporal features.
-
公开(公告)号:US20240005547A1
公开(公告)日:2024-01-04
申请号:US17750785
申请日:2022-05-23
申请人: NVIDIA Corporation
CPC分类号: G06T7/70 , G06T7/277 , G06T2207/20081 , G06T2207/10016 , G06T2207/20084 , G06T2207/10024
摘要: Apparatuses, systems, and techniques to determined a pose of an object from a plurality of images. In at least one embodiment, the pose of an object is determined from at least two images of a video sequence using one or more neural networks, in which the neural network produces a distribution of pose information that is filtered to determine the current pose.
-
-
-
-
-
-
-
-
-