-
公开(公告)号:US11941899B2
公开(公告)日:2024-03-26
申请号:US17331451
申请日:2021-05-26
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Fabio Tozeto Ramos , Yuke Zhu , Anima Anandkumar , Guanya Shi
IPC: B25J9/16 , B25J13/08 , B25J19/02 , G05B13/02 , G06F18/214 , G06K9/00 , G06N3/04 , G06N3/045 , G06T7/73 , G06V10/75 , G06V20/64
CPC classification number: G06V20/653 , G06F18/2148 , G06N3/045 , G06V10/751
Abstract: Apparatuses, systems, and techniques generate poses of an object based on image data of the object obtained from a first viewpoint of the object and a second viewpoint of the object. The poses can be evaluated to determine a portion of the image data usable by an estimator to generate a pose of the object.
-
公开(公告)号:US20240062534A1
公开(公告)日:2024-02-22
申请号:US17893038
申请日:2022-08-22
Applicant: NVIDIA Corporation
Inventor: Xiaojian Ma , Weili Nie , Zhiding Yu , Huaizu Jiang , Chaowei Xiao , Yuke Zhu , Anima Anandkumar
CPC classification number: G06V10/82 , G06V10/255 , G06V10/94
Abstract: A vision transformer (ViT) is a deep learning model that performs one or more vision processing tasks. ViTs may be modified to include a global task that clusters images with the same concept together to produce semantically consistent relational representations, as well as a local task that guides the ViT to discover object-centric semantic correspondence across images. A database of concepts and associated features may be created and used to train the global and local tasks, which may then enable the ViT to perform visual relational reasoning faster, without supervision, and outside of a synthetic domain.
-
公开(公告)号:US11931909B2
公开(公告)日:2024-03-19
申请号:US17331466
申请日:2021-05-26
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Fabio Tozeto Ramos , Yuke Zhu , Anima Anandkumar , Guanya Shi
IPC: B25J9/16 , B25J13/08 , B25J19/02 , G05B13/02 , G06F18/214 , G06K9/00 , G06N3/04 , G06N3/045 , G06T7/73 , G06V10/75 , G06V20/20 , G06V20/64
CPC classification number: B25J9/1697 , B25J9/161 , B25J9/1612 , B25J13/08 , B25J19/023 , G05B13/027 , G06F18/2148 , G06N3/045 , G06T7/73 , G06V10/751 , G06V20/20 , G06V20/653 , G06T2207/20081 , G06T2207/20084
Abstract: Apparatuses, systems, and techniques generate poses of an object based on data of the object observed from a first viewpoint and a second viewpoint. The poses can be evaluated to determine a portion of the data usable by an estimator to generate a pose of the object.
-
公开(公告)号:US20240078423A1
公开(公告)日:2024-03-07
申请号:US17893026
申请日:2022-08-22
Applicant: NVIDIA Corporation
Inventor: Xiaojian Ma , Weili Nie , Zhiding Yu , Huaizu Jiang , Chaowei Xiao , Yuke Zhu , Anima Anandkumar
Abstract: A vision transformer (ViT) is a deep learning model that performs one or more vision processing tasks. ViTs may be modified to include a global task that clusters images with the same concept together to produce semantically consistent relational representations, as well as a local task that guides the ViT to discover object-centric semantic correspondence across images. A database of concepts and associated features may be created and used to train the global and local tasks, which may then enable the ViT to perform visual relational reasoning faster, without supervision, and outside of a synthetic domain.
-
公开(公告)号:US20250073901A1
公开(公告)日:2025-03-06
申请号:US18239601
申请日:2023-08-29
Applicant: NVIDIA Corporation
Inventor: Ajay Uday Mandlekar , Soroush Nasiriany , Bowen Wen , Iretiayo Akinola , Yashraj Shyam Narang , Linxi Fan , Yuke Zhu , Dieter Fox
Abstract: Apparatuses, systems, and techniques to generate data to train a robotic device to perform tasks. In at least one embodiment, one or more first videos of a robotic device performing a task is used to generate one or more second videos of the robotic device performing the task differently than depicted in the one or more first videos.
-
公开(公告)号:US20230290057A1
公开(公告)日:2023-09-14
申请号:US17691723
申请日:2022-03-10
Applicant: NVIDIA Corporation
Inventor: Yuke Zhu , Bokui Shen , Christopher Bongsoo Choy , Animashree Anandkumar
CPC classification number: G06T17/10 , G06N20/20 , G06T19/20 , G06T2219/2021
Abstract: One or more machine learning models (MLMs) may learn implicit 3D representations of geometry of an object and of dynamics of the object from performing an action on the object. Implicit neural representations may be used to reconstruct high-fidelity full geometry of the object and predict a flow-based dynamics field from one or more images, which may provide a partial view of the object. Correspondences between locations of an object may be learned based at least on distances between the locations on a surface corresponding to the object, such as geodesic distances. The distances may be incorporated into a contrastive learning loss function to train one or more MLMs to learn correspondences between locations of the object, such as a correspondence embedding field. The correspondences may be used to evaluate state changes when evaluating one or more actions that may be performed on the object.
-
公开(公告)号:US20230280726A1
公开(公告)日:2023-09-07
申请号:US17684245
申请日:2022-03-01
Applicant: NVIDIA Corporation
Inventor: Yuke Zhu , Anima Anandkumar , Youngwoon Lee
IPC: G05B19/418
CPC classification number: G05B19/41865 , G05B19/41885 , G05B19/41895
Abstract: A manipulation task may include operations performed by one or more manipulation entities on one or more objects. This manipulation task may be broken down into a plurality of sequential sub-tasks (policies). These policies may be fine-tuned so that a terminal state distribution of a given policy matches an initial state distribution of another policy that immediately follows the given policy within the plurality of policies. The fine-tuned plurality of policies may then be chained together and implemented within a manipulation environment.
-
公开(公告)号:US20220379484A1
公开(公告)日:2022-12-01
申请号:US17331466
申请日:2021-05-26
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Fabio Tozeto Ramos , Yuke Zhu , Anima Anandkumar , Guanya Shi
Abstract: Apparatuses, systems, and techniques generate poses of an object based on data of the object observed from a first viewpoint and a second viewpoint. The poses can be evaluated to determine a portion of the data usable by an estimator to generate a pose of the object.
-
公开(公告)号:US12165258B2
公开(公告)日:2024-12-10
申请号:US17691723
申请日:2022-03-10
Applicant: NVIDIA Corporation
Inventor: Yuke Zhu , Bokui Shen , Christopher Bongsoo Choy , Animashree Anandkumar
Abstract: One or more machine learning models (MLMs) may learn implicit 3D representations of geometry of an object and of dynamics of the object from performing an action on the object. Implicit neural representations may be used to reconstruct high-fidelity full geometry of the object and predict a flow-based dynamics field from one or more images, which may provide a partial view of the object. Correspondences between locations of an object may be learned based at least on distances between the locations on a surface corresponding to the object, such as geodesic distances. The distances may be incorporated into a contrastive learning loss function to train one or more MLMs to learn correspondences between locations of the object, such as a correspondence embedding field. The correspondences may be used to evaluate state changes when evaluating one or more actions that may be performed on the object.
-
公开(公告)号:US20220383019A1
公开(公告)日:2022-12-01
申请号:US17331451
申请日:2021-05-26
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Fabio Tozeto Ramos , Yuke Zhu , Anima Anandkumar , Guanya Shi
Abstract: Apparatuses, systems, and techniques generate poses of an object based on image data of the object obtained from a first viewpoint of the object and a second viewpoint of the object. The poses can be evaluated to determine a portion of the image data usable by an estimator to generate a pose of the object.
-
-
-
-
-
-
-
-
-