-
公开(公告)号:US20230154139A1
公开(公告)日:2023-05-18
申请号:US17589709
申请日:2022-01-31
Applicant: salesforce.com, inc.
Inventor: Brian Chen , Ramprasaath Ramasamy Selvaraju , Juan Carlos Niebles Duque , Nikhil Naik
CPC classification number: G06V10/454 , G06V10/462 , G06V10/62
Abstract: Embodiments described herein provide an intelligent method to select instances, by utilizing unsupervised tracking for videos. Using this freely available form of supervision, a temporal constraint is adopted for selecting instances that ensures that different instances contain the same object while sampling the temporal augmentation from the video. In addition, using the information on the spatial extent of the tracked object, spatial constraints are applied to ensure that sampled instances overlap meaningfully with the tracked object. Taken together, these spatiotemporal constraints result in better supervisory signal for contrastive learning from videos.
-
公开(公告)号:US12106541B2
公开(公告)日:2024-10-01
申请号:US17589709
申请日:2022-01-31
Applicant: Salesforce.com, Inc.
Inventor: Brian Chen , Ramprasaath Ramasamy Selvaraju , Juan Carlos Niebles Duque , Nikhil Naik
CPC classification number: G06V10/454 , G06V10/462 , G06V10/62
Abstract: Embodiments described herein provide an intelligent method to select instances, by utilizing unsupervised tracking for videos. Using this freely available form of supervision, a temporal constraint is adopted for selecting instances that ensures that different instances contain the same object while sampling the temporal augmentation from the video. In addition, using the information on the spatial extent of the tracked object, spatial constraints are applied to ensure that sampled instances overlap meaningfully with the tracked object. Taken together, these spatiotemporal constraints result in better supervisory signal for contrastive learning from videos.
-
公开(公告)号:US20240161464A1
公开(公告)日:2024-05-16
申请号:US18159189
申请日:2023-01-25
Applicant: Salesforce.com, Inc.
Inventor: Roberto Martin-Martin , Silvio Savarese , Honglu Zhou , Juan Carlos Niebles Duque
IPC: G06V10/774 , G06F40/40 , G06V10/776 , G06V10/82 , G06V20/40 , G06V20/70
CPC classification number: G06V10/774 , G06F40/40 , G06V10/776 , G06V10/82 , G06V20/41 , G06V20/70
Abstract: Embodiments described herein provide systems and methods for training video models to perform a task from an input instructional video. A procedure knowledge graph (PKG) may be generated with nodes representing procedure steps, and edges representing relationships between the steps. The PKG may be generated based on text and/or video training data which includes procedures (e.g., instructional videos). Using the PKG, a video model may be trained using the PKG to provide supervisory training signals for a number of tasks. Once the model is trained, it may be fine-tuned for a specific task which benefits from the model being trained in a way that makes the model embed procedural information when encoding videos.
-
-