Patent search ap:("Adobe Inc.") AND inv:"Fabian David Caba Heilbron" Page 2

11.

发明公开
Adapting Pretrained Classification Models to Different Domains 审中-公开

公开(公告)号：US20230325685A1

公开(公告)日：2023-10-12

申请号：US17718607

申请日：2022-04-12

Applicant: Adobe Inc.

Inventor： Fabian David Caba Heilbron , Santiago Castro Serra

IPC: G06N5/02

CPC classification number: G06N5/022

Abstract: A model training system is described that obtains a training dataset including videos and text labels. The model training system generates a video-text classification model by causing a model having a dual image text encoder architecture to predict which of the text labels describes each video in the training dataset. Predictions output by the model are compared to the training dataset to determine distillation and contrastive losses, which are used to adjust internal weights of the model during training. The internal weights of the model are then combined with internal weights of a trained image-text classification model to generate the video-text classification model. The video text-classification model is configured to generate a video or text output that classifies a video or text input.

12.

发明申请
TEMPORALLY DISTRIBUTED NEURAL NETWORKS FOR VIDEO SEMANTIC SEGMENTATION 有权

公开(公告)号：US20220270370A1

公开(公告)日：2022-08-25

申请号：US17735156

申请日：2022-05-03

Applicant: Adobe Inc.

Inventor： Federico Perazzi , Zhe Lin , Ping Hu , Oliver Wang , Fabian David Caba Heilbron

IPC: G06V20/40 , G06N3/04 , G06T7/11 , G06F17/15

Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.

13.

发明申请
ACTIVE LEARNING METHOD FOR TEMPORAL ACTION LOCALIZATION IN UNTRIMMED VIDEOS 审中-公开

公开(公告)号：US20190325275A1

公开(公告)日：2019-10-24

申请号：US15957419

申请日：2018-04-19

Applicant: Adobe Inc.

Inventor： Joon-Young Lee , Hailin Jin , Fabian David Caba Heilbron

IPC: G06K9/66 , G06N3/08 , G06N3/04 , G06K9/00 , G06K9/62

Abstract: Various embodiments describe active learning methods for training temporal action localization models used to localize actions in untrimmed videos. A trainable active learning selection function is used to select unlabeled samples that can improve the temporal action localization model the most. The select unlabeled samples are then annotated and used to retrain the temporal action localization model. In some embodiment, the trainable active learning selection function includes a trainable performance prediction model that maps a video sample and a temporal action localization model to a predicted performance improvement for the temporal action localization model.

Patent Agency Ranking