Referring image segmentation
    2.
    发明授权

    公开(公告)号:US11657230B2

    公开(公告)日:2023-05-23

    申请号:US16899994

    申请日:2020-06-12

    Applicant: ADOBE INC.

    Abstract: A method, apparatus, and non-transitory computer readable medium for referring image segmentation are described. Embodiments of the method, apparatus, and non-transitory computer readable medium may extract an image feature vector from an input image, extract a plurality of language feature vectors for a referral expression, wherein each of the plurality of language feature vectors comprises a different number of dimensions, combine each of the language feature vectors with the image feature vector using a fusion module to produce a plurality of self-attention vectors, combine the plurality of self-attention vectors to produce a multi-modal feature vector, and decode the multi-modal feature vector to produce an image mask indicating a portion of the input image corresponding to the referral expression.

    Unified referring video object segmentation network

    公开(公告)号:US11526698B2

    公开(公告)日:2022-12-13

    申请号:US16893803

    申请日:2020-06-05

    Applicant: ADOBE INC.

    Abstract: Systems and methods for video object segmentation are described. Embodiments of systems and methods may receive a referral expression and a video comprising a plurality of image frames, generate a first image mask based on the referral expression and a first image frame of the plurality of image frames, generate a second image mask based on the referral expression, the first image frame, the first image mask, and a second image frame of the plurality of image frames, and generate annotation information for the video including the first image mask overlaid on the first image frame and the second image mask overlaid on the second image frame.

    Segmenting Objects In Video Sequences
    4.
    发明申请

    公开(公告)号:US20200143171A1

    公开(公告)日:2020-05-07

    申请号:US16183560

    申请日:2018-11-07

    Applicant: Adobe Inc.

    Abstract: In implementations of segmenting objects in video sequences, user annotations designate an object in any image frame of a video sequence, without requiring user annotations for all image frames. An interaction network generates a mask for an object in an image frame annotated by a user, and is coupled both internally and externally to a propagation network that propagates the mask to other image frames of the video sequence. Feature maps are aggregated for each round of user annotations and couple the interaction network and the propagation network internally. The interaction network and the propagation network are trained jointly using synthetic annotations in a multi-round training scenario, in which weights of the interaction network and the propagation network are adjusted after multiple synthetic annotations are processed, resulting in a trained object segmentation system that can reliably generate realistic object masks.

    Generating action tags for digital videos

    公开(公告)号:US11949964B2

    公开(公告)日:2024-04-02

    申请号:US17470441

    申请日:2021-09-09

    Applicant: Adobe Inc.

    CPC classification number: H04N21/8133 G06N3/08 G06V20/46 H04N21/8456

    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for automatic tagging of videos. In particular, in one or more embodiments, the disclosed systems generate a set of tagged feature vectors (e.g., tagged feature vectors based on action-rich digital videos) to utilize to generate tags for an input digital video. For instance, the disclosed systems can extract a set of frames for the input digital video and generate feature vectors from the set of frames. In some embodiments, the disclosed systems generate aggregated feature vectors from the feature vectors. Furthermore, the disclosed systems can utilize the feature vectors (or aggregated feature vectors) to identify similar tagged feature vectors from the set of tagged feature vectors. Additionally, the disclosed systems can generate a set of tags for the input digital videos by aggregating one or more tags corresponding to identified similar tagged feature vectors.

    Space-time memory network for locating target object in video content

    公开(公告)号:US11200424B2

    公开(公告)日:2021-12-14

    申请号:US16293126

    申请日:2019-03-05

    Applicant: Adobe Inc.

    Abstract: Certain aspects involve using a space-time memory network to locate one or more target objects in video content for segmentation or other object classification. In one example, a video editor generates a query key map and a query value map by applying a space-time memory network to features of a query frame from video content. The video editor retrieves a memory key map and a memory value map that are computed, with the space-time memory network, from a set of memory frames from the video content. The video editor computes memory weights by applying a similarity function to the memory key map and the query key map. The video editor classifies content in the query frame as depicting the target feature using a weighted summation that includes the memory weights applied to memory locations in the memory value map.

    Video object segmentation by reference-guided mask propagation

    公开(公告)号:US11176381B2

    公开(公告)日:2021-11-16

    申请号:US16856292

    申请日:2020-04-23

    Applicant: Adobe Inc.

    Abstract: Various embodiments describe video object segmentation using a neural network and the training of the neural network. The neural network both detects a target object in the current frame based on a reference frame and a reference mask that define the target object and propagates the segmentation mask of the target object for a previous frame to the current frame to generate a segmentation mask for the current frame. In some embodiments, the neural network is pre-trained using synthetically generated static training images and is then fine-tuned using training videos.

    VIDEO PANOPTIC SEGMENTATION
    9.
    发明申请

    公开(公告)号:US20210326638A1

    公开(公告)日:2021-10-21

    申请号:US16852647

    申请日:2020-04-20

    Applicant: ADOBE INC.

    Abstract: Systems and methods for panoptic video segmentation are described. A method may include identifying a target frame and a reference frame from a video, generating target features for the target frame and reference features for the reference frame, combining the target features and the reference features to produce fused features for the target frame, generating a feature matrix comprising a correspondence between objects from the reference features and objects from the fused features; and generating panoptic segmentation information for the target frame based on the feature matrix.

Patent Agency Ranking