System for automatic video reframing

    公开(公告)号:US11184558B1

    公开(公告)日:2021-11-23

    申请号:US16900435

    申请日:2020-06-12

    Applicant: ADOBE INC.

    Abstract: Systems and methods provide reframing operations in a smart editing system that may generate a focal point within a mask of an object for each frame of a video segment and perform editing effects on the frames of the video segment to quickly provide users with natural video editing effects. A reframing engine may processes video clips using a segmentation and hotspot module to determine a salient region of an object, generate a mask of the object, and track the trajectory of an object in the video clips. The reframing engine may then receive reframing parameters from a crop suggestion module and a user interface. Based on the determined trajectory of an object in a video clip and reframing parameters, the reframing engine may use reframing logic to produce temporally consistent reframing effects relative to an object for the video clip.

    Enhanced semantic segmentation of images

    公开(公告)号:US11127139B2

    公开(公告)日:2021-09-21

    申请号:US16574513

    申请日:2019-09-18

    Applicant: ADOBE INC.

    Abstract: Enhanced methods and systems for the semantic segmentation of images are described. A refined segmentation mask for a specified object visually depicted in a source image is generated based on a coarse and/or raw segmentation mask. The refined segmentation mask is generated via a refinement process applied to the coarse segmentation mask. The refinement process correct at least a portion of both type I and type II errors, as well as refine boundaries of the specified object, associated with the coarse segmentation mask. Thus, the refined segmentation mask provides a more accurate segmentation of the object than the coarse segmentation mask. A segmentation refinement model is employed to generate the refined segmentation mask based on the coarse segmentation mask. That is, the segmentation model is employed to refine the coarse segmentation mask to generate more accurate segmentations of the object. The refinement process is an iterative refinement process carried out via a trained neural network.

    EDGE-GUIDED RANKING LOSS FOR MONOCULAR DEPTH PREDICTION

    公开(公告)号:US20210256717A1

    公开(公告)日:2021-08-19

    申请号:US16790056

    申请日:2020-02-13

    Applicant: Adobe Inc.

    Abstract: In order to provide monocular depth prediction, a trained neural network may be used. To train the neural network, edge detection on a digital image may be performed to determine at least one edge of the digital image, and then a first point and a second point of the digital image may be sampled, based on the at least one edge. A relative depth between the first point and the second point may be predicted, and the neural network may be trained to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point.

    LEARNING COPY SPACE USING REGRESSION AND SEGMENTATION NEURAL NETWORKS

    公开(公告)号:US20210216824A1

    公开(公告)日:2021-07-15

    申请号:US17215067

    申请日:2021-03-29

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for characterizing and defining the location of a copy space in an image. A methodology implementing the techniques according to an embodiment includes applying a regression convolutional neural network (CNN) to an image. The regression CNN is configured to predict properties of the copy space such as size and type (natural or manufactured). The prediction is conditioned on a determination of the presence of the copy space in the image. The method further includes applying a segmentation CNN to the image. The segmentation CNN is configured to generate one or more pixel-level masks to define the location of copy spaces in the image, whether natural or manufactured, or to define the location of a background region of the image. The segmentation CNN may include a first stage comprising convolutional layers and a second stage comprising pairs of boundary refinement layers and bilinear up-sampling layers.

    PRESERVING REGIONS OF INTEREST IN AUTOMATIC IMAGE CROPPING

    公开(公告)号:US20210110589A1

    公开(公告)日:2021-04-15

    申请号:US17083899

    申请日:2020-10-29

    Applicant: Adobe Inc.

    Abstract: Embodiments of the present invention are directed to facilitating region of interest preservation. In accordance with some embodiments of the present invention, a region of interest preservation score using adaptive margins is determined. The region of interest preservation score indicates an extent to which at least one region of interest is preserved in a candidate image crop associated with an image. A region of interest positioning score is determined that indicates an extent to which a position of the at least one region of interest is preserved in the candidate image crop associated with the image. The region of interest preservation score and/or the preserving score are used to select a set of one or more candidate image crops as image crop suggestions.

    UTILIZING A NEURAL NETWORK HAVING A TWO-STREAM ENCODER ARCHITECTURE TO GENERATE COMPOSITE DIGITAL IMAGES

    公开(公告)号:US20210027470A1

    公开(公告)日:2021-01-28

    申请号:US16523465

    申请日:2019-07-26

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to utilizing a neural network having a two-stream encoder architecture to accurately generate composite digital images that realistically portray a foreground object from one digital image against a scene from another digital image. For example, the disclosed systems can utilize a foreground encoder of the neural network to identify features from a foreground image and further utilize a background encoder to identify features from a background image. The disclosed systems can then utilize a decoder to fuse the features together and generate a composite digital image. The disclosed systems can train the neural network utilizing an easy-to-hard data augmentation scheme implemented via self-teaching. The disclosed systems can further incorporate the neural network within an end-to-end framework for automation of the image composition process.

    Visually Guided Machine-learning Language Model

    公开(公告)号:US20200380403A1

    公开(公告)日:2020-12-03

    申请号:US16426298

    申请日:2019-05-30

    Applicant: Adobe Inc.

    Abstract: Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.

    HIERARCHICAL SCALE MATCHING AND PATCH ESTIMATION FOR IMAGE STYLE TRANSFER WITH ARBITRARY RESOLUTION

    公开(公告)号:US20200349688A1

    公开(公告)日:2020-11-05

    申请号:US16930736

    申请日:2020-07-16

    Applicant: Adobe Inc.

    Abstract: A style of a digital image is transferred to another digital image of arbitrary resolution. A high-resolution (HR) content image is segmented into several low-resolution (LR) patches. The resolution of a style image is matched to have the same resolution as the LR content image patches. Style transfer is then performed on a patch-by-patch basis using, for example, a pair of feature transforms—whitening and coloring. The patch-by-patch style transfer process is then repeated at several increasing resolutions, or scale levels, of both the content and style images. The results of the style transfer at each scale level are incorporated into successive scale levels up to and including the original HR scale. As a result, style transfer can be performed with images having arbitrary resolutions to produce visually pleasing results with good spatial consistency.

Patent Agency Ranking