CAPTIONING USING GENERATIVE ARTIFICIAL INTELLIGENCE

    公开(公告)号:US20250139161A1

    公开(公告)日:2025-05-01

    申请号:US18431134

    申请日:2024-02-02

    Applicant: ADOBE INC.

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding captioning video effects to the trimmed video (e.g., applying face-aware and non-face-aware captioning to emphasize extracted video segment headings, important sentences, quotes, words of interest, extracted lists, etc.). For example, a prompt is provided to a generative language model to identify portions of a transcript (e.g., extracted scene summaries, important sentences, lists of items discussed in the video, etc.) to apply to corresponding video segments as captions depending on the type of caption (e.g., an extracted heading may be captioned at the start of a corresponding video segment, important sentences and/or extracted list items may be captioned when they are spoken).

    ZOOM AND SCROLL BAR FOR A VIDEO TIMELINE

    公开(公告)号:US20230043769A1

    公开(公告)日:2023-02-09

    申请号:US17969536

    申请日:2022-10-19

    Applicant: Adobe Inc.

    Abstract: Embodiments are directed to techniques for interacting with a hierarchical video segmentation using a video timeline. In some embodiments, the finest level of a hierarchical segmentation identifies the smallest interaction unit of a video—semantically defined video segments of unequal duration called clip atoms, and higher levels cluster the clip atoms into coarser sets of video segments. A presented video timeline is segmented based on one of the levels, and one or more segments are selected through interactions with the video timeline. For example, a click or tap on a video segment or a drag operation dragging along the timeline snaps selection boundaries to corresponding segment boundaries defined by the level. Navigating to a different level of the hierarchy transforms the selection into coarser or finer video segments defined by the level. Any operation can be performed on selected video segments, including playing back, trimming, or editing.

    ANNOTATED TRANSCRIPT TEXT AND TRANSCRIPT THUMBNAIL BARS FOR TEXT-BASED VIDEO EDITING

    公开(公告)号:US20240127858A1

    公开(公告)日:2024-04-18

    申请号:US17967608

    申请日:2022-10-17

    Applicant: Adobe Inc.

    CPC classification number: G11B27/031 G10L15/24 G10L15/26

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for annotating transcript text with video metadata, and including thumbnail bars in the transcript to help users select a desired portion of a video through transcript interactions. In an example embodiment, a video editing interface includes a transcript interface that presents a transcript with transcript text that is annotated to indicate corresponding portions of the video where various features were detected (e.g., annotating via text stylization of transcript text and/or labeling the transcript text with a textual representation of a corresponding detected feature class). In some embodiments, the transcript interface displays a visual representation of detected non-speech audio or pauses (e.g., a sound bar) and/or video thumbnails corresponding to each line of transcript text (e.g., a thumbnail bar). Transcript text, soundbars, and/or thumbnail bars are selectable to identify and perform video editing operations on a corresponding video segment.

Patent Agency Ranking