Speaker thumbnail selection and speaker visualization in diarized transcripts for text-based video

    公开(公告)号:US12300272B2

    公开(公告)日:2025-05-13

    申请号:US17967697

    申请日:2022-10-17

    Applicant: Adobe Inc.

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for selection of the best image of a particular speaker's face in a video, and visualization in a diarized transcript. In an example embodiment, candidate images of a face of a detected speaker are extracted from frames of a video identified by a detected face track for the face, and a representative image of the detected speaker's face is selected from the candidate images based on image quality, facial emotion (e.g., using an emotion classifier that generates a happiness score), a size factor (e.g., favoring larger images), and/or penalizing images that appear towards the beginning or end of a face track. As such, each segment of the transcript is presented with the representative image of the speaker who spoke that segment and/or input is accepted changing the representative image associated with each speaker.

    Selecting and performing operations on hierarchical clusters of video segments

    公开(公告)号:US11631434B2

    公开(公告)日:2023-04-18

    申请号:US17017362

    申请日:2020-09-10

    Applicant: ADOBE INC.

    Abstract: Embodiments are directed to techniques for interacting with a hierarchical video segmentation. In some embodiments, the finest level of the hierarchical segmentation identifies the smallest interaction unit of a video—semantically defined video segments of unequal duration called clip atoms. Each level of the hierarchical segmentation clusters the clip atoms with a corresponding degree of granularity into a corresponding set of video segments. A presented video timeline is segmented based on one of the levels, and one or more segments are selected through interactions with the video timeline (e.g., clicks, drags), by performing a metadata search, or through selection of corresponding metadata segments from a metadata panel. Navigating to a different level of the hierarchy transforms the selection into corresponding coarser or finer video segments defined by the level. Any operation can be performed on selected video segments, including playing back, trimming, or editing.

    SELECTING AND PERFORMING OPERATIONS ON HIERARCHICAL CLUSTERS OF VIDEO SEGMENTS

    公开(公告)号:US20220076705A1

    公开(公告)日:2022-03-10

    申请号:US17017362

    申请日:2020-09-10

    Applicant: ADOBE INC.

    Abstract: Embodiments are directed to techniques for interacting with a hierarchical video segmentation. In some embodiments, the finest level of the hierarchical segmentation identifies the smallest interaction unit of a video—semantically defined video segments of unequal duration called clip atoms. Each level of the hierarchical segmentation clusters the clip atoms with a corresponding degree of granularity into a corresponding set of video segments. A presented video timeline is segmented based on one of the levels, and one or more segments are selected through interactions with the video timeline (e.g., clicks, drags), by performing a metadata search, or through selection of corresponding metadata segments from a metadata panel. Navigating to a different level of the hierarchy transforms the selection into corresponding coarser or finer video segments defined by the level. Any operation can be performed on selected video segments, including playing back, trimming, or editing.

Patent Agency Ranking