-
公开(公告)号:US20240244287A1
公开(公告)日:2024-07-18
申请号:US18154412
申请日:2023-01-13
Applicant: Adobe Inc.
Inventor: Kim Pascal PIMMEL , Stephen Joseph DIVERDI , Jiaju MA , Rubaiat HABIB , Li-Yi WEI , Hijung SHIN , Deepali ANEJA , John G. NELSON , Wilmot LI , Dingzeyu LI , Lubomira Assenova DONTCHEVA , Joel Richard BRANDT
IPC: H04N21/431 , G06F3/04812 , G06F3/0482 , H04N21/4402
CPC classification number: H04N21/4312 , G06F3/04812 , G06F3/0482 , H04N21/440236
Abstract: Embodiments of the present disclosure provide, a method, a system, and a computer storage media that provide mechanisms for multimedia effect addition and editing support for text-based video editing tools. The method includes generating a user interface (UI) displaying a transcript of an audio track of a video and receiving, via the UI, input identifying selection of a text segment from the transcript. The method also includes in response to receiving, via the UI, input identifying selection of a particular type of text stylization or layout for application to the text segment. The method further includes identifying a video effect corresponding to the particular type of text stylization or layout, applying the video effect to a video segment corresponding to the text segment, and applying the particular type of text stylization or layout to the text segment to visually represent the video effect in the transcript.
-
公开(公告)号:US20240135973A1
公开(公告)日:2024-04-25
申请号:US17967364
申请日:2022-10-17
Applicant: Adobe Inc.
Inventor: Xue BAI , Justin Jonathan SALAMON , Aseem Omprakash AGARWALA , Hijung SHIN , Haoran CAI , Joel Richard BRANDT , Lubomira Assenova DONTCHEVA , Cristin Ailidh Fraser
IPC: G11B27/036 , G06F40/166 , G10L15/26 , G10L25/57 , G11B27/34
CPC classification number: G11B27/036 , G06F40/166 , G10L15/26 , G10L25/57 , G11B27/34 , G06F3/0482
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for identifying candidate boundaries for video segments, video segment selection using those boundaries, and text-based video editing of video segments selected via transcript interactions. In an example implementation, boundaries of detected sentences and words are extracted from a transcript, the boundaries are retimed into an adjacent speech gap to a location where voice or audio activity is a minimum, and the resulting boundaries are stored as candidate boundaries for video segments. As such, a transcript interface presents the transcript, interprets input selecting transcript text as an instruction to select a video segment with corresponding boundaries selected from the candidate boundaries, and interprets commands that are traditionally thought of as text-based operations (e.g., cut, copy, paste) as an instruction to perform a corresponding video editing operation using the selected video segment.
-
公开(公告)号:US20240134909A1
公开(公告)日:2024-04-25
申请号:US17967703
申请日:2022-10-17
Applicant: Adobe Inc.
Inventor: Lubomira Assenova DONTCHEVA , Dingzeyu LI , Kim Pascal PIMMEL , Hijung SHIN , Hanieh DEILAMSALEHY , Aseem Omprakash AGARWALA , Joy Oakyung KIM , Joel Richard BRANDT , Cristin Ailidh Fraser
IPC: G06F16/732
CPC classification number: G06F16/732
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for a visual and text search interface used to navigate a video transcript. In an example embodiment, a freeform text query triggers a visual search for frames of a loaded video that match the freeform text query (e.g., frame embeddings that match a corresponding embedding of the freeform query), and triggers a text search for matching words from a corresponding transcript or from tags of detected features from the loaded video. Visual search results are displayed (e.g., in a row of tiles that can be scrolled to the left and right), and textual search results are displayed (e.g., in a row of tiles that can be scrolled up and down). Selecting (e.g., clicking or tapping on) a search result tile navigates a transcript interface to a corresponding portion of the transcript.
-
公开(公告)号:US20240134597A1
公开(公告)日:2024-04-25
申请号:US17967714
申请日:2022-10-17
Applicant: Adobe Inc.
Inventor: Lubomira Assenova DONTCHEVA , Anh Lan TRUONG , Hanieh DEILAMSALEHY , Kim Pascal PIMMEL , Aseem Omprakash AGARWALA , Dingzeyu Li , Joel Richard BRANDT , Joy Oakyung KIM
IPC: G06F3/16 , G06F3/0482 , G06F3/0484 , G06F16/735 , G06F16/738
CPC classification number: G06F3/167 , G06F3/0482 , G06F3/0484 , G06F16/735 , G06F16/738
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for a question search for meaningful questions that appear in a video. In an example embodiment, an audio track from a video is transcribed, and the transcript is parsed to identify sentences that end with a question mark. Depending on the embodiment, one or more types of questions are filtered out, such as short questions less than a designated length or duration, logistical questions, and/or rhetorical questions. As such, in response to a command to perform a question search, the questions are identified, and search result tiles representing video segments of the questions are presented. Selecting (e.g., clicking or tapping on) a search result tile navigates a transcript interface to a corresponding portion of the transcript.
-
公开(公告)号:US20250140292A1
公开(公告)日:2025-05-01
申请号:US18431103
申请日:2024-02-02
Applicant: ADOBE INC.
Inventor: Anh Lan TRUONG , Deepali ANEJA , Hijung SHIN , Rubaiat HABIB , Jakub FISER , Kishore RADHAKRISHNA , Joel Richard BRANDT , Matthew David FISHER , Zeyu JIN , Kim Pascal PIMMEL , Wilmot LI , Lubomira Assenova DONTCHEVA
IPC: G11B27/036 , G06V20/40 , G06V40/16 , H04N5/262
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding face-aware scale magnification to the trimmed video (e.g., applying scale magnification to simulate a camera zoom effect that hides shot cuts with respect to the subject's face). For example, as the trimmed video transitions from one video segment to the next video segment, a scale magnification may be applied that zooms in on a detected face at a boundary between the video segments to smooth the transition between video segments.
-
6.
公开(公告)号:US20240127855A1
公开(公告)日:2024-04-18
申请号:US17967697
申请日:2022-10-17
Applicant: Adobe Inc.
CPC classification number: G11B27/02 , G06V20/46 , G06V20/48 , G06V20/49 , G06V40/172 , G06V40/176
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for selection of the best image of a particular speaker's face in a video, and visualization in a diarized transcript. In an example embodiment, candidate images of a face of a detected speaker are extracted from frames of a video identified by a detected face track for the face, and a representative image of the detected speaker's face is selected from the candidate images based on image quality, facial emotion (e.g., using an emotion classifier that generates a happiness score), a size factor (e.g., favoring larger images), and/or penalizing images that appear towards the beginning or end of a face track. As such, each segment of the transcript is presented with the representative image of the speaker who spoke that segment and/or input is accepted changing the representative image associated with each speaker.
-
公开(公告)号:US20250168442A1
公开(公告)日:2025-05-22
申请号:US19033062
申请日:2025-01-21
Applicant: Adobe Inc.
Inventor: Kim Pascal PIMMEL , Stephen Joseph DIVERDI , Jiaju MA , Rubaiat HABIB , LI-Yi WEI , Hijung SHIN , Deepali ANEJA , John G. NELSON , Wilmot LI , Dingzeyu LI , Lubomira Assenova DONTCHEVA , Joel Richard BRANDT
IPC: H04N21/431 , G06F3/04812 , G06F3/0482 , H04N21/4402
Abstract: Embodiments of the present disclosure provide, a method, a system, and a computer storage media that provide mechanisms for multimedia effect addition and editing support for text-based video editing tools. The method includes generating a user interface (UI) displaying a transcript of an audio track of a video and receiving, via the UI, input identifying selection of a text segment from the transcript. The method also includes in response to receiving, via the UI, input identifying selection of a particular type of text stylization or layout for application to the text segment. The method further includes identifying a video effect corresponding to the particular type of text stylization or layout, applying the video effect to a video segment corresponding to the text segment, and applying the particular type of text stylization or layout to the text segment to visually represent the video effect in the transcript.
-
公开(公告)号:US20240127858A1
公开(公告)日:2024-04-18
申请号:US17967608
申请日:2022-10-17
Applicant: Adobe Inc.
Inventor: Lubomira Assenova DONTCHEVA , Hijung SHIN , Joel Richard BRANDT , Joy Oakyung KIM
IPC: G11B27/031 , G10L15/24 , G10L15/26
CPC classification number: G11B27/031 , G10L15/24 , G10L15/26
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for annotating transcript text with video metadata, and including thumbnail bars in the transcript to help users select a desired portion of a video through transcript interactions. In an example embodiment, a video editing interface includes a transcript interface that presents a transcript with transcript text that is annotated to indicate corresponding portions of the video where various features were detected (e.g., annotating via text stylization of transcript text and/or labeling the transcript text with a textual representation of a corresponding detected feature class). In some embodiments, the transcript interface displays a visual representation of detected non-speech audio or pauses (e.g., a sound bar) and/or video thumbnails corresponding to each line of transcript text (e.g., a thumbnail bar). Transcript text, soundbars, and/or thumbnail bars are selectable to identify and perform video editing operations on a corresponding video segment.
-
公开(公告)号:US20240126994A1
公开(公告)日:2024-04-18
申请号:US17967562
申请日:2022-10-17
Applicant: Adobe Inc.
Inventor: Hanieh DEILAMSALEHY , Aseem Omprakash AGARWALA , Haoran CAI , Hijung SHIN , Joel Richard BRANDT , Lubomira Assenova DONTCHEVA
IPC: G06F40/30 , G06F40/205 , H04N5/93
CPC classification number: G06F40/30 , G06F40/205 , H04N5/9305
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for segmenting a transcript into paragraphs. In an example embodiment, a transcript is segmented to start a new paragraph whenever there is a change in speaker and/or a long pause in speech. If any remaining paragraphs are longer than a designated length or duration (e.g., 50 or 100 words), each of those paragraphs is segmented using dynamic programming to minimize a cost function that penalizes candidate paragraphs based on divergence from a target paragraph length and/or that rewards candidate paragraphs that group semantically similar sentences. As such, the transcript is visualized, segmented at the identified paragraphs.
-
-
-
-
-
-
-
-