-
公开(公告)号:US20240404283A1
公开(公告)日:2024-12-05
申请号:US18328597
申请日:2023-06-02
Applicant: Adobe Inc.
Inventor: Zhaowen WANG , Trung BUI , Bo HE
IPC: G06V20/40 , G06F40/166 , G06F40/40 , G06V10/774 , G06V10/776 , G06V10/80
Abstract: A method includes receiving a video input and a text transcription of the video input. The video input includes a plurality of frames and the text transcription includes a plurality of sentences. The method further includes determining, by a multimodal summarization model, a subset of key frames of the plurality of frames and a subset of key sentences of the plurality of sentences. The method further includes providing a summary of the video input and a summary of the text transcription based on the subset of key frames and the subset of key sentences.
-
公开(公告)号:US20210264236A1
公开(公告)日:2021-08-26
申请号:US16802440
申请日:2020-02-26
Applicant: ADOBE INC.
Inventor: Ning XU , Bayram Safa CICEK , Hailin JIN , Zhaowen WANG
Abstract: Embodiments of the present disclosure are directed towards improved models trained using unsupervised domain adaptation. In particular, a style-content adaptation system provides improved translation during unsupervised domain adaptation by controlling the alignment of conditional distributions of a model during training such that content (e.g., a class) from a target domain is correctly mapped to content (e.g., the same class) in a source domain. The style-content adaptation system improves unsupervised domain adaptation using independent control over content (e.g., related to a class) as well as style (e.g., related to a domain) to control alignment when translating between the source and target domain. This independent control over content and style can also allow for images to be generated using the style-content adaptation system that contain desired content and/or style.
-
公开(公告)号:US20230070666A1
公开(公告)日:2023-03-09
申请号:US17466711
申请日:2021-09-03
Applicant: Adobe Inc. , Czech Technical University in Prague
Inventor: Michal LUKÁC , Daniel SÝKORA , David FUTSCHIK , Zhaowen WANG , Elya SHECHTMAN
Abstract: Embodiments are disclosed for translating an image from a source visual domain to a target visual domain. In particular, in one or more embodiments, the disclosed systems and methods comprise a training process that includes receiving a training input including a pair of keyframes and an unpaired image. The pair of keyframes represent a visual translation from a first version of an image in a source visual domain to a second version of the image in a target visual domain. The one or more embodiments further include sending the pair of keyframes and the unpaired image to an image translation network to generate a first training image and a second training image. The one or more embodiments further include training the image translation network to translate images from the source visual domain to the target visual domain based on a calculated loss using the first and second training images.
-
-