-
公开(公告)号:US20230102217A1
公开(公告)日:2023-03-30
申请号:US18049185
申请日:2022-10-24
Applicant: Adobe Inc.
Inventor: Mahika Wason , Amol Jindal , Ajay Bedi
Abstract: The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.
-
2.
公开(公告)号:US10685470B2
公开(公告)日:2020-06-16
申请号:US16128904
申请日:2018-09-12
Applicant: Adobe Inc.
Inventor: Amol Jindal , Vivek Mishra , Neha Sharan , Anmol Dhawan
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating and providing composition effect tutorials for creating and editing digital content based on a metadata composite structure. For example, the disclosed systems can generate and/or access a metadata composite structure that includes nodes corresponding to composition effects applied to a digital content item, where a given node can include location information indicating where a composition effect is applied relative to a digital content item. The disclosed systems can further generate a tutorial to guide a user to implement a selected composition effect by identifying composition effects of nodes that correspond to a location selected within a composition interface and presenting instructions for a particular composition effect.
-
公开(公告)号:US11875781B2
公开(公告)日:2024-01-16
申请号:US17008427
申请日:2020-08-31
Applicant: Adobe Inc.
Inventor: Amol Jindal , Somya Jain , Ajay Bedi
IPC: G10L15/04 , G06F40/253 , G06F40/30
CPC classification number: G10L15/04 , G06F40/253 , G06F40/30
Abstract: A media edit point selection process can include a media editing software application programmatically converting speech to text and storing a timestamp-to-text map. The map correlates text corresponding to speech extracted from an audio track for the media clip to timestamps for the media clip. The timestamps correspond to words and some gaps in the speech from the audio track. The probability of identified gaps corresponding to a grammatical pause by the speaker is determined using the timestamp-to-text map and a semantic model. Potential edit points corresponding to grammatical pauses in the speech are stored for display or for additional use by the media editing software application. Text can optionally be displayed to a user during media editing.
-
公开(公告)号:US11573689B1
公开(公告)日:2023-02-07
申请号:US17507105
申请日:2021-10-21
Applicant: Adobe Inc.
Inventor: Amol Jindal , Ajay Bedi
IPC: G06F3/048 , G06F3/04845 , G06F3/04842
Abstract: Techniques are described for trace layer for replicating a source region of a digital image. In an implementation, a user leverages a content editing system to select a source region of a source image to be replicated and a target region of a target image to which portions of the source region are to be replicated. A trace layer is generated that is a visual representation of portions of the source region, and the trace layer is positioned on the target region of the target image. Further, the trace layer is generated based on a visibility factor such that the trace layer is at least partially transparent. The trace layer receives user interaction to select portions of the trace layer and visibility of the selected portions is modified to replicate corresponding portions of the source region to the target region.
-
公开(公告)号:US20220261579A1
公开(公告)日:2022-08-18
申请号:US17177761
申请日:2021-02-17
Applicant: Adobe Inc.
Inventor: Amol Jindal , Ajay Bedi
IPC: G06K9/00 , G06K9/62 , G06K9/46 , G06F16/532
Abstract: Techniques are provided for identifying objects (such as products within a physical store) within a captured video scene and indicating which of object in the captured scene matches a desired object requested by a user. The matching object is then displayed in an accentuated manner to the user in real-time (via augmented reality). Object identification is carried out via a multimodal methodology. Objects within the captured video scene are identified using a neural network trained to identify different types of objects. The identified objects can then be compared against a database of pre-stored images of the desired product to determine if a close match is found. Additionally, text on the identified objects is analyzed and compared to the text of the desired object. Based on either or both identification methods, the desired object is indicated to the user on their display, via an augmented reality graphic.
-
公开(公告)号:US20230360271A1
公开(公告)日:2023-11-09
申请号:US17661881
申请日:2022-05-03
Applicant: Adobe Inc.
Inventor: Amol Jindal , Ajay Bedi
CPC classification number: G06T7/97 , G06T7/73 , G06F16/54 , G06T2207/20101 , G06T2200/24 , G06T2207/20081
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for detecting changes to a point of interest between a selected version and a previous version of a digital image and providing a summary of the changes to the point of interest. For example, the disclosed system provides for display a selected version of a digital image and detects a point of interest within the selected version of the digital image. The disclosed system determines image modifications to the point of interest (e.g., tracks changes to the point of interest) to generate a summary of the image modifications. Moreover, the summary can indicate further information concerning image modifications applied to the selected point of interest, such as timestamp, editor, or author information.
-
公开(公告)号:US11481563B2
公开(公告)日:2022-10-25
申请号:US16678378
申请日:2019-11-08
Applicant: Adobe Inc.
Inventor: Mahika Wason , Amol Jindal , Ajay Bedi
Abstract: The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.
-
公开(公告)号:US10701434B1
公开(公告)日:2020-06-30
申请号:US16253120
申请日:2019-01-21
Applicant: Adobe Inc.
Inventor: Amol Jindal , Ajay Bedi
IPC: H04H60/32 , H04N21/431 , G06N3/08 , H04N21/433 , H04N21/44
Abstract: A seek content extraction system analyzes frames of video content and identifies locations in the frames where session information is displayed. This session information refers to information that is displayed as part of video content and that describes, for a particular location in the video content, what is currently happening in the video content at that particular location. This session information is extracted from each of multiple frames, and for a given frame the extracted session information is associated with the frame. While the user is seeking forward or backward through the video content, a thumbnail of the frame at a given location in the video content is displayed along with the extracted session information associated with the frame.
-
公开(公告)号:US10460196B2
公开(公告)日:2019-10-29
申请号:US15232533
申请日:2016-08-09
Applicant: Adobe Inc.
Inventor: Anmol Dhawan , Varun Maini , Srinivasa Madhava Phaneen Angara , Amol Jindal
Abstract: Salient video frame establishment is described. In one or more example embodiments, salient frames of a video are established based on multiple photos. An image processing module is capable of analyzing both video frames and photos, both of which may include entities, such as faces or objects. Frames of a video are decoded and analyzed in terms of attributes of the video. Attributes include, for example, scene boundaries, facial expressions, brightness levels, and focus levels. From the video frames, the image processing module determines candidate frames based on the attributes. The image processing module analyzes multiple photos to ascertain multiple relevant entities based on the presence of entities in the multiple photos. Relevancy of an entity can depend, for instance, on a number of occurrences. The image processing module establishes multiple salient frames from the candidate frames based on the multiple relevant entities. Salient frames can be displayed.
-
公开(公告)号:US12299408B2
公开(公告)日:2025-05-13
申请号:US18049185
申请日:2022-10-24
Applicant: Adobe Inc.
Inventor: Mahika Wason , Amol Jindal , Ajay Bedi
Abstract: The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.
-
-
-
-
-
-
-
-
-