PARSING AND REFLOWING INFOGRAPHICS USING STRUCTURED LISTS AND GROUPS

    公开(公告)号:US20220019735A1

    公开(公告)日:2022-01-20

    申请号:US16929903

    申请日:2020-07-15

    Applicant: Adobe Inc.

    Abstract: This disclosure describes methods, systems, and non-transitory computer readable media for automatically parsing infographics into segments corresponding to structured groups or lists and displaying the identified segments or reflowing the segments into various computing tasks. For example, the disclosed systems may utilize a novel infographic grouping taxonomy and annotation system to group elements within infographics. The disclosed systems can train and apply a machine-learning-detection model to generate infographic segments according to the infographic grouping taxonomy. By generating infographic segments, the disclosed systems can facilitate computing tasks, such as converting infographics into digital presentation graphics (e.g., slide carousels), reflow the infographic into query-and-response models, perform search functions, or other computational tasks.

    Parsing and reflowing infographics using structured lists and groups

    公开(公告)号:US11769006B2

    公开(公告)日:2023-09-26

    申请号:US16929903

    申请日:2020-07-15

    Applicant: Adobe Inc.

    CPC classification number: G06F40/205 G06F16/90332 G06F16/9538 G06N20/00

    Abstract: This disclosure describes methods, systems, and non-transitory computer readable media for automatically parsing infographics into segments corresponding to structured groups or lists and displaying the identified segments or reflowing the segments into various computing tasks. For example, the disclosed systems may utilize a novel infographic grouping taxonomy and annotation system to group elements within infographics. The disclosed systems can train and apply a machine-learning-detection model to generate infographic segments according to the infographic grouping taxonomy. By generating infographic segments, the disclosed systems can facilitate computing tasks, such as converting infographics into digital presentation graphics (e.g., slide carousels), reflow the infographic into query-and-response models, perform search functions, or other computational tasks.

    Expressive text-to-speech utilizing contextual word-level style tokens

    公开(公告)号:US11322133B2

    公开(公告)日:2022-05-03

    申请号:US16934836

    申请日:2020-07-21

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generate expressive audio for input texts based on a word-level analysis of the input text. For example, the disclosed systems can utilize a multi-channel neural network to generate a character-level feature vector and a word-level feature vector based on a plurality of characters of an input text and a plurality of words of the input text, respectively. In some embodiments, the disclosed systems utilize the neural network to generate the word-level feature vector based on contextual word-level style tokens that correspond to style features associated with the input text. Based on the character-level and word-level feature vectors, the disclosed systems can generate a context-based speech map. The disclosed systems can utilize the context-based speech map to generate expressive audio for the input text.

    EXPRESSIVE TEXT-TO-SPEECH UTILIZING CONTEXTUAL WORD-LEVEL STYLE TOKENS

    公开(公告)号:US20220028367A1

    公开(公告)日:2022-01-27

    申请号:US16934836

    申请日:2020-07-21

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generate expressive audio for input texts based on a word-level analysis of the input text. For example, the disclosed systems can utilize a multi-channel neural network to generate a character-level feature vector and a word-level feature vector based on a plurality of characters of an input text and a plurality of words of the input text, respectively. In some embodiments, the disclosed systems utilize the neural network to generate the word-level feature vector based on contextual word-level style tokens that correspond to style features associated with the input text. Based on the character-level and word-level feature vectors, the disclosed systems can generate a context-based speech map. The disclosed systems can utilize the context-based speech map to generate expressive audio for the input text.

    Visualizing natural language through 3D scenes in augmented reality

    公开(公告)号:US10665030B1

    公开(公告)日:2020-05-26

    申请号:US16247235

    申请日:2019-01-14

    Applicant: Adobe Inc.

    Abstract: A natural language scene description is converted into a scene that is rendered in three dimensions by an augmented reality (AR) display device. Text-to-AR scene conversion allows a user to create an AR scene visualization through natural language text inputs that are easily created and well-understood by the user. The user can, for instance, select a pre-defined natural language description of a scene or manually enter a custom natural language description. The user can also select a physical real-world surface on which the AR scene is to be rendered. The AR scene is then rendered using the augmented reality display device according to its natural language description using 3D models of objects and humanoid characters with associated animations of those characters, as well as from extensive language-to-visual datasets. Using the display device, the user can move around the real-world environment and experience the AR scene from different angles.

    Summarizing video content based on memorability of the video content

    公开(公告)号:US10311913B1

    公开(公告)日:2019-06-04

    申请号:US15902046

    申请日:2018-02-22

    Applicant: Adobe Inc.

    Abstract: Certain embodiments involve generating summarized versions of video content based on memorability of the video content. For example, a video summarization system accesses segments of an input video. The video summarization system identifies memorability scores for the respective segments. The video summarization system selects a subset of segments from the segments based on each computed memorability score in the subset having a threshold memorability score. The video summarization system generates visual summary content from the subset of the segments.

Patent Agency Ranking