Invention Grant
- Patent Title: System and method for converting image data into a natural language description
-
Application No.: US16206439Application Date: 2018-11-30
-
Publication No.: US10726062B2Publication Date: 2020-07-28
- Inventor: Jian Zheng , Ruxin Chen
- Applicant: Sony Interactive Entertainment Inc.
- Applicant Address: JP Tokyo
- Assignee: Sony Interactive Entertainment Inc.
- Current Assignee: Sony Interactive Entertainment Inc.
- Current Assignee Address: JP Tokyo
- Agent John L. Rogitz
- Main IPC: G06F16/383
- IPC: G06F16/383 ; G06F16/583 ; G06K9/32 ; G06N5/04

Abstract:
For image captioning such as for computer game images or other images, bottom-up attention is combined with top-down attention to provide a multi-level residual attention-based image captioning model. A residual attention mechanism is first applied in the Faster R-CNN network to learn better feature representations for each region by taking spatial information into consideration. In the image captioning network, taking the extracted regional features as input, a second residual attention network is implemented to fuse the regional features attentionally for subsequent caption generation.
Public/Granted literature
- US20200175053A1 SYSTEM AND METHOD FOR CONVERTING IMAGE DATA INTO A NATURAL LANGUAGE DESCRIPTION Public/Granted day:2020-06-04
Information query