System and method for converting image data into a natural language description

Invention Grant

US10726062B2 System and method for converting image data into a natural language description 有权

Please log in to see more content

Patent Title: System and method for converting image data into a natural language description
Application No.: US16206439

Application Date: 2018-11-30
Publication No.: US10726062B2

Publication Date: 2020-07-28
Inventor: Jian Zheng , Ruxin Chen
Applicant: Sony Interactive Entertainment Inc.
Applicant Address: JP Tokyo
Assignee: Sony Interactive Entertainment Inc.
Current Assignee: Sony Interactive Entertainment Inc.
Current Assignee Address: JP Tokyo
Agent John L. Rogitz
Main IPC: G06F16/383
IPC: G06F16/383 ; G06F16/583 ; G06K9/32 ; G06N5/04

System and method for converting image data into a natural language description

Abstract:

For image captioning such as for computer game images or other images, bottom-up attention is combined with top-down attention to provide a multi-level residual attention-based image captioning model. A residual attention mechanism is first applied in the Faster R-CNN network to learn better feature representations for each region by taking spatial information into consideration. In the image captioning network, taking the extracted regional features as input, a second residual attention network is implemented to fuse the regional features attentionally for subsequent caption generation.

Public/Granted literature

US20200175053A1 SYSTEM AND METHOD FOR CONVERTING IMAGE DATA INTO A NATURAL LANGUAGE DESCRIPTION Public/Granted day:2020-06-04

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/30	.•非结构文本数据（文档管理系统入G06F 16/93）
G06F16/38	..••使用元数据的特征检索,例如,不来自内容或者元数据派生的
G06F16/383	...•••使用从内容中自动派生的元数据