METHOD AND SYSTEM FOR AUTO MULTIPLE IMAGE CAPTIONING

    公开(公告)号:US20210117723A1

    公开(公告)日:2021-04-22

    申请号:US17016654

    申请日:2020-09-10

    Abstract: A method and a system for automatically generating multiple captions of an image are provided. A method for training an auto image caption generation model according to an embodiment of the present disclosure includes: generating a caption attention map by using an image; converting the generated caption attention map into a latent variable by projecting the caption attention map onto a latent space; deriving a guide map by using the latent variable; and training to generate captions of an image by using the guide map and the image. Accordingly, a plurality of captions describing various characteristics of an image and including various expressions can be automatically generated.

    METHOD AND SYSTEM FOR AUTOMATIC IMAGE CAPTION GENERATION

    公开(公告)号:US20190286931A1

    公开(公告)日:2019-09-19

    申请号:US16043338

    申请日:2018-07-24

    Abstract: A method and a system for automatic image caption generation are provided. The automatic image caption generation method according to an embodiment of the present disclosure includes: extracting a distinctive attribute from example captions of a learning image; training a first neural network for predicting a distinctive attribute from an image, by using a pair of the extracted distinctive attribute and the learning image; inferring a distinctive attribute by inputting the learning image to the trained first neural network; and training a second neural network for generating a caption of an image by using a pair of the inferred distinctive attribute and the learning image. Accordingly, a caption well indicating a feature of a given image is automatically generated, such that an image can be more exactly explained and a difference from other images can be clearly distinguished.

    METHOD FOR CREATING MULTIMODAL TRAINING DATASETS FOR PREDICTING USER CHARACTERISTICS USING PSEUDO-LABELING

    公开(公告)号:US20240193969A1

    公开(公告)日:2024-06-13

    申请号:US18536856

    申请日:2023-12-12

    CPC classification number: G06V20/70 G06V10/44 G06V10/761

    Abstract: There is provided a method for creating multimodal training datasets for predicting characteristics of a user by using pseudo-labeling. According to an embodiment, the method may acquire a labelled dataset in which an image of a user is labelled with personality information and may extract a multimodal feature vector from the image of the acquired labelled dataset, may acquire an un-labelled dataset in which an image of a user is not labelled with personality information and may extract a multimodal feature vector from the image of the acquired un-labelled dataset, may measure a similarity between the extracted multimodal feature vector of the labelled dataset and the multimodal feature vector of the un-labelled dataset, and may label the un-labelled dataset based on the measured similarity. Accordingly, by creating multimodal training datasets for predicting a user personality by using pseudo-labeling, training datasets may be obtained rapidly, economically and effectively.

    METHOD FOR PREDICTING USER PERSONALITY USING PRE-OBTAINED PERSONALITY INDICATORS AND TIME-SERIES INFORMATION

    公开(公告)号:US20240193436A1

    公开(公告)日:2024-06-13

    申请号:US18536589

    申请日:2023-12-12

    CPC classification number: G06N5/02 G06V40/176 G06V40/20

    Abstract: There is provided a user personality prediction method using pre-obtained personality indicators and time-series information. According to an embodiment, a personality prediction method may acquire personality indicators representing personalities of a user, may acquiring external features of the user as time-series data, may train a personality prediction model with correlations between the acquired external features and the personality indicators, and may predict personality indicators of the user from the external features of the user by using the trained personality prediction model. Accordingly, a personality of a user is predicted in real time based on external features extracted in real time, and hence, personality prediction may be performed flexibly in response to a subtle change in AU intensities acquired as time-series data.

    CUSTOMIZED PERSONALITY AGENT SYSTEM EVOLVING ACCORDING TO USER SATISFACTION

    公开(公告)号:US20240193376A1

    公开(公告)日:2024-06-13

    申请号:US18536853

    申请日:2023-12-12

    CPC classification number: G06F40/40 H04L67/1396

    Abstract: There is provided a customized personality agent system evolving according to a satisfaction of a user. An interactive service providing method according to an embodiment provides an interactive AI service to a user by using an agent that is selected from a plurality of agents based on a state of personality of the user, and evaluates a satisfaction of the user and trains the agent that provides the interactive service. Accordingly, by searching an agent that has an optimal personality suited to a state of personality of a user and providing an interactive AI service, service quality may be enhanced. Also, by rewarding and training an agent that provides a service based on a satisfaction of a user who receives the service, the personality of the agent may evolve to be well suited to a personality of the user.

    METHOD FOR AUDIO SYNTHESIS ADAPTED TO VIDEO CHARACTERISTICS

    公开(公告)号:US20200043465A1

    公开(公告)日:2020-02-06

    申请号:US16256835

    申请日:2019-01-24

    Abstract: An audio synthesis method adapted to video characteristics is provided. The audio synthesis method according to an embodiment includes: extracting characteristics x from a video in a time-series way; extracting characteristics p of phonemes from a text; and generating an audio spectrum characteristic St used to generate an audio to be synthesized with a video at a time t, based on correlations between an audio spectrum characteristic St-1, which is used to generate an audio to be synthesized with a video at a time t−1, and the characteristics x. Accordingly, an audio can be synthesized according to video characteristics, and speech according to a video can be easily added.

Patent Agency Ranking