-
公开(公告)号:US20210117723A1
公开(公告)日:2021-04-22
申请号:US17016654
申请日:2020-09-10
Applicant: Korea Electronics Technology Institute
Inventor: Bo Eun KIM , Hye Dong JUNG
Abstract: A method and a system for automatically generating multiple captions of an image are provided. A method for training an auto image caption generation model according to an embodiment of the present disclosure includes: generating a caption attention map by using an image; converting the generated caption attention map into a latent variable by projecting the caption attention map onto a latent space; deriving a guide map by using the latent variable; and training to generate captions of an image by using the guide map and the image. Accordingly, a plurality of captions describing various characteristics of an image and including various expressions can be automatically generated.
-
公开(公告)号:US20200043473A1
公开(公告)日:2020-02-06
申请号:US16256563
申请日:2019-01-24
Applicant: Korea Electronics Technology Institute
Inventor: Young Han LEE , Jong Yeol YANG , Choong Sang CHO , Hye Dong JUNG
Abstract: An audio segmentation method based on an attention mechanism is provided. The audio segmentation method according to an embodiment obtains a mapping relationship between an “inputted text” and an “audio spectrum feature vector for generating an audio signal”, the audio spectrum feature vector being automatically synthesized by using the inputted text, and segments an inputted audio signal by using the mapping relationship. Accordingly, high quality can be guaranteed and the effort, time, and cost can be noticeably reduced through audio segmentation utilizing the attention mechanism.
-
3.
公开(公告)号:US20240193920A1
公开(公告)日:2024-06-13
申请号:US18536536
申请日:2023-12-12
Applicant: Korea Electronics Technology Institute
Inventor: Jae Woong YOO , Mi Ra LEE , Hye Dong JUNG
CPC classification number: G06V10/7715 , G06V10/44 , G06V10/764 , G06V10/806 , G06V40/20 , G10L15/02 , G10L15/08
Abstract: There is provided a method for predicting a user personality by mapping multimodal information on a personality expression space. A personality prediction method according to an embodiment extracts a multimodal feature from an input image in which a user appears, maps the extracted multimodal feature on a personality expression space, and predicts a personality of the user based on a result of mapping. Accordingly, a personality of a user may be more exactly predicted through establishment of a correlation between user's various behavior characteristics and personalities.
-
公开(公告)号:US20210043110A1
公开(公告)日:2021-02-11
申请号:US16536151
申请日:2019-08-08
Applicant: KOREA ELECTRONICS TECHNOLOGY INSTITUTE
Inventor: Hye Dong JUNG , Sang Ki KO , Han Mu PARK , Chang Jo KIM
Abstract: Disclosed is a method of providing a sign language video reflecting an appearance of a conversation partner. The method includes recognizing a speech language sentence from speech information, and recognizing an appearance image and a background image from video information. The method further comprises acquiring multiple pieces of word-joint information corresponding to the speech language sentence from joint information database, sequentially inputting the word-joint information to a deep learning neural network to generate sentence-joint information, generating a motion model on the basis of the sentence-joint information, and generating a sign language video in which the background image and the appearance image are synthesized with the motion model. The method provides a natural communication environment between a sign language user and a speech language user.
-
公开(公告)号:US20200005086A1
公开(公告)日:2020-01-02
申请号:US16147962
申请日:2018-10-01
Applicant: Korea Electronics Technology Institute
Inventor: Sang Ki KO , Choong Sang CHO , Hye Dong JUNG , Young Han LEE
Abstract: Deep learning-based automatic gesture recognition method and system are provided. The training method according to an embodiment includes: extracting a plurality of contours from an input image; generating training data by normalizing pieces of contour information forming each of the contours; and training an AI model for gesture recognition by using the generated training data. Accordingly, robust and high-performance automatic gesture recognition can be performed, without being influenced by an environment and a condition even while using less training data.
-
公开(公告)号:US20190286931A1
公开(公告)日:2019-09-19
申请号:US16043338
申请日:2018-07-24
Applicant: Korea Electronics Technology Institute
Inventor: Bo Eun KIM , Choong Sang CHO , Hye Dong JUNG , Young Han LEE
Abstract: A method and a system for automatic image caption generation are provided. The automatic image caption generation method according to an embodiment of the present disclosure includes: extracting a distinctive attribute from example captions of a learning image; training a first neural network for predicting a distinctive attribute from an image, by using a pair of the extracted distinctive attribute and the learning image; inferring a distinctive attribute by inputting the learning image to the trained first neural network; and training a second neural network for generating a caption of an image by using a pair of the inferred distinctive attribute and the learning image. Accordingly, a caption well indicating a feature of a given image is automatically generated, such that an image can be more exactly explained and a difference from other images can be clearly distinguished.
-
7.
公开(公告)号:US20240193969A1
公开(公告)日:2024-06-13
申请号:US18536856
申请日:2023-12-12
Applicant: Korea Electronics Technology Institute
Inventor: Jae Woong YOO , Mi Ra LEE , Hye Dong JUNG
CPC classification number: G06V20/70 , G06V10/44 , G06V10/761
Abstract: There is provided a method for creating multimodal training datasets for predicting characteristics of a user by using pseudo-labeling. According to an embodiment, the method may acquire a labelled dataset in which an image of a user is labelled with personality information and may extract a multimodal feature vector from the image of the acquired labelled dataset, may acquire an un-labelled dataset in which an image of a user is not labelled with personality information and may extract a multimodal feature vector from the image of the acquired un-labelled dataset, may measure a similarity between the extracted multimodal feature vector of the labelled dataset and the multimodal feature vector of the un-labelled dataset, and may label the un-labelled dataset based on the measured similarity. Accordingly, by creating multimodal training datasets for predicting a user personality by using pseudo-labeling, training datasets may be obtained rapidly, economically and effectively.
-
8.
公开(公告)号:US20240193436A1
公开(公告)日:2024-06-13
申请号:US18536589
申请日:2023-12-12
Applicant: Korea Electronics Technology Institute
Inventor: Jae Woong YOO , Hye Dong JUNG , Mi Ra LEE
CPC classification number: G06N5/02 , G06V40/176 , G06V40/20
Abstract: There is provided a user personality prediction method using pre-obtained personality indicators and time-series information. According to an embodiment, a personality prediction method may acquire personality indicators representing personalities of a user, may acquiring external features of the user as time-series data, may train a personality prediction model with correlations between the acquired external features and the personality indicators, and may predict personality indicators of the user from the external features of the user by using the trained personality prediction model. Accordingly, a personality of a user is predicted in real time based on external features extracted in real time, and hence, personality prediction may be performed flexibly in response to a subtle change in AU intensities acquired as time-series data.
-
公开(公告)号:US20240193376A1
公开(公告)日:2024-06-13
申请号:US18536853
申请日:2023-12-12
Applicant: Korea Electronics Technology Institute
Inventor: Jae Woong YOO , Hye Dong JUNG , Mi Ra LEE
IPC: G06F40/40 , H04L67/1396
CPC classification number: G06F40/40 , H04L67/1396
Abstract: There is provided a customized personality agent system evolving according to a satisfaction of a user. An interactive service providing method according to an embodiment provides an interactive AI service to a user by using an agent that is selected from a plurality of agents based on a state of personality of the user, and evaluates a satisfaction of the user and trains the agent that provides the interactive service. Accordingly, by searching an agent that has an optimal personality suited to a state of personality of a user and providing an interactive AI service, service quality may be enhanced. Also, by rewarding and training an agent that provides a service based on a satisfaction of a user who receives the service, the personality of the agent may evolve to be well suited to a personality of the user.
-
公开(公告)号:US20200043465A1
公开(公告)日:2020-02-06
申请号:US16256835
申请日:2019-01-24
Applicant: Korea Electronics Technology Institute
Inventor: Jong Yeol YANG , Young Han LEE , Choong Sang CHO , Hye Dong JUNG
IPC: G10L13/10 , G06K9/00 , H04N21/233
Abstract: An audio synthesis method adapted to video characteristics is provided. The audio synthesis method according to an embodiment includes: extracting characteristics x from a video in a time-series way; extracting characteristics p of phonemes from a text; and generating an audio spectrum characteristic St used to generate an audio to be synthesized with a video at a time t, based on correlations between an audio spectrum characteristic St-1, which is used to generate an audio to be synthesized with a video at a time t−1, and the characteristics x. Accordingly, an audio can be synthesized according to video characteristics, and speech according to a video can be easily added.
-
-
-
-
-
-
-
-
-