METHOD AND SYSTEM FOR DOMAIN ADAPTATION OF SOCIAL MEDIA TEXT USING LEXICAL DATA TRANSFORMATIONS
摘要:
A method and a system for performing domain adaptations of social media text by using lexical data transformations are provided. The method includes: receiving a first data set that is usable for training a machine learning (ML) model that is designed to perform natural language processing tasks; training the ML model by using the first data set; receiving a second data set that relates to a social media platform; transforming a subset of the first data set into a third data set that is suitable for the social media platform; and retraining the ML model by using a combination of the first data set, the second data set, and the third data set. The transformations may include injecting emojis, emoticons, user mention indicators, hashtags, retransmission indicators, URLs, and/or inverse lexical normalizations that are often used in social media posts.
信息查询
0/0