- 专利标题: METHOD AND SYSTEM FOR DOMAIN ADAPTATION OF SOCIAL MEDIA TEXT USING LEXICAL DATA TRANSFORMATIONS
-
申请号: US18103858申请日: 2023-01-31
-
公开(公告)号: US20240256760A1公开(公告)日: 2024-08-01
- 发明人: Akshat GUPTA , Xiaomo LIU , Sameena SHAH
- 申请人: JPMorgan Chase Bank, N.A.
- 申请人地址: US NY New York
- 专利权人: JPMorgan Chase Bank, N.A.
- 当前专利权人: JPMorgan Chase Bank, N.A.
- 当前专利权人地址: US NY New York
- 主分类号: G06F40/16
- IPC分类号: G06F40/16 ; G06F40/117 ; G06F40/134 ; G06F40/169 ; G06F40/237 ; G06F40/253 ; G06N20/00
摘要:
A method and a system for performing domain adaptations of social media text by using lexical data transformations are provided. The method includes: receiving a first data set that is usable for training a machine learning (ML) model that is designed to perform natural language processing tasks; training the ML model by using the first data set; receiving a second data set that relates to a social media platform; transforming a subset of the first data set into a third data set that is suitable for the social media platform; and retraining the ML model by using a combination of the first data set, the second data set, and the third data set. The transformations may include injecting emojis, emoticons, user mention indicators, hashtags, retransmission indicators, URLs, and/or inverse lexical normalizations that are often used in social media posts.
信息查询