- 专利标题: Categorical data transformation and clustering for machine learning using natural language processing
-
申请号: US15824382申请日: 2017-11-28
-
公开(公告)号: US11531927B2公开(公告)日: 2022-12-20
- 发明人: Kourosh Modarresi , Abdurrahman Ibn Munir
- 申请人: Adobe Inc.
- 申请人地址: US CA San Jose
- 专利权人: Adobe Inc.
- 当前专利权人: Adobe Inc.
- 当前专利权人地址: US CA San Jose
- 代理机构: FIG. 1 Patents
- 主分类号: G06F16/00
- IPC分类号: G06F16/00 ; G06N20/00 ; G06F16/242 ; G06F16/28 ; G06F16/35
摘要:
Categorical data transformation and clustering techniques and systems are described for machine learning using natural language processing. These techniques and systems are configured to improve operation of a computing device to support efficient and accurate use of categorical data, which is not possible using conventional techniques. In an example, categorical data is received by a computing device that includes a categorical variable having a non-numerical data type for a number of classes. The categorical data is then converted into numerical data using natural language processing. Data is then generated by the computing device that includes a plurality of latent classes. This is performed by clustering the numerical data into a number of clusters that is smaller than the number of classes in the categorical data.
公开/授权文献
信息查询