METHOD AND APPARATUS FOR LABELING DATA

发明申请

US20220277351A1 METHOD AND APPARATUS FOR LABELING DATA 有权

请登陆查看更多内容

专利标题： METHOD AND APPARATUS FOR LABELING DATA
申请号： US17749347

申请日： 2022-05-20
公开(公告)号： US20220277351A1

公开(公告)日： 2022-09-01
发明人: Sanjeev Misra , Appavu Siva Prakasam , Ann Eileen Skudlark , Siva Kolachina , Nisha Shahul Hameed , Prashanth Boddhireddy , Lien Tran , Jenq-Chyuan Wang
申请人： AT&T Intellectual Property I, L.P.
申请人地址： US GA Atlanta
专利权人： AT&T Intellectual Property I, L.P.
当前专利权人： AT&T Intellectual Property I, L.P.
当前专利权人地址： US GA Atlanta
主分类号： G06Q30/02
IPC分类号： G06Q30/02 ; G06F16/35 ; G06F16/93 ; G06F40/295 ; G06N5/04 ; G06Q10/10 ; G06N20/00

摘要：

Aspects of the subject disclosure may include, for example, determining classes from a corpus based on topic modeling, data clustering and unsupervised learning. Labels are determined for each of the classes and trained models are generated for each of the classes by assignment of a plurality of textual documents to labels based on a highest number of matches. A raw textual document can be tokenized and stop words removed. A corresponding one of the trained models can be selected according to a class that is applicable to subject matter of the raw textual document. The processed document can be assigned to a target label based on a highest number of matches of words. Other embodiments are disclosed.

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06Q	专门适用于行政、商业、金融、管理、监督或预测目的的数据处理系统或方法；其他类目不包含的专门适用于行政、商业、金融、管理、监督或预测目的的处理系统或方法
G06Q30/00	商业，例如购物或电子商务
G06Q30/02	.行销，例如，市场研究与分析、调查、促销、广告、买方剖析研究、客户管理或奖励；价格评估或确定