-
公开(公告)号:US08826226B2
公开(公告)日:2014-09-02
申请号:US13127417
申请日:2008-11-05
申请人: Jun Wu , Henry Ou , Xiliu Tang , Yong-Gang Wang , Yongyan Liu
发明人: Jun Wu , Henry Ou , Xiliu Tang , Yong-Gang Wang , Yongyan Liu
CPC分类号: G06F17/2715
摘要: Systems, methods, and apparatuses including computer program products for generating a custom language model. In one implementation, a method is provided. The method includes receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.
摘要翻译: 包括用于生成定制语言模型的计算机程序产品的系统,方法和装置。 在一个实现中,提供了一种方法。 该方法包括接收文件的集合; 将文档聚类成一个或多个集群; 为一个或多个聚类的每个聚类生成聚类向量; 生成与目标轮廓相关联的目标矢量; 将目标矢量与每个簇矢量进行比较; 基于所述比较选择所述一个或多个聚类中的一个或多个; 以及使用来自所述一个或多个所选集群的文档生成语言模型。
-
公开(公告)号:US20110296374A1
公开(公告)日:2011-12-01
申请号:US13127417
申请日:2008-11-05
申请人: Jun Wu , Henry Ou , Xiliu Tang , Yong-Gang Wang , Yongyan Liu
发明人: Jun Wu , Henry Ou , Xiliu Tang , Yong-Gang Wang , Yongyan Liu
IPC分类号: G06F9/44
CPC分类号: G06F17/2715
摘要: Systems, methods, and apparatuses including computer program products for generating a custom language model. In one implementation, a method is provided. The method includes receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associcated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.
摘要翻译: 包括用于生成定制语言模型的计算机程序产品的系统,方法和装置。 在一个实现中,提供了一种方法。 该方法包括接收文件的集合; 将文档聚类成一个或多个集群; 为一个或多个聚类的每个聚类生成聚类向量; 生成与目标轮廓相关联的目标矢量; 将目标矢量与每个簇矢量进行比较; 基于所述比较选择所述一个或多个聚类中的一个或多个; 以及使用来自所述一个或多个所选集群的文档生成语言模型。
-