Custom language models
    1.
    发明授权
    Custom language models 有权
    自定义语言模型

    公开(公告)号:US08826226B2

    公开(公告)日:2014-09-02

    申请号:US13127417

    申请日:2008-11-05

    CPC分类号: G06F17/2715

    摘要: Systems, methods, and apparatuses including computer program products for generating a custom language model. In one implementation, a method is provided. The method includes receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.

    摘要翻译: 包括用于生成定制语言模型的计算机程序产品的系统,方法和装置。 在一个实现中,提供了一种方法。 该方法包括接收文件的集合; 将文档聚类成一个或多个集群; 为一个或多个聚类的每个聚类生成聚类向量; 生成与目标轮廓相关联的目标矢量; 将目标矢量与每个簇矢量进行比较; 基于所述比较选择所述一个或多个聚类中的一个或多个; 以及使用来自所述一个或多个所选集群的文档生成语言模型。

    CUSTOM LANGUAGE MODELS
    2.
    发明申请
    CUSTOM LANGUAGE MODELS 有权
    自定义语言模式

    公开(公告)号:US20110296374A1

    公开(公告)日:2011-12-01

    申请号:US13127417

    申请日:2008-11-05

    IPC分类号: G06F9/44

    CPC分类号: G06F17/2715

    摘要: Systems, methods, and apparatuses including computer program products for generating a custom language model. In one implementation, a method is provided. The method includes receiving a collection of documents; clustering the documents into one or more clusters; generating a cluster vector for each cluster of the one or more clusters; generating a target vector associcated with a target profile; comparing the target vector with each of the cluster vectors; selecting one or more of the one or more clusters based on the comparison; and generating a language model using documents from the one or more selected clusters.

    摘要翻译: 包括用于生成定制语言模型的计算机程序产品的系统,方法和装置。 在一个实现中,提供了一种方法。 该方法包括接收文件的集合; 将文档聚类成一个或多个集群; 为一个或多个聚类的每个聚类生成聚类向量; 生成与目标轮廓相关联的目标矢量; 将目标矢量与每个簇矢量进行比较; 基于所述比较选择所述一个或多个聚类中的一个或多个; 以及使用来自所述一个或多个所选集群的文档生成语言模型。