Invention Grant
US07877371B1 Selectively deleting clusters of conceptually related words from a generative model for text
有权
从文本的生成模型中选择性地删除与概念相关的词的簇
- Patent Title: Selectively deleting clusters of conceptually related words from a generative model for text
- Patent Title (中): 从文本的生成模型中选择性地删除与概念相关的词的簇
-
Application No.: US11703582Application Date: 2007-02-07
-
Publication No.: US07877371B1Publication Date: 2011-01-25
- Inventor: Uri Lerner , Michael Jahr , Vishal Kasera
- Applicant: Uri Lerner , Michael Jahr , Vishal Kasera
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Park, Vaughan, Fleming & Dowler LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
One embodiment of the present invention provides a system that selectively deletes clusters of conceptually-related words from a probabilistic generative model for textual documents. During operation, the system receives a current model, which contains terminal nodes representing random variables for words and contains one or more cluster nodes representing clusters of conceptually related words. Nodes in the current model are coupled together by weighted links, so that if an incoming link from a node that has fired causes a cluster node to fire with a probability proportionate to a weight of the incoming link, an outgoing link from the cluster node to another node causes the other node to fire with a probability proportionate to the weight of the outgoing link. Next, the system processes a given cluster node in the current model for possible deletion. This involves determining a number of outgoing links from the given cluster node to terminal nodes or cluster nodes in the current model. If the determined number of outgoing links is less than a minimum value, or if the frequency with which the given cluster node fires is less than a minimum frequency, the system deletes the given cluster node from the current model.
Information query