Invention Grant
US06651058B1 System and method of automatic discovery of terms in a document that are relevant to a given target topic 有权
自动发现文档中与给定目标主题相关的术语的系统和方法

  • Patent Title: System and method of automatic discovery of terms in a document that are relevant to a given target topic
  • Patent Title (中): 自动发现文档中与给定目标主题相关的术语的系统和方法
  • Application No.: US09439758
    Application Date: 1999-11-15
  • Publication No.: US06651058B1
    Publication Date: 2003-11-18
  • Inventor: Neelakantan SundaresanJeonghee Yi
  • Applicant: Neelakantan SundaresanJeonghee Yi
  • Main IPC: G06F1730
  • IPC: G06F1730
System and method of automatic discovery of terms in a document that are relevant to a given target topic
Abstract:
A computer program product is provided as an automatic mining system to discover terms that are relevant to a given target topic from a large databases of unstructured information such as the World Wide Web. The operation of the automatic mining system is performed in three stages: The first stage is carried out by a new terms discoverer for discovering the terms in a document, the second stage is carried out by a candidate terms discoverer for discovering potentially relevant terms, and the third stage is carried out by a relevant terms discoverer for refining or testing the discovered relevance to filter false relevance. The new terms discoverer includes a system for the automatic mining of patterns and relations, a system for the automatic mining of new relationships, and a system for selecting new terms from relations. In one embodiment, the system for the automatic mining of patterns and relations identifies a set of related terms on the WWW with a high degree of confidence, using a duality concept, and includes a terms database and two identifiers: a relation identifier and a pattern identifier. The system for the automatic mining of new relationships includes a database a knowledge module and a statistics module. The knowledge module includes a stemming unit, a synonym check unit, and a domain knowledge check unit. The candidate terms discoverer includes a metadata extractor, a document vector module, an association module, a filtering module, and a database. The relevant terms discoverer includes a stop word filter and a system for the automatic construction of generalization—specialization hierarchy of terms comprised of a terms database, an augmentation module, a generalization detection module, and a hierarchy database.
Information query
Patent Agency Ranking
0/0