CROSS-CONTEXT NATURAL LANGUAGE MODEL GENERATION
摘要:
Provided is a method including obtaining a corpus and an associated set of domain indicators. The method includes learning a set of vectors in an embedding space based on n-grams of the corpus. The method includes updating ontology graphs comprising a set of vertices and edges associating the set of vertices with each other. The method also includes determining a vector cluster using hierarchical clustering based on distances of the set of vectors with respect to each other in the embedding space and determining a hierarchy of the ontology graphs based on a set of domain indicators of a respective set of vertices corresponding to vectors of the vector cluster. The method also includes updating an index based on the ontology graphs.
公开/授权文献
信息查询
0/0