Abstract:
In one embodiment, a method includes obtaining a text representation, and identifying a current topic structure for the text representation. The first topic structure is initially identified as an initial first topic structure. The method also includes identifying at least a first document that has a first document topic structure that is similar to the current first topic structure, refining the current first topic structure based on the first document topic structure, and introducing topic labels in the text representation based on the current first topic structure.
Abstract:
In one embodiment, digital content labeling includes receiving digital media content. Content is broken into topically homogenous segments, and these segments are clustered in accordance with segment similarities. A topic label is associated by user assignment or user confirmation with a segment in a cluster, and this topic label is propagated to other segments in the same cluster. A label rank may be associated with a label.