Textual analysis system for automatic content extaction
摘要:
The present invention provides a method, and an associated apparatus configured to implement such a method, for analysing mark-up language text content, such as might be found on a website or within online user generated content. The method comprises a training phase, in which plurality of schemas are automatically generated from a specified text and a final schema is compiled. This final schema can then be used to compare with other online text content such that content which matched the final schema can be identified, for example for further analysis and comparison.
公开/授权文献
信息查询
0/0