- 专利标题: Textual analysis system for automatic content extaction
-
申请号: US14009027申请日: 2012-03-29
-
公开(公告)号: US10545928B2公开(公告)日: 2020-01-28
- 发明人: Hamid Gharib , Simon Thompson , Duong Nguyen , Marcus Thint
- 申请人: Hamid Gharib , Simon Thompson , Duong Nguyen , Marcus Thint
- 申请人地址: GB London
- 专利权人: BRITISH TELECOMMUNICATIONS public limited company
- 当前专利权人: BRITISH TELECOMMUNICATIONS public limited company
- 当前专利权人地址: GB London
- 代理机构: Nixon & Vanderhye P.C.
- 优先权: EP11250404 20110330
- 国际申请: PCT/GB2012/000296 WO 20120329
- 国际公布: WO2012/131310 WO 20121004
- 主分类号: G06F16/21
- IPC分类号: G06F16/21 ; G06F17/27 ; G06F17/22
摘要:
The present invention provides a method, and an associated apparatus configured to implement such a method, for analysing mark-up language text content, such as might be found on a website or within online user generated content. The method comprises a training phase, in which plurality of schemas are automatically generated from a specified text and a final schema is compiled. This final schema can then be used to compare with other online text content such that content which matched the final schema can be identified, for example for further analysis and comparison.
公开/授权文献
- US20140025698A1 TEXTUAL ANALYSIS SYSTEM 公开/授权日:2014-01-23
信息查询