Automatic classification of segmented portions of web pages
    2.
    发明授权
    Automatic classification of segmented portions of web pages 有权
    对网页的分段部分进行自动分类

    公开(公告)号:US09514216B2

    公开(公告)日:2016-12-06

    申请号:US14480528

    申请日:2014-09-08

    Applicant: Yahoo! Inc.

    Abstract: Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems. In an embodiment, an index of segmented portions may be used by a search engine to respond to a search query. In an embodiment, one or more machine learned models may be used to identify one or more feature properties of a plurality of segmented portions within one or more files, or otherwise inferable from the one or more files. In an embodiment, one or more machine learned models may be used to classify one or more of a plurality of segmented portions as being at least one of a plurality of segment types.

    Abstract translation: 提供了可用于对网页的分段部分进行分类和索引并提供用于信息提取和/或信息检索系统的相关信息的示例性方法和装置。 在一个实施例中,搜索引擎可以使用分段部分的索引来响应搜索查询。 在一个实施例中,可以使用一个或多个机器学习模型来识别一个或多个文件内的多个分段部分的一个或多个特征属性,或者可以从一个或多个文件推断。 在一个实施例中,可以使用一个或多个机器学习模型来将多个分段部分中的一个或多个分类为多个分段类型中的至少一个。

Patent Agency Ranking