Detecting content scraping
    1.
    发明授权
    Detecting content scraping 有权
    检测内容刮

    公开(公告)号:US08909628B1

    公开(公告)日:2014-12-09

    申请号:US13668106

    申请日:2012-11-02

    Applicant: Google Inc.

    CPC classification number: G06F17/30864 G06Q30/0201

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a plurality of n-grams in a plurality of resources found in a particular site; determining, for each of the plurality of resources, a count of n-grams that originated in the resource; determining, based on counts of n-grams that originated in the resources, a first aggregate count of n-grams that originated in the particular site; determining a second aggregate count of the plurality of n-grams that were identified in the plurality of resources found in the particular site; and determining, based on the first and second aggregate counts, a site originality score for the particular site.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于识别在特定站点中发现的多个资源中的多个n克; 为所述多个资源中的每一个确定源自所述资源的n克的计数; 根据源自资源的n-gram的计数确定起源于该特定地点的n克的第一个总计数; 确定在所述特定站点中发现的所述多个资源中识别的所述多个n-gram的第二聚合计数; 以及基于所述第一和第二聚合计数确定所述特定站点的站点原创性得分。

Patent Agency Ranking