-
公开(公告)号:US09665617B1
公开(公告)日:2017-05-30
申请号:US14254349
申请日:2014-04-16
Applicant: Google Inc.
Inventor: Thomas James Worthington Long , Pieter Senster
CPC classification number: G06F17/30424 , G06F17/30864 , G06F17/30867 , G06F17/30882
Abstract: Systems and methods of generating a stable identifier for nodes likely to include primary content of an information resource are disclosed. A processor identifies, on an information resource, a plurality of content-related Document Object Model (DOM) nodes based on a primary content detection policy including one or more rules. The processor determines one or more container nodes containing one or more of the identified content-related DOM nodes. The processor generates, for each of the container nodes, one or more identifiers corresponding to the container node. The processor then determines, for each of the generated identifiers, one or more container nodes to which the identifier corresponds. The processor identifies, from the generated identifiers, a subset of the generated identifiers that correspond only to container nodes that contain the content-related DOM nodes and selects one of the identifiers of the subset as a stable identifier.