EXTRACTING DOCUMENT HIERARCHY USING A MULTIMODAL, LAYER-WISE LINK PREDICTION NEURAL NETWORK
Abstract:
The present disclosure relates to systems, methods, and non-transitory computer-readable media that generate a digital document hierarchy comprising layers of parent-child element relationships from the visual elements. For example, for a layer of the layers, the disclosed systems determine, from the visual elements, candidate parent visual elements and child visual elements. In addition, for the layer of the layers, the disclosed systems generate, from the feature embeddings utilizing a neural network, element classifications for the candidate parent visual elements and parent-child element link probabilities for the candidate parent visual elements and the child visual elements. Moreover, for the layer, the disclosed systems select parent visual elements from the candidate parent visual elements based on the parent-child element link probabilities. Further, the disclosed systems utilize the digital document hierarchy to generate an interactive digital document from the digital document image.
Information query
Patent Agency Ranking
0/0