-
公开(公告)号:US20210117666A1
公开(公告)日:2021-04-22
申请号:US16655363
申请日:2019-10-17
Applicant: Adobe Inc.
Inventor: Verena Sabine Kaynig-Fittkau , Smitha Bangalore Naresh , Shawn Alan Gaither , Richard Cohn , Paul John Asente , Eylon Stroh , Emily Seminerio
Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.
-
公开(公告)号:US11386685B2
公开(公告)日:2022-07-12
申请号:US16655363
申请日:2019-10-17
Applicant: Adobe Inc.
Inventor: Verena Sabine Kaynig-Fittkau , Smitha Bangalore Naresh , Shawn Alan Gaither , Richard Cohn , Paul John Asente , Eylon Stroh , Emily Seminerio
IPC: G06V30/413 , G06N20/00 , G06V30/412 , G06V30/414
Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.
-