-
公开(公告)号:US20240028972A1
公开(公告)日:2024-01-25
申请号:US17815448
申请日:2022-07-27
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Nikolaos Barmpalios , Sruthi Madapoosi Ravi , Ruchi Deshpande , Varun Manjunatha , Smitha Bangalore Naresh , Priyank Mathur , Oghenetegiri Sido
CPC classification number: G06N20/20 , G06K9/6262 , G06K9/6256
Abstract: Techniques for training for and determining a confidence of an output of a machine learning model are disclosed. Such techniques include, in some embodiments, receiving, from the machine learning model configured to receive information associated with a data object, information associated with a predicted structure for the data object; encoding, using a second machine learning model, the information associated with the predicted structure for the data object to produce encoded input channels; evaluating, using the second machine learning model, the information associated with the data object with the encoded input channels; and based on the evaluating, determining, using the second machine learning model, a probability of correctness of the predicted structure for the data object.
-
公开(公告)号:US20210117666A1
公开(公告)日:2021-04-22
申请号:US16655363
申请日:2019-10-17
Applicant: Adobe Inc.
Inventor: Verena Sabine Kaynig-Fittkau , Smitha Bangalore Naresh , Shawn Alan Gaither , Richard Cohn , Paul John Asente , Eylon Stroh , Emily Seminerio
Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.
-
公开(公告)号:US20240232525A9
公开(公告)日:2024-07-11
申请号:US18048900
申请日:2022-10-24
Applicant: ADOBE INC.
Inventor: Rajiv Bhawanji Jain , Michelle Yuan , Vlad Ion Morariu , Ani Nenkova Nenkova , Smitha Bangalore Naresh , Nikolaos Barmpalios , Ruchi Deshpande , Ruiyi Zhang , Jiuxiang Gu , Varun Manjunatha , Nedim Lipka , Andrew Marc Greene
IPC: G06F40/20 , G06F40/169 , G06N3/08
CPC classification number: G06F40/20 , G06F40/169 , G06N3/08
Abstract: Systems and methods for document classification are described. Embodiments of the present disclosure generate classification data for a plurality of samples using a neural network trained to identify a plurality of known classes; select a set of samples for annotation from the plurality of samples using an open-set metric based on the classification data, wherein the annotation includes an unknown class; and train the neural network to identify the unknown class based on the annotation of the set of samples.
-
公开(公告)号:US20240135096A1
公开(公告)日:2024-04-25
申请号:US18048900
申请日:2022-10-23
Applicant: ADOBE INC.
Inventor: Rajiv Bhawanji Jain , Michelle Yuan , Vlad Ion Morariu , Ani Nenkova Nenkova , Smitha Bangalore Naresh , Nikolaos Barmpalios , Ruchi Deshpande , Ruiyi Zhang , Jiuxiang Gu , Varun Manjunatha , Nedim Lipka , Andrew Marc Greene
IPC: G06F40/20 , G06F40/169 , G06N3/08
CPC classification number: G06F40/20 , G06F40/169 , G06N3/08
Abstract: Systems and methods for document classification are described. Embodiments of the present disclosure generate classification data for a plurality of samples using a neural network trained to identify a plurality of known classes; select a set of samples for annotation from the plurality of samples using an open-set metric based on the classification data, wherein the annotation includes an unknown class; and train the neural network to identify the unknown class based on the annotation of the set of samples.
-
公开(公告)号:US11386685B2
公开(公告)日:2022-07-12
申请号:US16655363
申请日:2019-10-17
Applicant: Adobe Inc.
Inventor: Verena Sabine Kaynig-Fittkau , Smitha Bangalore Naresh , Shawn Alan Gaither , Richard Cohn , Paul John Asente , Eylon Stroh , Emily Seminerio
IPC: G06V30/413 , G06N20/00 , G06V30/412 , G06V30/414
Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.
-
-
-
-