-
公开(公告)号:US11989964B2
公开(公告)日:2024-05-21
申请号:US17524157
申请日:2021-11-11
发明人: Amit Agarwal , Kulbhushan Pachauri , Iman Zadeh , Jun Qian
CPC分类号: G06V30/41 , G06N20/00 , G06V30/18181
摘要: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.
-
公开(公告)号:US11823478B2
公开(公告)日:2023-11-21
申请号:US17714806
申请日:2022-04-06
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US12106595B2
公开(公告)日:2024-10-01
申请号:US18379091
申请日:2023-10-11
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US20240289551A1
公开(公告)日:2024-08-29
申请号:US18240480
申请日:2023-08-31
IPC分类号: G06F40/284
CPC分类号: G06F40/284
摘要: In some implementations, techniques described herein may include identifying text in a visually rich document and determining a sequence for the identified text. The techniques may include selecting a language model based at least in part on the identified text and the determined sequence. Moreover, the techniques may include assigning each word of the identified text to a respective token to generate textual features corresponding to the identified text. The techniques may include extracting visual features corresponding to the identified text. The techniques may include determining positional features for each word of the identified text. The techniques may include generating a graph representing the visually rich document, each node in the graph representing each of the visual features, textual features, and positional features of a respective word of the identified text. The techniques may include training a classifier on the graph to classify each respective word of the identified text.
-
公开(公告)号:US20240037973A1
公开(公告)日:2024-02-01
申请号:US18379091
申请日:2023-10-11
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19187 , G06V30/19173 , G06V30/19147
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US20230326224A1
公开(公告)日:2023-10-12
申请号:US17714806
申请日:2022-04-06
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US20230146501A1
公开(公告)日:2023-05-11
申请号:US17524157
申请日:2021-11-11
发明人: Amit Agarwal , Kulbhushan Pachauri , Iman Zadeh , Jun Qian
CPC分类号: G06K9/00442 , G06N20/00 , G06K2209/01
摘要: A computing device may receive a set of user documents. Data may be extracted from the documents to generate a first graph data structure with one or more initial graphs containing key-value pairs. A model may be trained on the first graph data structure to classify the pairs. Until a set of evaluation metrics for the model exceeds a set of deployment thresholds: generating, a set of evaluation metrics may be generated for the model. The set of evaluation metrics may be compared to the set of deployment thresholds. In response to a determination that the set of evaluation metrics are below the set of deployment thresholds: one or more new graphs may be generated from the one or more initial graphs in the first graph data structure to produce a second graph data structure. The first and second graph can be used to train the model.
-
公开(公告)号:US20240144081A1
公开(公告)日:2024-05-02
申请号:US18051419
申请日:2022-10-31
发明人: Sandeep Jana , Edwin Thomas , Kulbhushan Pachauri
IPC分类号: G06N20/00 , G06V10/774
CPC分类号: G06N20/00 , G06V10/774
摘要: Continual learning techniques are described for extending the capabilities of a base model, which is trained to predict a set of existing or base classes, to generate a target model that is capable of making predictions for both the existing or base classes and additionally for making predictions for new or custom classes. The techniques described herein enable the target model to be trained such that the model can make predictions involving both base classes and custom classes with high levels of accuracy.
-
公开(公告)号:US20240005640A1
公开(公告)日:2024-01-04
申请号:US17994712
申请日:2022-11-28
发明人: Amit Agarwal , Srikant Panda , Kulbhushan Pachauri
IPC分类号: G06V10/774 , G06V30/414 , G06V30/413
CPC分类号: G06V10/774 , G06V30/413 , G06V30/414
摘要: Embodiments described herein are directed towards a synthetic document generation pipeline for training artificial intelligence models. One embodiment includes a method including a device that receives an instruction to generate a document to be used as a training instance for a first machine learning model, the instruction including an element configuration, a document class configuration, a format configuration, an augmentation configuration, and data bias and fairness. The device can receive an element from an interface based at least in part on the element configuration, the element can simulate a real-world image, real-world text, or real-world machine-readable visual code. The device can generate metadata describe a layout for the element on the document based on the document class configuration. The device can generate the document by arranging the element on the document based on the metadata, wherein the document is generated in a format based on the format configuration.
-
-
-
-
-
-
-
-