-
公开(公告)号:US20240013001A1
公开(公告)日:2024-01-11
申请号:US18349183
申请日:2023-07-10
发明人: Jia WANG , Jing-Cheng KE , Wen-Huang CHENG , Hong-Han SHUAI , Yung-Hui LI
IPC分类号: G06F40/295 , G06V30/146 , G06V30/19
CPC分类号: G06F40/295 , G06V30/147 , G06V30/19187
摘要: A recognition method includes the following steps. A text is analyzed by a language recognition network to generate an entity feature, a relation feature and an overall feature. An input image is analyzed by an object detection network to generate candidate regions. Node features, aggregated edge features and compound features are generated by an enhanced cross-modal graph attention network according to the entity feature, the relation feature, the candidate regions and the overall feature. The entity feature and the relation feature are matched to the node features and the aggregated edge features to generate the first scores. The overall feature is matched to the compound features to generate second scores. Final scores corresponding to the candidate regions are generated according to the first scores and the second scores.
-
公开(公告)号:US11823478B2
公开(公告)日:2023-11-21
申请号:US17714806
申请日:2022-04-06
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
3.
公开(公告)号:US20240265717A1
公开(公告)日:2024-08-08
申请号:US18105493
申请日:2023-02-03
发明人: Robert R. Price
IPC分类号: G06V30/18 , G06F16/55 , G06N7/01 , G06V30/186 , G06V30/19
CPC分类号: G06V30/18095 , G06F16/55 , G06N7/01 , G06V30/186 , G06V30/19187
摘要: A system and method for robust estimation of state parameters from internal readings in a sequence of images are provided. Various techniques can be implemented to address observation noise and/or underlying process noise to stabilize the readings.
-
公开(公告)号:US11972625B2
公开(公告)日:2024-04-30
申请号:US17740695
申请日:2022-05-10
申请人: Dell Products L.P.
发明人: Saurabh Jha , Atul Kumar
CPC分类号: G06V30/41 , G06V30/19147 , G06V30/19187 , G06V30/19193
摘要: Methods, apparatus, and processor-readable storage media for character-based representation learning for table data extraction using artificial intelligence techniques are provided herein. An example computer-implemented method includes identifying, from unstructured documents comprising tabular data, items of text and corresponding document position information using artificial intelligence-based text extraction techniques; generating an intermediate output by implementing character embedding with respect to the unstructured documents using an artificial intelligence-based encoder; determining structure-related information for the unstructured documents using one or more artificial intelligence-based graph-related techniques by inferring columns from the tabular data; generating a character-based representation of the unstructured documents using an artificial intelligence-based decoder by converting the inferred columns into one or more line items; classifying portions of the character-based representation using artificial intelligence-based statistical modeling techniques; and performing one or more automated actions based on the classifying.
-
5.
公开(公告)号:US20230368556A1
公开(公告)日:2023-11-16
申请号:US17740695
申请日:2022-05-10
申请人: Dell Products L.P.
发明人: Saurabh Jha , Atul Kumar
CPC分类号: G06V30/41 , G06V30/19187 , G06V30/19147 , G06V30/19193
摘要: Methods, apparatus, and processor-readable storage media for character-based representation learning for table data extraction using artificial intelligence techniques are provided herein. An example computer-implemented method includes identifying, from unstructured documents comprising tabular data, items of text and corresponding document position information using artificial intelligence-based text extraction techniques; generating an intermediate output by implementing character embedding with respect to the unstructured documents using an artificial intelligence-based encoder; determining structure-related information for the unstructured documents using one or more artificial intelligence-based graph-related techniques by inferring columns from the tabular data; generating a character-based representation of the unstructured documents using an artificial intelligence-based decoder by converting the inferred columns into one or more line items; classifying portions of the character-based representation using artificial intelligence-based statistical modeling techniques; and performing one or more automated actions based on the classifying.
-
公开(公告)号:US12106595B2
公开(公告)日:2024-10-01
申请号:US18379091
申请日:2023-10-11
发明人: Amit Agarwal , Kulbhushan Pachauri
IPC分类号: G06V30/414 , G06V30/19
CPC分类号: G06V30/414 , G06V30/19147 , G06V30/19173 , G06V30/19187
摘要: A computing device may access visually rich documents comprising an image and metadata. A graph, based on the image or metadata, can be generated for a visually rich document. The graph's nodes can correspond to words from the visually rich document. Features for nodes can be determined by the device. The device may generate model labeled graphs by assigning a pseudo-label to nodes using a pretrained model. The device may generate a plurality of graph labeled graphs by assigning a pseudo-label to nodes by matching a first node from a first graph to at least a second node from a second graph. The device may generate a plurality of updated graphs by cross referencing labels from the model labeled graphs and the graph labeled graphs. Until a change in labels is below a threshold, a model can be trained to perform key-value extraction using the updated graphs.
-
公开(公告)号:US11956254B1
公开(公告)日:2024-04-09
申请号:US17342417
申请日:2021-06-08
申请人: Arceo Labs Inc.
发明人: Ann Irvine , Robert Mealey , Russell Snyder
CPC分类号: H04L63/1416 , G06V30/41 , G06V30/19187
摘要: Generating a cybersecurity risk model using sparse data is disclosed, including: obtaining signals associated with a cybersecurity risk, wherein the obtained signals include technographic signals and query derived signals obtained from queries; generating pseudo signals based at least in part on a priori factors relating to the cybersecurity risk; and combining the pseudo signals and the obtained signals into a Bayesian model indicating the cybersecurity risk.
-
8.
公开(公告)号:US20240013562A1
公开(公告)日:2024-01-11
申请号:US18148947
申请日:2022-12-30
申请人: Nielsen Consumer LLC
IPC分类号: G06V30/19 , G06F16/901 , G06V30/14 , G06V30/148 , G06V10/44 , G06V10/82
CPC分类号: G06V30/19187 , G06F16/9024 , G06V30/1448 , G06V30/19107 , G06V30/153 , G06V10/44 , G06V10/82
摘要: Methods, apparatus, systems, and articles of manufacture are disclosed that determine related content. An example apparatus includes processor circuitry to generate a segment-level graph by sampling segment-level edges among segment nodes representing text segments, the segment-level graph including segment node embeddings representing features of the segment nodes; cluster the text segments to form entities by applying a first GAN based model to the segment-level graph to update the segment node embeddings; generate a multi-level graph by (a) generating an entity-level graph including hypernodes representing the entities and sampled entity edges connecting ones of the hypernodes, and (b) connecting the segment nodes to respective ones of the hypernodes using relation edges; generate hypernode embeddings by propagating the updated segment node embeddings using a relation graph; and cluster the entities by product by applying a second GAN based model to the multi-level graph, the multi-level graph to generate updated hypernode embeddings.
-
公开(公告)号:US20230282016A1
公开(公告)日:2023-09-07
申请号:US17898678
申请日:2022-08-30
发明人: Huihui HE , Jiayang WANG , Yubo XIANG
CPC分类号: G06V30/19187 , G06N3/08 , G06T9/002 , G06V30/19147 , G06V30/1916
摘要: Provided are method for recognizing a receipt, an electronic device and a storage medium, which relate to the fields of deep learning and pattern recognition. The method may include: a target receipt to be recognized is acquired; two-dimensional position information of multiple text blocks on the target receipt respectively is encoded, to obtain multiple encoding results; graph convolution is performed on the multiple encoding results respectively, to obtain multiple convolution results; and each of the multiple convolution results is recognized based on a first conditional random field model, to obtain a first prediction result at text block-level of the target receipt, wherein the first conditional random field model and a second conditional random field model are co-trained, so as to obtain a second prediction result at token-level of the target receipt.
-
10.
公开(公告)号:US20240331425A1
公开(公告)日:2024-10-03
申请号:US18460680
申请日:2023-09-04
申请人: ZHEJIANG LAB
发明人: Yao QI , Hongyang CHEN , Jingsong LV , Wentao YANG
CPC分类号: G06V30/19187 , G06N5/02 , G06V30/19173
摘要: A method, a device, computer equipment and a storage medium for identify an illegal commodity. The method comprises: firstly, constructing a multi-modal knowledge graph according to a multi-modal knowledge graph data set, and extracting visual features of all visual modality entities and text features of all text modality entities in the knowledge graph; then obtaining a commodity image and a commodity text according to a database; then, generating commodity visual feature according to the commodity image; then generating the commodity text feature according to the commodity text; secondly, according to the visual features and text features, as well as the commodity visual feature and the commodity text feature, linking the commodity image and the commodity text to the knowledge graph by using an entity linking method; finally, obtaining the correlation between the commodity image and the commodity text according to the linked knowledge graph to determine the illegality of the commodity.
-
-
-
-
-
-
-
-
-