-
公开(公告)号:US20230260302A1
公开(公告)日:2023-08-17
申请号:US18300131
申请日:2023-04-13
申请人: PAYPAL, INC.
发明人: Xiaodong Yu , Hewen Wang
IPC分类号: G06V30/18 , G06Q20/40 , G06Q10/0833 , G06F16/901 , G06N3/04 , G06Q10/083 , G06V10/44 , G06V10/46 , G06F18/214 , G06V30/14
CPC分类号: G06V30/18181 , G06Q20/407 , G06Q10/0833 , G06F16/9024 , G06N3/04 , G06Q10/0838 , G06V10/457 , G06V10/464 , G06F18/214 , G06V30/1444
摘要: Methods and systems are presented for extracting categorizable information from an image using a graph that models data within the image. Upon receiving an image, a data extraction system identifies characters in the image. The data extraction system then generates bounding boxes that enclose adjacent characters that are related to each other in the image. The data extraction system also creates connections between the bounding boxes based on locations of the bounding boxes. A graph is generated based on the bounding boxes and the connections such that the graph can accurately represent the data in the image. The graph is provided to a graph neural network that is configured to analyze the graph and produce an output. The data extraction system may categorize the data in the image based on the output.
-
公开(公告)号:US20240153296A1
公开(公告)日:2024-05-09
申请号:US17983908
申请日:2022-11-09
申请人: Paypal, Inc.
发明人: Yanfei Dong , Yuan Deng , Jiazheng Zhang , Francesco Gelli , Ting Lin , Yuzhen Zhuo , Hewen Wang , Soujanya Poria
IPC分类号: G06V30/19 , G06V10/74 , G06V30/18 , G06V30/414
CPC分类号: G06V30/1916 , G06V10/761 , G06V30/18181 , G06V30/414
摘要: A method of categorizing text entries on a document can include determining, for each of a plurality of text bounding boxes in the document, respective text, respective coordinates, and respective input embeddings. The method may further include defining a graph of the plurality of bounding boxes, the graph comprising a plurality of connections among the plurality of bounding boxes, each connection comprising a first and second bounding box and zero or more respective intermediate bounding boxes. The method may further include determining a respective attention value for each connection according to a quantity of intermediate bounding boxes in the connection and, based on a the respective attention values and a transformer-based machine learning model applied to the respective input embeddings and respective coordinates, determining output embeddings for each bounding box and, based on the respective output embeddings, generating a bounding box label for each bounding box.
-
3.
公开(公告)号:US11989962B2
公开(公告)日:2024-05-21
申请号:US17559643
申请日:2021-12-22
发明人: Chao Ma , Jingshuai Zhang , Qifan Huang , Kaichun Yao , Peng Wang , Hengshu Zhu
IPC分类号: G06V30/00 , G06V30/148 , G06V30/18 , G06V30/262 , G06V30/41
CPC分类号: G06V30/153 , G06V30/18181 , G06V30/274 , G06V30/41
摘要: A method, an apparatus, a device, a storage medium and a program product of performing a text matching are provided, which relate to a field of a computer technology, and in particular to natural language processing and deep learning technologies. The method includes: determining a word set and a plurality of semantic units from a text set, the word set is associated with a first predetermined attribute, and the text set contains a plurality of first texts indicating an object information and a plurality of second texts indicating an object demand information; generating a graph; and generating a final feature representation associated with the text set and the word set based on the graph and a graph convolution model, so as to perform the text matching.
-
公开(公告)号:US20240046677A1
公开(公告)日:2024-02-08
申请号:US17814856
申请日:2022-07-26
发明人: Ang Yi , Jing Zhang , Hai Cheng Wang , Jun Hong Zhao , Rajesh M. Desai , Yang Zhong Li , Xue Xu
IPC分类号: G06V30/148 , G06V30/18
CPC分类号: G06V30/153 , G06V30/18181
摘要: A computer-implemented method for text block segmentation includes determining a first text block segmentation pattern utilized to generate a segmented text block based, at least in part, on a comparison of semantic information associated with the segmented text block and a plurality of predefined types of text block segmentation patterns indicated by a graph; calculating a first degree of confidence in a size of the segmented text block based, at least in part, on comparing semantic entities associated with the segmented text block with semantic entities indicated by leaf nodes stemming from a first non-leaf node included in the graph and representative of the first type of text block segmentation pattern; and determining that the size of the segmented text block is non-optimal based on the calculated degree of confidence in the size of the segmented text block being below a predetermined threshold.
-
公开(公告)号:US20230343096A1
公开(公告)日:2023-10-26
申请号:US17975897
申请日:2022-10-28
申请人: GMDSOFT Inc.
发明人: Hyun Soo KIM , Kyung Su LEE , Chang Ha LEE , Jae Min JANG
CPC分类号: G06V20/41 , G06V30/1448 , G06V20/49 , G06V10/62 , G06V30/18181 , G06V10/945 , G06V30/166
摘要: The present disclosure relates to technology for automatically searching and recovering the recovery area of frames corresponding to a desired time for large-capacity video evidence using a time map generated through an optical character recognition (OCR) function. A digital forensic apparatus for searching and recovering a recovery target area for large-capacity video evidence using a time map according to an embodiment of the present disclosure may include a division recovery device for collecting video evidence from a storage device, dividing the collected video evidence into a plurality of spaces in consideration of the physical space of the storage device, and recovering a representative frame in each of the divided spaces; a time information recognizer for recognizing time information from the recovered representative frame using an optical character recognition (OCR) function; a time map generator for generating a time map in which the divided spaces are arranged according to a time criterion based on the recognized time information; and a selective recovery device for searching a recovery target area by matching specific time information input by a user with the generated time map and recovering the searched recovery target area.
-
公开(公告)号:US20240290122A1
公开(公告)日:2024-08-29
申请号:US18175077
申请日:2023-02-27
申请人: Innoplexus AG
发明人: Oliver Pfante , Akhil Nasser
IPC分类号: G06V30/412 , G06V30/18 , G06V30/414
CPC分类号: G06V30/412 , G06V30/18181 , G06V30/414
摘要: A method for processing documents for enhanced search includes identifying a set of bounding boxes in the document. The method further includes defining one or more pairs of bounding boxes in the document. Each pair of bounding boxes is defined by a binary relation. The method further includes constructing a directed acyclic graph (DAG) from the one or more pairs of bounding boxes. The method further includes determining a topological sorting of each bounding box in the document based on the DAG. The topological sorting defines an adjacency relationship between the bounding boxes in the document. The method further includes extracting key-value pairs from the document based on the adjacency relationship between the bounding boxes in the document. The method further includes storing the key-value pairs in a key-value pair database.
-
公开(公告)号:US12050522B2
公开(公告)日:2024-07-30
申请号:US17577711
申请日:2022-01-18
CPC分类号: G06F11/1476 , G06N3/04 , G06V30/18181
摘要: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
8.
公开(公告)号:US11775845B2
公开(公告)日:2023-10-03
申请号:US17209380
申请日:2021-03-23
发明人: Xiaoqiang Zhang , Chengquan Zhang , Shanshan Liu
IPC分类号: G06N5/00 , G06N5/022 , G06V30/148 , G06V30/262 , G06V30/18 , G06V30/196 , G06V10/764 , G06V20/62 , G06V30/10
CPC分类号: G06N5/022 , G06V10/764 , G06V20/62 , G06V30/153 , G06V30/18181 , G06V30/1988 , G06V30/274 , G06V30/10
摘要: A character recognition method, a character recognition apparatus, an electronic device and a computer readable storage medium are disclosed. The character recognition method includes: determining semantic information and first position information of each individual character recognized from an image; constructing a graph network according to the semantic information and the first position information of each individual character; and determining a character recognition result of the image according to a feature of each individual character calculated by the graph network.
-
公开(公告)号:US20230229570A1
公开(公告)日:2023-07-20
申请号:US17577711
申请日:2022-01-18
CPC分类号: G06F11/1476 , G06V30/18181 , G06N3/04
摘要: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
公开(公告)号:US20240330130A1
公开(公告)日:2024-10-03
申请号:US18740689
申请日:2024-06-12
CPC分类号: G06F11/1476 , G06N3/04 , G06V30/18181
摘要: Herein is machine learning for anomalous graph detection based on graph embedding, shuffling, comparison, and unsupervised training techniques that can characterize an unfamiliar graph. In an embodiment, a computer obtains many known vectors that respectively represent known graphs. A new vector is generated that represents a new graph that contains multiple vertices. The new vector may contain an arithmetic aggregation of vertex vectors that respectively represent multiple vertices and/or a vector that represents a virtual vertex that is connected to the multiple vertices by respective virtual edges. In the many known vectors, some similar vectors that are similar to the new vector are identified. The new graph is automatically characterized based on a subset of the known graphs that the similar vectors represent.
-
-
-
-
-
-
-
-
-