-
11.
公开(公告)号:US20240338958A1
公开(公告)日:2024-10-10
申请号:US18131744
申请日:2023-04-06
Applicant: Oracle International Corporation
Inventor: Liyu Gong , Yuying Wang , Mengqing Guo , Tao Sheng , Jun Qian
IPC: G06V30/19 , G06F40/143 , G06V10/70
CPC classification number: G06V30/19147 , G06F40/143 , G06V10/70
Abstract: Techniques are disclosed for optical character recognition of extensible markup language content. A method can include a system generating a first training data comprising extensible markup language (XML) content, the first training data comprising a first plurality of training instances, each training instance including a respective image comprising XML content and annotation information for the respective image. The system can train a plurality of machine learning models using the first training data to generate a plurality of trained machine learning models, to perform image-based XML content extraction. The system can generate a plurality of trained machine learning models based at least in part on the training.
-
公开(公告)号:US20240273789A1
公开(公告)日:2024-08-15
申请号:US18467291
申请日:2023-09-14
Applicant: Oracle International Corporation
Inventor: Mohammadhossein Chaghazardi , Wenjing Yang , Tao Sheng , Jun Qian
CPC classification number: G06T11/206 , G06F40/109 , G06T7/13 , G06T7/90 , G06V10/25 , G06V20/70 , G06T2207/10024
Abstract: Techniques are described for HTML-based image generation. An example, method can include generating hypertext markup language (HTML) code for a table comprising a table structure of a set of rows and columns. The method can further include generating HTML code for a text to populate a cell of the table. The method can further include generating a rendered image of the table using the HTML code. The method can further include detecting a first pixel of the rendered image comprising the first color, and a second pixel of the rendered image comprising the second color. The method can further include detecting the text on the rendered image. The method can further include generating a bounding box, surrounding the detected text. The method can further include generating annotation comprising a bounding box parameter and a text parameter.
-