-
公开(公告)号:US20250157237A1
公开(公告)日:2025-05-15
申请号:US18730493
申请日:2023-01-04
Inventor: Xiaofei MAO , Can HUANG
IPC: G06V30/12 , G06V20/62 , G06V30/148
Abstract: The disclosure relates to a method, apparatus, readable storage medium and electronic device of image processing. The method includes: performing text recognition on a target image, to obtain a recognized text; performing segmentation processing on the recognized text; and obtaining, based on a text segment obtained from the segmentation processing, a target text by correcting the recognized text through a pre-trained language model.
-
公开(公告)号:US20250131535A1
公开(公告)日:2025-04-24
申请号:US18834506
申请日:2023-02-23
Inventor: Xiaofei MAO , Can HUANG
Abstract: The present disclosure provides an image restoration method and apparatus, a device, a medium and a product. The method includes: acquiring an image to be restored; and then, inputting the image to be restored into a structure restoration model, obtaining a first feature sequence and a second feature sequence by down-sampling the image to be restored based on a plurality of branches of the structure restoration model, converting the first feature sequence into a third feature sequence that has the same length as the second feature sequence, fusing the third feature sequence with the second feature sequence, and obtaining an image in which the structure of the image to be restored is restored by performing structure restoration on the image to be restored according to a fused feature sequence. In this way, a restored image with higher restoration precision and a better effect can be obtained.
-
3.
公开(公告)号:US20250095327A1
公开(公告)日:2025-03-20
申请号:US18730536
申请日:2022-12-26
Inventor: Xiaofei MAO , Can HUANG
Abstract: The disclosure relates to a method, apparatus, readable storage medium, and electronic device for object attribute recognition. The method includes: acquiring a target image, the target image comprising a target object and object description information of the target object; extracting, from the target image, a sequence of key information features of the target object and a sequence of multimodal features corresponding to a target attribute of the target object, the sequence of multimodal features comprising a sequence of visual features and a sequence of semantic features of the target attribute; and determining a plurality of object attributes of the target object based on the sequence of key information features and the sequence of multimodal features.
-
公开(公告)号:US20250104453A1
公开(公告)日:2025-03-27
申请号:US18832018
申请日:2023-02-27
Inventor: Xiaofei MAO , Can HUANG
Abstract: The present disclosure provides an image description generation method and apparatus, a device, a medium, and a product, and relates to the technical field of image processing. The method includes obtaining an image including a target object; respectively extracting a label feature of the target object, a position feature of the target object in the image, a text feature in the image, and a visual feature of the target object from the image; and generating a natural language description for the image according to the label feature, the position feature, the text feature, the visual feature, and a visual linguistic model. It is apparent that through the method, more effective information is extracted from the image, such that the model can better understand the image, thereby improving a matching degree between the obtained natural language description and the target object in the image.
-
-
-