-
1.
公开(公告)号:US11610082B2
公开(公告)日:2023-03-21
申请号:US17187473
申请日:2021-02-26
发明人: Haozhi Huang , Hao Wang , Wenhan Luo , Lin Ma , Peng Yang , Wenhao Jiang , Xiaolong Zhu , Wei Liu
IPC分类号: G06V10/00 , G06K9/62 , G06N3/04 , G06N3/08 , G06K9/00 , G06V10/30 , G06V10/44 , G06V10/75 , G06V10/98 , G06V20/40 , G06T5/00
摘要: A method, apparatus, and storage medium for training a neural network model used for image processing are described. The method includes: obtaining a plurality of video frames; inputting the plurality of video frames through a neural network model so that the neural network model outputs intermediate images; obtaining optical flow information between an early video frame and a later video frame; modifying an intermediate image corresponding to the early video frame according to the optical flow information to obtain an expected-intermediate image; determining a time loss between an intermediate image corresponding to the later video frame and the expected-intermediate image; determining a feature loss between the intermediate images and a target feature image; and training the neural network model according to the time loss and the feature loss, and returning to obtaining a plurality of video frames continue training until the neural network model satisfies a training finishing condition.
-
公开(公告)号:US11276207B2
公开(公告)日:2022-03-15
申请号:US16880883
申请日:2020-05-21
发明人: Minjun Li , Haozhi Huang , Lin Ma , Wei Liu , Yugang Jiang
摘要: An image processing method for a computer device. The method includes obtaining a to-be-processed image belonging to a first image category; inputting the to-be-processed image into a first stage image conversion model, to obtain a first intermediate image; and converting the first intermediate image into a second intermediate image through a second stage image conversion model. The method also includes determining a first weight matrix corresponding to the first intermediate image; determining a second weight matrix corresponding to the second intermediate image; and fusing the first intermediate image and the second intermediate image according to the corresponding first weight matrix and second weight matrix, to obtain a target image corresponding to the to-be-processed image and belonging to a second image category. A sum of the first weight matrix and the second weight matrix being a preset matrix.
-
公开(公告)号:US11972778B2
公开(公告)日:2024-04-30
申请号:US17712060
申请日:2022-04-01
发明人: Yonggen Ling , Haozhi Huang , Li Shen
IPC分类号: G11B27/031 , G10L13/02 , G10L25/57 , G11B27/34
CPC分类号: G11B27/031 , G10L13/02 , G10L25/57 , G11B27/34
摘要: A video sound-picture matching includes: acquiring a voice sequence; acquiring a voice segment from the voice sequence; acquiring an initial position of a start-stop mark and a moving direction of the start-stop mark from an image sequence; determining an active segment according to the initial position of the start-stop mark, the moving direction of the start-stop mark, and the voice segment; and synthesizing the voice segment and the active segment to obtain a video segment. In a video synthesizing process, the present disclosure uses start-stop marks to locate positions of active segments in an image sequence, so as to match the active segments having actions with voice segments, so that the synthesized video segments are more in line with natural laws of a character during speaking, and have better authenticity.
-
公开(公告)号:US11776097B2
公开(公告)日:2023-10-03
申请号:US17336561
申请日:2021-06-02
发明人: Haozhi Huang , Senzhe Xu , Shimin Hu , Wei Liu
CPC分类号: G06T5/50 , G06N3/04 , G06N3/08 , G06T2207/20081 , G06T2207/20084 , G06T2207/20221
摘要: Methods, devices, and storage medium for fusing at least one image are disclosed. The method includes obtaining a first to-be-fused image and a second to-be-fused image, the first to-be-fused image comprising first regions, and the second to-be-fused image comprising second regions; obtaining a first feature set according to the first to-be-fused image and obtaining a second feature set according to the second to-be-fused image; performing first fusion processing on the first to-be-fused image and the second to-be-fused image by using a shape fusion network model to obtain a third to-be-fused image, the third to-be-fused image comprising at least one first encoding feature and at least one second encoding feature; and performing second fusion processing on the third to-be-fused image and the first to-be-fused image by using a condition fusion network model to obtain a target fused image. Model training methods, apparatus, and storage medium are also disclosed.
-
5.
公开(公告)号:US11501574B2
公开(公告)日:2022-11-15
申请号:US17073441
申请日:2020-10-19
发明人: Haozhi Huang , Xinyu Gong , Jingmin Luo , Xiaolong Zhu , Wei Liu
IPC分类号: G06V40/20 , G06T7/73 , H04L12/46 , H04L61/103 , H04L67/125 , H04L69/325 , H04N19/126 , H04N19/543 , H04N19/55 , H04N19/59 , H04N19/70 , H04N19/87 , G06V10/44 , G06V10/80 , G06V10/82
摘要: In a multi-person pose recognition method, a to-be-recognized image is obtained, and a circuitous pyramid network is constructed. The circuitous network pyramid includes parallel phases, and each phase includes downsampling network layers, upsampling network layers, and a first residual connection layer to connect the downsampling and upsampling network layers. The phases are interconnected by a second residual connection layer. The circuitous pyramid network is traversed, by extracting a feature map for each phase, and the feature map of the last phase is determined to be the feature map of the to-be-recognized image. Multi-pose recognition is then performed on the to-be-recognized image according to the feature map to obtain a pose recognition result for the to-be-recognized image.
-
公开(公告)号:US11417095B2
公开(公告)日:2022-08-16
申请号:US16685526
申请日:2019-11-15
发明人: Xiaolong Zhu , Kaining Huang , Jingmin Luo , Lijian Mei , Shenghui Huang , Yongsen Zheng , Yitong Wang , Haozhi Huang
摘要: An image recognition method is provided. The method includes obtaining predicted locations of joints of a target person in a to-be-recognized image based on a joint prediction model, where the joint prediction model is pre-constructed by: obtaining a plurality of sample images; inputting training features of the sample images and a body model feature to a neural network and obtaining predicted locations of joints in the sample images outputted by the neural network; updating a body extraction parameter and an alignment parameter; and inputting the training features of the sample images and the body model feature to the neural network to obtain the joint prediction model.
-
公开(公告)号:US11200680B2
公开(公告)日:2021-12-14
申请号:US16671747
申请日:2019-11-01
发明人: Xiaolong Zhu , Kaining Huang , Jingmin Luo , Lijian Mei , Shenghui Huang , Yongsen Zheng , Yitong Wang , Haozhi Huang
摘要: An image processing method and a related apparatus are provided. The method is applied to an image processing device, and includes: obtaining an original image, the original image including a foreground object; extracting a foreground region from the original image through a deep neural network; identifying pixels of the foreground object from the foreground region; forming a mask according to the pixels of the foreground object, the mask including mask values corresponding to the pixels of the foreground object; and extracting the foreground object from the original image according to the mask.
-
公开(公告)号:US12067690B2
公开(公告)日:2024-08-20
申请号:US17497883
申请日:2021-10-08
发明人: Tianyu Sun , Haozhi Huang , Wei Liu
CPC分类号: G06T3/04 , G06F18/214 , G06N3/045 , G06N3/088 , G06T9/00 , G06V10/95 , G06V40/168 , G06V40/174
摘要: An image processing method is provided. The method includes: encoding an input image based on an attention mechanism to obtain an encoding tensor set and an attention map set of the input image; obtaining an encoding result of the input image according to the encoding tensor set and the attention map set, the encoding result of the input image recording an identity feature of a human face in the input image; encoding an expression image to obtain an encoding result of the expression image, the encoding result of the expression image recording an expression feature of a human face in the expression image; and generating an output image according to the encoding result of the input image and the encoding result of the expression image, the output image having the identity feature of the input image and the expression feature of the expression image.
-
公开(公告)号:US11356619B2
公开(公告)日:2022-06-07
申请号:US17239438
申请日:2021-04-23
发明人: Haozhi Huang , Kun Cheng , Chun Yuan , Wei Liu
摘要: Embodiments of this application disclose methods, systems, and devices for video synthesis. In one aspect, a method comprises obtaining a plurality of frames corresponding to source image information of a first to-be-synthesized video, each frame of the source image information. The method also comprises obtaining a plurality of frames corresponding to target image information of a second to-be-synthesized video. For each frame of the plurality of frames corresponding to the target image information of the second to-be-synthesized video, the method comprises fusing a respective source image from the first to-be-synthesized video, a corresponding source motion key point, and a respective target motion key point corresponding to the frame using a pre-trained video synthesis model, and generating a respective output image in accordance with the fusing. The method further comprises repeating the fusing and the generating steps for the second to-be-synthesized video to produce a synthesized video.
-
10.
公开(公告)号:US10970600B2
公开(公告)日:2021-04-06
申请号:US16373034
申请日:2019-04-02
发明人: Haozhi Huang , Hao Wang , Wenhan Luo , Lin Ma , Peng Yang , Wenhao Jiang , Xiaolong Zhu , Wei Liu
摘要: A method, apparatus, and storage medium for training a neural network model used for image processing are described. The method includes: obtaining a plurality of video frames; inputting the plurality of video frames through a neural network model so that the neural network model outputs intermediate images; obtaining optical flow information between an early video frame and a later video frame; modifying an intermediate image corresponding to the early video frame according to the optical flow information to obtain an expected-intermediate image; determining a time loss between an intermediate image corresponding to the later video frame and the expected-intermediate image; determining a feature loss between the intermediate images and a target feature image; and training the neural network model according to the time loss and the feature loss, and returning to obtaining a plurality of video frames continue training until the neural network model satisfies a training finishing condition.
-
-
-
-
-
-
-
-
-