-
公开(公告)号:US11983926B2
公开(公告)日:2024-05-14
申请号:US17674688
申请日:2022-02-17
Inventor: Yan Li , Bin Ji , Xintian Shi , Bin Kang
Abstract: A video content recognition method is performed by a computer device, the method including: obtaining an image feature corresponding to a video frame set extracted from a target video; dividing the image feature into a plurality of image sub-features according to a preset sequence, and each image sub-feature having a corresponding channel; choosing, from the image sub-features based on the preset sequence, a current image sub-feature; image sub-feature fusing the current image sub-feature and a convolution processing result of a previous image sub-feature into a fused image sub-feature, and performing convolution processing on the fused image sub-feature, to obtain a convolved image sub-feature corresponding to the current image sub-feature; splicing a plurality of convolved image sub-features corresponding to the plurality of channels of the convolved image sub-feature, to obtain a spliced image feature; and determining video content corresponding to the target video based on the spliced image feature.
-
公开(公告)号:US11967151B2
公开(公告)日:2024-04-23
申请号:US17515164
申请日:2021-10-29
Inventor: Yan Li , Xintian Shi , Bin Ji
IPC: G06K9/00 , G06F18/214 , G06F18/25 , G06V20/40
CPC classification number: G06V20/41 , G06F18/214 , G06F18/253 , G06V20/46 , G06V20/49
Abstract: Embodiments of this application disclose a video classification method performed by a computer device and belong to the field of computer vision (CV) technologies. The method includes: obtaining a video; selecting n image frames from the video; extracting respective feature information of the n image frames according to a learned feature fusion policy by using a feature extraction network, the learned feature fusion policy being used for indicating proportions of the feature information of the other image frames that have been fused with feature information of a first image frame in the n image frames; and determining a classification result of the video according to the respective feature information of the n image frames. By replacing complex and repeated 3D convolution operations with simple feature information fusion between adjacent image frames, time for finally obtaining a classification result of the video is therefore reduced, thereby having high efficiency.
-
公开(公告)号:US20220172477A1
公开(公告)日:2022-06-02
申请号:US17674688
申请日:2022-02-17
Inventor: Yan Li , Bin Ji , Xintian Shi , Bin Kang
Abstract: A video content recognition method is performed by a computer device, the method including: obtaining an image feature corresponding to a video frame set extracted from a target video; dividing the image feature into a plurality of image sub-features according to a preset sequence, and each image sub-feature having a corresponding channel; choosing, from the image sub-features based on the preset sequence, a current image sub-feature; image sub-feature fusing the current image sub-feature and a convolution processing result of a previous image sub-feature into a fused image sub-feature, and performing convolution processing on the fused image sub-feature, to obtain a convolved image sub-feature corresponding to the current image sub-feature; splicing a plurality of convolved image sub-features corresponding to the plurality of channels of the convolved image sub-feature, to obtain a spliced image feature; and determining video content corresponding to the target video based on the spliced image feature.
-
公开(公告)号:US11314806B2
公开(公告)日:2022-04-26
申请号:US17026477
申请日:2020-09-21
Inventor: Yan Li , Hanjie Wang , Hao Ye , Bo Chen
IPC: G06F16/30 , G06F16/635 , G06F16/65 , G06F16/2457 , G06F16/68 , G06F40/279 , G06K9/62 , G06N3/04 , G06N3/08 , G11B27/036 , G06F16/783 , G10H1/36 , G10H1/00 , G06V20/40
Abstract: This application discloses a method for making music recommendations. The method for making music recommendations is performed by a server device. The method includes obtaining a material for which background music is to be added; determining at least one visual semantic tag of the material, the at least one visual semantic tag describing at least one characteristic of the material; identifying a matched music matching the at least one visual semantic tag from a candidate music library; sorting the matched music according to user assessing information of a user corresponding to the material; screening the matched music based on a sorting result and according to a preset music screening condition; and recommending matched music obtained through the screening as candidate music of the material.
-
-
-