-
1.
公开(公告)号:US11615140B2
公开(公告)日:2023-03-28
申请号:US17144523
申请日:2021-01-08
Inventor: Xiang Long , Dongliang He , Fu Li , Xiang Zhao , Tianwei Lin , Hao Sun , Shilei Wen , Errui Ding
IPC: G06F16/738 , G06V20/40 , G06F18/214 , G06F18/25
Abstract: A method includes screening, by a video-clip screening module in a video description model, a plurality of video proposal clips acquired from a video to be analyzed, to acquire a plurality of video clips suitable for description. The plural video proposal clips acquired from the video to be analyzed may be screened by the video-clip screening module to acquire the plural video clips suitable for description; and then, each video clip is described by a video-clip describing module, thus avoiding description of all the video proposal clips, only describing the screened video clips which have strong correlation with the video and are suitable for description, removing the interference of the description of the video clips which are not suitable for description in the description of the video, guaranteeing the accuracy of the final descriptions of the video clips, and improving the quality of the descriptions of the video clips.
-
公开(公告)号:US20200320303A1
公开(公告)日:2020-10-08
申请号:US16905482
申请日:2020-06-18
Inventor: Dongliang He , Xiang Zhao , Jizhou Huang , Fu Li , Xiao Liu , Shilei Wen
IPC: G06K9/00 , G06F16/783 , G06N3/08
Abstract: A method and an apparatus for grounding a target video clip in a video are provided. The method includes: determining a current video clip in the video based on a current position; acquiring descriptive information indicative of a pre-generated target video clip descriptive feature, and executing a target video clip determining step which includes: determining current state information of the current video clip, wherein the current state information includes information indicative of a feature of the current video clip; generating a current action policy based on the descriptive information and the current state information, the current action policy being indicative of a position change of the current video clip in the video; the method further comprises: in response to reaching a preset condition, using a video clip resulting from executing the current action policy on the current video clip as the target video clip.
-
公开(公告)号:US11256920B2
公开(公告)日:2022-02-22
申请号:US16830895
申请日:2020-03-26
Inventor: Xiang Long , Dongliang He , Fu Li , Zhizhen Chi , Zhichao Zhou , Xiang Zhao , Ping Wang , Hao Sun , Shilei Wen , Errui Ding
Abstract: A method and an apparatus for classifying a video are provided. The method may include: acquiring a to-be-classified video; extracting a set of multimodal features of the to-be-classified video; inputting the set of multimodal features into a post-fusion model corresponding to each modal respectively, to obtain multimodal category information of the to-be-classified video; and fusing the multimodal category information of the to-be-classified video, to obtain category information of the to-be-classified video. This embodiment improves the accuracy of video classification.
-
公开(公告)号:US20200374526A1
公开(公告)日:2020-11-26
申请号:US16797911
申请日:2020-02-21
Inventor: Zhichao Zhou , Dongliang He , Fu Li , Xiang Zhao , Xin Li , Zhizhen Chi , Xiang Long , Hao Sun
IPC: H04N19/14 , H04N19/12 , H04N19/625 , G06N3/08
Abstract: A method, device, apparatus for predicting a video coding complexity and a computer-readable storage medium are provided. The method includes: acquiring an attribute feature of a target video; extracting a plurality of first target image frames from the target video; performing a frame difference calculation on the plurality of the first target image frames, to acquire a plurality of first frame difference images; determining a histogram feature for frame difference images of the target video according to a statistical histogram of each first frame difference image; and inputting a plurality of features of the target video into a coding complexity prediction model to acquire a coding complexity prediction value of the target video. Through the above method, the BPP prediction value can be acquired intelligently.
-
公开(公告)号:US11410422B2
公开(公告)日:2022-08-09
申请号:US16905482
申请日:2020-06-18
Inventor: Dongliang He , Xiang Zhao , Jizhou Huang , Fu Li , Xiao Liu , Shilei Wen
Abstract: A method and an apparatus for grounding a target video clip in a video are provided. The method includes: determining a current video clip in the video based on a current position; acquiring descriptive information indicative of a pre-generated target video clip descriptive feature, and executing a target video clip determining step which includes: determining current state information of the current video clip, wherein the current state information includes information indicative of a feature of the current video clip; generating a current action policy based on the descriptive information and the current state information, the current action policy being indicative of a position change of the current video clip in the video; the method further comprises: in response to reaching a preset condition, using a video clip resulting from executing the current action policy on the current video clip as the target video clip.
-
公开(公告)号:US11259029B2
公开(公告)日:2022-02-22
申请号:US16797911
申请日:2020-02-21
Inventor: Zhichao Zhou , Dongliang He , Fu Li , Xiang Zhao , Xin Li , Zhizhen Chi , Xiang Long , Hao Sun
IPC: H04N19/14 , H04N19/12 , H04N19/625 , G06N3/08
Abstract: A method, device, apparatus for predicting a video coding complexity and a computer-readable storage medium are provided. The method includes: acquiring an attribute feature of a target video; extracting a plurality of first target image frames from the target video; performing a frame difference calculation on the plurality of the first target image frames, to acquire a plurality of first frame difference images; determining a histogram feature for frame difference images of the target video according to a statistical histogram of each first frame difference image; and inputting a plurality of features of the target video into a coding complexity prediction model to acquire a coding complexity prediction value of the target video. Through the above method, the BPP prediction value can be acquired intelligently.
-
公开(公告)号:US20210019531A1
公开(公告)日:2021-01-21
申请号:US16830895
申请日:2020-03-26
Inventor: Xiang Long , Dongliang He , Fu Li , Zhizhen Chi , Zhichao Zhou , Xiang Zhao , Ping Wang , Hao Sun , Shilei Wen , Errui Ding
Abstract: a method and an apparatus for classifying a video are provided. The method may include: acquiring a to-be-classified video; extracting a set of multimodal features of the to-be-classified video; inputting the set of multimodal features into a post-fusion model corresponding to each modal respectively, to obtain multimodal category information of the to-be-classified video; and fusing the multimodal category information of the to-be-classified video, to obtain category information of the to-be-classified video. This embodiment improves the accuracy of video classification.
-
-
-
-
-
-