-
公开(公告)号:US11748988B1
公开(公告)日:2023-09-05
申请号:US17236688
申请日:2021-04-21
Applicant: Amazon Technologies, Inc.
Inventor: Shixing Chen , Xiaohan Nie , David Jiatian Fan , Dongqing Zhang , Vimal Bhat , Muhammad Raffay Hamid
IPC: G06V20/40 , G06N20/00 , G06N5/04 , G06F16/73 , G06F16/78 , G11B27/34 , H04N5/14 , G11B27/036 , G06V10/75 , G06F18/22 , G06F18/214
CPC classification number: G06V20/46 , G06F16/73 , G06F16/78 , G06F18/214 , G06F18/22 , G06N5/04 , G06N20/00 , G06V10/751 , G06V20/49 , G11B27/036 , G11B27/34 , H04N5/147
Abstract: Techniques for automatic scene change detection in a video are described. As one example, a computer-implemented method includes extracting features of a query shot and its neighboring shots of a first set of shots without labels with a query model, determining a key shot of the neighboring shots which is most similar to the query shot based at least in part on the features of the query shot and its neighboring shots, extracting features of the key shot with a key model, training the query model into a trained query model based at least in part on a comparison of the features of the query shot and the features of the key shot, extracting features of a second set of shots with labels with the trained query model, and training a temporal model into a trained temporal model based at least in part on the features extracted from the second set of shots and the labels of the second set of shots.
-
公开(公告)号:US11532111B1
公开(公告)日:2022-12-20
申请号:US17344690
申请日:2021-06-10
Applicant: Amazon Technologies, Inc.
Inventor: Dongqing Zhang , Muhammad Raffay Hamid , Xiaohan Nie , Shixing Chen
IPC: G06F17/00 , G06T11/60 , G11B27/031 , G10L15/26 , G06F40/134 , G06V20/40 , G06V40/16
Abstract: Techniques for a comic book feature are described herein. A visual data stream of a video may be parsed into a plurality of frames. Scene boundaries may be determined to generate a scene using the plurality of frames where a scene includes a subset of frames. A key frame may be determined for the scene using the subset of frames. An audio portion of an audio data stream of the video may be identified that maps to the subset of frames based on time information. The key frame may be converted to a comic image based on an algorithm. First dimensions and placement for a data object may be determined for the comic image. The data object may include the audio portion for the comic image. A comic panel may be generated for the comic image that incorporates the data object using the determined first dimensions and the placement.
-
公开(公告)号:US12067779B1
公开(公告)日:2024-08-20
申请号:US17668014
申请日:2022-02-09
Applicant: Amazon Technologies, Inc.
Inventor: Shixing Chen , Xiang Hao , Xiaohan Nie , Muhammad Raffay Hamid
IPC: G06V20/40 , G06V10/774
CPC classification number: G06V20/48 , G06V10/774 , G06V20/46
Abstract: A plurality of similar video pairs may be determined based on one or more similarity information types. Each video pair of the plurality of similar video pairs may include a first respective video and a second respective video. For each video pair, one or more similar scene pairs may be determined. Each of the one or more similar scene pairs may include a respective first scene from the first respective video and a second respective scene from the second respective video. An encoder may be trained using a contrastive learning model that contrasts a plurality of similar scene pairs with a plurality of random scenes. The plurality of similar scene pairs may include the one or more scene pairs for each video pair. One or more scene features of one or more other scenes of one or more other videos may be determined using the encoder.
-
公开(公告)号:US11734930B1
公开(公告)日:2023-08-22
申请号:US17662608
申请日:2022-05-09
Applicant: Amazon Technologies, Inc.
Inventor: Kewen Chen , Tu Anh Ho , Muhammad Raffay Hamid , Shixing Chen
CPC classification number: G06V20/46 , G06V20/49 , G06V40/165 , G06V40/168
Abstract: Methods and apparatus are described for generating compelling preview clips of media presentations. Compelling clips are identified based on the extent to which human faces are shown and/or the loudness of the audio associated with the clips. One or more of these compelling clips are then provided to a client device for playback.
-
公开(公告)号:US11776273B1
公开(公告)日:2023-10-03
申请号:US17107514
申请日:2020-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Shixing Chen , Muhammad Raffay Hamid , Vimal Bhat , Shiva Krishnamurthy
IPC: G06V20/40 , G06N5/04 , G06N20/20 , G10L25/78 , G06F18/213
CPC classification number: G06V20/49 , G06F18/213 , G06N5/04 , G06N20/20 , G10L25/78
Abstract: Techniques for automatic scene change detection are described. As one example, a computer-implemented method includes receiving a request to train an ensemble of machine learning models on a training dataset of videos having labels that indicate scene changes to detect a scene change in a video, partitioning each video file of the training dataset of videos into a plurality of shots, training the ensemble of machine learning models into a trained ensemble of machine learning models based at least in part on the plurality of shots of the training dataset of videos and the labels that indicate scene changes, receiving an inference request for an input video, partitioning the input video into a plurality of shots, generating, by the trained ensemble of machine learning models, an inference of one or more scene changes in the input video based at least in part on the plurality of shots of the input video, and transmitting the inference to a client application or to a storage location.
-
公开(公告)号:US11354905B1
公开(公告)日:2022-06-07
申请号:US17247324
申请日:2020-12-07
Applicant: Amazon Technologies, Inc.
Inventor: Kewen Chen , Tu Anh Ho , Muhammad Raffay Hamid , Shixing Chen
Abstract: Methods and apparatus are described for generating compelling preview clips of media presentations. Compelling clips are identified based on the extent to which human faces are shown and/or the loudness of the audio associated with the clips. One or more of these compelling clips are then provided to a client device for playback.
-
-
-
-
-