-
公开(公告)号:US11659217B1
公开(公告)日:2023-05-23
申请号:US17301212
申请日:2021-03-29
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Avijit Vajpayee , Abhinav Jain , Arjun Cholkar , Vimal Bhat
IPC: H04N21/242 , H04N21/234 , H04N21/233
CPC classification number: H04N21/242 , H04N21/233 , H04N21/234
Abstract: Techniques are described for detecting desynchronization between an audio component and a video component of a media presentation. Feature sets may be determined for portions of the audio component and portions of the video component, which may then be used to generate correlations between portions of the audio component and portions of the video component. Synchronization may then be assessed based on the correlations.
-
公开(公告)号:US11935170B1
公开(公告)日:2024-03-19
申请号:US17530070
申请日:2021-11-18
Applicant: Amazon Technologies, Inc.
Inventor: Abhinav Jain , Avijit Vajpayee , Vimal Bhat , Arjun Cholkar , Louis Kirk Barker
IPC: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/40 , G06V40/16 , G06V40/20 , G09B21/00 , G10L25/63 , H04N5/272
CPC classification number: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/46 , G06V40/174 , G06V40/28 , G09B21/009 , G10L25/63 , H04N5/272
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation and presentation of sign language avatars for video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames, first audio content, and first subtitle data, where the first subtitle data comprises a first word and a second word. Methods may include determining, using a first machine learning model, a first sign gesture associated with the first word, determining first motion data associated with the first sign gesture, and determining first facial expression data. Methods may include generating an avatar configured to perform the first sign gesture using the first motion data, where a facial expression of the avatar while performing the first sign gesture is based on the first facial expression data.
-
公开(公告)号:US11595614B1
公开(公告)日:2023-02-28
申请号:US17475180
申请日:2021-09-14
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Arjun Cholkar
Abstract: Intelligent reframing techniques are described in which content (e.g., a movie) can be generated in a different aspect ratio than previously provided. These techniques include obtaining various video frames having a first aspect ratio. Various objects can be identified within the frames. An object having the highest degree of importance in a frame can be selected and a focal point can be calculated based at least in part on that object. A modified version of the content can be generated in a second aspect ratio that is different from the first aspect ratio. The modified version can be generated using the focal point calculated based on the object having the greatest degree of importance. Using these techniques, the content can be provided in a different aspect ratio while ensuring that the most important features of the frame still appear in the new version of the content.
-
公开(公告)号:US20240242413A1
公开(公告)日:2024-07-18
申请号:US18432623
申请日:2024-02-05
Applicant: Amazon Technologies, Inc.
Inventor: Avijit Vajpayee , Vimal Bhat , Arjun Cholkar , Louis Kirk Barker , Abhinav Jain
IPC: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/40 , G06V40/16 , G06V40/20 , G09B21/00 , G10L25/63 , H04N5/272
CPC classification number: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/46 , G06V40/174 , G06V40/28 , G09B21/009 , G10L25/63 , H04N5/272
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation and presentation of sign language avatars for video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames, first audio content, and first subtitle data, where the first subtitle data comprises a first word and a second word. Methods may include determining, using a first machine learning model, a first sign gesture associated with the first word, determining first motion data associated with the first sign gesture, and determining first facial expression data. Methods may include generating an avatar configured to perform the first sign gesture using the first motion data, where a facial expression of the avatar while performing the first sign gesture is based on the first facial expression data.
-
公开(公告)号:US11582522B1
公开(公告)日:2023-02-14
申请号:US17332498
申请日:2021-05-27
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Shiva Krishnamurthy , Steven David Prinz , Craig Critchley , Arjun Cholkar , Andrew James McVeigh
IPC: H04N21/4722 , H04N21/45 , G06F16/78 , H04N21/431 , H04N21/478 , H04N21/4402
Abstract: A system can be configured to receive entertainment content requested by a user and identify content segments and content features from the entertainment content. The content segments can be utilized to identify portions of the entertainment content for enrichment and/or enhancement by the system. The content features can be utilized to associate the entertainment content and the content segments with supplemental content that includes or is associated with the content features. The content features can indicate genres, scene classifications, significant figures credited with creating the entertainment content, and other points of interests for users interested in the entertainment content. The associations between the entertainment content and the supplemental content can enable the system to engage the users by presenting the supplemental content determined to match interests of the users.
-
公开(公告)号:US10904476B1
公开(公告)日:2021-01-26
申请号:US16712294
申请日:2019-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Christian Garcia Siagian , Charles Effinger , David Niu , Yang Yu , Narayan Sundaram , Arjun Cholkar , Ramakanth Mudumba
Abstract: Techniques for automated up-sampling of media files are provided. In some examples, a title associated with a media file, a metadata file associated with the title, and the media file may be received. The media file may be partitioned into one or more scene files, each scene file including a plurality of frame images in a sequence. One or more up-sampled scene files may be generated, each corresponding to a scene file of the one or more scene files. An up-sampled media file may be generated by combining at least a subset of the one or more up-sampled scene files. Generating one or more up-sampled scene files may include identifying one or more characters in a frame image of the plurality of frame images, based at least in part on implementation of a facial recognition algorithm including deep learning features in a neural network.
-
公开(公告)号:US12047650B1
公开(公告)日:2024-07-23
申请号:US17521668
申请日:2021-11-08
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Shivakumar Krishnamurthy , Arjun Cholkar , Rafael Soltanovich
IPC: H04N21/2187 , H04N21/25 , H04N21/472 , H04N21/845
CPC classification number: H04N21/47217 , H04N21/2187 , H04N21/251 , H04N21/8456
Abstract: Techniques for using a machine learning model to determine a proper subset of a multimedia file for a viewer based on their interest without the need to actively control a media player timeline are described. As one example, a computer-implemented method includes receiving a request at a content delivery service from a media player of a viewer to play a proper subset of a live multimedia file for a category of content of the live multimedia file without the viewer actively controlling a timeline of the media player of the viewer, determining an indication of a prior multimedia playing interaction of the viewer with the content delivery service, partitioning, by the content delivery service, the live multimedia file into a video portion, an audio portion, and a text portion, determining, by the content delivery service, one or more labels for the video portion, the audio portion, and the text portion, determining, by a machine learning model of the content delivery service, a proper subset of segments of the live multimedia file to send to the viewer based at least in part on the indication and the one or more labels, and live streaming the proper subset of segments of the live multimedia file to the media player of the viewer.
-
公开(公告)号:US11321877B1
公开(公告)日:2022-05-03
申请号:US17000585
申请日:2020-08-24
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Arjun Cholkar , Harshal Dilip Wanjari
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated selection of color palettes for video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment comprising a first set of frames, determining, using a first video processing algorithm, a first object that is present in the first set of frames, and determining, using a second video processing algorithm, a first semantic characteristic of the first segment. Some example methods may include generating a first vector representing the first object and the first semantic characteristic, and generating, using a first neural network and the first vector, a first color palette recommendation for the first segment. Selection of the first color palette recommendation may cause a color filter to be applied to the first set of frames.
-
-
-
-
-
-
-