-
公开(公告)号:US10999566B1
公开(公告)日:2021-05-04
申请号:US16563485
申请日:2019-09-06
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Vimal Bhat , Jatin Jain , Udit Bhatia , Roya Hosseini
IPC: H04N9/87 , G06K9/00 , G06F16/738 , G06N3/04 , G06F16/78 , G06F40/166 , G10L13/00 , G10L17/00
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation of textual descriptions of video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames and first audio content, determining, using a first neural network, a first action that occurs in the first set of frames, and determining a first sound present in the first audio content. Some methods may include generating a vector representing the first action and the first sound, and generating, using a second neural network and the vector, a first textual description of the first segment, where the first textual description includes words that describe events of the first segment.
-
公开(公告)号:US20200175303A1
公开(公告)日:2020-06-04
申请号:US16208074
申请日:2018-12-03
Applicant: Amazon Technologies, Inc.
Inventor: Vimal Bhat , Shai Ben Nun , Hooman Mahyar , Harshal Wanjari
IPC: G06K9/32 , G06N20/00 , G06K9/00 , H04N21/8549 , H04L12/58
Abstract: A user may indicate an interest relating to events such as objects, persons, or activities, where the events included in content depicted in a video. The user may also indicate a configurable action associated with the user interest, including receiving a notification via an electronic device. A video item, for example a live-streaming sporting event, may be broken into frames and analyzed frame-by-frame to determine a region of interest. The region of interest is then analyzed to identify objects, persons, or activities depicted in the frame. In particular, the region of interest is compared to stored images that are known to depict different objects, persons, or activities. When a region of interest is determined to be associated with the user interest, the configurable action is triggered.
-
公开(公告)号:US12211131B1
公开(公告)日:2025-01-28
申请号:US17935796
申请日:2022-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Najmeh Sadoughi Nourabadi , Rohith Mysore Vijaya Kumar , Vimal Bhat
IPC: G06T11/60 , G06T7/194 , G06T7/60 , G06T7/70 , G06V10/74 , G06V10/762 , G06V10/774
Abstract: Technologies are disclosed for managing composite storefront images. The composite storefront images can be generated utilizing portions of media content (e.g., visual content, such as videos, trailers, etc.), and templates generated based on sets of existing storefront images. Objects can be extracted from frames and composited together to generate course composite images. The course composite storefront images can be utilized to map the object portions onto the media content artwork styles to generate refined composite storefront images based on the templates.
-
公开(公告)号:US20240242413A1
公开(公告)日:2024-07-18
申请号:US18432623
申请日:2024-02-05
Applicant: Amazon Technologies, Inc.
Inventor: Avijit Vajpayee , Vimal Bhat , Arjun Cholkar , Louis Kirk Barker , Abhinav Jain
IPC: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/40 , G06V40/16 , G06V40/20 , G09B21/00 , G10L25/63 , H04N5/272
CPC classification number: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/46 , G06V40/174 , G06V40/28 , G09B21/009 , G10L25/63 , H04N5/272
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation and presentation of sign language avatars for video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames, first audio content, and first subtitle data, where the first subtitle data comprises a first word and a second word. Methods may include determining, using a first machine learning model, a first sign gesture associated with the first word, determining first motion data associated with the first sign gesture, and determining first facial expression data. Methods may include generating an avatar configured to perform the first sign gesture using the first motion data, where a facial expression of the avatar while performing the first sign gesture is based on the first facial expression data.
-
公开(公告)号:US11776273B1
公开(公告)日:2023-10-03
申请号:US17107514
申请日:2020-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Shixing Chen , Muhammad Raffay Hamid , Vimal Bhat , Shiva Krishnamurthy
IPC: G06V20/40 , G06N5/04 , G06N20/20 , G10L25/78 , G06F18/213
CPC classification number: G06V20/49 , G06F18/213 , G06N5/04 , G06N20/20 , G10L25/78
Abstract: Techniques for automatic scene change detection are described. As one example, a computer-implemented method includes receiving a request to train an ensemble of machine learning models on a training dataset of videos having labels that indicate scene changes to detect a scene change in a video, partitioning each video file of the training dataset of videos into a plurality of shots, training the ensemble of machine learning models into a trained ensemble of machine learning models based at least in part on the plurality of shots of the training dataset of videos and the labels that indicate scene changes, receiving an inference request for an input video, partitioning the input video into a plurality of shots, generating, by the trained ensemble of machine learning models, an inference of one or more scene changes in the input video based at least in part on the plurality of shots of the input video, and transmitting the inference to a client application or to a storage location.
-
-
-
-