-
公开(公告)号:US11037304B1
公开(公告)日:2021-06-15
申请号:US16127052
申请日:2018-09-10
Applicant: Amazon Technologies, Inc.
Inventor: Ryan Barlow Dall , Hooman Mahyar
IPC: G06F16/738 , G06T7/246 , H04N21/439 , G06N3/04
Abstract: This disclosure is directed to systems and methods that automatically detects static content within a media item. While consuming a media item, such as a movie, a user might notice, unexpectedly, that a portion of the movie does not change resulting in a poor user experience. By dividing the media item into portions and analyzing the portions, the systems and methods described can automatically detect the static content and, in some instances, correct the static content.
-
公开(公告)号:US10999566B1
公开(公告)日:2021-05-04
申请号:US16563485
申请日:2019-09-06
Applicant: Amazon Technologies, Inc.
Inventor: Hooman Mahyar , Vimal Bhat , Jatin Jain , Udit Bhatia , Roya Hosseini
IPC: H04N9/87 , G06K9/00 , G06F16/738 , G06N3/04 , G06F16/78 , G06F40/166 , G10L13/00 , G10L17/00
Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation of textual descriptions of video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames and first audio content, determining, using a first neural network, a first action that occurs in the first set of frames, and determining a first sound present in the first audio content. Some methods may include generating a vector representing the first action and the first sound, and generating, using a second neural network and the vector, a first textual description of the first segment, where the first textual description includes words that describe events of the first segment.
-
公开(公告)号:US20200175303A1
公开(公告)日:2020-06-04
申请号:US16208074
申请日:2018-12-03
Applicant: Amazon Technologies, Inc.
Inventor: Vimal Bhat , Shai Ben Nun , Hooman Mahyar , Harshal Wanjari
IPC: G06K9/32 , G06N20/00 , G06K9/00 , H04N21/8549 , H04L12/58
Abstract: A user may indicate an interest relating to events such as objects, persons, or activities, where the events included in content depicted in a video. The user may also indicate a configurable action associated with the user interest, including receiving a notification via an electronic device. A video item, for example a live-streaming sporting event, may be broken into frames and analyzed frame-by-frame to determine a region of interest. The region of interest is then analyzed to identify objects, persons, or activities depicted in the frame. In particular, the region of interest is compared to stored images that are known to depict different objects, persons, or activities. When a region of interest is determined to be associated with the user interest, the configurable action is triggered.
-
公开(公告)号:US10423660B1
公开(公告)日:2019-09-24
申请号:US15835256
申请日:2017-12-07
Applicant: Amazon Technologies, Inc.
Inventor: Donghyeok Heo , Hooman Mahyar
IPC: G06F17/30 , G10L15/26 , G06F16/68 , H04N21/488 , G06F16/30
Abstract: Techniques for identifying and correcting synchronization errors between audio and subtitles for media content are described herein. For example, a portion of a subtitle file associated with media content may be extracted based on subtitle cues included in the portion of the subtitle file. In embodiments, an audio to text file may be generated from the extracted portion using a speech to text algorithm. A detected subtitle text file may be generated using the subtitle file, the audio to text file, and an edit distance algorithm. In embodiments, one or more synchronization errors between the audio and subtitles for the media content may be identified based on time stamp information associated with the audio to text file and a subtitle cue for the extracted portion of the subtitle file.
-
-
-