Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Muhammad Raffay Hamid"

1.

发明授权
Contrastive learning of scene representation guided by video similarities 有权

公开(公告)号：US12067779B1

公开(公告)日：2024-08-20

申请号：US17668014

申请日：2022-02-09

Applicant: Amazon Technologies, Inc.

Inventor： Shixing Chen , Xiang Hao , Xiaohan Nie , Muhammad Raffay Hamid

IPC: G06V20/40 , G06V10/774

CPC classification number: G06V20/48 , G06V10/774 , G06V20/46

Abstract: A plurality of similar video pairs may be determined based on one or more similarity information types. Each video pair of the plurality of similar video pairs may include a first respective video and a second respective video. For each video pair, one or more similar scene pairs may be determined. Each of the one or more similar scene pairs may include a respective first scene from the first respective video and a second respective scene from the second respective video. An encoder may be trained using a contrastive learning model that contrasts a plurality of similar scene pairs with a plurality of random scenes. The plurality of similar scene pairs may include the one or more scene pairs for each video pair. One or more scene features of one or more other scenes of one or more other videos may be determined using the encoder.

2.

发明授权
Depth-guided structure-from-motion techniques 有权

公开(公告)号：US12046002B1

公开(公告)日：2024-07-23

申请号：US17684197

申请日：2022-03-01

Applicant: Amazon Technologies, Inc.

Inventor： Xiaohan Nie , Michael Thomas Pecchia , Leo Chan , Ahmed Aly Saad Ahmed , Muhammad Raffay Hamid , Sheng Liu

IPC: G06T7/73 , G06T7/55

CPC classification number: G06T7/73 , G06T7/55

Abstract: Systems, devices, and methods are provided for depth guided structure from motion. A system may obtain a plurality of image frames from a digital content item that corresponds to a scene and determine, based at least in part on a correspondence search, a set of 2-D keypoints for the plurality of image frames. A depth estimator may be used to determine a plurality of dense depth map for the plurality of image frames. The set of 2-D keypoints and the plurality of dense depth maps may be used to determine a corresponding set of depth priors. Initialization and/or depth-regularized optimization may be performed using the keypoints and depth priors.

3.

发明授权
Language agnostic drift correction 有权

公开(公告)号：US11625928B1

公开(公告)日：2023-04-11

申请号：US17009311

申请日：2020-09-01

Applicant: Amazon Technologies, Inc.

Inventor： Tamojit Chatterjee , Mayank Sharma , Muhammad Raffay Hamid , Sandeep Joshi

IPC: G06F17/00 , G06V20/62 , G11B27/10 , G06N7/00 , G06F40/169 , G06V20/40

Abstract: Systems, methods, and computer-readable media are disclosed for language-agnostic subtitle drift detection and correction. A method may include determining subtitles and/or captions from media content (e.g., videos), the subtitles and/or captions corresponding to dialog in the media content. The subtitles may be broken up into segments which may be analyzed to determine a likelihood of drift (e.g., a likelihood that the subtitles are out of synchronization with the dialog in the media content) for each segment. For segments with a high likelihood of drift, the subtitles may be incrementally adjusted to determine an adjustment that eliminates and/or reduces the amount of drift and the drift in the segment may be corrected based on the drift amount detected. A linear regression model and/or human blocks determined by human operators may be used to otherwise optimize drift correction.

4.

发明授权
Temporal localization of mature content in long-form videos using only video-level labels 有权

公开(公告)号：US11829413B1

公开(公告)日：2023-11-28

申请号：US17030103

申请日：2020-09-23

Applicant: Amazon Technologies, Inc.

Inventor： Xiang Hao , Jingxiang Chen , Vernon Germano , Muhammad Raffay Hamid , Lakshay Sharma

IPC: G06F16/783 , G06N20/00 , G06F16/75

CPC classification number: G06F16/7847 , G06F16/75 , G06N20/00

Abstract: Techniques for temporal localization of mature content in long-form videos using only video-level labels are described. According to some embodiments, computer-implemented method includes receiving a request to train a machine learning model on a training video file comprising at least one mature content label, training the machine learning model to generate a feature vector for each of a plurality of video frames of the training video file, generate a plurality of frame-level mature content classification scores of the training video file from the feature vectors of the training video file, and generate a video-level mature content classification score of the training video file from the plurality of frame-level mature content classification scores for the training video file based at least in part on the at least one mature content label of the training video file, receiving a request for an input video file, generating, by the machine learning model in response to the request, a feature vector for each of a plurality of video frames of the input video file, a plurality of frame-level mature content classification scores of the input video file from the feature vectors of the input video file, and a video-level mature content classification score of the input video file from the plurality of frame-level mature content classification scores for the input video file, and transmitting the plurality of frame-level mature content classification scores of the input video file or the video-level mature content classification score of the input video file to a client application or to a storage location.

5.

发明授权
Shot contras five self-supervised learning of a plurality of machine learning models for video analysis applications 有权

公开(公告)号：US11748988B1

公开(公告)日：2023-09-05

申请号：US17236688

申请日：2021-04-21

Applicant: Amazon Technologies, Inc.

Inventor： Shixing Chen , Xiaohan Nie , David Jiatian Fan , Dongqing Zhang , Vimal Bhat , Muhammad Raffay Hamid

IPC: G06V20/40 , G06N20/00 , G06N5/04 , G06F16/73 , G06F16/78 , G11B27/34 , H04N5/14 , G11B27/036 , G06V10/75 , G06F18/22 , G06F18/214

CPC classification number: G06V20/46 , G06F16/73 , G06F16/78 , G06F18/214 , G06F18/22 , G06N5/04 , G06N20/00 , G06V10/751 , G06V20/49 , G11B27/036 , G11B27/34 , H04N5/147

Abstract: Techniques for automatic scene change detection in a video are described. As one example, a computer-implemented method includes extracting features of a query shot and its neighboring shots of a first set of shots without labels with a query model, determining a key shot of the neighboring shots which is most similar to the query shot based at least in part on the features of the query shot and its neighboring shots, extracting features of the key shot with a key model, training the query model into a trained query model based at least in part on a comparison of the features of the query shot and the features of the key shot, extracting features of a second set of shots with labels with the trained query model, and training a temporal model into a trained temporal model based at least in part on the features extracted from the second set of shots and the labels of the second set of shots.

6.

发明授权
Systems and methods for video-based sports field registration 有权

公开(公告)号：US11468578B2

公开(公告)日：2022-10-11

申请号：US16948348

申请日：2020-09-14

Applicant: Amazon Technologies, Inc.

Inventor： Xiaohan Nie , Muhammad Raffay Hamid

IPC: G06T7/33 , G06T7/73 , G06V20/40 , G06V10/75 , G06K9/62 , H04N5/272 , H04N21/2187 , H04N21/234

Abstract: Methods and systems are described for registering a sports field to a video. Video of a live event may feature participants at a venue. A template of the venue, including virtual markings that represent real markings on the venue, may be obtained. A homographic transformation between an image plane and a ground plane may be determined by matching virtual markings to corresponding real markings captured in at least one frame of the video. The determined homographic transformation may be used in the automated analysis of sports statistics and in improving inserted annotations and visualizations.

7.

发明授权
Systems and methods for content-based indexing of videos at web-scale 有权

公开(公告)号：US11341185B1

公开(公告)日：2022-05-24

申请号：US16386992

申请日：2019-04-17

Applicant: Amazon Technologies, Inc.

Inventor： Muhammad Raffay Hamid

IPC: G06F16/71 , G06F16/783 , G10L25/57 , G06V20/40 , G06K9/62

Abstract: Techniques for content-based indexing of videos at web-scale are described. As one example, a computer-implemented method includes receiving a video file, splitting the video file into video frames and audio for the video frames, determining audial features for the audio, clustering each of a plurality of subsets of the audial features into a respective audio centroid for a shared set of bases, determining a first adjacency matrix of distances between the respective audio centroids, determining visual features for the video frames, clustering each of a plurality of subsets of the visual features into a respective video centroid, and determining a second adjacency matrix of distances between the respective video centroids.

8.

发明申请
SYSTEMS AND METHODS FOR VIDEO-BASED SPORTS FIELD REGISTRATION 有权

公开(公告)号：US20220084222A1

公开(公告)日：2022-03-17

申请号：US16948348

申请日：2020-09-14

Applicant: Amazon Technologies, Inc.

Inventor： Xiaohan Nie , Muhammad Raffay Hamid

IPC: G06T7/33 , G06K9/00 , G06K9/62 , G06T7/73 , H04N5/272

Abstract: Methods and systems are described for registering a sports field to a video. Video of a live event may feature participants at a venue. A template of the venue, including virtual markings that represent real markings on the venue, may be obtained. A homographic transformation between an image plane and a ground plane may be determined by matching virtual markings to corresponding real markings captured in at least one frame of the video. The determined homographic transformation may be used in the automated analysis of sports statistics and in improving inserted annotations and visualizations.

9.

发明授权
Language agnostic automated voice activity detection 有权

公开(公告)号：US11205445B1

公开(公告)日：2021-12-21

申请号：US16436351

申请日：2019-06-10

Applicant: Amazon Technologies, Inc.

Inventor： Mayank Sharma , Sandeep Joshi , Muhammad Raffay Hamid

IPC: G10L25/84 , G10L15/22 , G10L25/18 , G10L15/06 , G10L15/16

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for language agnostic automated voice activity detection. Example methods may include determining an audio file associated with video content, generating a number of audio segments using the audio file, the plurality of audio segments including a first segment and a second segment, where the first segment and the second segment are consecutive segments. Example methods may include determining, using a Gated Recurrent Unit neural network, that the first segment includes first voice activity, determining, using the Gated Recurrent Unit neural network, that the second segment includes second voice activity, and determining that voice activity is present between a first timestamp associated with the first segment and a second timestamp associated with the second segment.

10.

发明授权
Systems and methods for video-based sports field registration 有权

公开(公告)号：US12211222B2

公开(公告)日：2025-01-28

申请号：US18481179

申请日：2023-10-04

Applicant: Amazon Technologies, Inc.

Inventor： Xiaohan Nie , Muhammad Raffay Hamid

IPC: G06T7/33 , G06F18/21 , G06F18/214 , G06F18/40 , G06T7/73 , G06V10/75 , G06V20/40 , H04N5/272 , H04N21/2187 , H04N21/234

Abstract: Methods and systems are described for registering a sports field to a video. Video of a live event may feature participants at a venue. A template of the venue, including virtual markings that represent real markings on the venue, may be obtained. A homographic transformation between an image plane and a ground plane may be determined by matching virtual markings to corresponding real markings captured in at least one frame of the video. The determined homographic transformation may be used in the automated analysis of sports statistics and in improving inserted annotations and visualizations.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification