Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Vimal Bhat"

11.

发明授权
Automated generation and presentation of textual descriptions of video content 有权

公开(公告)号：US10999566B1

公开(公告)日：2021-05-04

申请号：US16563485

申请日：2019-09-06

Applicant: Amazon Technologies, Inc.

Inventor： Hooman Mahyar , Vimal Bhat , Jatin Jain , Udit Bhatia , Roya Hosseini

IPC: H04N9/87 , G06K9/00 , G06F16/738 , G06N3/04 , G06F16/78 , G06F40/166 , G10L13/00 , G10L17/00

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation of textual descriptions of video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames and first audio content, determining, using a first neural network, a first action that occurs in the first set of frames, and determining a first sound present in the first audio content. Some methods may include generating a vector representing the first action and the first sound, and generating, using a second neural network and the vector, a first textual description of the first segment, where the first textual description includes words that describe events of the first segment.

12.

发明申请
CUSTOMIZED ACTION BASED ON VIDEO ITEM EVENTS 审中-公开

公开(公告)号：US20200175303A1

公开(公告)日：2020-06-04

申请号：US16208074

申请日：2018-12-03

Applicant: Amazon Technologies, Inc.

Inventor： Vimal Bhat , Shai Ben Nun , Hooman Mahyar , Harshal Wanjari

IPC: G06K9/32 , G06N20/00 , G06K9/00 , H04N21/8549 , H04L12/58

Abstract: A user may indicate an interest relating to events such as objects, persons, or activities, where the events included in content depicted in a video. The user may also indicate a configurable action associated with the user interest, including receiving a notification via an electronic device. A video item, for example a live-streaming sporting event, may be broken into frames and analyzed frame-by-frame to determine a region of interest. The region of interest is then analyzed to identify objects, persons, or activities depicted in the frame. In particular, the region of interest is compared to stored images that are known to depict different objects, persons, or activities. When a region of interest is determined to be associated with the user interest, the configurable action is triggered.

13.

发明授权
Composite storefront image management 有权

公开(公告)号：US12211131B1

公开(公告)日：2025-01-28

申请号：US17935796

申请日：2022-09-27

Applicant: Amazon Technologies, Inc.

Inventor： Najmeh Sadoughi Nourabadi , Rohith Mysore Vijaya Kumar , Vimal Bhat

IPC: G06T11/60 , G06T7/194 , G06T7/60 , G06T7/70 , G06V10/74 , G06V10/762 , G06V10/774

Abstract: Technologies are disclosed for managing composite storefront images. The composite storefront images can be generated utilizing portions of media content (e.g., visual content, such as videos, trailers, etc.), and templates generated based on sets of existing storefront images. Objects can be extracted from frames and composited together to generate course composite images. The course composite storefront images can be utilized to map the object portions onto the media content artwork styles to generate refined composite storefront images based on the templates.

14.

发明公开
Automated Generation and Presentation of Sign Language Avatars for Video Content 审中-公开

公开(公告)号：US20240242413A1

公开(公告)日：2024-07-18

申请号：US18432623

申请日：2024-02-05

Applicant: Amazon Technologies, Inc.

Inventor： Avijit Vajpayee , Vimal Bhat , Arjun Cholkar , Louis Kirk Barker , Abhinav Jain

IPC: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/40 , G06V40/16 , G06V40/20 , G09B21/00 , G10L25/63 , H04N5/272

CPC classification number: G06T13/40 , G06F40/20 , G06N3/08 , G06T17/00 , G06V20/46 , G06V40/174 , G06V40/28 , G09B21/009 , G10L25/63 , H04N5/272

Abstract: Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation and presentation of sign language avatars for video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames, first audio content, and first subtitle data, where the first subtitle data comprises a first word and a second word. Methods may include determining, using a first machine learning model, a first sign gesture associated with the first word, determining first motion data associated with the first sign gesture, and determining first facial expression data. Methods may include generating an avatar configured to perform the first sign gesture using the first motion data, where a facial expression of the avatar while performing the first sign gesture is based on the first facial expression data.

15.

发明授权
Ensemble of machine learning models for automatic scene change detection 有权

公开(公告)号：US11776273B1

公开(公告)日：2023-10-03

申请号：US17107514

申请日：2020-11-30

Applicant: Amazon Technologies, Inc.

Inventor： Shixing Chen , Muhammad Raffay Hamid , Vimal Bhat , Shiva Krishnamurthy

IPC: G06V20/40 , G06N5/04 , G06N20/20 , G10L25/78 , G06F18/213

CPC classification number: G06V20/49 , G06F18/213 , G06N5/04 , G06N20/20 , G10L25/78

Abstract: Techniques for automatic scene change detection are described. As one example, a computer-implemented method includes receiving a request to train an ensemble of machine learning models on a training dataset of videos having labels that indicate scene changes to detect a scene change in a video, partitioning each video file of the training dataset of videos into a plurality of shots, training the ensemble of machine learning models into a trained ensemble of machine learning models based at least in part on the plurality of shots of the training dataset of videos and the labels that indicate scene changes, receiving an inference request for an input video, partitioning the input video into a plurality of shots, generating, by the trained ensemble of machine learning models, an inference of one or more scene changes in the input video based at least in part on the plurality of shots of the input video, and transmitting the inference to a client application or to a storage location.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification