-
公开(公告)号:US12062367B1
公开(公告)日:2024-08-13
申请号:US17360903
申请日:2021-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Steve Huynh , Erik Martin Zigman , Ronald Diaz
IPC: G10L15/18 , G06F16/783 , G06F16/901 , G06N20/00 , G10L15/06 , G10L15/183 , H04N21/439 , H04N21/44
CPC classification number: G10L15/1815 , G06F16/783 , G06F16/9024 , G06N20/00 , G10L15/063 , G10L15/1822 , G10L15/183 , H04N21/4394 , H04N21/44008
Abstract: Systems, devices, and methods are provided for processing video streams. Metadata is extracted from an input video stream and processed using a video stream analyzer. The extracted metadata may be correlated along a time dimension. A metadata graph is generated based on relationships between various information present in the video stream as well as external fact sources. Machine learning models may be trained to receive an input phrase, determine a graph query from the input, and determine an output by traversing the metadata graph according to the graph query.
-
公开(公告)号:US11961168B1
公开(公告)日:2024-04-16
申请号:US17332355
申请日:2021-05-27
Applicant: Amazon Technologies, Inc.
Inventor: Steve Huynh , Erik Martin Zigman , Ronald Diaz
IPC: G06V10/40 , G06F18/2413 , G06T3/40 , G06T11/60
CPC classification number: G06T11/60 , G06F18/24147 , G06T3/40 , G06V10/40 , G06V2201/10
Abstract: Systems, devices, and methods are provided for processing images using machine learning. Features may be obtained from an image using a residual network, such as ResNet-101. Features may be analyzed using a classification model such as K-nearest neighbors (K-NN). Features and metadata extracted from images may be used to generate other images. Templates may be used to generate various types of images. For example, assets from two images may be combined to create a third image.
-
公开(公告)号:US10832692B1
公开(公告)日:2020-11-10
申请号:US16049369
申请日:2018-07-30
Applicant: Amazon Technologies, Inc.
Inventor: Ronald Diaz , Srikanth Kotagiri
IPC: G06F17/00 , G10L19/018 , G10L15/26 , G06N7/00 , G06N20/00 , G06F16/683
Abstract: Techniques are described for verifying that an audio file corresponds to an instance of media content. An audio file is divided into a plurality of audio segments, and a digital fingerprint is generated for each of the plurality of audio segments. A digital signature is generated for the audio file by aggregating the digital fingerprints. The generated digital signature and at least one other digital signature corresponding to an instance of media content are processed as inputs to a linear regression machine learning model, to determine a measure of similarity between the generated digital signature and the at least one other digital signature. The linear regression machine learning model can be trained using a supervised learning approach and a set of training data. Embodiments determine whether the audio file corresponds to the instance of media content, based on the measure of similarity.
-
-