AUTOMATED WORKFLOWS FROM MEDIA ASSET DIFFERENTIALS

    公开(公告)号:US20220021911A1

    公开(公告)日:2022-01-20

    申请号:US17245252

    申请日:2021-04-30

    Applicant: Netflix, Inc.

    Abstract: The disclosed computer-implemented method may include (1) accessing a first media data object and a different, second media data object that, when played back, each render temporally sequenced content, (2) comparing first temporally sequenced content represented by the first media data object with second temporally sequenced content represented by the second media data object to identify a set of common temporal subsequences between the first media data object and the second media data object, (3) identifying a set of edits relative to the set of common temporal subsequences that describe a difference between the temporally sequenced content of the first media data object and the temporally sequenced content of the second media data object, and (4) executing a workflow relating to the first media data object and/or the second media data object based on the set of edits. Various other methods, systems, and computer-readable media are also disclosed.

    METHODS AND SYSTEMS FOR LEARNING LANGUAGE-INVARIANT AUDIOVISUAL REPRESENTATIONS

    公开(公告)号:US20240161500A1

    公开(公告)日:2024-05-16

    申请号:US18505081

    申请日:2023-11-08

    Applicant: Netflix, Inc.

    CPC classification number: G06V20/41 G06V10/82

    Abstract: The disclosed computer-implemented methods and systems include training a machine-learning model to accurately generate representations of similar scenes from long-form videos that have semantically different speech audio. For example, the methods and systems described herein generate machine-learning model training data including video clips and corresponding audio spectrograms. To augment this data, the methods and systems described herein further include dubbed audio spectrograms with the training data such that each video clips corresponds with a primary language audio spectrogram and a secondary language audio spectrogram. By applying a machine-learning model to this training data, the systems and methods described herein teach the machine-learning model to de-emphasize speech audio when generating audio visual representations corresponding to scenes from long-form video. Various other methods, systems, and computer-readable media are also disclosed.

Patent Agency Ranking