USING AUDIO SEPARATION AND CLASSIFICATION TO ENHANCE AUDIO IN VIDEOS

    公开(公告)号:US20250117185A1

    公开(公告)日:2025-04-10

    申请号:US18904981

    申请日:2024-10-02

    Applicant: Google LLC

    Abstract: A media application obtains a video that includes an audio portion. The media application separates the audio portion into a plurality of channels, where each channel corresponds to a particular audio source. An on-screen classifier model obtains an indication of whether the particular audio source for each channel is depicted in the video. An audio-type classifier model determines, an auditory object classification for each channel. The media application determines a respective gain for each channel based on the indication of whether the particular audio source for the channel is depicted in the video and the auditory object classification for the channel. The media application modifies each channel by applying the respective gain. The media application mixes the modified channels with the audio portion to generate a combined audio.

Patent Agency Ranking