-
公开(公告)号:US20250117185A1
公开(公告)日:2025-04-10
申请号:US18904981
申请日:2024-10-02
Applicant: Google LLC
Inventor: Moonseok Kim , Elliot PATROS , Sneh SINGARAJU , Michelle ANSAI , Efthymios TZINIS
Abstract: A media application obtains a video that includes an audio portion. The media application separates the audio portion into a plurality of channels, where each channel corresponds to a particular audio source. An on-screen classifier model obtains an indication of whether the particular audio source for each channel is depicted in the video. An audio-type classifier model determines, an auditory object classification for each channel. The media application determines a respective gain for each channel based on the indication of whether the particular audio source for the channel is depicted in the video and the auditory object classification for the channel. The media application modifies each channel by applying the respective gain. The media application mixes the modified channels with the audio portion to generate a combined audio.