SYSTEMS AND METHODS FOR LYRICS ALIGNMENT
    1.
    发明公开

    公开(公告)号:US20240233776A9

    公开(公告)日:2024-07-11

    申请号:US18341550

    申请日:2023-06-26

    申请人: Spotify AB

    IPC分类号: G11B27/10

    CPC分类号: G11B27/10

    摘要: A method includes obtaining lyrics text and audio for a media item and generating, using a first encoder, a first plurality of embeddings representing symbols that appear in the lyrics text for the media item. The method includes generating, using a second encoder, a second plurality of embeddings representing an acoustic representation of the audio for the media item. The method includes determining respective similarities between embeddings of the first plurality of embeddings and embeddings of the second plurality of embeddings and aligning the lyrics text and the audio for the media item based on the respective similarities. The method includes, while streaming the audio for the media item, providing, for display, the aligned lyrics text with the streamed audio.

    ENHANCED AUDIO FILE GENERATOR
    2.
    发明公开

    公开(公告)号:US20240105203A1

    公开(公告)日:2024-03-28

    申请号:US17934906

    申请日:2022-09-23

    申请人: Spotify AB

    摘要: This disclosure is directed to an enhanced audio file generator. One aspect is a method of enhancing input speech in an input audio file, the method comprising receiving the input audio file representing the input speech, wherein the input audio file is recorded at an audio recording device, and generating an enhanced audio file by applying an audio transformation model to the input audio file, wherein applying the audio transformation model to generate the enhanced audio file comprises extracting parameters defining audio features from the input audio file, the parameters including a noise parameter defining noise in the input audio file and one or more other preset parameters respectively defining other audio features, synthesizing clean speech based on the extracted parameters including the noise parameter, wherein synthesizing the clean speech comprises transforming the noise parameter to defined value(s); and generating the enhanced audio file with the synthesized clean speech.

    SYSTEMS AND METHODS FOR MUSICAL PERFORMANCE SCORING

    公开(公告)号:US20240153478A1

    公开(公告)日:2024-05-09

    申请号:US18054119

    申请日:2022-11-09

    申请人: Spotify AB

    IPC分类号: G10H1/36

    CPC分类号: G10H1/361 G10H2210/091

    摘要: An electronic device pre-processes a target audio track, including determining, for each time interval of a plurality of time intervals of the target audio track, a multi-pitch salience. The electronic device presents the target audio track at a device associated with the user. While presenting the target audio track at the device associated with the user, the electronic device receives an audio data stream representative of a user's musical performance and scores the user's musical performance with respect to the target audio track by comparing, respectively, for each time interval of the plurality of time intervals of the target audio track, (i) a pitch of the user's musical performance represented by the audio data stream to (ii) the multi-pitch salience.

    SYSTEMS AND METHODS FOR LYRICS ALIGNMENT
    4.
    发明公开

    公开(公告)号:US20240135974A1

    公开(公告)日:2024-04-25

    申请号:US18341550

    申请日:2023-06-25

    申请人: Spotify AB

    IPC分类号: G11B27/10

    CPC分类号: G11B27/10

    摘要: A method includes obtaining lyrics text and audio for a media item and generating, using a first encoder, a first plurality of embeddings representing symbols that appear in the lyrics text for the media item. The method includes generating, using a second encoder, a second plurality of embeddings representing an acoustic representation of the audio for the media item. The method includes determining respective similarities between embeddings of the first plurality of embeddings and embeddings of the second plurality of embeddings and aligning the lyrics text and the audio for the media item based on the respective similarities. The method includes, while streaming the audio for the media item, providing, for display, the aligned lyrics text with the streamed audio.