AUDIO GENERATION METHOD, SERVER, AND STORAGE MEDIUM

    公开(公告)号:US20180107735A1

    公开(公告)日:2018-04-19

    申请号:US15845906

    申请日:2017-12-18

    发明人: Hongcheng FU

    摘要: Audio generation method, server and storage medium are provided. The method includes obtaining a comparison audio, and performing a theme extraction on the comparison audio to obtain a comparison note sequence, the comparison note sequence comprising comparison note positions, comparison note pitches, and a comparison note duration; obtaining an original audio matching with the comparison audio via audio retrieval, and obtaining an original note sequence corresponding to the original audio by performing a theme extraction on the original audio, the original note sequence comprising original note positions, original note pitches, and an original note duration; calculating theme distances between fragments of the comparison audio and fragments of the original audio according to the comparison note sequence and the original note sequence; and generating an audio by capturing a fragment that is of the original audio and that satisfies the smallest theme distance.

    Melody recognition systems
    8.
    发明授权
    Melody recognition systems 有权
    旋律识别系统

    公开(公告)号:US09569532B1

    公开(公告)日:2017-02-14

    申请号:US14300600

    申请日:2014-06-10

    申请人: Google Inc.

    IPC分类号: H04N5/92 G06F17/30

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting, from among a collection of videos, a set of candidate videos that (i) are identified as being associated with a particular song, and (ii) are classified as a cappella video recordings; extracting, from each of the candidate videos of the set, a monophonic melody line from an audio channel of the candidate video; selecting, from among the set of candidate videos, a subset of the candidate videos based on a similarity of the monophonic melody line of the candidate videos of the subset with each other; and providing, to a recognizer that recognizes songs from sounds produced by a human voice, (i) an identifier of the particular song, and (ii) one or more of the monophonic melody lines of the candidate videos of the subset.

    摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从视频集合中选择一组候选视频,所述一组候选视频被识别为与特定歌曲相关联,以及(ii) 被列为无伴奏视频录像; 从所述候选视频的音频频道中提取来自所述组的每个候选视频的单声道旋律线; 基于所述子集的候选视频的单声道旋律线的相似度,从所述一组候选视频中选择所述候选视频的子集; 以及提供识别器,其识别由人类声音产生的声音的歌曲,(i)特定歌曲的标识符,以及(ii)该子集的候选视频的一个或多个单声道旋律线。

    Audio matching based on harmonogram
    9.
    发明授权
    Audio matching based on harmonogram 有权
    基于音调的音频匹配

    公开(公告)号:US09501568B2

    公开(公告)日:2016-11-22

    申请号:US14980622

    申请日:2015-12-28

    申请人: Gracenote, Inc.

    发明人: Zafar Rafii

    摘要: In an example context of identifying live audio, an audio processor machine accesses audio data that represents a query sound and creates a spectrogram from the audio data. Each segment of the spectrogram represents a different time slice in the query sound. For each time slice, the audio processor machine determines one or more dominant frequencies and an aggregate energy value that represents a combination of all the energy for that dominant frequency and its harmonics. The machine creates a harmonogram by representing these aggregate energy values at these dominant frequencies in each time slice. The harmonogram thus may represent the strongest harmonic components within the query sound. The machine can identify the query sound by comparing its harmonogram to other harmonograms of other sounds and may respond to a user's submission of the query sound by providing an identifier of the query sound to the user.

    摘要翻译: 在识别实时音频的示例上下文中,音频处理器机器访问表示查询声音的音频数据,并从音频数据创建频谱图。 光谱图的每个片段表示查询声音中不同的时间片。 对于每个时间片,音频处理器机器确定一个或多个主要频率和表示该主频率的所有能量及其谐波的组合的总能量值。 机器通过在每个时间片中以这些主要频率表示这些总能量值来创建一个谐波图。 因此,谐波图可以表示查询声音中最强的谐波分量。 该机器可以通过将其谐波图与其他声音的其他谐波图进行比较来识别查询声音,并且可以通过向用户提供查询声音的标识符来响应用户的查询声音的提交。

    AUDIO MATCHING BASED ON HARMONOGRAM
    10.
    发明申请
    AUDIO MATCHING BASED ON HARMONOGRAM 有权
    基于和声的音频匹配

    公开(公告)号:US20160196343A1

    公开(公告)日:2016-07-07

    申请号:US14980622

    申请日:2015-12-28

    申请人: Gracenote, Inc.

    发明人: Zafar Rafii

    摘要: In an example context of identifying live audio, an audio processor machine accesses audio data that represents a query sound and creates a spectrogram from the audio data. Each segment of the spectrogram represents a different time slice in the query sound. For each time slice, the audio processor machine determines one or more dominant frequencies and an aggregate energy value that represents a combination of all the energy for that dominant frequency and its harmonics. The machine creates a harmonogram by representing these aggregate energy values at these dominant frequencies in each time slice. The harmonogram thus may represent the strongest harmonic components within the query sound. The machine can identify the query sound by comparing its harmonogram to other harmonograms of other sounds and may respond to a user's submission of the query sound by providing an identifier of the query sound to the user.

    摘要翻译: 在识别实时音频的示例上下文中,音频处理器机器访问表示查询声音的音频数据,并从音频数据创建频谱图。 光谱图的每个片段表示查询声音中不同的时间片。 对于每个时间片,音频处理器机器确定一个或多个主要频率和表示该主频率的所有能量及其谐波的组合的总能量值。 机器通过在每个时间片中以这些主要频率表示这些总能量值来创建一个谐波图。 因此,谐波图可以表示查询声音中最强的谐波分量。 该机器可以通过将其谐波图与其他声音的其他谐波图进行比较来识别查询声音,并且可以通过向用户提供查询声音的标识符来响应用户的查询声音的提交。