-
公开(公告)号:US11740862B1
公开(公告)日:2023-08-29
申请号:US17992468
申请日:2022-11-22
申请人: ALGORIDDIM GMBH
IPC分类号: G06F3/16 , G10H1/00 , G06F16/632
CPC分类号: G06F3/165 , G06F16/632 , G10H1/0008 , G10H1/0041 , G10H2210/056 , G10H2210/125 , G10H2240/141 , G10H2250/235 , G10H2250/311
摘要: A method for processing audio data, comprising providing song identification data identifying a particular song from among a plurality of songs or identifying a particular position within a particular song, loading intermediate data associated with the song identification data from a storage medium or from a remote device. The method also comprises obtaining input audio data representing audio signals of the song as identified by the song identification data. The audio signals comprise a mixture of different musical timbres, including at least a first musical timbre and a second musical timbre different from said first musical timbre. The method comprises combining the input audio data and the intermediate data with one another to obtain output audio data. The audio data represent audio signals of the first musical timbre separated from the second musical timbre.
-
公开(公告)号:US20230153351A1
公开(公告)日:2023-05-18
申请号:US17681416
申请日:2022-02-25
发明人: Jee Hyun PARK , Jung Hyun KIM , Hye Mi KIM , Yong Seok SEO , Dong Hyuck IM , Won Young YOO
IPC分类号: G06F16/683 , G10H1/00 , G06F16/61
CPC分类号: G06F16/683 , G10H1/0008 , G10H1/0041 , G06F16/61 , G10H2240/075 , G10H2210/031 , G10H2240/095 , G10H2240/141 , G10H2240/135 , G10H2250/311
摘要: The present invention relates to an apparatus and method for identifying music in a content, The present invention includes extracting and storing a fingerprint of an original audio in an audio fingerprint DB; extracting a first fingerprint of a first audio in the content; and searching for a fingerprint corresponding to the fingerprint of the first audio in the audio fingerprint DB, wherein the first audio is audio data in a music section detected from the content.
-
公开(公告)号:US20180349495A1
公开(公告)日:2018-12-06
申请号:US16102485
申请日:2018-08-13
发明人: Weifeng ZHAO
CPC分类号: G06F16/686 , G06F16/685 , G10H1/0083 , G10H1/36 , G10H1/365 , G10H2210/061 , G10H2220/011 , G10H2230/015 , G10H2240/141 , G10H2240/325 , G10L15/04 , G10L15/24 , G10L15/26 , G10L25/90 , G10L2015/025
摘要: The present disclosure discloses an audio data processing performed by a computing device. The computing device obtains song information of a song, the song information comprising an accompaniment file, a lyric file, and a music score file that correspond to the song and then determines a predefined portion of the song and music score information corresponding to the predefined portion according to the song information. After receiving audio data that is input by a user, the computing device determines time information of each word in the audio data and then processes the audio data according to the time information of each word in the audio data and the music score information of the predefined portion of the song. Finally, the computing device obtains mixed audio data by mixing the processed audio data and the accompaniment file.
-
公开(公告)号:US20180276297A1
公开(公告)日:2018-09-27
申请号:US15990089
申请日:2018-05-25
发明人: Jin Xing Ming , Yu Jia Jun , Li Ke , Wu Yong Jian , Huang Fei Yue
IPC分类号: G06F17/30
CPC分类号: G06F16/683 , G06F16/00 , G06F16/638 , G10H1/00 , G10H1/0008 , G10H2210/066 , G10H2220/096 , G10H2240/141
摘要: An audio identification method and apparatus are disclosed within the technical field of audio processing technology. The audio identification solution includes obtaining an original pitch sequence of a to-be-identified audio, where the original pitch sequence is used to indicate a frequency of the to-be-identified audio at each time point. The audio identification solution further includes dividing the original pitch sequence into a plurality of pitch sub-sequences, respectively identifying the original pitch sequence and the plurality of pitch sub-sequences, and combining the identification results. In doing so, the audio identification solution obtains a final identification result by dividing a long pitch sequence into a plurality of short pitch sequences, thus respectively identifying the long pitch sequence and the plurality of short pitch sequences, and combining identification results.
-
公开(公告)号:US20180107735A1
公开(公告)日:2018-04-19
申请号:US15845906
申请日:2017-12-18
发明人: Hongcheng FU
CPC分类号: G06F17/30743 , G10H1/0008 , G10H1/0033 , G10H2210/061 , G10H2210/066 , G10H2240/141 , G10L25/54 , G10L25/78 , G10L25/90 , G10L2025/906
摘要: Audio generation method, server and storage medium are provided. The method includes obtaining a comparison audio, and performing a theme extraction on the comparison audio to obtain a comparison note sequence, the comparison note sequence comprising comparison note positions, comparison note pitches, and a comparison note duration; obtaining an original audio matching with the comparison audio via audio retrieval, and obtaining an original note sequence corresponding to the original audio by performing a theme extraction on the original audio, the original note sequence comprising original note positions, original note pitches, and an original note duration; calculating theme distances between fragments of the comparison audio and fragments of the original audio according to the comparison note sequence and the original note sequence; and generating an audio by capturing a fragment that is of the original audio and that satisfies the smallest theme distance.
-
公开(公告)号:US09910919B2
公开(公告)日:2018-03-06
申请号:US14924235
申请日:2015-10-27
发明人: Jang-ho Jin , Young-jun Ryu , Myung-suk Song
IPC分类号: G06F17/00 , G06F17/30 , H04N21/439 , H04N21/845
CPC分类号: G06F17/30758 , G06F17/30743 , G10H2210/046 , G10H2210/061 , G10H2220/011 , G10H2240/141 , H04N21/4394 , H04N21/8456
摘要: A content processing device is provided. The content processing device includes a receiver configured to receive a content, an audio processor configured to extract an audio signal by decoding audio data included in the content, a processor configured to determine a characteristic section in the audio signal based on a ratio of music information of the audio signal, and detect a segment corresponding to the characteristic section in the audio signal; and a communicator configured to transmit the segment to a music recognition server, and a size of the segment is determined variably within a threshold range.
-
公开(公告)号:US09666199B2
公开(公告)日:2017-05-30
申请号:US13910949
申请日:2013-06-05
申请人: Smule, Inc.
发明人: Parag Chordia , Mark Godfrey , Alexander Rae , Prerna Gupta , Perry R. Cook
IPC分类号: G10L21/007 , G10L21/04 , G10L19/02 , G10L19/00 , G10L21/055 , G10H1/36
CPC分类号: G10L19/02 , G10H1/366 , G10H2210/051 , G10H2240/141 , G10H2250/235 , G10L19/00 , G10L21/055
摘要: Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
-
公开(公告)号:US09569532B1
公开(公告)日:2017-02-14
申请号:US14300600
申请日:2014-06-10
申请人: Google Inc.
CPC分类号: G06F17/3082 , G06F17/30758 , G06F17/30787 , G10H1/0008 , G10H1/368 , G10H2240/075 , G10H2240/135 , G10H2240/141
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting, from among a collection of videos, a set of candidate videos that (i) are identified as being associated with a particular song, and (ii) are classified as a cappella video recordings; extracting, from each of the candidate videos of the set, a monophonic melody line from an audio channel of the candidate video; selecting, from among the set of candidate videos, a subset of the candidate videos based on a similarity of the monophonic melody line of the candidate videos of the subset with each other; and providing, to a recognizer that recognizes songs from sounds produced by a human voice, (i) an identifier of the particular song, and (ii) one or more of the monophonic melody lines of the candidate videos of the subset.
摘要翻译: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于从视频集合中选择一组候选视频,所述一组候选视频被识别为与特定歌曲相关联,以及(ii) 被列为无伴奏视频录像; 从所述候选视频的音频频道中提取来自所述组的每个候选视频的单声道旋律线; 基于所述子集的候选视频的单声道旋律线的相似度,从所述一组候选视频中选择所述候选视频的子集; 以及提供识别器,其识别由人类声音产生的声音的歌曲,(i)特定歌曲的标识符,以及(ii)该子集的候选视频的一个或多个单声道旋律线。
-
公开(公告)号:US09501568B2
公开(公告)日:2016-11-22
申请号:US14980622
申请日:2015-12-28
申请人: Gracenote, Inc.
发明人: Zafar Rafii
CPC分类号: G06F17/30755 , G06F17/3033 , G06F17/30743 , G06F17/30778 , G10H2210/066 , G10H2240/141 , G10H2250/031 , G10L25/18 , G10L25/21 , G10L25/45 , G10L25/54 , G10L25/72
摘要: In an example context of identifying live audio, an audio processor machine accesses audio data that represents a query sound and creates a spectrogram from the audio data. Each segment of the spectrogram represents a different time slice in the query sound. For each time slice, the audio processor machine determines one or more dominant frequencies and an aggregate energy value that represents a combination of all the energy for that dominant frequency and its harmonics. The machine creates a harmonogram by representing these aggregate energy values at these dominant frequencies in each time slice. The harmonogram thus may represent the strongest harmonic components within the query sound. The machine can identify the query sound by comparing its harmonogram to other harmonograms of other sounds and may respond to a user's submission of the query sound by providing an identifier of the query sound to the user.
摘要翻译: 在识别实时音频的示例上下文中,音频处理器机器访问表示查询声音的音频数据,并从音频数据创建频谱图。 光谱图的每个片段表示查询声音中不同的时间片。 对于每个时间片,音频处理器机器确定一个或多个主要频率和表示该主频率的所有能量及其谐波的组合的总能量值。 机器通过在每个时间片中以这些主要频率表示这些总能量值来创建一个谐波图。 因此,谐波图可以表示查询声音中最强的谐波分量。 该机器可以通过将其谐波图与其他声音的其他谐波图进行比较来识别查询声音,并且可以通过向用户提供查询声音的标识符来响应用户的查询声音的提交。
-
公开(公告)号:US20160196343A1
公开(公告)日:2016-07-07
申请号:US14980622
申请日:2015-12-28
申请人: Gracenote, Inc.
发明人: Zafar Rafii
CPC分类号: G06F17/30755 , G06F17/3033 , G06F17/30743 , G06F17/30778 , G10H2210/066 , G10H2240/141 , G10H2250/031 , G10L25/18 , G10L25/21 , G10L25/45 , G10L25/54 , G10L25/72
摘要: In an example context of identifying live audio, an audio processor machine accesses audio data that represents a query sound and creates a spectrogram from the audio data. Each segment of the spectrogram represents a different time slice in the query sound. For each time slice, the audio processor machine determines one or more dominant frequencies and an aggregate energy value that represents a combination of all the energy for that dominant frequency and its harmonics. The machine creates a harmonogram by representing these aggregate energy values at these dominant frequencies in each time slice. The harmonogram thus may represent the strongest harmonic components within the query sound. The machine can identify the query sound by comparing its harmonogram to other harmonograms of other sounds and may respond to a user's submission of the query sound by providing an identifier of the query sound to the user.
摘要翻译: 在识别实时音频的示例上下文中,音频处理器机器访问表示查询声音的音频数据,并从音频数据创建频谱图。 光谱图的每个片段表示查询声音中不同的时间片。 对于每个时间片,音频处理器机器确定一个或多个主要频率和表示该主频率的所有能量及其谐波的组合的总能量值。 机器通过在每个时间片中以这些主要频率表示这些总能量值来创建一个谐波图。 因此,谐波图可以表示查询声音中最强的谐波分量。 该机器可以通过将其谐波图与其他声音的其他谐波图进行比较来识别查询声音,并且可以通过向用户提供查询声音的标识符来响应用户的查询声音的提交。
-
-
-
-
-
-
-
-
-