-
公开(公告)号:US11756530B2
公开(公告)日:2023-09-12
申请号:US17640579
申请日:2020-09-25
Applicant: GOOGLE LLC
Inventor: Marco Tagliasacchi , Mihajlo Velimirovic , Matthew Sharifi , Dominik Roblek , Christian Frank , Beat Gfeller
IPC: G10L15/06 , G10L21/013 , G10L25/30 , G10L25/90
CPC classification number: G10L15/063 , G10L21/013 , G10L25/30 , G10L25/90
Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.
-
2.
公开(公告)号:US10809968B2
公开(公告)日:2020-10-20
申请号:US16148338
申请日:2018-10-01
Applicant: Google LLC
Inventor: Dominik Roblek , Blaise Hilary Aguera-Arcas , Thomas W. Hume , Marvin Karl Ritter , Brandon Charles Barbello , Kevin I. Kilgour , Mihajlo Velimirovic , Christopher Thornton , Gabriel Oak Taubman , James David Lyon , Jan Heinrich Althaus , Katsiaryna Naliuka , Julian James Odell , Matthew Sharifi , Beat Gfeller
IPC: G06F17/00 , G06F3/16 , G06F16/635 , G06F16/683 , G06N3/08 , G06N20/00
Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.
-
公开(公告)号:US20240428056A1
公开(公告)日:2024-12-26
申请号:US18750973
申请日:2024-06-21
Applicant: Google LLC
Inventor: Paul Kishan Rubenstein , Matthew Sharifi , Alexandru Tudor , Chulayuth Asawaroengchai , Duc Dung Nguyen , Marco Tagliasacchi , Neil Zeghidour , Zalán Borsos , Christian Frank , Dalia Salem Hassan Fahmy Elbadawy , Hannah Raphaelle Muckenhirn , Dirk Ryan Padfield , Damien Vincent , Evgeny Kharitonov , Michelle Dana Tadmor , Mihajlo Velimirovic , Feifan Chen , Victoria Zayats
IPC: G06N3/0475 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing tasks. One of the methods includes obtaining a sequence of input tokens, where each token is selected from a vocabulary of tokens that includes text tokens and audio tokens, and wherein the sequence of input tokens includes tokens that describe a task to be performed and data for performing the task; generating a sequence of embeddings by embedding each token in the sequence of input tokens in an embedding space; and processing the sequence of embeddings using a language model neural network to generate a sequence of output tokens for the task, where each token is selected from the vocabulary.
-
4.
公开(公告)号:US20190102458A1
公开(公告)日:2019-04-04
申请号:US16148338
申请日:2018-10-01
Applicant: Google LLC
Inventor: Dominik Roblek , Blaise Aguera-Arcas , Tom Hume , Marvin Ritter , Brandon Barbello , Kevin Kilgour , Mihajlo Velimirovic , Christopher Walter George Thornton , Gabriel Taubman , James David Lyon , Jan Athaus , Katsiaryna Naliuka , Julian Odell , Matthew Sharifi , Beat Gfeller
Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.
-
公开(公告)号:US20190102144A1
公开(公告)日:2019-04-04
申请号:US16148401
申请日:2018-10-01
Applicant: Google LLC
Inventor: Dominik Roblek , Blaise Aguera-Arcas , Tom Hume , Marvin Ritter , Brandon Barbello , Kevin Kilgour , Mihajlo Velimirovic , Christopher Walter George Thornton , Gabriel Taubman , James David Lyon , Jan Althaus , Katsiaryna Naliuka , Julian Odell , Matthew Sharifi , Beat Gfeller
Abstract: In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products for indicating a reference song. A computing device stores reference song characterization data that identifies a plurality of audio characteristics for each reference song in a plurality of reference songs. The computing device receives digital audio data that represents audio recorded by a microphone, converts the digital audio data from time-domain format into frequency-domain format, and uses the digital audio data in the frequency-domain format in a music-characterization process. In response to determining that characterization values for the digital audio data are most relevant to characterization values for a particular reference song, the computing device outputs an indication of the particular reference song.
-
-
-
-