Abstract:
Systems and techniques for removing a sound recording from an audio recording (e.g., an audio recording embedded in a media file) are presented. The system can include an identification component, a first subtraction component and a second subtraction component. The identification component identifies a sound recording in a mixed audio recording. The first subtraction component determines a local linear transformation of the sound recording and subtracts the local linear transformation of the sound recording from the mixed audio recording to generate a new mixed audio recording. The second subtraction component compares one or more segments of the sound recording with one or more corresponding segments of the new mixed audio recording and reduces a power level of the new mixed audio recording based at least in part on correlation of the one or more corresponding segments with the one or more segments.
Abstract:
Systems and techniques for removing a sound recording from an audio recording (e.g., an audio recording embedded in a media file) are presented. The system can include an identification component, a first subtraction component and a second subtraction component. The identification component identifies a sound recording in a mixed audio recording. The first subtraction component determines a local linear transformation of the sound recording and subtracts the local linear transformation of the sound recording from the mixed audio recording to generate a new mixed audio recording. The second subtraction component compares one or more segments of the sound recording with one or more corresponding segments of the new mixed audio recording and reduces a power level of the new mixed audio recording based at least in part on correlation of the one or more corresponding segments with the one or more segments.
Abstract:
Provided content is determined to contain an asset represented by reference content by comparing digital fingerprints of the provided content and the reference content. The fingerprints of the reference content and the provided content are generated using a convolutional neural network (CNN). The CNN is trained using a plurality of frame triplets including an anchor frame representing the reference content, a positive frame which is a transformation of the anchor frame, and a negative frame representing content that is not the reference content. The provided content is determined to contain the asset represented by the reference content based on a similarity measure between the generated fingerprints. If the provided content is determined to contain the asset represented by the reference content, a policy associated with the asset is enforced on the provided content.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a machine learning model that has been trained through reinforcement learning to select a content item. One of the methods includes receiving first data characterizing a first context in which a first content item may be presented to a first user in a presentation environment; and providing the first data as input to a long-term engagement machine learning model, the model having been trained through reinforcement learning to: receive a plurality of inputs, and process each of the plurality of inputs to generate a respective engagement score for each input that represents a predicted, time-adjusted total number of selections by the respective user of future content items presented to the respective user in the presentation environment if the respective content item is presented in the respective context.
Abstract:
Systems and methods facilitating removal of content from audio files are described. A method includes identifying a sound recording in a first audio file, identifying a reference file having at least a defined level of similarity to the sound recording, and processing the first audio file to remove the sound recording and generate a second audio file. In some embodiments, winner-take-all coding and Hough transforms are employed for determining alignment and rate adjustment of the reference file in the first audio file. After alignment, the reference file is filtered in the frequency domain to increase similarity between the reference file and the sound recording. The frequency domain representation (FR) of the filtered version is subtracted from the FR first audio and the result converted to a time representation of the second audio file. In some embodiments, spectral subtraction is also performed to generate a further improved second audio file.
Abstract:
Systems and methods facilitating removal of content from audio files are described. A method includes identifying a sound recording in a first audio file, identifying a reference file having at least a defined level of similarity to the sound recording, and processing the first audio file to remove the sound recording and generate a second audio file. In some embodiments, winner-take-all coding and Hough transforms are employed for determining alignment and rate adjustment of the reference file in the first audio file. After alignment, the reference file is filtered in the frequency domain to increase similarity between the reference file and the sound recording. The frequency domain representation (FR) of the filtered version is subtracted from the FR first audio and the result converted to a time representation of the second audio file. In some embodiments, spectral subtraction is also performed to generate a further improved second audio file.
Abstract:
Aspects relate to determining whether a probe media content matches one or more reference media content. The reference media content is classified into a content class. The probe media content could also be classified into a content class. Similarities between the probe media content and the reference media content are identified. A matching score given to the probe media content is weighted based on statistics regarding matches and false-positive rates for the content class of the reference media content. Further, classifiers can be trained on computed audio features and video features and/or video metadata and audio metadata of various media content.
Abstract:
Systems and techniques for removing a sound recording from an audio recording (e.g., an audio recording embedded in a media file) are presented. The system can include an identification component, a first subtraction component and a second subtraction component. The identification component identifies a sound recording in a mixed audio recording. The first subtraction component determines a local linear transformation of the sound recording and subtracts the local linear transformation of the sound recording from the mixed audio recording to generate a new mixed audio recording. The second subtraction component compares one or more segments of the sound recording with one or more corresponding segments of the new mixed audio recording and reduces a power level of the new mixed audio recording based at least in part on correlation of the one or more corresponding segments with the one or more segments.
Abstract:
Systems and methods facilitating removal of content from audio files are described. A method includes identifying a sound recording in a first audio file, identifying a reference file having at least a defined level of similarity to the sound recording, and processing the first audio file to remove the sound recording and generate a second audio file. In some embodiments, winner-take-all coding and Hough transforms are employed for determining alignment and rate adjustment of the reference file in the first audio file. After alignment, the reference file is filtered in the frequency domain to increase similarity between the reference file and the sound recording. The frequency domain representation (FR) of the filtered version is subtracted from the FR first audio and the result converted to a time representation of the second audio file. In some embodiments, spectral subtraction is also performed to generate a further improved second audio file.
Abstract:
Systems and techniques for removing a sound recording from an audio recording (e.g., an audio recording embedded in a media file) are presented. The system can include an identification component, a first subtraction component and a second subtraction component. The identification component identifies a sound recording in a mixed audio recording. The first subtraction component determines a local linear transformation of the sound recording and subtracts the local linear transformation of the sound recording from the mixed audio recording to generate a new mixed audio recording. The second subtraction component compares one or more segments of the sound recording with one or more corresponding segments of the new mixed audio recording and reduces a power level of the new mixed audio recording based at least in part on correlation of the one or more corresponding segments with the one or more segments.