Abstract:
In one embodiment, a method includes identifying a shared visual concept in visual-media items based on shared visual features in images of the visual-media items; extracting, for each of the visual-media items, n-grams from communications associated with the visual-media item; generating, in a d-dimensional space, an embedding for each of the visual-media items at a location based on the visual concepts included in the visual-media item; generating, in the d-dimensional space, an embedding for each of the extracted n-grams at a location based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; and associating, with the shared visual concept, the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items.
Abstract:
In one embodiment, a method includes receiving a plurality of search queries comprising n-grams; identifying a subset of the plurality of search queries as being queries for visual-media items based on one or more n-grams of the search query being associated with visual-media content; calculating, for each of the n-grams of the search queries of the subset, a popularity-score based on a count of the search queries in the subset that include the n-gram; determining popular n-grams, wherein each of the popular n-grams is an n-gram of the search queries of the subset of search queries having a popularity-score greater than a threshold popularity-score; and selecting one or more of the popular n-grams for training a visual-concept recognition system, wherein each of the popular n-grams is selected based on whether it is associated with a visual concept.
Abstract:
In one embodiment, a method includes identifying a shared visual concept in visual-media items based on shared visual features in images of the visual-media items; extracting, for each of the visual-media items, n-grams from communications associated with the visual-media item; generating, in a d-dimensional space, an embedding for each of the visual-media items at a location based on the visual concepts included in the visual-media item; generating, in the d-dimensional space, an embedding for each of the extracted n-grams at a location based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; and associating, with the shared visual concept, the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items.
Abstract:
In one embodiment, a system retrieves a first feature vector for an image. The image is inputted into a first deep-learning model, which is a first-version model, and the first feature vector may be output from a processing layer of the first deep-learning model for the image. The first feature vector using a feature-vector conversion model to obtain a second feature vector for the image. The feature-vector conversion model is trained to convert first-version feature vectors to second-version feature vectors. The second feature vector is associated with a second deep-learning model, and the second deep-learning model is a second-version model. The second-version model is an updated version of the first-version model. A plurality of predictions for the image may be generated using the second feature vector and the second deep-learning model.
Abstract:
In one embodiment, a method includes receiving a request for a protected resource, providing information to display a challenge-response test, where the challenge-response test includes an image and instructions to provide user input in relation to the image, the image comprises one or more masks, and each of the masks is defined by a perimeter, receiving user input in relation to the image, generating an assessment of the user input based on a correlation between the user input and the masks, determining, based on the assessment, whether the user input corresponds to human-generated input, and if the user input may be deemed responsive to the instructions, then providing information to access the protected resource, else providing information indicating that the user input failed the challenge-response test. Each of the masks may include a classification, and the instructions may provide user input in relation to the classifications.
Abstract:
In one embodiment, a method includes receiving a query of a first user; retrieving videos that match the query; determining a filtered set of videos, wherein the filtering includes removing duplicate videos based on the duplicate videos having a digital fingerprint that is within a threshold degree of sameness from that of a modal video; calculating, for each video, similarity-scores that correspond to a degree of similarity between the video and another video in the filtered set; grouping the videos into clusters that include videos with similarity-scores greater than a threshold similarity-score with respect to each other video in the cluster; and sending, to the first user, a search-results interface including search results for the videos that are organized within the interface based on the respective clusters of their corresponding videos.
Abstract:
In one embodiment, a method includes identifying a shared visual concept in visual-media items based on shared visual features in images of the visual-media items; extracting, for each of the visual-media items, n-grams from communications associated with the visual-media item; generating, in a d-dimensional space, an embedding for each of the visual-media items at a location based on the visual concepts included in the visual-media item; generating, in the d-dimensional space, an embedding for each of the extracted n-grams at a location based on a frequency of occurrence of the n-gram in the communications associated with the visual-media items; and associating, with the shared visual concept, the extracted n-grams that have embeddings within a threshold area of the embeddings for the identified visual-media items.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.