Abstract:
An online system matches a user across multiple online systems based on image data for the user (e.g., profile photo) regardless whether the image data is from the online system, a different but related online system or a third party system. For example, to match the user across a social networking system and INSTAGRAM™ system, the online system compares the similarity between images of the user from both systems in addition to similarity of textual information in the user profiles on both systems. The similarity of image data and the similarity of textual information associated with the user are used by the online system as indicators of matched user accounts belonging to the same user across both systems. The online system applies models trained using deep learning techniques to match a user across multiple online systems based on the image data and textual information associated with the user.
Abstract:
An online system predicts household features of a user, e.g., household size and demographic composition, based on image data of the user, e.g., profile photos, photos posted by the user and photos posted by other users socially connected with the user, and textual data in the user's profile that suggests relationships among individuals shown in the image data of the user. The online system applies one or more models trained using deep learning techniques to generate the predictions. For example, a trained image analysis model identifies each individual depicted in the photos of the user; a trained text analysis model derive household member relationship information from the user's profile data and tags associated with the photos. The online system uses the predictions to build more information about the user and his/her household in the online system, and provide improved and targeted content delivery to the user and the user's household.
Abstract:
An audience analysis system determines and predicts reach and frequency information of online users. The system receives real-time ad impression data from ad publishers or other data providers as well as report requests from advertisers asking for the reach and frequency information. The reach and frequency information of online users describes characteristics of online users that are reached by the advertisers. Matched users and unmatched users are identified via online cookies. Atomic data units are generated to allow feature computation and reach prediction for online users in a more efficient way. Machine learning models are trained to help predict the reach and frequency of unmatched users and to generate reports. The audience analysis system provides the advertisers with the generated reports, responding to the report requests.
Abstract:
An online system maintains an identity graph having links between different types of user identifying information (e.g., email addresses, phone numbers, user identifiers) describing various users of the online system. Based on information received from various sources describing relationships between different types of user identifying information describing a user, the online system generates confidence values for each link between different types of user identifying information. In some embodiments, a confidence value accounts for an amount of time since information describing a relationship between different types of user identifying information was received from a source. If the confidence value of a link between different types of user identifying information equals or exceeds a threshold value, the online system determines the different types of user identifying information are correlated with each other, allowing the online system to correlate user identifying information without storing user identifying information received from sources.
Abstract:
An online system predicts gender, age, interests, or other demographic information of a user based on image data of the user, e.g., profile photos, photos the user posts of him/herself within an online system, and photos of the user posted by other users socially connected with the user, and textual data in the user's profile that suggests age or gender (e.g., like or dislikes similar to a population of users of an online system). The online system similarly predicts a user's interests based on the photos of the user. The online system applies one or more models trained using deep learning techniques to generate the predictions. The online system uses the predictions to build more information about the user in the online system, and provide improved and targeted content delivery to the user that may have disparate information scattered throughout different online systems.
Abstract:
Online system users interact with one or more third party systems, with the online system maintaining an account for each of its users and each third party system maintaining a third party account for each of its users. The online system compares information in a user's account to accessible information in third party accounts and establishes connections between the user's account and third party accounts based on the comparisons, a connection including a confidence level indicating a likelihood of a third party account being associated with the user of the online system corresponding to the user's account. Similarly, the online system compares information in different third party accounts and establishes connections between different third party accounts based on the comparisons including includes a confidence level indicating a likelihood of a third party account and an additional third party account being associated with the same user.
Abstract:
An online system obtains a set of resolved impressions based on historical data about multiple publishers. A set of features is then extracted, for each resolved impression, based on a comparison of historical data about the first publisher and the second publisher. The online system performs training of a machine-learned model based on the set of features. Data about a plurality of new impressions are input into the trained machine-learned model to obtain an output of the trained machine-learned model. A reach overlap metric and unique reach metric can be computed based on the output of the trained machine-learned model.
Abstract:
Different online systems, such as an ad system or a social networking system, maintain different identifiers. An ad system identifies an association between an unsynced cookie maintained by an ad system and a user of the online system. The ad system identifies an overlap IP sequence including multiple occurrences of a user's user id and multiple occurrences of an unsynced cookie id in communications associated with an IP address over a given time period. The ad system determines an overlap score based on the identified overlap IP sequence. The overlap score determines how closely the unsynced cookie is associated with the user of the online system. The ad system determines whether the unsynced cookie id and the user id are associated with one another based on the overlap score. The ad system stores an association between the unsynced cookie and the user of the online system thereby generating a synced cookie.
Abstract:
A method for providing content items to one or more client devices associated with at least one unresolved identifier. An unresolved identifier defines a context in which a client device accesses one or more online systems, the context not determined to be associated with a specific user. The method comprises identifying a set of unresolved identifiers, and identifying information describing one or more access events associated with each unresolved identifier. For each pair of unresolved identifiers, a similarity score for the pair is determined based on the identified information. Responsive to the similarity score exceeding a threshold similarity score, the pair of unresolved identifiers is clustered, the clustering indicating a prediction that the pair of unresolved identifiers are associated with a common user. Based on this clustering, a content item is displayed on or more user devices associated with at least one unresolved identifier of the set of unresolved identifiers.
Abstract:
An online system matches a user across multiple online systems based on image data for the user (e.g., profile photo) regardless whether the image data is from the online system, a different but related online system or a third party system. For example, to match the user across a social networking system and INSTAGRAM™ system, the online system compares the similarity between images of the user from both systems in addition to similarity of textual information in the user profiles on both systems. The similarity of image data and the similarity of textual information associated with the user are used by the online system as indicators of matched user accounts belonging to the same user across both systems. The online system applies models trained using deep learning techniques to match a user across multiple online systems based on the image data and textual information associated with the user.