Abstract:
Embodiments described herein enable data associated with a large plurality of users to be analyzed without compromising the privacy of the user data. In one embodiment, a user can opt-in to allow analysis of clear text of the user's emails. An analysis process can then be performed in which an analysis service receives clear text of an email of a client device; processes the clear text of the email into one or more tokens having one or more tags; enriches one or more tokens in the processed email using data associated with a user of the client device and the one or more tags; and processes the clear text and one or more enriched tokens to generate a data set of one or more feature vectors.
Abstract:
One embodiment provides for a mobile electronic device comprising a non-transitory machine-readable medium to store instructions, the instructions to cause the mobile electronic device to receive a set of labeled data from a server; receive a unit of data from the server, the unit of data of a same type of data as the set of labeled data; determine a proposed label for the unit of data via a machine learning model on the mobile electronic device, the machine learning model to determine the proposed label for the unit of data based on the set of labeled data from the server and a set of unlabeled data associated with the mobile electronic device; encode the proposed label via a privacy algorithm to generate a privatized encoding of the proposed label; and transmit the privatized encoding of the proposed label to the server.
Abstract:
Systems and methods are disclosed for generating term frequencies of known terms based on crowdsourced differentially private sketches of the known terms. An asset catalog can be updated with new frequency counts for known terms based on the crowdsourced differentially private sketches. Known terms can have a classification. A client device can maintain a privacy budget for each classification of known terms. Classifications can include emojis, deep links, locations, finance terms, and health terms, etc. A privacy budget ensures that a client does not transmit too much information to a term frequency server, thereby compromising the privacy of the client device.
Abstract:
The disclosed embodiments provide a system that manages access to a user account from an electronic device. The system includes an identity service that provides a device token for the electronic device and a set of handles associated with the user account to the electronic device. Next, the identity service receives, from the electronic device, a handle registration containing one or more selected handles from the set of handles. Finally, the identity service transmits an identity certificate comprising an association between the selected handles and the electronic device to the electronic device, wherein the identity certificate and the association are used to route data associated with the selected handles to and from the electronic device.
Abstract:
Systems and methods are disclosed for a server learning new words generated by user client devices in a crowdsourced manner while maintaining local differential privacy of client devices. A client device can determine that a word typed on the client device is a new word that is not contained in a dictionary or asset catalog on the client device. New words can be grouped in classifications such as entertainment, health, finance, etc. A differential privacy system on the client device can comprise a privacy budget for each classification of new words. If there is privacy budget available for the classification, then one or more new terms in a classification can be sent to new term learning server, and the privacy budget for the classification reduced. The privacy budget can be periodically replenished.
Abstract:
Registering a client computing device for online communication sessions. A registration server receives a message that has a push token that is unique to the client computing device and a phone number of the client computing device from an SMS (Short Message Service) transit device, which received an SMS message having the push token from the client computing device and determined the phone number of the client computing device from that SMS message. The registration server associates the push token and the phone number and stores it in a registration data store, which is used for inviting users for online communication sessions.
Abstract:
Embodiments described herein provide a technique to crowdsource labeling of training data for a machine learning model while maintaining the privacy of the data provided by crowdsourcing participants. Client devices can be used to generate proposed labels for a unit of data to be used in a training dataset. One or more privacy mechanisms are used to protect user data when transmitting the data to a server. The server can aggregate the proposed labels and use the most frequently proposed labels for an element as the label for the element when generating training data for the machine learning model. The machine learning model is then trained using the crowdsourced labels to improve the accuracy of the model.
Abstract:
Systems and methods are disclosed for a server learning new words generated by user client devices in a crowdsourced manner while maintaining local differential privacy of client devices. A client device can determine that a word typed on the client device is a new word that is not contained in a dictionary or asset catalog on the client device. New words can be grouped in classifications such as entertainment, health, finance, etc. A differential privacy system on the client device can comprise a privacy budget for each classification of new words. If there is privacy budget available for the classification, then one or more new terms in a classification can be sent to new term learning server, and the privacy budget for the classification reduced. The privacy budget can be periodically replenished.
Abstract:
Systems and methods are disclosed for generating term frequencies of known terms based on crowdsourced differentially private sketches of the known terms. An asset catalog can be updated with new frequency counts for known terms based on the crowdsourced differentially private sketches. Known terms can have a classification. A client device can maintain a privacy budget for each classification of known terms. Classifications can include emojis, deep links, locations, finance terms, and health terms, etc. A privacy budget ensures that a client does not transmit too much information to a term frequency server, thereby compromising the privacy of the client device.
Abstract:
Systems and methods are disclosed for a server learning new words generated by user client devices in a crowdsourced manner while maintaining local differential privacy of client devices. A client device can determine that a word typed on the client device is a new word that is not contained in a dictionary or asset catalog on the client device. New words can be grouped in classifications such as entertainment, health, finance, etc. A differential privacy system on the client device can comprise a privacy budget for each classification of new words. If there is privacy budget available for the classification, then one or more new terms in a classification can be sent to new term learning server, and the privacy budget for the classification reduced. The privacy budget can be periodically replenished.