-
公开(公告)号:US10853696B1
公开(公告)日:2020-12-01
申请号:US16382162
申请日:2019-04-11
Applicant: Facebook, Inc.
Inventor: Enming Luo , Emanuel Alexandre Strauss
Abstract: An online system uses a model to detect violations of policies enforced by the online system for content uploaded to the online system by users for viewing by other users. The online system trains the model in multiple stages. To train the model, the online system obtains a set of training content items, with each content item of the set labeled with both a policy violated by the content item and a source of the content item, which acts as a proxy for a sub-category identifying a way in which the content item violated the policy. In the first stage, the online system trains the model using the set of training content items. In a second stage, the model of trained to predict policy violations from content items that are not labeled with a source. For example, the second stage is performed by freezing earlier layers in the model.
-
公开(公告)号:US20180349942A1
公开(公告)日:2018-12-06
申请号:US15608803
申请日:2017-05-30
Applicant: Facebook, Inc.
Inventor: Yang Mu , Emanuel Alexandre Strauss , Daniel Olmedilla de la Calle
Abstract: For various content campaigns (or content), an online system predicts a likelihood score of context violations (e.g., account term violations) of a content campaign. The online system derives a plurality of feature vectors of the content campaign. The online system predicts a likelihood score of context violation of the content campaign using a memorization model based on the plurality of feature vectors. The memorization model comprises a plurality of categories and a plurality of items of each category. Each of the plurality of categories has a category weight, and each of the plurality of items of each category has an item weight. The predicted likelihood score is based on a combination of a plurality of category weights and a plurality of item weights associated with the plurality of feature vectors. The online system performs an action affecting the content campaign based in part on the predicted likelihood score.
-
公开(公告)号:US20180253661A1
公开(公告)日:2018-09-06
申请号:US15449448
申请日:2017-03-03
Applicant: Facebook, Inc.
Inventor: Emanuel Alexandre Strauss
IPC: G06N99/00
CPC classification number: G06N20/00 , G06N7/005 , G06Q30/0269 , G06Q50/01
Abstract: An online system maintains machine learning models that determine risk scores for content items indicating likelihoods of content items violating content policies associated with the machine learning models. When the online system obtains an additional content policy, the online system applies a maintained machine learning model to a set including content items previously identified as violating or not violating the additional content policy. The online system maps the risk scores determined for content items of the set to likelihoods of violating the additional content policy based on the identifications of content times in the set violating or not violating the additional content policy. Subsequently, the online system applies the maintained machine learning model to content items and determines likelihoods of the content items violating the additional content policy based on the mapping of risk scores to likelihood of violating the additional content policy.
-
公开(公告)号:US11195099B2
公开(公告)日:2021-12-07
申请号:US15694321
申请日:2017-09-01
Applicant: Facebook, Inc.
Inventor: Enming Luo , Yang Mu , Emanuel Alexandre Strauss , Taiyuan Zhang , Daniel Olmedilla de la Calle
Abstract: A content review system for an online system automatically determines if received content items to be displayed to users violate any policies of the online system. The content review system generates a semantic vector representing the semantic features of a content item, for example, using a neural network. By comparing the semantic vector for the content item with semantic vectors of content items previously determined to violate one or more policies, the content review system determines whether the content item also violates one or more policies. The content review system may also maintain templates corresponding to portions of semantic vectors shared by multiple content items. An analysis of historical content items that conform to the template is performed to determine a probability that received content items that conform to the template violate a policy.
-
公开(公告)号:US10956522B1
公开(公告)日:2021-03-23
申请号:US16003770
申请日:2018-06-08
Applicant: Facebook, Inc.
Inventor: Abhay Kumar Jha , Emanuel Alexandre Strauss
Abstract: An online system enforces policies to content items that are distributed on its platform and blocks content items that violate one or more of those policies. To identify content items that are slightly varied from each other, the online system generates an embedding for each of the known content items that have already been determined to be noncompliant with one or more policies. The online system then groups the known noncompliant content items that are clustered together in the embedding space. The texts of the group of known noncompliant content items are converted to finite state automata and are merged to generate a common automaton. The common automaton is used to generate a common regular expression that is used to screen new content items. When a new content item matches the textual pattern defined by the common regular expression, the system may block the new content item.
-
公开(公告)号:US10599774B1
公开(公告)日:2020-03-24
申请号:US15905709
申请日:2018-02-26
Applicant: Facebook, Inc.
Inventor: Enming Luo , Emanuel Alexandre Strauss
Abstract: A content review system for an online system automatically determines if received content items to be displayed to users contain text that violates a policy of the online system. The content review system generates a semantic vector representing semantic features of text extracted from the content item, for example, using a neural network. By comparing the semantic vector for the extracted text with stored semantic vectors of extracted text previously determined to violate one or more policies, the content review system determines whether the content item contains text that also violates one or more policies. The content review system also reviews stored semantic vectors previously determined to be unsuitable, in order to remove false positives, as well as unsuitable semantic vectors that are sufficiently similar to known suitable semantic vectors and as such may cause content items having suitable text to be erroneously rejected.
-
公开(公告)号:US20190164196A1
公开(公告)日:2019-05-30
申请号:US15826392
申请日:2017-11-29
Applicant: Facebook, Inc.
Inventor: Sijian Tang , Shengbo Guo , Jiayi Wen , Gregory Matthew Marra , James Li , Seiji James Yamamoto , Grace Louise Jackson , Kristin S. Hendrix , Benxiong Wu , Jiun-Ren Lin , Sara Lee Su , Panagiotis Papadimitriou , Michael Charles Bailey , Cristian Orellana , Emanuel Alexandre Strauss
IPC: G06Q30/02 , G06F3/0482 , G06F17/30 , G06N5/02
Abstract: The disclosed computer-implemented method may include (1) sampling links from an online system, (2) receiving, from a human labeler for each of the links, a label indicating whether the human labeler considers a landing page of the link to be a low-quality webpage, (3) deriving features from a landing page of each of the links, (4) using the label and the features of each of the links to train a model configured to predict a likelihood that a link is to a low-quality webpage, (5) identifying content items that are candidates for a content feed of a user of the online system, (6) applying the model to a link of each of the content items to determine a ranking of the content items, and (7) displaying the content items in the content feed of the user based on the ranking. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US09959412B2
公开(公告)日:2018-05-01
申请号:US15067498
申请日:2016-03-11
Applicant: Facebook, Inc.
CPC classification number: G06F21/577 , G06F17/3053 , G06F21/10 , G06F2221/0775 , G06N99/005 , G06Q30/0269 , G06Q30/0275
Abstract: An online system obtains risk scores determined by a machine learning model for a content item provided by a user of an online system for display to users of the online system, where the risk scores indicate the likelihood of content items violating a content policy. The online system uses the risk scores to determine sampling weights used to select content items for inclusion in a sampled subset of content items. The sampling weights are determined from risk score counts indicating the relative frequency of the obtained risk scores and impression counts indicating the number of times content items have been presented to the users of the online system. The online system presents the selected content items for evaluation by a human reviewer using a quality review interface. Using the results of the quality review, the online system determines quality performance metrics of the machine learning model.
-
公开(公告)号:US20170262635A1
公开(公告)日:2017-09-14
申请号:US15067498
申请日:2016-03-11
Applicant: Facebook, Inc.
CPC classification number: G06F21/577 , G06F17/3053 , G06F21/10 , G06F2221/0775 , G06N99/005 , G06Q30/0269 , G06Q30/0275
Abstract: An online system obtains risk scores determined by a machine learning model for a content item provided by a user of an online system for display to users of the online system, where the risk scores indicate the likelihood of content items violating a content policy. The online system uses the risk scores to determine sampling weights used to select content items for inclusion in a sampled subset of content items. The sampling weights are determined from risk score counts indicating the relative frequency of the obtained risk scores and impression counts indicating the number of times content items have been presented to the users of the online system. The online system presents the selected content items for evaluation by a human reviewer using a quality review interface. Using the results of the quality review, the online system determines quality performance metrics of the machine learning model.
-
-
-
-
-
-
-
-