Abstract:
An online system identifies an additional feature to evaluate for inclusion in a machine learned model. The additional feature is based on characteristics of one or more dimensions of information maintained by the online system. To generate data for evaluating the additional feature, the online system generates various partitions of stored data, where each partition includes characteristics associated with one or more dimensions on which the additional feature is based. Using values of characteristics in a partition, the online system generates values for the additional feature and includes the values of the additional feature in the partition. Values for the additional feature are generated for various partitions based on the values of characteristics in each partition. The online system combines multiple partitions that include values for the additional feature to generate a training set for evaluating a machine learned model including the additional feature.
Abstract:
In one embodiment, a method includes generating a reconstructed embedding of a query based on one or more term embeddings associated with the one or more query terms, respectively, on receiving a query with the one or more query terms, formulating an evaluation model based at least on the reconstructed embedding of the query, where the evaluation model calculates a relevance score for posts with respect to the search query based at least on the classifier vectors of the posts, and calculating, for each of the retrieved posts, a relevance score for the post by applying the associated classifier vector to the formulated evaluation model.
Abstract:
In one embodiment, a method includes generating a reconstructed embedding of a query based on one or more term embeddings associated with the one or more query terms, respectively, on receiving a query with the one or more query terms, formulating an evaluation model based at least on the reconstructed embedding of the query, where the evaluation model calculates a relevance score for posts with respect to the search query based at least on the classifier vectors of the posts, and calculating, for each of the retrieved posts, a relevance score for the post by applying the associated classifier vector to the formulated evaluation model.
Abstract:
An online system identifies an additional feature to evaluate for inclusion in a machine learned model. The additional feature is based on characteristics of one or more dimensions of information maintained by the online system. To generate data for evaluating the additional feature, the online system generates various partitions of stored data, where each partition includes characteristics associated with one or more dimensions on which the additional feature is based. Using values of characteristics in a partition, the online system generates values for the additional feature and includes the values of the additional feature in the partition. Values for the additional feature are generated for various partitions based on the values of characteristics in each partition. The online system combines multiple partitions that include values for the additional feature to generate a training set for evaluating a machine learned model including the additional feature.
Abstract:
An online system identifies an additional feature to evaluate for inclusion in a machine learned model. The additional feature is based on characteristics of one or more dimensions of information maintained by the online system. To generate data for evaluating the additional feature, the online system generates various partitions of stored data, where each partition includes characteristics associated with one or more dimensions on which the additional feature is based. Using values of characteristics in a partition, the online system generates values for the additional feature and includes the values of the additional feature in the partition. Values for the additional feature are generated for various partitions based on the values of characteristics in each partition. The online system combines multiple partitions that include values for the additional feature to generate a training set for evaluating a machine learned model including the additional feature.