Abstract:
Methods, systems, and computer programs are presented for providing a personalized news stream to a user. One method includes an operation for identifying user features associated with a user. The user features include personal features and social features. The personal features are based on activities of the user and the profile of the user. The social features are based on information about social connections of the user. The method further includes operations for extracting content features from a corpus of content items, for identifying intersections between user features and content features, and for assigning weights to the content features from the corpus based on the identified intersections. A score for each content item is determined based on the content features and the respective weights of the content items. The content items are then ranked based on the scores. One or more of the ranked content items are displayed.
Abstract:
Software for an online content service obtains a plurality of events chronologically generated by a plurality of users of an online content service during a specified period of time. The software identifies any content items associated with each event and annotates each of the content items with (a) a plurality of metadata attributes associated with the content item and (b) a plurality of metadata attributes associated with the online content service. The software sorts the events based on user and based on content identifier and orders the sorted events based on timestamp. The software determines the events that make up a content session for the specific content item and the specific user, using the ordered events for the specific content item and a look-back time period and a look-ahead time period. Then the software generates an analytic based at least in part on the content session.
Abstract:
In one embodiment, in a hierarchy of nodes, a master node having two or more child nodes obtains from the two or more child nodes two or more sets of data samples or summaries associated therewith, the two or more sets of data samples being representative of traffic processed via two or more sets of servers corresponding to the two or more child nodes, wherein a size of each of the two or more sets of data samples is proportional to an allocation of traffic among the two or more sets of servers corresponding to the two or more child nodes. Each of the two or more sets of data samples is obtained from a different one of the two or more child nodes and represents traffic processed by a corresponding one of the two or more sets of servers. The master node combines the two or more sets of data samples or summaries associated therewith such that a combined set of data is generated. The master node ascertains a numerical value from the combined set of data.
Abstract:
A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.
Abstract:
In one embodiment, in a hierarchy of nodes, a master node having two or more child nodes obtains from the two or more child nodes two or more sets of data samples or summaries associated therewith, the two or more sets of data samples being representative of traffic processed via two or more sets of servers corresponding to the two or more child nodes, wherein a size of each of the two or more sets of data samples is proportional to an allocation of traffic among the two or more sets of servers corresponding to the two or more child nodes. Each of the two or more sets of data samples is obtained from a different one of the two or more child nodes and represents traffic processed by a corresponding one of the two or more sets of servers. The master node combines the two or more sets of data samples or summaries associated therewith such that a combined set of data is generated. The master node ascertains a numerical value from the combined set of data.
Abstract:
Methods, systems, and computer programs are presented for providing a personalized news stream to a user. One method includes an operation for identifying user features associated with a user. The user features include personal features and social features. The personal features are based on activities of the user and the profile of the user. The social features are based on information about social connections of the user. The method further includes operations for extracting content features from a corpus of content items, for identifying intersections between user features and content features, and for assigning weights to the content features from the corpus based on the identified intersections. A score for each content item is determined based on the content features and the respective weights of the content items. The content items are then ranked based on the scores. One or more of the ranked content items are displayed.
Abstract:
A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.
Abstract:
A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.
Abstract:
Methods for categorizing news are presented. One method groups articles into clusters that share a common topic. A first category is identified for each article that indicates if the article is news or not. Further, the method includes an operation for determining use data for each article that has information about people that have accessed or referenced the article. Additionally, the method includes an operation for combining the use data and the first category for all the articles in each cluster to determine the geographical scope of interest for the cluster. The use data and the first category are combined for all the articles in each cluster to determine a second category for each article that indicates if the article is general news, topical news, or not news. The articles are presented to the user based on the geographical scope of interest, the second category, and the attributes of the user.