CONTENT RANKING BASED ON USER FEATURES IN CONTENT
    1.
    发明申请
    CONTENT RANKING BASED ON USER FEATURES IN CONTENT 有权
    基于用户内容的内容排名

    公开(公告)号:US20150193540A1

    公开(公告)日:2015-07-09

    申请号:US14147789

    申请日:2014-01-06

    Applicant: Yahoo! Inc.

    Inventor: Mike Wexler

    CPC classification number: G06F17/30867

    Abstract: Methods, systems, and computer programs are presented for providing a personalized news stream to a user. One method includes an operation for identifying user features associated with a user. The user features include personal features and social features. The personal features are based on activities of the user and the profile of the user. The social features are based on information about social connections of the user. The method further includes operations for extracting content features from a corpus of content items, for identifying intersections between user features and content features, and for assigning weights to the content features from the corpus based on the identified intersections. A score for each content item is determined based on the content features and the respective weights of the content items. The content items are then ranked based on the scores. One or more of the ranked content items are displayed.

    Abstract translation: 呈现用于向用户提供个性化新闻流的方法,系统和计算机程序。 一种方法包括用于识别与用户相关联的用户特征的操作。 用户功能包括个人功能和社交功能。 个人功能基于用户的活动和用户的配置文件。 社交功能基于用户社交关系的信息。 该方法还包括用于从内容项的语料库中提取内容特征的操作,用于识别用户特征和内容特征之间的交集,并且基于所识别的交集从语料库分配对内容特征的权重。 基于内容特征和内容项的相应权重来确定每个内容项的得分。 然后基于分数对内容项进行排名。 显示一个或多个排名内容项目。

    Determining Content Sessions Using Content-Consumption Events
    2.
    发明申请
    Determining Content Sessions Using Content-Consumption Events 审中-公开
    使用内容消费事件确定内容会话

    公开(公告)号:US20160292170A1

    公开(公告)日:2016-10-06

    申请号:US14673854

    申请日:2015-03-30

    Applicant: Yahoo! Inc.

    Abstract: Software for an online content service obtains a plurality of events chronologically generated by a plurality of users of an online content service during a specified period of time. The software identifies any content items associated with each event and annotates each of the content items with (a) a plurality of metadata attributes associated with the content item and (b) a plurality of metadata attributes associated with the online content service. The software sorts the events based on user and based on content identifier and orders the sorted events based on timestamp. The software determines the events that make up a content session for the specific content item and the specific user, using the ordered events for the specific content item and a look-back time period and a look-ahead time period. Then the software generates an analytic based at least in part on the content session.

    Abstract translation: 用于在线内容服务的软件在指定的时间段内获取由在线内容服务的多个用户按时间顺序生成的多个事件。 该软件识别与每个事件相关联的任何内容项目,并用(a)与该内容项目相关联的多个元数据属性和(b)与在线内容服务相关联的多个元数据属性来注释每个内容项目。 该软件基于用户和基于内容标识符对事件进行排序,并根据时间戳对排序的事件进行排序。 该软件使用针对特定内容项目的有序事件以及查找时间段和预览时间段来确定构成特定内容项目和特定用户的内容会话的事件。 然后软件至少部分地基于内容会话生成分析。

    USING HIERARCHICAL RESERVOIR SAMPLING TO COMPUTE PERCENTILES AT SCALE
    3.
    发明申请
    USING HIERARCHICAL RESERVOIR SAMPLING TO COMPUTE PERCENTILES AT SCALE 有权
    使用分层储存器计算计算符号

    公开(公告)号:US20160277490A1

    公开(公告)日:2016-09-22

    申请号:US14664043

    申请日:2015-03-20

    Applicant: Yahoo! Inc.

    CPC classification number: H04L67/1029 H04L41/044 H04L43/022 H04L43/0876

    Abstract: In one embodiment, in a hierarchy of nodes, a master node having two or more child nodes obtains from the two or more child nodes two or more sets of data samples or summaries associated therewith, the two or more sets of data samples being representative of traffic processed via two or more sets of servers corresponding to the two or more child nodes, wherein a size of each of the two or more sets of data samples is proportional to an allocation of traffic among the two or more sets of servers corresponding to the two or more child nodes. Each of the two or more sets of data samples is obtained from a different one of the two or more child nodes and represents traffic processed by a corresponding one of the two or more sets of servers. The master node combines the two or more sets of data samples or summaries associated therewith such that a combined set of data is generated. The master node ascertains a numerical value from the combined set of data.

    Abstract translation: 在一个实施例中,在节点的层次中,具有两个或多个子节点的主节点从两个或更多个子节点获得两组或更多组数据样本或与其相关联的摘要,所述两组或更多组数据样本表示 经由与两个或更多个子节点对应的两组或多组服务器处理的流量,其中两组或更多组数据样本中的每一组的大小与对应于该两个或更多个子节点的两组或更多服务器集合中的流量分配成比例 两个或多个子节点。 从两个或多个子节点中的不同的一个子节点获得两组或更多组数据样本中的每一组,并且表示由两组或更多组服务器中的对应的一个服务器处理的流量。 主节点组合两组或多组数据样本或与之相关联的摘要,以便生成一组组合的数据。 主节点根据组合的数据集确定数值。

    Categorizing hash tags
    4.
    发明授权
    Categorizing hash tags 有权
    分类哈希标签

    公开(公告)号:US09384259B2

    公开(公告)日:2016-07-05

    申请号:US14170952

    申请日:2014-02-03

    Applicant: Yahoo! Inc.

    Abstract: A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.

    Abstract translation: 内容项分类系统从互联网来源检索内容项。 如果检索到的内容项目包含用于传统分类方法的足够信息,则系统使用这种传统方法向内容项目分配一个或多个类别。 该系统基于关于传统分类的内容项目的信息创建元数据模型,其将至少将内容项目的标签图映射到一个或多个内容类别。 当系统检索到不包含传统分类的足够信息的稀疏信息项时,系统应用元数据模型,使用稀疏信息项中的至少一个标签进行内容分类。 元数据模型还可以包括指示类别之间的映射的信息,以及主题标签和附加内容项目属性的一致性。 此外,元数据模型可以提供用于根据稀疏信息项元数据中的多个标签符号对稀疏信息项进行分类的信息。

    Using hierarchical reservoir sampling to compute percentiles at scale

    公开(公告)号:US09756122B2

    公开(公告)日:2017-09-05

    申请号:US14664043

    申请日:2015-03-20

    Applicant: Yahoo! Inc.

    CPC classification number: H04L67/1029 H04L41/044 H04L43/022 H04L43/0876

    Abstract: In one embodiment, in a hierarchy of nodes, a master node having two or more child nodes obtains from the two or more child nodes two or more sets of data samples or summaries associated therewith, the two or more sets of data samples being representative of traffic processed via two or more sets of servers corresponding to the two or more child nodes, wherein a size of each of the two or more sets of data samples is proportional to an allocation of traffic among the two or more sets of servers corresponding to the two or more child nodes. Each of the two or more sets of data samples is obtained from a different one of the two or more child nodes and represents traffic processed by a corresponding one of the two or more sets of servers. The master node combines the two or more sets of data samples or summaries associated therewith such that a combined set of data is generated. The master node ascertains a numerical value from the combined set of data.

    Content ranking based on user features in content

    公开(公告)号:US09633119B2

    公开(公告)日:2017-04-25

    申请号:US14147789

    申请日:2014-01-06

    Applicant: Yahoo! Inc.

    Inventor: Mike Wexler

    CPC classification number: G06F17/30867

    Abstract: Methods, systems, and computer programs are presented for providing a personalized news stream to a user. One method includes an operation for identifying user features associated with a user. The user features include personal features and social features. The personal features are based on activities of the user and the profile of the user. The social features are based on information about social connections of the user. The method further includes operations for extracting content features from a corpus of content items, for identifying intersections between user features and content features, and for assigning weights to the content features from the corpus based on the identified intersections. A score for each content item is determined based on the content features and the respective weights of the content items. The content items are then ranked based on the scores. One or more of the ranked content items are displayed.

    CATEGORIZING HASH TAGS
    7.
    发明申请

    公开(公告)号:US20160314189A1

    公开(公告)日:2016-10-27

    申请号:US15199420

    申请日:2016-06-30

    Applicant: Yahoo! Inc.

    Abstract: A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.

    CATEGORIZING HASH TAGS
    8.
    发明申请
    CATEGORIZING HASH TAGS 有权
    分类哈希标签

    公开(公告)号:US20150220615A1

    公开(公告)日:2015-08-06

    申请号:US14170952

    申请日:2014-02-03

    Applicant: Yahoo! Inc.

    Abstract: A content item categorizer system retrieves content items from Internet sources. If a retrieved content item includes sufficient information for traditional categorization methods, then the system assigns one or more categories to the content item using such traditional methods. The system creates a metadata model, based on information about traditionally-categorized content items, that maps at least hashtags from the content items to one or more content categories. When the system retrieves a sparse-info item that does not include sufficient information for traditional categorization, the system applies the metadata model to categorize the content item using at least hashtags in the sparse-info item. The metadata model may also include information indicating mappings between categories and coincidence of hashtags and additional content item attributes. Also, the metadata model may provide information for categorizing sparse-info items based on multiple hashtags in the sparse-info item metadata.

    Abstract translation: 内容项分类系统从互联网来源检索内容项。 如果检索到的内容项目包含用于传统分类方法的足够信息,则系统使用这种传统方法向内容项目分配一个或多个类别。 该系统基于关于传统分类的内容项目的信息创建元数据模型,其将至少将内容项目的标签图映射到一个或多个内容类别。 当系统检索到不包含传统分类的足够信息的稀疏信息项时,系统应用元数据模型,使用稀疏信息项中的至少一个标签进行内容分类。 元数据模型还可以包括指示类别之间的映射的信息,以及主题标签和附加内容项目属性的一致性。 此外,元数据模型可以提供用于根据稀疏信息项元数据中的多个标签符号对稀疏信息项进行分类的信息。

    DETERMINATION OF GENERAL AND TOPICAL NEWS AND GEOGRAPHICAL SCOPE OF NEWS CONTENT
    9.
    发明申请
    DETERMINATION OF GENERAL AND TOPICAL NEWS AND GEOGRAPHICAL SCOPE OF NEWS CONTENT 有权
    一般和主题新闻的确定和新闻内容的地理范围

    公开(公告)号:US20150026255A1

    公开(公告)日:2015-01-22

    申请号:US13944409

    申请日:2013-07-17

    Applicant: YAHOO! INC.

    Inventor: Mike Wexler

    Abstract: Methods for categorizing news are presented. One method groups articles into clusters that share a common topic. A first category is identified for each article that indicates if the article is news or not. Further, the method includes an operation for determining use data for each article that has information about people that have accessed or referenced the article. Additionally, the method includes an operation for combining the use data and the first category for all the articles in each cluster to determine the geographical scope of interest for the cluster. The use data and the first category are combined for all the articles in each cluster to determine a second category for each article that indicates if the article is general news, topical news, or not news. The articles are presented to the user based on the geographical scope of interest, the second category, and the attributes of the user.

    Abstract translation: 介绍了分类新闻的方法。 一种方法将文章分组到共享共同主题的集群中。 每个文章都标识了第一个类别,指示该文章是否为新闻。 此外,该方法包括用于确定每个文章的使用数据的操作,其具有关于已经访问或引用该文章的人的信息。 此外,该方法包括用于将每个集群中的所有文章的使用数据和第一类别组合以确定集群的感兴趣的地理范围的操作。 对于每个集群中的所有文章,使用数据和第一类合并,以确定每篇文章的第二个类别,指示该文章是一般消息,主题消息还是新闻。 根据感兴趣的地理范围,第二类别和用户的属性将文章呈现给用户。

Patent Agency Ranking