Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects
    21.
    发明授权
    Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects 有权
    方法,系统和制作物品,用于共同发生物体的软分层聚类

    公开(公告)号:US07644102B2

    公开(公告)日:2010-01-05

    申请号:US09982236

    申请日:2001-10-19

    IPC分类号: G06F17/30

    摘要: Methods, systems, and articles of manufacture consistent with certain principles related to the present invention enable a computing system to perform hierarchical topical clustering of text data based on statistical modeling of co-occurrences of (document, word) pairs. The computing system may be configured to receive a collection of documents, each document including a plurality of words, and perform a modified deterministic annealing Expectation-Maximization (EM) process on the collection to produce a softly assigned hierarchy of nodes. The process may involve assigning documents and document fragments to multiple nodes in the hierarchy based on words included in the documents, such that a document may be assigned to any ancestor node included in the hierarchy, thus eliminating the hard assignment of documents in the hierarchy.

    摘要翻译: 根据与本发明相关的某些原理的方法,系统和制品使得计算系统能够基于(文档,单词)对的共同出现的统计建模来执行文本数据的分层局部聚类。 计算系统可以被配置为接收文档集合,每个文档包括多个单词,并对该集合执行经修改的确定性退火期望最大化(EM)过程以产生柔性分配的节点层级。 该过程可以包括将文档和文档片段分配给基于文档中包含的字的层次结构中的多个节点,使得可以将文档分配给包括在该层次结构中的任何祖先节点,从而消除层次结构中的文档的硬分配。

    SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION
    22.
    发明申请
    SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION 有权
    使用分段和关键字概述支持移动设备上的文档导航的系统和方法

    公开(公告)号:US20090193337A1

    公开(公告)日:2009-07-30

    申请号:US12242757

    申请日:2008-09-30

    IPC分类号: G06F3/048 G06F17/00

    CPC分类号: G06F3/0481

    摘要: Described is a system that characterizes segments of a document with one or more keyphrases and then uses the keyphrases to help users find interesting parts of a document. The keyphrases are displayed with information about the location of the phrase in the document and are used as pointers to quickly move to from an overview to a section of potential interest.

    摘要翻译: 描述了一种使用一个或多个关键短语表征文档段的系统,然后使用关键短语来帮助用户找到文档的有趣部分。 关键短语显示有关短语在文档中的位置的信息,并且用作快速从概览迁移到潜在兴趣部分的指针。

    User Profile Classification By Web Usage Analysis
    23.
    发明申请
    User Profile Classification By Web Usage Analysis 有权
    用户配置文件分类按Web使用情况分析

    公开(公告)号:US20070073681A1

    公开(公告)日:2007-03-29

    申请号:US11559355

    申请日:2006-11-13

    IPC分类号: G06F17/30

    摘要: Demographic information of an Internet user is predicted based on an analysis of accessed web pages. Web pages accessed by the Internet user are detected and mapped to a user path vector which is converted to a normalized weighted user path vector. A centroid vector identifies web page access patterns of users with a shared user profile attribute. The user profile attribute is assigned to the Internet user based on a comparison of the vectors. Bias values are also assigned to a set of web pages and a user profile attribute can be predicted for an Internet user based on the bias values of web pages accessed by the user. User attributes can also be predicted based on the results of an expectation maximization process. Demographic information can be predicted based on the combined results of a vector comparison, bias determination, or expectation maximization process.

    摘要翻译: 基于访问的网页的分析来预测互联网用户的人口统计信息。 检测由互联网用户访问的网页并将其映射到被转换为归一化加权用户路径向量的用户路径向量。 质心向量使用共享用户配置文件属性来标识用户的网页访问模式。 基于向量的比较将用户简档属性分配给互联网用户。 偏差值也被分配给一组网页,并且可以基于用户访问的网页的偏差值来为互联网用户预测用户简档属性。 也可以基于期望最大化过程的结果来预测用户属性。 可以基于矢量比较,偏差确定或期望最大化过程的组合结果来预测人口统计信息。

    Secondary market for keyword advertising
    26.
    发明申请
    Secondary market for keyword advertising 审中-公开
    关键字广告二级市场

    公开(公告)号:US20050144068A1

    公开(公告)日:2005-06-30

    申请号:US10742667

    申请日:2003-12-19

    IPC分类号: G06Q30/00 G06Q40/00 G06F17/60

    摘要: A method of trading a future right to a keyword advertisement placement associated with a search results list, wherein the search results list is generated in response to a search query. The method includes creating ownership of the future right to the keyword advertisement placement in an original keyword search engine. Next, the future right to the keyword advertisement placement originally owned by the original keyword search engine is made available for purchase in a keyword advertising market. Then, the future right to the keyword advertisement placement originally owned by the original keyword search engine is traded to another participant in the keyword advertising market.

    摘要翻译: 交易与搜索结果列表相关联的关键字广告布局的未来权利的方法,其中响应于搜索查询生成搜索结果列表。 该方法包括在原始关键字搜索引擎中创建关键字广告展示位置的未来权利的所有权。 接下来,原始关键字搜索引擎最初拥有的关键字广告布局的未来权利可用于在关键字广告市场中购买。 然后,原始关键字搜索引擎最初拥有的关键字广告展示的未来权利被交易到关键字广告市场的另一参与者。

    System and method for video summarization
    27.
    发明授权
    System and method for video summarization 有权
    视频摘要的系统和方法

    公开(公告)号:US08200063B2

    公开(公告)日:2012-06-12

    申请号:US11860436

    申请日:2007-09-24

    IPC分类号: H04N9/80

    摘要: The subject invention relates to a system and method for video summarization, and more specifically to a system for segmenting and classifying data from a video in order to create a summary video that preserves and summarizes relevant content. In one embodiment, the system first extracts appearance, motion, and audio features from a video in order to create video segments corresponding to the extracted features. The video segments are then classified as dynamic or static depending on the appearance-based and motion-based features extracted from each video segment. The classified video segments are then grouped into clusters to eliminate redundant content. Select video segments from each cluster are selected as summary segments, and the summary segments are compiled to form a summary video. The parameters for any of the steps in the summarization of the video can be altered so that a user can adapt the system to any type of video, although the system is designed to summarize unstructured videos where the content is unknown. In another aspect, audio features can also be used to further summarize video with certain audio properties.

    摘要翻译: 本发明涉及一种用于视频摘要的系统和方法,更具体地涉及一种用于从视频分割和分类数据以便创建保留和总结相关内容的摘要视频的系统。 在一个实施例中,系统首先从视频中提取外观,运动和音频特征,以便创建对应于所提取的特征的视频段。 视频片段根据从每个视频段中提取的基于外观和基于运动的特征,被分类为动态或静态。 然后将分类的视频片段分组成簇以消除冗余内容。 选择每个群集中的视频片段作为摘要片段,并将摘要片段编译为一个摘要视频。 视频总结中的任何步骤的参数可以被改变,使得用户可以使系统适应任何类型的视频,尽管系统被设计为总结内容未知的非结构化视频。 另一方面,也可以使用音频特征来进一步总结具有某些音频属性的视频。

    SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION
    28.
    发明申请
    SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION 有权
    使用分段和关键字概述支持移动设备上的文档导航的系统和方法

    公开(公告)号:US20090193350A1

    公开(公告)日:2009-07-30

    申请号:US12268343

    申请日:2008-11-10

    IPC分类号: G06F3/048

    摘要: Described is system that characterizes segments of document with one or more keyphrases and then uses keyphrases to help users find interesting parts of document. Keyphrases are displayed with information about the location of the phrase in the document and are used as pointers to quickly move to from overview to section of potential interest. In another implementation, when there are many documents in a collection, inventive multi-document view can be used to reduce number of documents presented, helping user to more efficiently find documents of interest. In this view, a user (possibly repeatedly) filters documents displayed based on metadata values. In one implementation, icons corresponding to documents are displayed on a display device together with metadata corresponding to the documents. When the value of the metadata is selected by the user, display state of the icons corresponding to document is varied based on selected value of metadata.

    摘要翻译: 描述了用一个或多个关键短语表征文档段的系统,然后使用关键短语来帮助用户找到文档的有趣部分。 关键短语显示有关短语在文档中的位置的信息,并且用作从概览到潜在兴趣的部分快速移动的指针。 在另一个实现中,当集合中有许多文档时,可以使用创新的多文档视图来减少呈现的文档数量,帮助用户更有效地找到感兴趣的文档。 在此视图中,用户(可能重复)过滤基于元数据值显示的文档。 在一个实现中,与文档对应的图标与对应于文档的元数据一起显示在显示设备上。 当用户选择元数据的值时,与文档相对应的图标的显示状态根据元数据的选择值而变化。

    User Profile Classification By Web Usage Analysis

    公开(公告)号:US20070073682A1

    公开(公告)日:2007-03-29

    申请号:US11559357

    申请日:2006-11-13

    IPC分类号: G06F17/30

    摘要: Demographic information of an Internet user is predicted based on an analysis of accessed web pages. Web pages accessed by the Internet user are detected and mapped to a user path vector which is converted to a normalized weighted user path vector. A centroid vector identifies web page access patterns of users with a shared user profile attribute. The user profile attribute is assigned to the Internet user based on a comparison of the vectors. Bias values are also assigned to a set of web pages and a user profile attribute can be predicted for an Internet user based on the bias values of web pages accessed by the user. User attributes can also be predicted based on the results of an expectation maximization process. Demographic information can be predicted based on the combined results of a vector comparison, bias determination, or expectation maximization process.

    Systems and methods for linked event detection
    30.
    发明申请
    Systems and methods for linked event detection 有权
    链接事件检测的系统和方法

    公开(公告)号:US20050021490A1

    公开(公告)日:2005-01-27

    申请号:US10626875

    申请日:2003-07-25

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/3069

    摘要: Techniques for training and using linked event detection systems and transforming source-identified stopwords are provided. A training corpus of source identified stories and a reference language is determined. Optionally, stopwords for source-identified stories are transformed based on statistical analysis of parallel verified and un-verified transformations. Reference language and non-reference language terms are selectively included in source-pair term frequency-inverse story frequency models. Optionally, incremental source-identified term frequency-inverse story frequency models are determined. Selected terms are weighted and similarity metrics determined. Associated source-pair statistics, computed in part from a training corpus, are combined with the values of each similarity metric in the set of similarity metrics to form a similarity vector. Similarity vectors and verified link label information are used to determine a predictive model. Similarity vectors for story pairs are used with the predictive model to determine if the story-pairs are linked. Sources are arranged based on source inter-relationships into a source-hierarchy. Progressively more refined source-pair similarity statistics are also provided. New sources and associated source-pair similarity statistics are added by substituting related source-pair similarity statistics based on the source hierarchy and source characteristics. The source-pair similarity statistics are used to optionally normalize the similarity metrics.

    摘要翻译: 提供了用于训练和使用链接事件检测系统和转换源标识的无效词的技术。 确定了源识别故事和参考语言的训练语料库。 可选地,基于来源识别故事的停用词基于并行验证和未验证转换的统计分析而被转换。 参考语言和非参考语言术语被选择性地包括在源对术语频率 - 反故障频率模型中。 可选地,确定增量源标识术语频率 - 反故障频率模型。 所选项是加权且确定相似度度量。 从训练语料库部分计算的相关源对统计信息与相似性度量集合中的每个相似性度量的值组合以形成相似度向量。 使用相似矢量和经过验证的链接标签信息来确定预测模型。 故事对的相似性向量与预测模型一起使用,以确定故事对是否链接。 根据源间的关系将源排列成源层次。 还提供逐渐更精细的源对相似性统计。 通过根据源层次和源特征取代相关的源对相似性统计信息来添加新的源和相关的源对相似性统计。 源对相似性统计量用于可选地标准化相似性度量。