SUMMARIZING ONLINE FORUMS INTO QUESTION-CONTEXT-ANSWER TRIPLES
    1.
    发明申请
    SUMMARIZING ONLINE FORUMS INTO QUESTION-CONTEXT-ANSWER TRIPLES 审中-公开
    将在线论文概括为问题语境 - 答案三重奏

    公开(公告)号:US20100076978A1

    公开(公告)日:2010-03-25

    申请号:US12207231

    申请日:2008-09-09

    CPC classification number: G06F17/279 G06F16/34

    Abstract: In this paper, we propose a new approach to extracting question-context-answer triples from online discussion forums. More specifically, we propose a general framework based on Conditional Random Fields (CRFs) for context and answer detection, and also extend the basic framework to utilize contexts for answer detection and to better accommodate the features of forums.

    Abstract translation: 在本文中,我们提出了一种从在线论坛中提取问题语境回答三元组的新方法。 更具体地说,我们提出了一种基于条件随机场(CRF)的上下文和答案检测的一般框架,并且还扩展了基础框架,以利用上下文进行答案检测,并更好地适应论坛的特征。

    DISCOVERING QUESTION AND ANSWER PAIRS
    2.
    发明申请
    DISCOVERING QUESTION AND ANSWER PAIRS 审中-公开
    发现问题和答复对

    公开(公告)号:US20100063797A1

    公开(公告)日:2010-03-11

    申请号:US12207199

    申请日:2008-09-09

    CPC classification number: G06F16/367

    Abstract: The present invention provides a new approach to extracting question-answer pairs from online forums. The system develops a classification-based technique to discover questions in forums using sequential patterns automatically extracted from both questions and non-question sentences in forums as features. Once the questions are discovered, the system discovers the answers. The invention includes a graph-based method is that it is complementary with supervised methods for knowledge extraction, and techniques for question answering.

    Abstract translation: 本发明提供了一种从在线论坛中提取问答答对的新方法。 该系统开发了一种基于分类的技术,使用从论坛中的问题和非问题句子自动提取的顺序模式作为功能,在论坛中发现问题。 一旦发现问题,系统会发现答案。 本发明包括基于图表的方法是与知识提取的监督方法以及问答技术的补充。

    Determining utility of a question
    3.
    发明授权
    Determining utility of a question 有权
    确定问题的效用

    公开(公告)号:US08112269B2

    公开(公告)日:2012-02-07

    申请号:US12197991

    申请日:2008-08-25

    CPC classification number: G06F17/277 G06F17/30654

    Abstract: A question search system provides a collection of questions having words for use in evaluating the utility of the questions based on a language model. The question search system calculates n-gram probabilities for words within the questions of the collection. The n-gram probability of a word for a sequence of n−1 words indicates the probability of that word being next after that sequence in the collection of questions. The n-gram probabilities for the words of the collection represent the language model of the collection. The question search system calculates a language model utility score for each question within a collection that indicates the likelihood that a question is repeatedly asked by users. The question search system derives the language model utility score for a question from the n-gram probabilities of the words within that question.

    Abstract translation: 问题搜索系统提供了具有用于评估基于语言模型的问题的效用的单词的问题的集合。 问题搜索系统计算收集问题内的单词的n-gram概率。 n-1个词序列的单词的n-gram概率表示该词在该问题集合中的该序列之后的概率。 集合词的n-gram概率表示集合的语言模型。 问题搜索系统计算集合中每个问题的语言模型效用得分,其指示用户重复询问问题的可能性。 问题搜索系统从该问题中的单词的n-gram概率得出问题的语言模型效用得分。

    Question and Answer Forum Techniques
    5.
    发明申请
    Question and Answer Forum Techniques 有权
    问答论坛技巧

    公开(公告)号:US20130097178A1

    公开(公告)日:2013-04-18

    申请号:US13274796

    申请日:2011-10-17

    CPC classification number: G09B7/02 G06Q50/10 G06Q50/20

    Abstract: Techniques for unsupervised management of a question and answer (QA) forum include labeling of answers for quality purposes, and identification of experts. In a QA thread, a ranking of answers may include an initial labeling of the longest answer in each thread as the best answer. Such a labeling provides an initial point of reference. Then, in an iterative manner answerers are ranked using the labeling. The ranking of answerers allows selection of experts and poor or inexpert answerers. A label update is performed using the experts (and perhaps inexpert answerers) as input. The label update may be used to train a model, which may describe quality of answers in one or more QA threads and an indication of expert and inexpert answerers. The iterative process may be ended upon convergence or upon a maximum number of iterations.

    Abstract translation: 用于无人管理问答(QA)论坛的技术包括为质量目的标识答案,并确定专家。 在QA线程中,答案的排名可能包括每个线程中最长答案的初始标签作为最佳答案。 这样的标签提供了初步的参考点。 然后,以迭代的方式,使用标签对答复者进行排名。 回答者的排名允许选择专家和穷人或无经验的回答者。 使用专家(或许不太实际的回答者)作为输入进行标签更新。 标签更新可以用于训练模型,其可以描述一个或多个QA线程中的答案的质量以及专家和不熟练的答复者的指示。 迭代过程可以在收敛或最大迭代次数时结束。

    Clustering question search results based on topic and focus
    6.
    发明授权
    Clustering question search results based on topic and focus 有权
    基于主题和焦点的聚类问题搜索结果

    公开(公告)号:US08024332B2

    公开(公告)日:2011-09-20

    申请号:US12185702

    申请日:2008-08-04

    CPC classification number: G06F17/30696

    Abstract: A method and system for presenting questions that are relevant to a queried question based on clusters of topics and clusters of focuses of the questions is provided. A question search system provides a collection of questions. Each question of the collection has an associated topic and focus. Upon receiving a queried question, the question search system identifies questions of the collection that may be relevant to the queried question and generates a score or ranking indicating relevance of the identified questions. The question search system clusters the identified questions into topic clusters of questions with similar topics. The question search system may also cluster the questions within each topic cluster into focus clusters of questions with similar focuses.

    Abstract translation: 提供了一种方法和系统,用于根据问题的集群和问题的聚焦集提出与查询问题相关的问题。 问题搜索系统提供了一系列问题。 集合的每个问题都有相关的主题和焦点。 在收到查询问题后,问题搜索系统识别可能与查询问题相关的集合问题,并生成指示所识别问题的相关性的分数或排名。 问题搜索系统将识别的问题集中到具有相似主题的主题问题集群中。 问题搜索系统还可以将每个主题集群中的问题集中到具有类似重点的问题焦点集群中。

    QUESTION AND ANSWER SEARCH
    7.
    发明申请
    QUESTION AND ANSWER SEARCH 审中-公开
    问题和答案搜索

    公开(公告)号:US20100235311A1

    公开(公告)日:2010-09-16

    申请号:US12403560

    申请日:2009-03-13

    CPC classification number: G06F16/9535

    Abstract: Exemplary methods, computer-readable media, and systems are presented for leveraging question-answering knowledge from community sites by complementing product search services with a search of questions, answers, reviews and other Internet accessible content including user-generated content. Product or service information is obtained by crawling Internet-accessible Web sites including community sites. An integrated index of such information is generated. A user is able to browse questions by product or service feature, by topic, by identified comparative questions, and by question ranking (for example, interestingness or popularity).

    Abstract translation: 呈现示例性方法,计算机可读介质和系统,以通过对包括用户生成的内容的问题,答案,评论和其他因特网可访问内容的搜索来补充产品搜索服务来利用来自社区网站的问答答案。 产品或服务信息是通过抓取可访问Internet的网站(包括社区网站)获得的。 生成此类信息的综合索引。 用户能够通过产品或服务功能,主题,识别的比较问题以及问题排名(例如,趣味性或人气)来浏览问题。

    CLUSTERING QUESTION SEARCH RESULTS BASED ON TOPIC AND FOCUS
    8.
    发明申请
    CLUSTERING QUESTION SEARCH RESULTS BASED ON TOPIC AND FOCUS 有权
    基于主题和焦点的聚类问题搜索结果

    公开(公告)号:US20100030769A1

    公开(公告)日:2010-02-04

    申请号:US12185702

    申请日:2008-08-04

    CPC classification number: G06F17/30696

    Abstract: A method and system for presenting questions that are relevant to a queried question based on clusters of topics and clusters of focuses of the questions is provided. A question search system provides a collection of questions. Each question of the collection has an associated topic and focus. Upon receiving a queried question, the question search system identifies questions of the collection that may be relevant to the queried question and generates a score or ranking indicating relevance of the identified questions. The question search system clusters the identified questions into topic clusters of questions with similar topics. The question search system may also cluster the questions within each topic cluster into focus clusters of questions with similar focuses.

    Abstract translation: 提供了一种方法和系统,用于根据问题的集群和问题的聚焦集提出与查询问题相关的问题。 问题搜索系统提供了一系列问题。 集合的每个问题都有相关的主题和焦点。 在收到查询问题后,问题搜索系统识别可能与查询问题相关的集合问题,并产生指示所识别问题的相关性的分数或排名。 问题搜索系统将识别的问题集中到具有相似主题的主题问题集群中。 问题搜索系统还可以将每个主题集群中的问题集中到具有类似重点的问题焦点集群中。

    Domain constraint path based data record extraction
    10.
    发明授权
    Domain constraint path based data record extraction 有权
    基于域约束路径的数据记录提取

    公开(公告)号:US09171080B2

    公开(公告)日:2015-10-27

    申请号:US13356241

    申请日:2012-01-23

    CPC classification number: G06F17/30864 G06F17/227 G06F17/30867

    Abstract: Described herein are techniques for extracting data records containing user-generated content from documents. The documents may be processed into document trees in which sub-trees represent the data records of the document. Domain constraints may be used to locate structured portions of the document tree. For example, anchor trees may be located as being sets of sibling sub-trees with similar tag paths that contain the domain constraints. The anchor trees may then be used to determine a record boundary (e.g., the start offset and length) of the data records. Finally, the data records may be extracted based on the anchor trees and the record boundaries.

    Abstract translation: 这里描述的是从文档中提取包含用户生成的内容的数据记录的技术。 文档可以被处理成文档树,其中子树表示文档的数据记录。 域约束可用于定位文档树的结构化部分。 例如,锚树可以被定位为具有包含域约束的类似标签路径的兄弟子树的集合。 然后可以使用锚树来确定数据记录的记录边界(例如,起始偏移和长度)。 最后,可以基于锚树和记录边界来提取数据记录。

Patent Agency Ranking