-
公开(公告)号:US20060212265A1
公开(公告)日:2006-09-21
申请号:US11083204
申请日:2005-03-17
申请人: Einat Amitay , Adam Darlow , Uri Weiss
发明人: Einat Amitay , Adam Darlow , Uri Weiss
IPC分类号: G21C17/00
CPC分类号: G06F16/951
摘要: A method and system for assessing the quality of one or more search engines are provided. The method and system monitor reformulation sessions by users (201) of a search engine (308, 402, 403) by retrieving data from a query log (307, 407, 408), wherein a reformulation session is a series of at least two queries to a search engine (308) issued by a user (201) to satisfy a single information need. The method and system then determine a reformulation session parameter for the search engine (308, 402, 403) and analyse the reformulation session parameter. The reformulation session parameter may be a rate of query reformulations in a reformulation session or a reformulation session duration. Analysing the reformulation session parameter for a single search engine may determine if the parameter changes with time or may determine the parameter with different settings in a single search engine. Analysing the reformulation session parameter for two or more search engines includes comparing the parameters of the two or more search engines to measure the search quality. The analysis can be used to control the operation of one or more search engines.
摘要翻译: 提供了一种用于评估一个或多个搜索引擎的质量的方法和系统。 所述方法和系统通过从查询日志(307,407,408)中检索数据来监视用户(201)的搜索引擎(308,402,403)的重新制定会话,其中重新配置会话是一系列至少两个查询 到由用户(201)发布以满足单个信息需求的搜索引擎(308)。 方法和系统然后确定搜索引擎(308,402,403)的重新配置会话参数,并分析重新配置会话参数。 重新配置会话参数可以是重新配置会话或重新配置会话持续时间中的查询重新设置的速率。 分析单个搜索引擎的重新配置会话参数可以确定参数是否随时间变化,或者可以在单个搜索引擎中确定具有不同设置的参数。 分析两个或多个搜索引擎的重新配置会话参数包括比较两个或多个搜索引擎的参数以测量搜索质量。 该分析可用于控制一个或多个搜索引擎的操作。
-
公开(公告)号:US08024324B2
公开(公告)日:2011-09-20
申请号:US12164139
申请日:2008-06-30
申请人: Einat Amitay , David Carmel , Nadav Golbandi , Nadav Y Har'el , Shila Ofek-Koifman , Sivan Yogev
发明人: Einat Amitay , David Carmel , Nadav Golbandi , Nadav Y Har'el , Shila Ofek-Koifman , Sivan Yogev
IPC分类号: G06F17/30
CPC分类号: G06F17/30675
摘要: A method for information retrieval with unified search between heterogeneous objects includes indexing a first object as a document in a search index; referencing a second object related to the first object in a facet of the document; and storing a relationship strength between the first and second objects in the facet of the document in the search index. Multiple heterogeneous objects can be related to the first object and referenced in multiple facets of the document, each with its relationship strength to the first object. Scoring an indirect object by indirect relation to a query object can be carried out by aggregating the relationship strengths between the indirect object and the retrieved objects multiplied by the retrieved objects' direct scores of relationship strength to the query object.
摘要翻译: 用于异构对象之间的统一搜索的信息检索方法包括将第一对象作为搜索索引中的文档进行索引; 在所述文档的方面引用与所述第一对象相关的第二对象; 以及在所述搜索索引中存储所述文档的所述面中的所述第一和第二对象之间的关系强度。 多个异构对象可以与第一个对象相关,并在文档的多个方面被引用,每一个都具有与第一个对象的关系强度。 通过与查询对象的间接关系来计算间接对象可以通过将间接对象和检索对象之间的关系强度乘以检索到的对象的关系强度的直接得分与查询对象进行。
-
公开(公告)号:US20060161537A1
公开(公告)日:2006-07-20
申请号:US11038370
申请日:2005-01-19
申请人: Einat Amitay , Nadav Har'el
发明人: Einat Amitay , Nadav Har'el
IPC分类号: G06F17/30
CPC分类号: G06F17/277
摘要: A method includes finding content-rich text in a document by identifying areas of narrative in the document. An apparatus includes a detector and a content-rich text indicator. The detector detects linguistic parameters which characterize narrative text in an input document and the content-rich text indicator provides the locations of narrative text in the input document.
摘要翻译: 一种方法包括通过识别文档中叙述的区域来在文档中找到内容丰富的文本。 一种装置包括检测器和富含内容的文本指示符。 检测器检测表征输入文档中的叙述文本的语言参数,并且内容丰富的文本指示符在输入文档中提供叙述文本的位置。
-
公开(公告)号:US20060004752A1
公开(公告)日:2006-01-05
申请号:US11165527
申请日:2005-06-23
申请人: Nadav Harel , Einat Amitay , Ron Sivan
发明人: Nadav Harel , Einat Amitay , Ron Sivan
IPC分类号: G06F7/00
CPC分类号: G06F16/9537 , G06F16/313
摘要: A method and system for determining the focus of a document are provided. Candidate topics in the form of topic nodes in a hierarchy of topics are input into a focus determining algorithm. For each candidate topic node, a score is allocated to the topic of each level of the hierarchy of the topic node , the scores for each topic are summed and one or more topics are determined to be the focus of the document based on the scores. The scores allocated to the topic of each parent level of the hierarchy of the topic node are progressively lower for the topic of each parent level of the hierarchy. The candidate topics may be provided by identifying occurrences of references to a topic in a document, providing a plurality of possible topics in the form of topic nodes in a hierarchy of topics, and, for each identified occurrence of a reference to a topic, determining the appropriate topic node and adding the topic node to the candidate topics.
摘要翻译: 提供了一种用于确定文档焦点的方法和系统。 将主题层次结构中的主题节点形式的候选主题输入到焦点确定算法中。 对于每个候选主题节点,将分数分配给主题节点的层级的每个级别的主题,将每个主题的分数相加,并且基于分数将一个或多个主题确定为文档的焦点。 分配给主题节点的层次结构的每个父级别的主题的分数对于层次结构的每个父级别的主题逐渐降低。 候选主题可以通过标识对文档中的主题的引用的出现来提供,以主题层次结构中的主题节点的形式提供多个可能的主题,并且对于针对主题的引用的每个确定的出现,确定 相应的主题节点,并将主题节点添加到候选主题。
-
公开(公告)号:US07752208B2
公开(公告)日:2010-07-06
申请号:US11733808
申请日:2007-04-11
申请人: Einat Amitay , Sivan Yogev , Elad Yom-Tov
发明人: Einat Amitay , Sivan Yogev , Elad Yom-Tov
IPC分类号: G06F7/00
CPC分类号: G06F17/30705 , G06F17/30675 , Y10S707/99942
摘要: A method and system are provided for detection of authors across different types of information sources such as across documents on the Web. The method includes obtaining a compression signature for a document, and determining the similarity between compression signatures of two or more documents. If the similarity is greater than a threshold measure, the two or more documents are considered to be by the same author. Scored pairs of documents are clustered to provide a group of documents by the same author.The group of documents by the same author can be used for user profiling, noise reduction, contribution sizing, detecting fraudulent contributions, obtaining other search results by the same author, or mating a document with undisclosed authorship to a document of known author.
摘要翻译: 提供了一种方法和系统,用于检测跨不同类型信息源的作者,例如跨Web上的文档。 该方法包括获得文档的压缩签名,以及确定两个或多个文档的压缩签名之间的相似性。 如果相似度大于阈值度量,则两个或多个文档被认为是由同一作者。 得分的文档对被聚集以提供同一作者的一组文档。 同一作者的一组文件可用于用户分析,降噪,贡献大小,检测欺诈性贡献,获取同一作者的其他搜索结果,或将未公开作者的文档与已知作者的文档进行交互。
-
公开(公告)号:US20090327271A1
公开(公告)日:2009-12-31
申请号:US12164139
申请日:2008-06-30
申请人: Einat Amitay , David Carmel , Nadav Golbandi , Nadav Y. Har'el , Shila Ofek-Koifman , Sivan Yogev
发明人: Einat Amitay , David Carmel , Nadav Golbandi , Nadav Y. Har'el , Shila Ofek-Koifman , Sivan Yogev
CPC分类号: G06F17/30675
摘要: Information retrieval with unified search between heterogeneous objects is described. The method includes: indexing a first object as a document in a search index; referencing a second object related to the first object in a facet of the document; and storing a relationship strength between the first and second objects in the facet of the document in the search index. Multiple heterogeneous objects can be related to the first object and referenced in multiple facets of the document, each with its relationship strength to the first object. Scoring an indirect object by indirect relation to a query object can be carried out by aggregating the relationship strengths between the indirect object and the retrieved objects multiplied by the retrieved objects' direct scores of relationship strength to the query object.
摘要翻译: 描述了异构对象之间统一搜索的信息检索。 该方法包括:将第一对象作为文档索引到搜索索引中; 在所述文档的方面引用与所述第一对象相关的第二对象; 以及在所述搜索索引中存储所述文档的所述面中的所述第一和第二对象之间的关系强度。 多个异构对象可以与第一个对象相关,并在文档的多个方面被引用,每一个都具有与第一个对象的关系强度。 通过与查询对象的间接关系来计算间接对象可以通过将间接对象和检索对象之间的关系强度乘以检索到的对象的关系强度的直接得分与查询对象进行。
-
7.
公开(公告)号:US20070265999A1
公开(公告)日:2007-11-15
申请号:US11383265
申请日:2006-05-15
申请人: Einat Amitay , Naama Kraus , Ronny Lempel , Yael Petruschka , Aya Soffer
发明人: Einat Amitay , Naama Kraus , Ronny Lempel , Yael Petruschka , Aya Soffer
IPC分类号: G06F17/30
CPC分类号: G06F11/3495 , G06F11/3438 , G06F16/951 , G06F16/958
摘要: A system for monitoring search performance and user interaction is provided in the form of a utility (300) including a plurality of monitoring components (302), each for dynamic monitoring of an aspect of searching a collection of documents. An analyzer module (303) analyzes the dynamic monitoring and identifies problems or difficulties in the search performance or user interactions. An output (301), which may be in the form of a display interface, provides information regarding the search performance and user interaction including one or more of: reasoning, improvement suggestions, reports, and problem alerts. The analyzer module (302) compares the dynamic monitoring to benchmark search engine conduct and document collection state.
摘要翻译: 用于监视搜索性能和用户交互的系统以包括多个监视组件(302)的实用程序(300)的形式提供,每个监视组件用于动态监视搜索文档集合的方面。 分析器模块(303)分析动态监视并识别搜索性能或用户交互中的问题或困难。 输出(301)可以是显示界面的形式,提供关于搜索性能和用户交互的信息,包括以下一个或多个:推理,改进建议,报告和问题警报。 分析器模块(302)将动态监视与基准搜索引擎行为和文档收集状态进行比较。
-
公开(公告)号:US07260571B2
公开(公告)日:2007-08-21
申请号:US10440883
申请日:2003-05-19
申请人: Einat Amitay , Rani Nelken , Niblack Wayne , David C Smith , Aya Soffer
发明人: Einat Amitay , Rani Nelken , Niblack Wayne , David C Smith , Aya Soffer
IPC分类号: G06F17/30
CPC分类号: G06F17/30672 , Y10S707/99933 , Y10S707/99935 , Y10S707/99945
摘要: A method for extracting information from a corpus of data includes specifying a topic and a query term associated with the topic, and defining adjunct terms which may occur in the corpus in a context of the query term, the adjunct terms comprising one or more off-topic terms. Occurrences of the query term are found in the corpus, the occurrences including at least one occurrence of the query term together with at least one of the off-topic terms in the context of the query term. The at least one occurrence of the query term is classified as non-relevant to the topic responsively to the occurrence of the at least one of the off-topic terms in the context of the query term.
摘要翻译: 一种从数据语料库中提取信息的方法包括:指定主题和与该主题相关联的查询词,以及定义在查询词语的上下文中可能在语料库中发生的附加词,该附加词包含一个或多个离线词, 话题。 查询项的出现在语料库中找到,所发生的事件包括查询词的至少一个出现以及在查询词的上下文中的至少一个离题词。 所述查询项的所述至少一个出现被响应于所述查询项的上下文中的所述脱离主题项中的至少一个的出现而被归类为与所述主题不相关。
-
公开(公告)号:US20050067492A1
公开(公告)日:2005-03-31
申请号:US10675155
申请日:2003-09-30
申请人: Einat Amitay , Aya Soffer
发明人: Einat Amitay , Aya Soffer
CPC分类号: G06Q30/02
摘要: A dynamic index may list physical items in the changing vicinity of a user or a generator of the index. The vicinity may be within the same space as the user or the generator, such as a store, a library, a shelf, an aisle, within a given radius, a street, a city, a campus, a building, an area and a park. The index may store information about the physical items near the user or generator, such as content found on tags associated with the physical items. The content might be a description of the physical items and their locations. The present invention also includes a system and method to generate such a dynamic index.
摘要翻译: 动态索引可以列出用户的变化附近的物理项目或索引的生成器。 附近可以在与用户或发电机相同的空间内,例如在给定半径范围内的商店,图书馆,架子,通道,街道,城市,校园,建筑物,区域和 公园。 索引可以存储关于用户或生成器附近的物理项目的信息,例如与物理项目相关联的标签上找到的内容。 内容可能是物理物品及其位置的描述。 本发明还包括产生这样的动态索引的系统和方法。
-
公开(公告)号:US07792827B2
公开(公告)日:2010-09-07
申请号:US10335357
申请日:2002-12-31
申请人: Einat Amitay , David Carmel , Michael Herscovici , Ronny Lempel , Aya Soffer , Uri Weiss
发明人: Einat Amitay , David Carmel , Michael Herscovici , Ronny Lempel , Aya Soffer , Uri Weiss
IPC分类号: G06F17/30
CPC分类号: G06F17/30893
摘要: A method for gathering and recording temporal information for a linked entity, the method including identifying a link related activity within a linked source entity, and recording a time stamp in association with the link related activity.
摘要翻译: 一种用于收集和记录链接实体的时间信息的方法,所述方法包括识别链接源实体内的链接相关活动,以及记录与所述链接相关活动相关联的时间戳。
-
-
-
-
-
-
-
-
-