-
公开(公告)号:US20130297634A1
公开(公告)日:2013-11-07
申请号:US13465848
申请日:2012-05-07
申请人: Mohammad Shami , David Herman , Sherif Botros
发明人: Mohammad Shami , David Herman , Sherif Botros
IPC分类号: G06F17/30
CPC分类号: G06F17/273 , G06F17/278
摘要: Data is received that comprises an entity name. Thereafter, it is determined (i) whether there are any punctuation variations for the entity name, (ii) whether there is at least one character to drop from the entity name, and (iii) whether there are alternative equivalents of at least a portion of the entity name. After such determinations have been made, a plurality of variants for the entity name is generated based on a combination of each determined punctuation variation, determined at least one character to drop, and determined alternative equivalent. Related apparatus, systems, techniques and articles are also described.
摘要翻译: 收到包含实体名称的数据。 此后,确定(i)实体名称是否存在任何标点符号变化,(ii)是否存在至少一个字符从实体名称中删除,以及(iii)是否存在至少一部分的替代等同物 的实体名称。 在进行这样的确定之后,基于确定的至少一个字符和下降的确定的标点符号变化的组合以及确定的替代等效物来生成用于实体名称的多个变体。 还描述了相关设备,系统,技术和物品。
-
公开(公告)号:US09569413B2
公开(公告)日:2017-02-14
申请号:US13465833
申请日:2012-05-07
申请人: Mohammad Shami , David Herman , Sherif Botros
发明人: Mohammad Shami , David Herman , Sherif Botros
CPC分类号: G06F17/2241 , G06F17/30719
摘要: A document is received that has a plurality of lines with text. This document includes text associated with at least one topic of interest and text not associated with the at least one topic of interest. Thereafter, it is determined, for each line in the document, a length of the line and a number of off-topic indicators with the off-topic indicators characterizing portions of the document as likely being not being associated with the at least one topic of interest. Thereafter, a density for each line can be determined based on the determined line length and the determined number of off-topic indicators. The determined densities for each line are used to identify portions of the documents likely associated with the at least one topic of interest so that data characterizing the identified portions of the document can be provided. Related apparatus, systems, techniques and articles are also described.
摘要翻译: 收到具有多行文本的文档。 本文档包括与至少一个感兴趣的主题相关联的文本和与该至少一个感兴趣的主题不相关联的文本。 此后,对于文档中的每一行,确定具有表示文档部分的偏离主题指示符的线的长度和偏离主题指示符的数量可能不与至少一个主题相关联 利益。 此后,可以基于确定的行长度和确定的脱离主题指示数来确定每行的密度。 用于确定每行的密度用于识别可能与所述至少一个感兴趣的主题相关联的文档的部分,从而可以提供表征文档的所识别的部分的数据。 还描述了相关设备,系统,技术和物品。
-
公开(公告)号:US20130297999A1
公开(公告)日:2013-11-07
申请号:US13465833
申请日:2012-05-07
申请人: Mohammad Shami , David Herman , Sherif Botros
发明人: Mohammad Shami , David Herman , Sherif Botros
IPC分类号: G06F17/20
CPC分类号: G06F17/2241 , G06F17/30719
摘要: A document is received that has a plurality of lines with text. This document includes text associated with at least one topic of interest and text not associated with the at least one topic of interest. Thereafter, it is determined, for each line in the document, a length of the line and a number of off-topic indicators with the off-topic indicators characterizing portions of the document as likely being not being associated with the at least one topic of interest. Thereafter, a density for each line can be determined based on the determined line length and the determined number of off-topic indicators. The determined densities for each line are used to identify portions of the documents likely associated with the at least one topic of interest so that data characterizing the identified portions of the document can be provided. Related apparatus, systems, techniques and articles are also described.
摘要翻译: 收到具有多行文本的文档。 本文档包括与至少一个感兴趣的主题相关联的文本和与该至少一个感兴趣的主题不相关联的文本。 此后,对于文档中的每一行,确定具有表示文档部分的偏离主题指示符的线的长度和偏离主题指示符的数量可能不与至少一个主题相关联 利益。 此后,可以基于确定的行长度和确定的脱离主题指示数来确定每行的密度。 用于确定每行的密度用于识别可能与所述至少一个感兴趣的主题相关联的文档的部分,从而可以提供表征文档的所识别的部分的数据。 还描述了相关设备,系统,技术和物品。
-
公开(公告)号:US20130297361A1
公开(公告)日:2013-11-07
申请号:US13465869
申请日:2012-05-07
申请人: Mohammad Shami , Sherif Botros , David Herman
发明人: Mohammad Shami , Sherif Botros , David Herman
IPC分类号: G06Q10/06
CPC分类号: G06Q10/0631
摘要: A company is associated, in an enterprise resource planning system, with a plurality of business entities that each have at least one structured record used by the enterprise resource planning system to characterize the business entity. Thereafter, documents are obtained from a plurality of information sources that characterize events associated with each business entity. It is then determined, using pre-defined business rules, which of the events are pertinent to the company so that enhancement records can be generated for the events determined to be pertinent to the company. These enhancement records characterize the corresponding event and are linked to the structured record for the corresponding business entity. Related apparatus, systems, techniques and articles are also described.
摘要翻译: 公司在企业资源计划系统中与多个业务实体相关联,每个业务实体至少有一个由企业资源规划系统使用的结构化记录来表征业务实体。 此后,从表示与每个业务实体相关联的事件的多个信息源获得文档。 然后,使用预定义的业务规则确定哪个事件与公司相关,以便可以为确定与公司相关的事件生成增强记录。 这些增强记录表征相应的事件,并链接到相应业务实体的结构化记录。 还描述了相关设备,系统,技术和物品。
-
公开(公告)号:US09171057B2
公开(公告)日:2015-10-27
申请号:US13945720
申请日:2013-07-18
申请人: Sherif Botros
发明人: Sherif Botros
CPC分类号: G06F17/30598 , G06Q10/00
摘要: Techniques for data classification include matching one or more attributes of a commodity with one or more terms of a plurality of terms in a word matrix; generating, based on the matching, a vector for the commodity; and identifying, based on the vector, one or more classification regions that each define a classification of the commodity.
摘要翻译: 用于数据分类的技术包括将商品的一个或多个属性与单词矩阵中的多个术语中的一个或多个术语进行匹配; 基于匹配产生商品的向量; 以及基于所述向量来识别每个定义所述商品分类的一个或多个分类区域。
-
6.
公开(公告)号:US06934702B2
公开(公告)日:2005-08-23
申请号:US10106604
申请日:2002-03-26
IPC分类号: G06F17/30
CPC分类号: G06F17/30867 , Y10S707/959 , Y10S707/99933
摘要: A system and method for distributing search requests in a network. The system and method may also route search responses. Network nodes operating as consumer or requesting nodes generate the search requests. Nodes operating as hubs are configured to route the search requests in the network. Individual nodes operating as provider nodes receive the search request and in response may generate search results according to their own procedures and return them. Communication between nodes in the network may use a common query protocol. Hub nodes may resolve the search requests to a subset of the provider nodes in the network, for example by matching search requests with registration information from nodes. Search results may be customized at various stages in the network.
摘要翻译: 一种用于在网络中分发搜索请求的系统和方法。 系统和方法也可以路由搜索响应。 作为消费者或请求节点运行的网络节点生成搜索请求。 用作集线器的节点被配置为在网络中路由搜索请求。 作为提供商节点运行的单个节点接收搜索请求,并且响应可以根据其自己的过程生成搜索结果并返回它们。 网络中的节点之间的通信可以使用公共查询协议。 集线器节点可以将搜索请求解析到网络中的提供商节点的子集,例如通过将搜索请求与来自节点的注册信息相匹配。 搜索结果可以在网络的各个阶段进行定制。
-
公开(公告)号:US20130185286A1
公开(公告)日:2013-07-18
申请号:US13668847
申请日:2012-11-05
申请人: Boris Galitsky , Sherif Botros
发明人: Boris Galitsky , Sherif Botros
IPC分类号: G06F17/30
CPC分类号: G06F17/30424 , G06F17/30637 , G06F17/30666
摘要: To retrieve a sequence of associated events in log data, a request expression is parsed to retrieve types of dependencies between events which are searched, and the constraints (e.g., keywords) which characterize each event. Based on the parsing results, query components can be formed, expressing the constraints for individual events and interrelations (e.g., time spans) between events. A resultant span query comprising the query components can then be run against an index of events, which encodes a mutual location of associated events in storage.
-
公开(公告)号:US07925678B2
公开(公告)日:2011-04-12
申请号:US11623010
申请日:2007-01-12
申请人: Sherif Botros , Jian L. Zhen , Minjun Liu , Boris Galitsky
发明人: Sherif Botros , Jian L. Zhen , Minjun Liu , Boris Galitsky
IPC分类号: G06F17/30
CPC分类号: G06F17/30991 , Y10S707/99943
摘要: Event data (e.g., log messages) are represented as sets of attribute/value pairs. An index maps each attribute/value pair or attribute/value tuple to a pointer that points to event data which contains the attribute/value pair or attribute/value tuple. An attribute co-occurrence map or matrix can be generated that includes attribute names that co-occur together. Queries and custom reports can be generated by projecting event data into one or more attributes or attribute/value pairs, and then determining statistics on other attributes using a combination of the inverted index, the attribute co-occurrence map or matrix, operations on sets and/or math and statistical functions.
摘要翻译: 事件数据(例如,日志消息)被表示为属性/值对的集合。 索引将每个属性/值对或属性/值元组映射到指向包含属性/值对或属性/值元组的事件数据的指针。 可以生成包括共同出现的属性名称的属性共现映射或矩阵。 可以通过将事件数据投影到一个或多个属性或属性/值对中来生成查询和自定义报告,然后使用反向索引,属性共现映射或矩阵的组合来确定其他属性的统计信息,集合上的操作和 /或数学和统计功能。
-
公开(公告)号:US20130198187A1
公开(公告)日:2013-08-01
申请号:US13362598
申请日:2012-01-31
申请人: Sherif Botros
发明人: Sherif Botros
IPC分类号: G06F17/30
CPC分类号: G06F17/30598 , G06Q10/00
摘要: Techniques for data classification include receiving, at a local computing system, a query from a remote computing system, the query comprising data associated with a commodity, the data comprising one or more attributes of the commodity; matching the one or more attributes of the commodity with one or more terms of a plurality of terms in a word matrix that includes a plurality of nodes that each include a term of the plurality of terms and a plurality of links that each connect two or more nodes and define a similarity between the two or more nodes; generating, based on the matching, a numerical vector for the business enterprise commodity; identifying one or more classification regions that each define a classification of the commodity; and preparing the classifications for display at the remote computing system.
摘要翻译: 用于数据分类的技术包括在本地计算系统处接收来自远程计算系统的查询,所述查询包括与商品相关联的数据,所述数据包括商品的一个或多个属性; 将商品的一个或多个属性与包括多个节点的多个条目中的多个术语中的多个术语的一个或多个术语相匹配,每个节点包括多个术语的术语,以及多个链接,每个连接两个或多个 节点并定义两个或多个节点之间的相似度; 基于匹配,产生商业企业商品的数字向量; 识别每个定义商品分类的一个或多个分类区域; 并准备在远程计算系统上显示的分类。
-
公开(公告)号:US08498986B1
公开(公告)日:2013-07-30
申请号:US13362598
申请日:2012-01-31
申请人: Sherif Botros
发明人: Sherif Botros
IPC分类号: G06F17/30
CPC分类号: G06F17/30598 , G06Q10/00
摘要: Techniques for data classification include receiving, at a local computing system, a query from a remote computing system, the query comprising data associated with a commodity, the data comprising one or more attributes of the commodity; matching the one or more attributes of the commodity with one or more terms of a plurality of terms in a word matrix that includes a plurality of nodes that each include a term of the plurality of terms and a plurality of links that each connect two or more nodes and define a similarity between the two or more nodes; generating, based on the matching, a numerical vector for the business enterprise commodity; identifying one or more classification regions that each define a classification of the commodity; and preparing the classifications for display at the remote computing system.
摘要翻译: 用于数据分类的技术包括在本地计算系统处接收来自远程计算系统的查询,所述查询包括与商品相关联的数据,所述数据包括商品的一个或多个属性; 将商品的一个或多个属性与包括多个节点的多个条目中的多个术语中的多个术语的一个或多个术语相匹配,每个节点包括多个术语的术语,以及多个链接,每个连接两个或多个 节点并定义两个或多个节点之间的相似度; 基于匹配,产生商业企业商品的数字向量; 识别每个定义商品分类的一个或多个分类区域; 并准备在远程计算系统上显示的分类。
-
-
-
-
-
-
-
-
-