Forecasting based on a collection of data
    91.
    发明申请
    Forecasting based on a collection of data 有权
    基于数据收集的预测

    公开(公告)号:US20090024444A1

    公开(公告)日:2009-01-22

    申请号:US11879924

    申请日:2007-07-19

    Applicant: Jerry Z. Shan

    Inventor: Jerry Z. Shan

    CPC classification number: G06F17/30548 G06F17/30536 G06Q30/0202

    Abstract: To forecast data, an initial collection of data having a first length is received. In response to determining that the first length of the initial collection of data is insufficient for performing forecasting using a forecasting algorithm, an order of the initial collection of data is reversed to provide a reversed collection of data. Forecasting is applied on the reversed collection of data to estimate additional data values to combine with the initial collection of data to provide a second collection of data having a second length greater than the first length. The forecasting algorithm is applied on the second collection of data.

    Abstract translation: 为了预测数据,接收到具有第一长度的数据的初始集合。 响应于确定初始数据收集的第一长度不足以使用预测算法执行预测,数据的初始收集的顺序被反转以提供反向的数据收集。 将预测应用于数据的反向收集以估计附加数据值以与数据的初始收集相结合,以提供具有大于第一长度的第二长度的第二数据集合。 预测算法应用于第二个数据集合。

    Nominal population metric: clustering of nominal application attributes
    92.
    发明申请
    Nominal population metric: clustering of nominal application attributes 有权
    名义人口度量:名义应用程序属性的聚类

    公开(公告)号:US20080183731A1

    公开(公告)日:2008-07-31

    申请号:US12011053

    申请日:2008-01-23

    Inventor: Juan E. Gilbert

    CPC classification number: G04F3/00 G06F17/30522 G06F17/30528 G06F17/30536

    Abstract: Clustering of nominal attributes using a nominal population metric enables comparisons of entities which are not easily comparable. In some embodiments, nominal population metrics are determined using a similarity matrix and a nominal population matrix using comparisons. In some embodiments, nominal population metrics are determined using a nominal population matrix using distributions. A computing device is able to determine the nominal population metrics with the appropriate hardware and applications configured for computing the nominal population metrics.

    Abstract translation: 使用名义人口度量对名义属性进行聚类,可以比较不容易比较的实体。 在一些实施例中,使用相似性矩阵和使用比较的标称群体矩阵来确定标称群体度量。 在一些实施例中,使用使用分布的标称群体矩阵来确定标称群体度量。 计算设备能够使用配置用于计算名义人口度量的适当硬件和应用来确定名义人口度量。

    Method and System for High Performance Integration, Processing and Searching of Structured and Unstructured Data Using Coprocessors
    93.
    发明申请
    Method and System for High Performance Integration, Processing and Searching of Structured and Unstructured Data Using Coprocessors 有权
    使用协处理器高性能集成,处理和搜索结构化和非结构化数据的方法和系统

    公开(公告)号:US20080114724A1

    公开(公告)日:2008-05-15

    申请号:US11938709

    申请日:2007-11-12

    Abstract: Disclosed herein is a method and system for integrating an enterprise's structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. Queries can be directed toward both an enterprise's structured and unstructured data using standardized database query formats such as SQL commands. A coprocessor can be used to hardware-accelerate data processing tasks (such as full-text searching) on unstructured data as necessary to handle a query. Furthermore, traditional relational database techniques can be used to access structured data stored by a relational database to determine which portions of the enterprise's unstructured data should be delivered to the coprocessor for hardware-accelerated data processing.

    Abstract translation: 本文公开了一种用于集成企业的结构化和非结构化数据以向用户和企业应用程序提供对该数据的高效和智能访问的方法和系统。 查询可以使用标准化的数据库查询格式(如SQL命令)针对企业的结构化和非结构化数据。 可以使用协处理器对处理查询所需的非结构化数据进行硬件加速数据处理任务(如全文搜索)。 此外,传统的关系数据库技术可用于访问由关系数据库存储的结构化数据,以确定企业的非结构化数据的哪些部分应被传递到协处理器以进行硬件加速数据处理。

    Method and apparatus for exploiting statistics on query expressions for optimization
    94.
    发明授权
    Method and apparatus for exploiting statistics on query expressions for optimization 有权
    利用查询表达式进行统计优化的方法和装置

    公开(公告)号:US07363289B2

    公开(公告)日:2008-04-22

    申请号:US11177598

    申请日:2005-07-07

    Abstract: A method for evaluating a user query on a relational database having records stored therein, a workload made up of a set of queries that have been executed on the database, and a query optimizer that generates a query execution plan for the user query. Each query plan includes a plurality of intermediate query plan components that verify a subset of records from the database meeting query criteria. The method accesses the query plan and a set of stored intermediate statistics for records verified by query components, such as histograms that summarize the cardinality of the records that verify the query component. The method forms a transformed query plan based on the selected intermediate statistics (possibly by rewriting the query plan) and estimates the cardinality of the transformed query plan to arrive at a more accurate cardinality estimate for the query. If additional intermediate statistics are necessary, a pool of intermediate statistics may be generated based on the queries in the workload by evaluating the benefit of a given statistic over the workload and adding intermediate statistics to the pool that provide relatively great benefit.

    Abstract translation: 一种用于评估具有存储在其中的记录的关系数据库的用户查询的方法,由在数据库上执行的一组查询组成的工作负载以及生成用户查询的查询执行计划的查询优化器。 每个查询计划包括多个中间查询计划组件,其从数据库会议查询条件验证记录的子集。 该方法访问查询计划和一组存储的中间统计信息,用于查询组件验证的记录,例如总结验证查询组件的记录的基数的直方图。 该方法基于所选择的中间统计(可能通过重写查询计划)形成转换的查询计划,并且估计转换后的查询计划的基数以得到查询的更准确的基数估计。 如果需要额外的中间统计数据,则可以根据工作负载中的查询生成中间统计数据池,方法是评估给定统计量对工作负载的好处,并将中间统计信息添加到提供相对较大收益的池中。

    System and method for computing analytics on structured data
    95.
    发明申请
    System and method for computing analytics on structured data 失效
    用于计算结构化数据分析的系统和方法

    公开(公告)号:US20080059115A1

    公开(公告)日:2008-03-06

    申请号:US11515333

    申请日:2006-09-01

    Inventor: Leland Wilkinson

    CPC classification number: G06F17/18 G06F17/30536 G06F2216/03

    Abstract: A system, method and computer storage medium is provided for computing analytics on structured data. The method for computing analytics on structured data comprises providing at least one data source, providing a statistics object for computing statistical estimates, providing software capable of performing any of the data processing methods selected from the pass, stream and merge methods and performing at least one statistical calculation on data from the data source using the statistics object to compute statistical estimates by at least one method selected from the provided data processing methods.

    Abstract translation: 提供了一种用于计算结构化数据分析的系统,方法和计算机存储介质。 用于计算结构化数据分析的方法包括提供至少一个数据源,提供用于计算统计估计的统计对象,提供能够执行从传递,流和合并方法中选择的任何数据处理方法的软件,并执行至少一个 使用统计对象对来自数据源的数据进行统计计算,以通过从所提供的数据处理方法中选择的至少一种方法来计算统计估计。

    Statistics collection using path-identifiers for relational databases
    96.
    发明申请
    Statistics collection using path-identifiers for relational databases 失效
    使用关系数据库的路径标识符进行统计收集

    公开(公告)号:US20070271217A1

    公开(公告)日:2007-11-22

    申请号:US11435017

    申请日:2006-05-16

    Abstract: Disclosed are a system, method, and computer readable medium for collecting statistics associated with data in a database. The method comprises determining an amount of memory needed to collect statistics for data associated with a defined data type in a relational database. The defined data type is based upon a mark-up language using a tree structure with one or more root-to-node paths therein. The amount of memory as determined is allocated for collecting the statistics for the data of the defined data type. A statistics collection is performed for the data of the defined data type in a single pass through the database and within the amount of memory which has been allocated.

    Abstract translation: 公开了用于收集与数据库中的数据相关联的统计信息的系统,方法和计算机可读介质。 该方法包括确定为关系数据库中与定义的数据类型相关联的数据收集统计信息所需的存储器量。 定义的数据类型基于使用具有一个或多个根到节点路径的树结构的标记语言。 分配所确定的内存量用于收集所定义数据类型的数据的统计信息。 在通过数据库的单次传递中以及已分配的内存量内,对定义的数据类型的数据执行统计信息收集。

    Automatic database statistics creation
    97.
    发明授权
    Automatic database statistics creation 有权
    自动创建数据库统计信息

    公开(公告)号:US07289999B2

    公开(公告)日:2007-10-30

    申请号:US10981799

    申请日:2004-11-05

    Abstract: A system for automatic statistics creation comprises a query optimizer which automatically generates statistics derived from data in a database and selects an executable procedure from a plurality of procedures that operate on data in a database using the automatically generated statistics. A counter is maintained of updates made to each statistic that has been automatically generated. If the counter breaches a threshold, the automatically generated statistic is removed from the database.

    Abstract translation: 用于自动统计创建的系统包括自动生成从数据库中的数据导出的统计信息的查询优化器,并且从使用自动生成的统计信息对数据库中的数据进行操作的多个过程中选择可执行过程。 维护对自动生成的每个统计信息进行更新的计数器。 如果计数器违反阈值,则会从数据库中删除自动生成的统计信息。

    Method for distributed tracking of approximate join size and related summaries
    98.
    发明申请
    Method for distributed tracking of approximate join size and related summaries 有权
    分布式跟踪连接大小和相关摘要的方法

    公开(公告)号:US20070240061A1

    公开(公告)日:2007-10-11

    申请号:US11392440

    申请日:2006-03-29

    CPC classification number: G06F17/30536 G06F17/30516 G06F17/30545

    Abstract: A method of distributed approximate query tracking relies on tracking general-purpose randomized sketch summaries of local streams at remote sites along with concise prediction models of local site behavior in order to produce highly communication-efficient and space/time-efficient solutions. A powerful approximate query tracking framework readily incorporates several complex analysis queries, including distributed join and multi-join aggregates and approximate wavelet representations, thus giving the first known low-overhead tracking solution for such queries in the distributed-streams model.

    Abstract translation: 分布式近似查询跟踪的方法依赖于跟踪远程站点的本地流的通用随机草图摘要以及本地站点行为的简洁预测模型,以生成高通信效率和空间/时间效率的解决方案。 强大的近似查询跟踪框架容易地并入了多个复杂的分析查询,包括分布式连接和多连接聚合以及近似小波表示,从而为分布式流模型中的这种查询提供了第一个已知的低开销跟踪解决方案。

    METHODS AND SYSTEMS FOR SELECTING AND PRESENTING CONTENT BASED ON CONTEXT SENSITIVE USER PREFERENCES
    99.
    发明申请
    METHODS AND SYSTEMS FOR SELECTING AND PRESENTING CONTENT BASED ON CONTEXT SENSITIVE USER PREFERENCES 有权
    基于上下文敏感用户选择和选择内容的方法和系统

    公开(公告)号:US20070219985A1

    公开(公告)日:2007-09-20

    申请号:US11682599

    申请日:2007-03-06

    Abstract: A method of selecting and presenting content based on context-sensitive learned user preferences is provided. The method includes providing a set of content items having descriptive terms. The method includes receiving user input for identifying items and, in response thereto, presenting a subset of items. The method includes receiving user selections of said items and analyzing the descriptive terms of those items to learn the user's content preferences. The method includes determining the context in which the user performed the selections and associating those contexts with the user content preferences learned from the corresponding user selections. The method includes, in response to subsequent user input, determining a context of said subsequent input and selecting and ordering a collection of items based on comparing those items' descriptive terms with the user's learned content preferences associated with the determined context in which the user entered the subsequent input.

    Abstract translation: 提供了基于上下文敏感的学习用户偏好来选择和呈现内容的方法。 该方法包括提供具有描述性术语的一组内容项。 该方法包括接收用于识别项目的用户输入,并且响应于此呈现项目的子集。 该方法包括接收用户对所述项目的选择并分析这些项目的描述性条款以了解用户的内容偏好。 该方法包括确定用户执行选择并将这些上下文与从相应的用户选择学习的用户内容偏好相关联的上下文。 该方法包括响应于随后的用户输入,确定所述后续输入的上下文,并且基于将这些项目的描述性词与与用户输入的所确定的上下文相关联的用户的学习内容偏好进行比较来选择和排序项目集合 随后的输入。

Patent Agency Ranking