Storage configuration in data warehouses
    2.
    发明授权
    Storage configuration in data warehouses 有权
    数据仓库中的存储配置

    公开(公告)号:US09563687B1

    公开(公告)日:2017-02-07

    申请号:US14540648

    申请日:2014-11-13

    CPC classification number: G06F17/30306 G06F17/30339

    Abstract: Techniques are described for employing a graph-based analysis to determine a configuration of datasets to be stored on data storage systems in a data warehouse environment. Associations between datasets may be determined based on the parsing of join statements or other types of statements in jobs that are executed on the data storage systems. A graph may be generated that describes the associations among datasets. A greedy breadth-first traversal of the graph may be performed to determine sets of associated datasets. A utilization metric describing a weight of storing the datasets may be determined and employed to identify a data storage system on which to store a set of associated datasets, given the storage and processing capacity of the data storage system.

    Abstract translation: 描述了采用基于图形的分析来确定要存储在数据仓库环境中的数据存储系统上的数据集的配置的技术。 可以基于在数据存储系统上执行的作业中的连接语句或其他类型的语句的解析来确定数据集之间的关联。 可以生成描述数据集之间关联的图形。 可以执行图的贪心宽度优先遍历以确定相关数据集的集合。 考虑到数据存储系统的存储和处理能力,可以确定描述存储数据集的权重的使用度量,并用于识别在其上存储一组相关联的数据集的数据存储系统。

Patent Agency Ranking