Method and Apparatus for Identifying the Optimal Schema to Store Graph Data in a Relational Store
    2.
    发明申请
    Method and Apparatus for Identifying the Optimal Schema to Store Graph Data in a Relational Store 审中-公开
    用于识别最佳模式以在关系存储中存储图形数据的方法和装置

    公开(公告)号:US20160203236A1

    公开(公告)日:2016-07-14

    申请号:US15078931

    申请日:2016-03-23

    IPC分类号: G06F17/30

    CPC分类号: G06F16/9024 G06F16/211

    摘要: A system for identifying a schema for storing graph data includes a database containing a graph dataset of data and relationships between data pairs and a list of storage methods that each are a distinct structural arrangement of the data and relationships from the graph data set. An analyzer module collects statistics for the graph dataset, and a data classification module uses the collected statistics to calculate metrics describing the data and relationships in the graph dataset, uses the calculated metrics to group the data and relationships into a plurality of graph dataset subsets and associates each graph dataset subset with one of the plurality of storage methods. The resulting group of storage methods associated with the plurality of graph dataset subsets includes a unique storage method for each graph dataset subset. The data and relationships in each graph dataset subset are arranged in accordance with associated storage methods.

    摘要翻译: 用于识别用于存储图形数据的模式的系统包括数据库,该数据库包含数据的图形数据集和数据对之间的关​​系以及存储方法的列表,每个存储方法是与图形数据集的数据和关系的不同结构布置。 分析器模块收集图形数据集的统计信息,数据分类模块使用收集的统计信息来计算描述图形数据集中的数据和关系的度量,使用计算的度量将数据和关系分组为多个图形数据集子集,以及 将每个图形数据集子集与多个存储方法之一相关联。 与多个图形数据集子集相关联的所得到的存储方法组包括用于每个图形数据集子集的唯一存储方法。 每个图形数据集子集中的数据和关系按照相关的存储方法进行排列。

    Method and Apparatus for Storing Sparse Graph Data as Multi-Dimensional Cluster
    3.
    发明申请
    Method and Apparatus for Storing Sparse Graph Data as Multi-Dimensional Cluster 有权
    将稀疏图数据存储为多维集群的方法和装置

    公开(公告)号:US20150052134A1

    公开(公告)日:2015-02-19

    申请号:US13967261

    申请日:2013-08-14

    IPC分类号: G06F17/30

    摘要: A system for storing graph data as a multi-dimensional cluster having a database with a graph dataset containing data and relationships between data pairs and a schema list of storage methods that use a table with columns and rows associated with data or relationships. An analyzer module to collect statistics of a graph dataset and a dimension identification module to identify a plurality of dimensions that each represent a column in the table. A schema creation and loading module creates a modified storage method and having a plurality of distinct table blocks and a plurality of table block indexes, one index for each table block and arranges the data and relationships in the given graph dataset in accordance with the modified storage method to create the multi-dimensional cluster.

    摘要翻译: 用于将图形数据存储为具有数据库的图形数据的系统,该数据库具有包含数据和数据对之间的关​​系的图形数据集,以及使用具有与数据或关系相关联的列和行的表的存储方法的模式列表。 分析器模块,用于收集图形数据集和维度识别模块的统计信息,以识别每个表示表中的列的多个维度。 模式创建和加载模块创建经修改的存储方法并且具有多个不同的表块和多个表块索引,每个表块的一个索引,并且根据修改的存储器将数据和关系布置在给定图形数据集中 方法来创建多维集群。

    Optimizing sparse schema-less data in data stores

    公开(公告)号:US09715560B2

    公开(公告)日:2017-07-25

    申请号:US13929129

    申请日:2013-06-27

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30958 G06F17/30292

    摘要: Various embodiments of the invention relate to optimizing storage of schema-less data. At least one of a schema-less dataset including a plurality of resources one or more query workloads associated with the plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A schema is generated based on the graph that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph.

    Method and Apparatus for Determining the Schema of a Graph Dataset
    7.
    发明申请
    Method and Apparatus for Determining the Schema of a Graph Dataset 有权
    用于确定图形数据集模式的方法和装置

    公开(公告)号:US20150193478A1

    公开(公告)日:2015-07-09

    申请号:US14151768

    申请日:2014-01-09

    IPC分类号: G06F17/30

    摘要: A schema for a dataset is identified by identifying a dataset comprising data and relationships between data pairs. An original schema is identified for the dataset. This original schema comprises an organizational structure. An initial fit between the dataset and the original schema is determined. The initial fit quantifying a conformity of the data in the dataset to the organizational structure of the original schema. A plurality of additional schemas are identified. Each additional schema is a distinct organizational schema. The dataset is partitioned into a plurality of subsets. Each subset comprises a modified fit quantifying a modified conformity of subset data in each subset to one of the original schema and the additional schemas. The modified fit is greater than the original fit.

    摘要翻译: 通过识别包含数据和数据对之间的关​​系的数据集来识别数据集的模式。 为数据集标识原始模式。 该原始模式包括组织结构。 确定数据集与原始模式之间的初始拟合。 初始拟合量化数据集中的数据与原始模式的组织结构的一致性。 识别多个附加模式。 每个附加模式是不同的组织架构。 数据集被划分成多个子集。 每个子集包括将每个子集中的子集数据的修改的一致性量化为原始模式和附加模式之一的经修改的拟合。 修改的拟合大于原始拟合。