-
公开(公告)号:US20220318252A1
公开(公告)日:2022-10-06
申请号:US17714028
申请日:2022-04-05
申请人: DataStax, Inc.
发明人: T Jake Luciani , Sergio Bossa
IPC分类号: G06F16/2455 , G06F16/23 , G06F16/22
摘要: A streaming operation is performed by nodes of cluster that implement a database. A method includes a first node determining data segments from data in a first data file stored at the first node for transfer to a second node of the cluster. The first node generates segment offset data for each data segment defining an offset position of the data segment relative to positions in the first data file. The first node transfers sets of segment data, each set including a data segment and the segment offset data for the data segment, to the receiving node. The second writes the data segments to a second data file stored at the second node by mapping each data segment to a position in the second data file as defined by the offset position in the segment offset data for the data segment.
-
公开(公告)号:US20220255014A1
公开(公告)日:2022-08-11
申请号:US17549570
申请日:2021-12-13
申请人: DataStax, Inc.
IPC分类号: H01L51/00 , C07D209/86
摘要: A database system uses byte ordering for keys and a trie index to reference stored data. The keys of a database are converted into byte-comparable sequences of byte values. The trie index is generated including nodes connected by edges defining paths from a root node to leaf nodes. Each edge is associated with at least one byte value such that each path from the root node to a leaf node through one or more edges defines a unique byte prefix for a byte-comparable sequence of byte values. The leaf node of each path is associated with a database location value. A record is accessed in the database using a database location value determined from referencing the trie index using a byte-comparable sequence of byte values of the record generated from a key of the record. A trie structure and byte ordered keys may be used for partition or row indices.
-
公开(公告)号:US11082538B2
公开(公告)日:2021-08-03
申请号:US16020939
申请日:2018-06-27
申请人: DataStax, Inc.
发明人: Matthew Earl Kennedy
摘要: Embodiments relate to a compacting datafiles generated by a database node using a compaction processing node with separate compute resources. The database node generates datafiles and stores the datafiles in a data store. To perform compacting of the datafiles, a snapshot of the data store is created and stored in a snapshot store separate from the data store. The compaction processing node is initiated and attached with the snapshot store. The compaction processing node generates a compacted datafile that is stored in the snapshot store. The database node replaces the data store with the snapshot store, and writes additional datafiles using the snapshot store as a new data store. The compaction processing node may be an instance of a cloud compute infrastructure that is initiated to perform the compaction to reduce compute resource usage by the database node.
-
公开(公告)号:US20200007662A1
公开(公告)日:2020-01-02
申请号:US16020939
申请日:2018-06-27
申请人: DataStax, Inc.
发明人: Matthew Earl Kennedy
摘要: Embodiments relate to a compacting datafiles generated by a database node using a compaction processing node with separate compute resources. The database node generates datafiles and stores the datafiles in a data store. To perform compacting of the datafiles, a snapshot of the data store is created and stored in a snapshot store separate from the data store. The compaction processing node is initiated and attached with the snapshot store. The compaction processing node generates a compacted datafile that is stored in the snapshot store. The database node replaces the data store with the snapshot store, and writes additional datafiles using the snapshot store as a new data store. The compaction processing node may be an instance of a cloud compute infrastructure that is initiated to perform the compaction to reduce compute resource usage by the database node.
-
公开(公告)号:US11204905B2
公开(公告)日:2021-12-21
申请号:US16020936
申请日:2018-06-27
申请人: DataStax, Inc.
IPC分类号: G06F16/22
摘要: A database system uses byte ordering for keys and a trie index to reference stored data. The keys of a database are converted into byte-comparable sequences of byte values. The trie index is generated including nodes connected by edges defining paths from a root node to leaf nodes. Each edge is associated with at least one byte value such that each path from the root node to a leaf node through one or more edges defines a unique byte prefix for a byte-comparable sequence of byte values. The leaf node of each path is associated with a database location value. A record is accessed in the database using a database location value determined from referencing the trie index using a byte-comparable sequence of byte values of the record generated from a key of the record. A trie structure and byte ordered keys may be used for partition or row indices.
-
16.
公开(公告)号:US10666728B1
公开(公告)日:2020-05-26
申请号:US16186895
申请日:2018-11-12
申请人: DataStax
摘要: Data consistency across replicas in a cluster of nodes is maintained by continuously validating local data ranges and repairing any inconsistencies found. Local data ranges are split into segments and prioritized. After a segment is selected for validation, a hash value of a portion of the segment is compared to a hash value from other nodes storing replicas of that data. If the hash values match then the data is consistent. If the hash values do not match then the data is not consistent and whichever data is most current according to their timestamps is considered correct. If the local node data is correct, it is communicated to the replica nodes so they can be updated. If the local node data is not correct, then data from the replica nodes is correct and is used to update the data in the local node.
-
17.
公开(公告)号:US20200153900A1
公开(公告)日:2020-05-14
申请号:US16186895
申请日:2018-11-12
申请人: DataStax
IPC分类号: H04L29/08
摘要: Data consistency across replicas in a cluster of nodes is maintained by continuously validating local data ranges and repairing any inconsistencies found. Local data ranges are split into segments and prioritized. After a segment is selected for validation, a hash value of a portion of the segment is compared to a hash value from other nodes storing replicas of that data. If the hash values match then the data is consistent. If the hash values do not match then the data is not consistent and whichever data is most current according to their timestamps is considered correct. If the local node data is correct, it is communicated to the replica nodes so they can be updated. If the local node data is not correct, then data from the replica nodes is correct and is used to update the data in the local node.
-
18.
公开(公告)号:US20200151145A1
公开(公告)日:2020-05-14
申请号:US16580302
申请日:2019-09-24
申请人: DataStax
IPC分类号: G06F16/178 , G06F16/182 , G06F16/14
摘要: Data consistency across replicas in a cluster of nodes is maintained by continuously validating local data ranges and repairing any inconsistencies found. Local data ranges are split into segments and prioritized. After a segment is selected for validation, a hash value of a portion of the segment is compared to a hash value from other nodes storing replicas of that data. If the hash values match then the data is consistent. If the hash values do not match then the data is not consistent and whichever data is most current according to their timestamps is considered correct. If the local node data is correct, it is communicated to the replica nodes so they can be updated. If the local node data is not correct, then data from the replica nodes is correct and is used to update the data in the local node. An alternative, incremental validation approach improves efficiency.
-
公开(公告)号:US10148754B1
公开(公告)日:2018-12-04
申请号:US15194446
申请日:2016-06-27
申请人: DataStax, Inc.
IPC分类号: G06F15/173 , H04L29/08 , H04L12/911 , H04L29/06
摘要: A distributed system that manages resources of the distributed system without the need for complex time synchronization systems is described. The distributed system includes a resource manager that manages the resources of the distributed system. The resource manager assigns leases and renews leases of resources of the distributed system to clients in the distributed system. The leases specify the duration of time that the lease is awarded to clients.
-
公开(公告)号:US20180081937A1
公开(公告)日:2018-03-22
申请号:US14933697
申请日:2015-11-05
申请人: DataStax, Inc.
发明人: Matthias Broecheler
IPC分类号: G06F17/30
摘要: At least a portion of a graph database having a plurality of vertex-centric indices is stored. A virtual edge to be generated is identified based on a plurality of edges of the graph database. The virtual edge connecting at least a pair of vertices that were not previously directly connected is generated. The plurality of vertex-centric indices is updated to include information about the virtual edge.
-
-
-
-
-
-
-
-
-