Splitting partitions across clusters in a time-series database

    公开(公告)号:US11409771B1

    公开(公告)日:2022-08-09

    申请号:US16831599

    申请日:2020-03-26

    Abstract: Methods, systems, and computer-readable media for splitting partitions across database clusters in a time-series database are disclosed. A time-series database determines that a heat metric for the first tile has exceeded a threshold. The first tile represents spatial boundaries and temporal boundaries of time-series data, and a lease for the first tile is assigned to a storage node. Based (at least in part on) the heat metric, a temporal split of the first tile is performed to generate an intermediate tile representing the spatial boundaries and a later portion of the temporal boundaries. A spatial split of the intermediate tile is performed to generate second and third tiles representing two portions of the spatial boundaries and the later portion of the temporal boundaries. The storage node stores elements of the time-series data within these new boundaries to the second and third tiles.

    Heat balancing in a distributed time-series database

    公开(公告)号:US11263270B1

    公开(公告)日:2022-03-01

    申请号:US16831637

    申请日:2020-03-26

    Abstract: Methods, systems, and computer-readable media for heat balancing in a distributed time-series database are disclosed. A time-series database stores time-series data using database clusters. A plurality of leases for tiles representing spatial and temporal partitions of the time-series data are assigned to a first storage node. The time-series database determines that a heat metric for the first storage node has exceeded a threshold. The time-series database determines respective heat metrics for additional storage nodes including a second storage node. The time-series database selects the second storage node based (at least in part) on the respective heat metrics. The time-series database reassigns one or more of the leases from the first storage node to the second storage node. The second storage node stores elements of the time-series data into the plurality of database clusters in one or more tiles associated with the one or more reassigned leases.

    MULTI-TENANT PARTITIONING IN A TIME-SERIES DATABASE

    公开(公告)号:US20220374407A1

    公开(公告)日:2022-11-24

    申请号:US17817883

    申请日:2022-08-05

    Inventor: Dumanshu Goyal

    Abstract: Methods, systems, and computer-readable media for multi-tenant partitioning in a time-series database are disclosed. A partitioning scheme is determined that maps a plurality of data points to a plurality of partitions based at least in part on table identifiers associated with the data points. The partitions are stored using a plurality of storage resources. After the storage resources are provisioned, an additional table identifier is generated. Based at least in part on the partitioning scheme, one or more additional data points comprising the additional table identifier are mapped to a particular partition of the plurality of partitions. The one or more additional data points are stored in the particular partition using the storage resources.

    Creating replicas from across storage groups of a time series database

    公开(公告)号:US11294931B1

    公开(公告)日:2022-04-05

    申请号:US16577931

    申请日:2019-09-20

    Abstract: Creating replicas of a time series database from across storage groups may be implemented for a time series database. Updates to a time series database may be maintained in an update log. Updates may be obtained from the log and ingested at different groups of copies of the time series database used to perform queries. Updates may be ingested at different rates at the different groups. A new copy may be added to one of the groups by copying a portion of the time series database for the new copy determined to be present in another group of copies and an update not found in the other from the log to the new copy.

    Continuous verified backups
    6.
    发明授权

    公开(公告)号:US10795777B1

    公开(公告)日:2020-10-06

    申请号:US15889103

    申请日:2018-02-05

    Inventor: Dumanshu Goyal

    Abstract: A system and technique for creating, in a non-native format, verified snapshots and change log archives for data in a database (e.g., tables, partitions, etc.). To verify accuracy of a conversion of the data and corresponding change log data from a native format to a non-native format, both data from the database and the corresponding change logs are processed separately with a forward transformation process, and then a reverse transformation process. The results of the reverse transformations are then compared to the original data to catch data corruptions or errors when performing the format conversion and creating the snapshot or change log archive so that the corruption or error is not propagated to the snapshot/archive. Various forms of error detection (e.g., byte-level, raw data comparisons, checksums, etc.) and error handling are disclosed. The verified snapshots and change log archives may be used to restore the database, for example.

    Creating replicas using queries to a time series database

    公开(公告)号:US11853317B1

    公开(公告)日:2023-12-26

    申请号:US16357224

    申请日:2019-03-18

    Inventor: Dumanshu Goyal

    CPC classification number: G06F16/27 G06F16/2477

    Abstract: Creating replicas using queries may be implemented for a time series database. A new host for a new copy of time series database data may be added and idempotent ingestion of additional data to be included in the new copy after a creation time for the new copy may be performed. Queries to other hosts that store the time series database data may be performed to obtain time series data prior to the creation time. Idempotent ingestion of the results of the queries may be performed at the new host after which performance of queries to the new copy of the time series database may be allowed at the new host.

    Multi-tenant partitioning in a time-series database

    公开(公告)号:US11409725B1

    公开(公告)日:2022-08-09

    申请号:US16267330

    申请日:2019-02-04

    Inventor: Dumanshu Goyal

    Abstract: Methods, systems, and computer-readable media for multi-tenant partitioning in a time-series database are disclosed. A partitioning scheme is determined that maps a plurality of data points to a plurality of partitions based at least in part on table identifiers associated with the data points. The partitions are stored using a plurality of storage resources. After the storage resources are provisioned, an additional table identifier is generated. Based at least in part on the partitioning scheme, one or more additional data points comprising the additional table identifier are mapped to a particular partition of the plurality of partitions. The one or more additional data points are stored in the particular partition using the storage resources.

    Dynamic lease assignments in a time-series database

    公开(公告)号:US11366598B1

    公开(公告)日:2022-06-21

    申请号:US16831608

    申请日:2020-03-26

    Abstract: Methods, systems, and computer-readable media for dynamic lease assignments in a time-series database are disclosed. A time-series database determines an assignment of a lease for a tile representing spatial and temporal boundaries of time-series data. The lease is assigned to a first storage node of a plurality of storage nodes. The time-series database routes the elements of the time-series data within the spatial and temporal boundaries to the first storage node based at least in part on the assignment of the lease. The first storage node stores the elements of the time-series data into the tile in a database cluster. Write requests by the first storage node to the tile are validated by the database cluster based at least in part on the assignment of the lease.

    INGESTION PARTITION AUTO-SCALING IN A TIME-SERIES DATABASE

    公开(公告)号:US20220171792A1

    公开(公告)日:2022-06-02

    申请号:US17675567

    申请日:2022-02-18

    Abstract: Methods, systems, and computer-readable media for ingestion partition auto-scaling in a time-series database are disclosed. A first set of one or more hosts divides elements of time-series data into a plurality of partitions. A second set of one or more hosts stores the elements of time-series data from the plurality of partitions into one or more storage tiers of a time-series database. An analyzer receives first data indicative of the resource usage of the time-series data at the first set of one or more hosts. The analyzer receives second data indicative of the resource usage of the time-series data at the second set of one or more hosts. Based at least in part on analysis of the first data and the second data, the analyzer initiates a split of an individual one of the partitions into two or more partitions.

Patent Agency Ranking