Storing data and metadata in respective virtual shards on sharded storage systems

    公开(公告)号:US09811546B1

    公开(公告)日:2017-11-07

    申请号:US14319301

    申请日:2014-06-30

    CPC classification number: G06F17/30321 G06F17/30091 G06F17/30194

    Abstract: Techniques are provided for storing data and metadata on sharded storage arrays. In one embodiment, data is processed in a sharded distributed data storage system that stores data in a plurality of shards on one or more storage nodes by providing a plurality of addressable virtual shards within each of the shards, wherein at least a first one of the addressable virtual shards stores the data, and wherein at least a second one of the addressable virtual shards stores the metadata related to the data; obtaining the data from a compute node; and providing the data and the metadata related to the data stored to the sharded distributed data storage system for storage in the respective first and second addressable virtual shards. The metadata related to the data is stored together at a portion of a corresponding stripe for the data in the second one of the addressable virtual shards. A third one of the addressable virtual shards optionally stores a checksum value related to the data.

    Advanced metadata management
    2.
    发明授权

    公开(公告)号:US11093468B1

    公开(公告)日:2021-08-17

    申请号:US14230829

    申请日:2014-03-31

    Abstract: A computer-executable method, system, and computer program product for managing metadata in a distributed data storage system, wherein the distributed data storage system includes a first burst buffer having a key-value store enabled to store metadata, the computer-executable method, system, and computer program product comprising receiving, from a compute node, metadata related to data stored within the distributed data storage system, indexing the metadata at the first burst buffer, and processing the metadata in the first burst buffer.

    End-to-end data integrity in parallel storage systems

    公开(公告)号:US09767139B1

    公开(公告)日:2017-09-19

    申请号:US14319647

    申请日:2014-06-30

    CPC classification number: G06F11/14

    Abstract: End-to-end data integrity is provided in parallel computing systems, such as High Performance Computing (HPC) environments. An exemplary method is provided for processing data in a distributed data storage system by obtaining the data and one or more corresponding checksum values from a compute node; and providing the data and the one or more corresponding checksum values to the distributed data storage system for storage. One or more checksum values corresponding to the data can be generated if the one or more checksum values are not received from a compute node. Exemplary processes are provided for copy; slice; merge: and slice and merge functions. The distributed data storage system comprises, for example, one or more Parallel Log-Structured File System (PLFS) storage elements and/or key-value storage elements storing one or more key-value pairs.

Patent Agency Ranking