Placing data in a data storage array based on detection of different data streams within an incoming flow of data

    公开(公告)号:US11340814B1

    公开(公告)日:2022-05-24

    申请号:US15498653

    申请日:2017-04-27

    Abstract: A technique performs stream-based storage of data. The technique involves receiving, by processing circuitry of data storage equipment, an incoming flow of data. The technique further involves detecting, by the processing circuitry, different data streams within the incoming flow of data. The technique further involves performing, by the processing circuitry, data placement operations based on the different data streams detected within the incoming flow of data. The data placement operations are configured and operative to place data of each data stream of the different data streams in a different segment of storage provided by a data storage array of the data storage equipment. With data of each data stream being placed in a different segment, the resulting operation is more efficient, e.g., optimized sequential reads and writes, more effective data prefetching, more effective auto-tiering of data, and so on.

    Log cleaning and tiering in a log-based data storage system

    公开(公告)号:US09959054B1

    公开(公告)日:2018-05-01

    申请号:US14984060

    申请日:2015-12-30

    Abstract: A technique is directed to cleaning a log structure. The technique involves identifying extents (e.g., a contiguous segment of 8 MB) to reclaim from a first storage tier of a set of storage tiers containing the log structure. The technique further involves performing a tier selection operation to select a target storage tier from the set of storage tiers based on a utilization measure of the log structure. The technique further involves, after identifying the extents to reclaim and performing the tier selection operation, storing data from the identified extents into a new extent of the target storage tier and freeing the identified extents. Such a technique combines log cleaning and tiering into a single operation thus placing less stress on storage devices (e.g., less wear on flash memory, etc.), consuming fewer system resources, and providing better performance.

    Inline deduplication using log based storage

    公开(公告)号:US11144533B1

    公开(公告)日:2021-10-12

    申请号:US15283289

    申请日:2016-09-30

    Abstract: A method is used in managing deduplication of data in storage systems. A candidate data object is identified for deduplicating a data object by evaluating digests stored in a current digest segment to determine whether another digest matching a digest associated with the data block is stored in the current digest segment. The current digest segment includes a set of digests associated with a set of data blocks previously received for deduplication. Based on the evaluation, a deduplicating technique is applied to the data object. The current digest segment is stored in an index table. A previous digest segment associated with a digest stored in the index table matches the digest associated with the data block is replaced by the current digest segment. A plurality of digest segments are organized into a segment group and a reference counter is associated with the segment group, wherein if the reference counter reaches zero, storage space consumed by the digest group is reclaimed.

    USER STREAM AWARE FILE SYSTEMS WITH USER STREAM DETECTION

    公开(公告)号:US20210034289A1

    公开(公告)日:2021-02-04

    申请号:US16526391

    申请日:2019-07-30

    Abstract: Techniques for handling multiple data streams in stream-aware data storage systems. The data storage systems can detect multiple sub-streams in an incoming stream of data, form a group of data blocks corresponding to each respective sub-stream, and associate, bind, and/or assign a stream ID to each data block in the respective sub-stream. The data storage systems can write each group of data blocks having the same stream ID to the same segment of a data log in one or more non-volatile storage devices, and manage and/or maintain, in persistent data storage, attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written and/or received. The techniques can improve the detection of multiple sub-streams in an incoming stream of data, and improve the management of attribute information pertaining to data blocks in the respective sub-streams.

    Merging mapping metadata to promote reference counting efficiency

    公开(公告)号:US10146466B1

    公开(公告)日:2018-12-04

    申请号:US15499488

    申请日:2017-04-27

    Abstract: A technique for managing metadata in a data storage system designates block pointers as either sources or copies, where sources contribute to reference counts of pointed-to structures but copies do not. The technique maintains parent-child relationships between parent BPSs (block pointer sets) and child BPSs, where each BPS includes an array of block pointers. Each child BPS is created as a copy of a parent BPS and has block pointers initially designated as copies. The technique performs a metadata-merge operation to merge the block pointers of the parent BPS into those of a child BPS by promoting attributes of block pointers in the child BPS from copy to source, avoiding any need to perform reference count updates on structures pointed to by promoted block pointers.

    Systems and methods of amortizing deletion processing of a log structured storage based volume virtualization

    公开(公告)号:US11163446B1

    公开(公告)日:2021-11-02

    申请号:US15664852

    申请日:2017-07-31

    Abstract: Techniques for amortizing metadata updates due to data delete operations in data storage systems that implement log structured storage of data from virtual volumes. The techniques employ a segment database (DB) and a deleted chunk DB. The segment DB is implemented as a key-value store. The deleted chunk DB is likewise implemented as a key-value store, but configured as a log structured merge (LSM) tree. By configuring the deleted chunk DB as an LSM-tree, more efficient use of memory and improved reduction of metadata updates can be achieved. Stored segments of log structured data can also be effectively “cleaned” in a background process that involves ordered traversals of the segment DB and the deleted chunk DB, allowing for more efficient recovery of storage space consumed by the deleted data chunks.

    Handling data that has become inactive within stream aware data storage equipment

    公开(公告)号:US10289566B1

    公开(公告)日:2019-05-14

    申请号:US15662669

    申请日:2017-07-28

    Abstract: A technique involves, from an incoming flow of data that includes a first stream from a first source and another stream from another source, placing data of the first stream into first storage segments and data of the other stream into other storage segments that are different from the first storage segments. The technique further involves, while some of the data of the first stream becomes invalidated over time and while a garbage collection service consolidates remaining valid data of the first stream together within the first segments, tracking the number of times the remaining valid data of the first stream is consolidated together within the first segments by the garbage collection service. The technique further involves comingling (i) remaining valid data of the first stream which has been consolidated together a predefined number of times within the first segments with (ii) the data of the other stream.

    Fast object snapshot via background processing

    公开(公告)号:US11194760B1

    公开(公告)日:2021-12-07

    申请号:US15662483

    申请日:2017-07-28

    Abstract: Techniques for creating snapshots of data storage objects that can perform certain operations (e.g., flushing dirty data, setting up extent pointers, allocating block storage space, etc.) during background (or deferred) processing. The disclosed techniques employ one or more extent copy trackers that can be created during processing of a transaction, while I/O request from host computers are suspended. The extent copy trackers are configured to perform some or all of the certain operations in the background, after the transaction has been committed and/or the processing of the transaction has been completed. By performing such operations during background processing, a processing time required to complete the snapshot transaction is reduced, thereby reducing latency in the resumption of the I/O requests from the host computers.

    Handling pattern identifiers in a data storage system

    公开(公告)号:US10983705B2

    公开(公告)日:2021-04-20

    申请号:US16397493

    申请日:2019-04-29

    Abstract: Techniques for handling pattern identifiers in a data storage system. By replacing a block pointer with a pattern identifier, the techniques can identify a data block (or an indirect data block) as a bad block, without resorting to the use of a separate flag or bad block (BB) bit in per-block metadata (e.g., a mapping pointer) of the data block. The techniques can also avoid waste of valuable metadata space by using pattern identifiers at various levels of a mapping tree, leveraging pointer granularity at lower levels, mid-levels, and progressively higher levels of the mapping tree.

    User stream aware file systems with user stream detection

    公开(公告)号:US10929066B1

    公开(公告)日:2021-02-23

    申请号:US16526391

    申请日:2019-07-30

    Abstract: Techniques for handling multiple data streams in stream-aware data storage systems. The data storage systems can detect multiple sub-streams in an incoming stream of data, form a group of data blocks corresponding to each respective sub-stream, and associate, bind, and/or assign a stream ID to each data block in the respective sub-stream. The data storage systems can write each group of data blocks having the same stream ID to the same segment of a data log in one or more non-volatile storage devices, and manage and/or maintain, in persistent data storage, attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written and/or received. The techniques can improve the detection of multiple sub-streams in an incoming stream of data, and improve the management of attribute information pertaining to data blocks in the respective sub-streams.

Patent Agency Ranking