Sequence invalidation consolidation in a storage system

    公开(公告)号:US11449485B1

    公开(公告)日:2022-09-20

    申请号:US15641011

    申请日:2017-07-03

    IPC分类号: G06F16/22 G06F16/23

    摘要: A method for tracking valid and invalid sequence numbers in a storage system, performed by a processor, is provided. The method includes establishing a table as a key value store in memory in the storage system. The table has sequence numbers as keys and represents valid sequence numbers and invalidated sequence numbers of an open-ended sequence relating to storage of data or metadata in the storage system. The method includes adding to the table an entry that records a first plurality of consecutive sequence numbers, as a first range-valued key associated with a first value indicating the first plurality of consecutive sequence numbers is valid. The method includes adding to the table an entry that records a deletion of a second plurality of consecutive sequence numbers, as a second range-valued key associated with a second value indicating the second plurality of consecutive sequence numbers is invalid.

    Replication across partitioning schemes in a distributed storage system

    公开(公告)号:US11281394B2

    公开(公告)日:2022-03-22

    申请号:US16450632

    申请日:2019-06-24

    IPC分类号: G06F12/00 G06F3/06 G06F12/10

    摘要: A method of replication in a distributed storage system, performed by the distributed storage system is provided. The method includes managing a first index for data or metadata in a first storage system, the first storage system having a first partitioning scheme. The method includes managing a second index for data or metadata in a second storage system, the second storage system having a second partitioning scheme. The method includes replicating the data or metadata from the first storage system to the second storage system, translating an identifier of the data or metadata from the first storage system, and mapping the replicated data or metadata into the second partitioning scheme, via the translating of the identifier of the data or metadata from the first storage system.

    Efficient coding in a storage system

    公开(公告)号:US10942869B2

    公开(公告)日:2021-03-09

    申请号:US16725639

    申请日:2019-12-23

    摘要: A method for efficient name coding in a storage system is provided. The method includes identifying common prefixes, common suffixes, and midsections of a plurality of strings in the storage system, and writing the common prefixes, midsections and common suffixes to a string table in the storage system. The method includes encoding each string of the plurality of strings as to position in the string table of prefix, midsection and suffix of the string, and writing the encoding of each string to memory in the storage system for the plurality of strings, in the storage system.

    Search acceleration for artificial intelligence

    公开(公告)号:US10915813B2

    公开(公告)日:2021-02-09

    申请号:US16449241

    申请日:2019-06-21

    IPC分类号: G06N3/08 G06N5/04

    摘要: An apparatus for artificial intelligence acceleration is provided. The apparatus includes a storage and compute system having a distributed, redundant key value store for metadata. The storage and compute system having distributed compute resources configurable to access, through a plurality of authorities, data in the solid-state memory, run inference with a deep learning model, generate vectors for the data and store the vectors in the key value store.

    Thining databases for garbage collection

    公开(公告)号:US11262929B2

    公开(公告)日:2022-03-01

    申请号:US16730066

    申请日:2019-12-30

    IPC分类号: G06F12/00 G06F3/06 G06F12/02

    摘要: An implementation of the disclosure provides a system comprising a storage array comprising a storage controller coupled to the storage array. The storage controller comprising a processing device to remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map. The processing device further to update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced and trim the entry from the deduplication map that is associated with the record.

    Tombstones for no longer relevant deduplication entries

    公开(公告)号:US10528280B1

    公开(公告)日:2020-01-07

    申请号:US15420726

    申请日:2017-01-31

    IPC分类号: G06F12/00 G06F3/06 G06F12/02

    摘要: An implementation of the disclosure provides a system comprising a storage array comprising a plurality of data blocks and a storage controller coupled to the storage array. The storage controller comprising a processing device to identify a canonical instance of a data block in a vector associated with a deduplication map. The vector represents a plurality of updates to the deduplication map over a determined time period. A deduplication reference representing duplicate data of the data block in the storage array is select from the deduplication map. The deduplication reference is remapped in the deduplication map to point to the canonical instance. Based on the remapping, an entry in the deduplication map for the deduplication reference is updated with a record. Responsive to detecting that the entry is in a location associated with an original entry of the data block in the deduplication map, delete the entry with the record.

    Memory use and eviction in a deduplication storage system

    公开(公告)号:US09940060B1

    公开(公告)日:2018-04-10

    申请号:US15331181

    申请日:2016-10-21

    IPC分类号: G06F12/00 G06F3/06

    摘要: The method includes storing data including an index summary (IS) and a deduplication map (DDM) in volatile memory of a deduplication system. The method also includes detecting that the stored data exceeds a data allocation size limit for the volatile memory. The method includes evicting the data from the volatile memory using a memory eviction policy to meet the data allocation size limit. The method further includes performing a first eviction by evicting the DDM levels from an oldest DDM level to a newest DDM level until a first one of the data allocation size limit or a DDM threshold is met. The method also includes performing a second eviction by evicting the IS levels from an oldest IS level to a newest IS level until a first one of the data allocation size limit or IS threshold is met in response to the data allocation size limit not being met by the first eviction.