MANAGING DATASETS OF A COGNITIVE STORAGE SYSTEM WITH A SPIKING NEURAL NETWORK

    公开(公告)号:US20190392303A1

    公开(公告)日:2019-12-26

    申请号:US16015897

    申请日:2018-06-22

    IPC分类号: G06N3/08 G06N3/04

    摘要: A computer-implemented method for managing datasets of a storage system is provided, wherein the datasets have respective sets of metadata, the method including: successively feeding first sets of metadata to a spiking neural network (SNN), the first sets of metadata fed corresponding to datasets of the storage system that are labeled with respect to classes they belong to, so as to be associated with class labels, for the SNN to learn representations of said classes in terms of connection weights that weight the metadata fed; successively feeding second sets of metadata to the SNN, the second sets of metadata corresponding to unlabeled datasets of the storage system, for the SNN to infer class labels for the unlabeled datasets, based on the second sets of metadata fed and the representations learned; and managing datasets in the storage system, based on class labels of the datasets, these including the inferred class labels.

    Subtier-Level Data Assignment in a Tiered Storage System

    公开(公告)号:US20180107425A1

    公开(公告)日:2018-04-19

    申请号:US15297750

    申请日:2016-10-19

    IPC分类号: G06F3/06

    摘要: An embodiment is directed to a method for determining an assignment of data to be stored on at least one storage tier i of a plurality of storage tiers of a tiered storage system. The method including, for the at least one storage tier i, steps of accessing storage device characteristics of the at least one storage tier i of the plurality; based on the accessed storage device characteristics, splitting the at least one storage tier i into Ni storage subtiers of the at least one storage tier i, the Ni storage subtiers having respective storage device characteristics; and based on characteristics of data to be stored on the tiered storage system and the respective storage device characteristics of the Ni storage subtiers, determining an assignment of data to be stored on each of the Ni storage subtiers. Embodiments are directed to related methods, systems and computer program products.

    DATASET RELEVANCE ESTIMATION IN STORAGE SYSTEMS

    公开(公告)号:US20190243546A1

    公开(公告)日:2019-08-08

    申请号:US16390214

    申请日:2019-04-22

    IPC分类号: G06F3/06

    摘要: The invention is notably directed to computer-implemented methods and systems for managing datasets in a storage system. In such systems, it is assumed that a (typically small) subset of datasets are labeled with respect to their relevance, so as to be associated with respective relevance values. Essentially, the present methods determine, for each unlabeled dataset of the datasets, a respective probability distribution over a set of relevance values. From this probability distribution, a corresponding relevance value can be obtained. This probability distribution is computed based on distances (or similarities), in terms of metadata values, between said each unlabeled dataset and the labeled datasets. Based on their associated relevance values, datasets can then be efficiently managed in a storage system.

    METHOD FOR CONTROLLING A STORAGE SYSTEM
    29.
    发明申请

    公开(公告)号:US20190079689A1

    公开(公告)日:2019-03-14

    申请号:US15702807

    申请日:2017-09-13

    IPC分类号: G06F3/06

    摘要: Predictively selecting a subset of disks of a storage system to be spun-up, including providing metadata of data entities stored in the disks of the storage system, estimating the data entity access probabilities for a prediction time window based on said metadata, each data entity access probability being indicative for the probability of access to a certain data entity within said prediction time window, calculating disk access probabilities for a prediction time window based on the estimated probability of access of data entities, each disk access probability being indicative for the probability of access to a certain disk within said prediction time window, estimating the number of disks to be spun-up in a certain prediction time window, dynamically adapting the data entity threshold value and/or the disk access threshold value, selecting a subset of disks to be spun-up in the following prediction time window.