Consistent filtering of machine learning data

    公开(公告)号:US10540606B2

    公开(公告)日:2020-01-21

    申请号:US14460314

    申请日:2014-08-14

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    CONSISTENT FILTERING OF MACHINE LEARNING DATA

    公开(公告)号:US20230126005A1

    公开(公告)日:2023-04-27

    申请号:US18146075

    申请日:2022-12-23

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    CONSISTENT FILTERING OF MACHINE LEARNING DATA

    公开(公告)号:US20200034742A1

    公开(公告)日:2020-01-30

    申请号:US16591521

    申请日:2019-10-02

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    Consistent filtering of machine learning data

    公开(公告)号:US11544623B2

    公开(公告)日:2023-01-03

    申请号:US16591521

    申请日:2019-10-02

    Abstract: Consistency metadata, including a parameter for a pseudo-random number source, are determined for training-and-evaluation iterations of a machine learning model. Using the metadata, a first training set comprising records of at least a first chunk is identified from a plurality of chunks of a data set. The first training set is used to train a machine learning model during a first training-and-evaluation iteration. A first test set comprising records of at least a second chunk is identified using the metadata, and is used to evaluate the model during the first training-and-evaluation iteration.

    Input processing for machine learning

    公开(公告)号:US11100420B2

    公开(公告)日:2021-08-24

    申请号:US14460312

    申请日:2014-08-14

    Abstract: A record extraction request for a data set is received at a machine learning service. A plan to perform one or more chunk-level operations (such as sampling, shuffling, splitting or partitioning for parallel computation) on chunks of the data set is generated. A set of data transfers that results in a particular chunk being stored in a particular server's memory is initiated to implement the first chunk-level operation of the sequence. A second operation such as another filtering operation or a feature processing operation is performed on a result set of the first chunk-level operation.

Patent Agency Ranking