Fingerprint-based data classification

    公开(公告)号:US11886468B2

    公开(公告)日:2024-01-30

    申请号:US17541704

    申请日:2021-12-03

    CPC classification number: G06F16/285 G06F16/221 G06F16/2264 G06N20/00

    Abstract: Systems and methods are provided for automated classification of data using fingerprints. In embodiments, a method includes: generating, by a computing device based on predetermined rules, a fingerprint of a data column in a data set to be classified, the fingerprint comprising dimensions, wherein each of the dimension is assigned an attribute representing a characteristic of data in the data column; determining, by the computing device, that the fingerprint matches one or more target fingerprints by comparing the fingerprint to the target fingerprints, wherein each target fingerprint is associated with a class and includes dimensions, and each dimension is assigned an attribute representing a characteristic of data in the class; and assigning, by the computing device, one or more classes to the data column based on the one or more target fingerprints, thereby generating classified data.

    New Data Class Generation Based on Static Reference Data

    公开(公告)号:US20240386032A1

    公开(公告)日:2024-11-21

    申请号:US18317475

    申请日:2023-05-15

    Abstract: New data class generation is provided. A dimension score is generated for each respective dimension of a plurality of predefined dimensions as relating to column attributes of a data asset while performing a static reference data analysis of the data asset. The dimension score of each respective dimension is added together to obtain a total dimension score for the data asset. It is determined whether the total dimension score of the data asset is greater than a predefined minimum dimension score threshold level. The data asset is identified as new static reference data in response to determining that the total dimension score of the data asset is greater than the predefined minimum dimension score threshold level. A new data class is generated based on the new static reference data.

    Mutual Exclusion Data Class Analysis in Data Governance

    公开(公告)号:US20230297596A1

    公开(公告)日:2023-09-21

    申请号:US17654858

    申请日:2022-03-15

    CPC classification number: G06F16/285 G06F16/221

    Abstract: Performing a mutual exclusion data class analysis is provided. A data class group of a plurality of data class groups that a matching data class is a member of is identified. The matching data class matches data in a plurality of rows of a column in a data asset. Data classes included in the data class group that the matching data class is a member of are identified. A mutual exclusion data class is filtered from the data class group to form a filtered data class group for the column. The filtered data class group is run against the column of the data asset decreasing processing time and resource utilization of a computer.

    Learning-based automation machine learning code annotation in computational notebooks

    公开(公告)号:US11360763B2

    公开(公告)日:2022-06-14

    申请号:US17069402

    申请日:2020-10-13

    Abstract: One embodiment of the invention provides a method for automated code annotation in machine learning (ML) and data science. The method comprises receiving, as input, a section of executable code. The method further comprises classifying, via a ML model, the section of executable code with a stage classification label indicative of a stage within a workflow for automated ML that the executable code applies to. The method further comprises categorizing, based on the stage classification label, the section of executable code with a category of annotation that is most appropriate for the section of executable code. The method further comprises generating a suggested annotation for the section of executable code based on the category of annotation. The method further comprises providing, as output, the suggested annotation to a display of an electronic device for user review. The suggested annotation is user interactable via the electronic device.

    Multi-bit error correction method and apparatus based on a BCH code and memory system

    公开(公告)号:US10243589B2

    公开(公告)日:2019-03-26

    申请号:US15240648

    申请日:2016-08-18

    Abstract: Exemplary embodiments for providing multi-bit error correction based on a BCH code are provided. In one such embodiment, the following operations are repeatedly performed, including shifting each bit of the BCH code rightward by 1 bit while filling the bit vacated due to the rightward shifting in the BCH code with 0, calculating syndrome values corresponding to the shifting of the BCH code, and determining a first error number in the BCH code under the shifting based on the syndrome values corresponding to the shifting of the BCH code. In the case where the first error number is not equal to 0, modified syndrome values are calculated corresponding to the shifting of the BCH code. The modified syndrome values are those corresponding to the case that the current rightmost bit of the BCH code under the shifting is changed to the inverse value. Additional operations are performed as described herein.

    Offering Personalized and Interactive Decision Support Based on Learned Model to Predict Preferences from Traits

    公开(公告)号:US20180101887A1

    公开(公告)日:2018-04-12

    申请号:US15289420

    申请日:2016-10-10

    CPC classification number: G06Q30/0631 G06N5/022 G06N20/00

    Abstract: A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement a personalized interactive decision support system. A personalized product recommendation module executing within the personalized interactive decision support system correlates at least one customer to a set of consumption preferences using a machine learning model based on a set of traits of the at least one customer to form at least one customer-to-preference correlation. The personalized product recommendation module maps a set of products to the set of consumption preferences using a consumption preferences-to-product attribute mapping data structure based on a set of attributes of the set of products to form a set of product-to-preference correlations. The personalized product recommendation module matches the at least one customer to at least one product within a set of products based on the at least one customer-to-preference correlation and the set of product-to-preference correlations to form at least one product recommendation. A visual and interactive decision support module executing within the personalized interactive decision support system presents the at least one product recommendation.

    LEARNING-BASED AUTOMATION MACHINE LEARNING CODE ANNOTATION IN COMPUTATIONAL NOTEBOOKS

    公开(公告)号:US20220113964A1

    公开(公告)日:2022-04-14

    申请号:US17069402

    申请日:2020-10-13

    Abstract: One embodiment of the invention provides a method for automated code annotation in machine learning (ML) and data science. The method comprises receiving, as input, a section of executable code. The method further comprises classifying, via a ML model, the section of executable code with a stage classification label indicative of a stage within a workflow for automated ML that the executable code applies to. The method further comprises categorizing, based on the stage classification label, the section of executable code with a category of annotation that is most appropriate for the section of executable code. The method further comprises generating a suggested annotation for the section of executable code based on the category of annotation. The method further comprises providing, as output, the suggested annotation to a display of an electronic device for user review. The suggested annotation is user interactable via the electronic device.

Patent Agency Ranking