GENERATING AN ERROR POLICY FOR A MACHINE LEARNING ENGINE

    公开(公告)号:US20240202575A1

    公开(公告)日:2024-06-20

    申请号:US18069150

    申请日:2022-12-20

    IPC分类号: G06N20/00

    CPC分类号: G06N20/00

    摘要: A computer hardware system includes a slice generator and a policy generator and performs the following. The slice generator slices a first dataset including true values and predicted values of a class variable into a plurality of slices each defining a plurality of observations within the first dataset. A first one and another one of the plurality of slices are selected, and a union of observations is generated by adding observations within the selected another one to observations within the selected first one of the plurality of slices. The selecting another one of the plurality of slices and the generating the union is repeated until a number of observations within the union reaches a predetermined value. Using the policy generator and after the number of observations within the union reaches the predetermined value, an error policy is generated. The predicted values were generated by a machine learning engine.

    Utilizing semantic clusters to Predict Software defects
    3.
    发明申请
    Utilizing semantic clusters to Predict Software defects 审中-公开
    利用语义聚类来预测软件缺陷

    公开(公告)号:US20160004627A1

    公开(公告)日:2016-01-07

    申请号:US14324191

    申请日:2014-07-06

    IPC分类号: G06F11/36 G06F9/44

    摘要: A method, apparatus and product for utilizing semantic clusters to predict software defects. The method comprising: obtaining a plurality of software elements that are associated with a version of a System Under Test (SUT), wherein the plurality of software elements comprise defective software elements which are associated with a defect in the version of the SUT; defining, by a processor, a plurality of clusters, wherein each cluster of the plurality of clusters comprises software elements having an attribute, wherein the attribute is associated with a functionality of the SUT; and determining a score of each cluster of the plurality of clusters, wherein the score of a cluster is based on a relation between a number of defect software elements in the cluster and a number of software elements in the cluster.

    摘要翻译: 一种利用语义聚类来预测软件缺陷的方法,装置和产品。 该方法包括:获得与被测系统(SUT)的版本相关联的多个软件单元,其中所述多个软件单元包括与所述SUT版本中的缺陷相关联的缺陷软件单元; 由处理器定义多个群集,其中所述多个群集中的每个群集包括具有属性的软件元素,其中所述属性与所述SUT的功能相关联; 以及确定所述多个群集中的每个群集的得分,其中,所述群集的得分基于所述群集中的多个缺陷软件元素与所述群集中的软件元素的数量之间的关系。

    Anomaly detection of entity behavior

    公开(公告)号:US11995068B1

    公开(公告)日:2024-05-28

    申请号:US18142584

    申请日:2023-05-03

    摘要: A method including: receiving a set of data representing usage by entities of objects in a computing resource; extracting, from the initial set of data, one or more feature vectors representing the usage by one of the entities with respect to the objects; generating, from the feature vectors, a feature matrix; with respect to each entry in the feature matrix: (i) assigning a binary value to the entry, based on a predefined usage threshold, (ii) identifying, among the one or more entities, k nearest neighbor entities with respect to the one of the entities, based on a predefined distance threshold, and (iii) modifying the usage value of the entry, based on usage values associated with each of the k nearest neighbor entities with respect to the one of the objects; and updating the feature matrix with the modified usage values, to obtain a manipulated feature matrix.

    Path-coverage directed black box API testing

    公开(公告)号:US11768758B2

    公开(公告)日:2023-09-26

    申请号:US17495200

    申请日:2021-10-06

    IPC分类号: G06F11/36 G06N20/00 G06N5/01

    摘要: Methods, systems, and computer program products for path-coverage directed black box application programming interface (API) testing are provided herein. A computer-implemented method includes determining constraints based on inputs and corresponding outputs of an API in a production environment; generating initial test inputs based at least in part on the constraints; creating a program dependency graph based on trace sequences and request-response data obtained in response to providing the initial test inputs to an endpoint of the API; enhancing the program dependency graph by generating additional test inputs directed to one or more paths of the dependency graph; identifying, based on the enhanced program dependency graph, at least a portion of the API that is not covered by an existing test suite; and using the enhanced program dependency graph to generate new test cases for the test suite based on the identifying.

    METHODS AND SYSTEMS FOR AUTOMATICALLY IDENTIFY IN A DATASET INSUFFICIENT DATA FOR LEARNING, OR RECORDS WITH ANOMALOUS COMBINATIONS OF FEATURE VALUES

    公开(公告)号:US20230205847A1

    公开(公告)日:2023-06-29

    申请号:US17561951

    申请日:2021-12-26

    IPC分类号: G06K9/62

    CPC分类号: G06K9/6219 G06K9/6261

    摘要: Systems and methods for automatically identifying in a dataset insufficient data for learning, or records with anomalous combinations of feature values, by partition of numeric and/or categorical data space into human-interpretable regions are disclosed. The method comprises: receiving a dataset of numeric and/or categorical features with a plurality of observations.
    Calculating observation density for each observation according to a distance or anomaly based metric, and receiving a density measurement. Partitioning the dataset along the numeric and/or categorical features according to the density measurement of each observation by a perpendicular cut along the feature spaces, receiving a map of a plurality of hyper-rectangular shapes representing various levels of density including empty spaces. Displaying the received map, being human-interpretable regions on a Graphic user interface, GUI, wherein the plurality of hyper-rectangular shapes are selectable and present information about the selected hyper-rectangular shape level of density when selected by a user.

    AUTOMATIC DETECTION OF CHANGES IN DATA SET RELATIONS

    公开(公告)号:US20230102152A1

    公开(公告)日:2023-03-30

    申请号:US17484104

    申请日:2021-09-24

    摘要: A system, program product, and method for automatic detection of data drift in a data set are presented. The method includes determining changes to relations in the data set through generating baseline and production data sets. The method further includes generating a production data set with some inserted data distortion, and defining, for a plurality of features in the baseline data set, potential relations for participant features. The method also includes determining a first likelihood and a second likelihood of each potential relation in the baseline and production data sets, respectively, for the participant features. The method further includes comparing each first likelihood with each second likelihood, generating a comparison value that is compared with a threshold value, and determining, subject to the comparison value exceeding the threshold value, the potential relation in the baseline data set does not describe a relation in the production data set.

    ESTIMATING FEASIBILITY AND EFFORT FOR A MACHINE LEARNING SOLUTION

    公开(公告)号:US20210012221A1

    公开(公告)日:2021-01-14

    申请号:US16508698

    申请日:2019-07-11

    摘要: A method, computer system, and a computer program product for assessing a likelihood of success associated with developing at least one machine learning (ML) solution is provided. The present invention may include generating a set of questions based on a set of raw training data. The present invention may also include computing a feasibility score based on an answer corresponding with each question from the generated set of questions. The present invention may then include, in response to determining that the computed feasibility score satisfies a threshold, computing a level of effort associated with developing the at least one ML solution to address a problem. The present invention may further include presenting, to a user, a plurality of results associated with assessing the likelihood of success of the at least one ML solution.

    Utilizing semantic clusters to Predict Software defects

    公开(公告)号:US20160292069A1

    公开(公告)日:2016-10-06

    申请号:US15186560

    申请日:2016-06-20

    IPC分类号: G06F11/36 G06F9/44

    摘要: A method, apparatus and product for utilizing semantic clusters to predict software defects. The method comprising: obtaining a plurality of software elements that are associated with a version of a System Under Test (SUT), wherein the plurality of software elements comprise defective software elements which are associated with a defect in the version of the SUT; defining, by a processor, a plurality of clusters, wherein each cluster of the plurality of clusters comprises software elements having an attribute, wherein the attribute is associated with a functionality of the SUT; and determining a score of each cluster of the plurality of clusters, wherein the score of a cluster is based on a relation between a number of defect software elements in the cluster and a number of software elements in the cluster.