DATA CLASSIFICATION METHOD FOR CLASSIFYING INLIER AND OUTLIER DATA

    公开(公告)号:US20240160660A1

    公开(公告)日:2024-05-16

    申请号:US18503197

    申请日:2023-11-07

    CPC classification number: G06F16/55 G06F18/241

    Abstract: A data classification method, for classifying unlabeled images into an inlier data set or an outlier data set, include following steps. The unlabeled images are obtained. An assigned inlier image is selected among the unlabeled images. A similarity matrix is computed and the similarity matrix includes first similarity scores of the unlabeled images relative to the assigned inlier image. Each of the unlabeled images is classified into an inlier data set or an outlier data set according to the similarity matrix, so as to generate inlier-outlier predictions of the unlabeled images.

    DATA CLASSIFICATION METHOD FOR FILTERING OUTLIER TEXT DATA

    公开(公告)号:US20250077552A1

    公开(公告)日:2025-03-06

    申请号:US18818623

    申请日:2024-08-29

    Abstract: A data classification method includes following steps. Text samples are obtained from a dataset. The text samples are converted into text embeddings in a semantic space. An outlier-inlier ranking of the text samples is generated based on an outlier detection algorithm according to distances between the text embeddings in the semantic space. Partial samples are selected from the text samples according to the outlier-inlier ranking. A manual input command is received to assign manual-input labels on the partial samples. A prompt message is generated according to the partial samples with the manual-input labels and unlabeled samples of the text samples. The prompt message is provided to a generative pre-trained transformer model for generating inlier-outlier prediction labels about the unlabeled samples.

Patent Agency Ranking