DE-DUPLICATING TRANSACTION RECORDS USING TARGETED FUZZY MATCHING

    公开(公告)号:US20240264989A1

    公开(公告)日:2024-08-08

    申请号:US18427309

    申请日:2024-01-30

    CPC classification number: G06F16/215 G06V30/19093 G06V30/412

    Abstract: A computer-implemented method is disclosed. The method includes obtaining, by a de-duplication server, a candidate pair of a plurality of digitally stored documents from a document database. Text elements are identified from each digitally stored document in the candidate pair in response, and the text elements are stored as document extraction attributes. The method then automatically computes and stores relative positional differences of the text elements between each digitally stored document of the candidate pair and a document similarity score based on the relative positional differences. The relative positional differences are compared with a similarity function to form a difference similarity vector for the candidate pair. The difference similarity vector comprises components corresponding to each relative positional difference. The components of the difference similarity vector are aggregated to determine a final score for the candidate pair. A document-level similarity metric is determined from the final score. The method includes determining whether the final score is above a cutoff value, and in response to determining that the final score for the candidate pair is above the cutoff value, comparing the document extraction attribute with the final score. The method also determines whether the document-level similarity metric is above a threshold value by the de-duplication server. The candidate pair is classified based on determining that the document-level similarity metric is above the threshold value to de-duplicate the plurality of digitally stored documents in the candidate pair. Based on the classifying, duplicate transaction documents are removed from the document database by any of deleting records, marking records, updating column attributes, or writing records to a different table.

    Catalog enablement data for supplier systems based on community activities

    公开(公告)号:US11188966B1

    公开(公告)日:2021-11-30

    申请号:US16563615

    申请日:2019-09-06

    Abstract: A method and apparatus for generating recommendation data for cataloging items in an e-procurement system is provided. In various embodiments, a database of records is created and maintained corresponding to a plurality of transactions in an e-procurement system. In various embodiments, database records are weighted and sorted according a transaction method associated with the records. In various embodiments, recommendation data is generated for items associated with the records to suggest more efficient methods for offering items for procurement in an e-marketplace based on the weights and sort order of the records.

Patent Agency Ranking