Augmenting match indices
    41.
    发明授权

    公开(公告)号:US10817549B2

    公开(公告)日:2020-10-27

    申请号:US15590371

    申请日:2017-05-09

    Abstract: System creates three tries based on values stored in first three fields by records. System associates node in third trie with record, based on value stored in third field by record. System associates node with first dispersion measure, based on values stored in first field by records associated with node, and with second dispersion measure, based on values stored in second field by records associated with node. System identifies branch sequence in third trie as key for prospective record, based on value stored in third field by prospective record. System uses key to identify a subset of records that match prospective record. If a count of the subset exceeds threshold, the system identifies other branch sequence in first trie or second trie as other key for prospective record, based on first dispersion measure and second dispersion measure. System uses the key and the other key to identify at least one record that matches prospective record.

    MACHINE LEARNING FROM DATA STEWARD FEEDBACK FOR DATA MATCHING

    公开(公告)号:US20200250687A1

    公开(公告)日:2020-08-06

    申请号:US16263313

    申请日:2019-01-31

    Abstract: A system determines factored score by multiplying factor and match score for values of field in two records, offset score by adding offset to factored score, and weighted score by applying weight to offset score. The system determines status for two records based on combining weighted score with other weighted score corresponding to other field of two records. The system revises factor, offset, and weight based on feedback associated with two records. The system determines revised factored score by multiplying revised factor and match score for other values of field in two other records, revised offset score by adding revised offset to revised factored score, and revised weighted score by applying revised weight to revised offset score. The system determines learned status for two other records based on combining revised weighted score with additional weighted score corresponding to other field for two other records.

    Account routing to user account sets

    公开(公告)号:US10715626B2

    公开(公告)日:2020-07-14

    申请号:US14751401

    申请日:2015-06-26

    Abstract: New account routing to user account sets is described. A system creates multiple accounts profiles corresponding to multiple sets of accounts, based on multiple attributes associated with each account of the multiple sets of accounts. The system calculates multiple account scores for an account based on comparing multiple attributes associated with the account against the corresponding multiple accounts profiles, wherein the account is not in the multiple sets of accounts. The system identifies a highest account score of the multiple account scores. The system routes the account to a user associated with a set of accounts corresponding to the highest account score.

    Bulk deduplication detection
    44.
    发明授权

    公开(公告)号:US10152497B2

    公开(公告)日:2018-12-11

    申请号:US15052382

    申请日:2016-02-24

    Abstract: Some embodiments of the present invention include a system and method for removing duplicate records from a group of records in a database system. The method includes generating a first cluster of records from the group of records, generating a second cluster of records from the group of records, identifying sets of duplicate records in the first cluster of records, and identifying sets of duplicate records in the second cluster of records. The method also includes merging at least two sets of duplicate records associated with both the first cluster and the second cluster of records to form a merged set of duplicate records. The merging is performed based on the at least two sets of duplicate records having a common record. Duplicate records in the group of records may then be removed by removing duplicate records from the merged set of duplicate records.

    Combined directed graphs
    45.
    发明授权

    公开(公告)号:US09977797B2

    公开(公告)日:2018-05-22

    申请号:US14867154

    申请日:2015-09-28

    Abstract: A combined directed graph is created having a corresponding node for each node in a first directed graph lacking a corresponding node in a second directed graph, each node in the second graph lacking a corresponding node in the first graph, and each node in the first graph having a corresponding node in the second graph. A corresponding directed arc is created in the combined directed graph for each arc in the first graph lacking a corresponding arc in the second directed graph, each arc in the second graph lacking a corresponding arc in the first graph, and each arc in the first graph having a corresponding arc in the second graph. A recommendation is output for a user to interact with a recommended object based on an object interaction and a conditional probability, in the combined graph, which corresponds to the recommended object and the object interaction.

    USER SCORES BASED ON BULK RECORD UPDATES
    46.
    发明申请
    USER SCORES BASED ON BULK RECORD UPDATES 审中-公开
    基于大写记录更新的用户分数

    公开(公告)号:US20160125347A1

    公开(公告)日:2016-05-05

    申请号:US14532179

    申请日:2014-11-04

    CPC classification number: G06Q10/06398 G06F16/248

    Abstract: User scores based on bulk record updates is described. A system receives record updates submitted by a user. The system subtracts a penalty debit from a user score, which corresponds to the user, for each record which corresponds to at least one of the record updates and which is removed from purchasing availability. The system adds a full credit to the user score for each record which corresponds to at least one of the record updates and which is purchased. The system adds a partial credit to the user score for each record which corresponds to at least one of the record updates and which is yet to be purchased and which is yet to be removed from purchasing availability, wherein the partial credit is a positive value that is less than the full credit. The system enables the user to access records, based on the user score.

    Abstract translation: 描述基于批量记录更新的用户分数。 系统接收用户提交的记录更新。 对于对应于至少一个记录更新并且从购买可用性中移除的每个记录,该系统从对应于用户的用户分数中减去罚金。 该系统对与至少一个记录更新和购买的记录相对应的每个记录的用户分数增加了完整的信用。 该系统对与至少一个记录更新对应的每个记录的用户分数添加部分信用,并且哪些尚未被购买,哪些尚未从购买可用性中移除,其中部分信用是正值 不到全部信用。 该系统使用户能够根据用户得分访问记录。

    Machine learning from data steward feedback for merging records

    公开(公告)号:US11755914B2

    公开(公告)日:2023-09-12

    申请号:US16361026

    申请日:2019-03-21

    CPC classification number: G06N3/084 G06F16/9024 G06N7/01 G06N20/00

    Abstract: System determines first and second scores based on applying function to features of first and second values in fields in first and second records, respectively. System determines first priority based on first score and second priority based on second score for displaying first and second values in fields in first profile. System revises, based on feedback associated with first value and second value, parameter associated with function and determines third score based on applying function, associated with revised parameter, to feature of third value in field in third record. System determines fourth score based on applying function, associated with revised parameter, to feature of fourth value in field in fourth record and determines third priority, based on third score, for displaying third value in field in second profile and fourth priority, based on fourth score, for displaying fourth value in field in second profile.

    Generating adaptive match keys based on estimating counts

    公开(公告)号:US11244004B2

    公开(公告)日:2022-02-08

    申请号:US16661715

    申请日:2019-10-23

    Abstract: A system creates a graph of nodes connected by edges, the nodes including: i) a first node associated with a first value and a count of the first value, and ii) a second node associated with a second value and a count of the second value, the edges including an edge that connects the first and second nodes and is associated with a count of instances of the first value being stored with the second value. The system includes each node and each associated with clique count less than clique threshold in keys sets and deletes each node and each edge associated with clique count less than clique threshold. The system identifies triplet nodes connected by triplet edges. If estimated clique count for triplet values represented by triplet nodes is less than clique threshold, the system includes triplet values in keys set and identify triplet of nodes as analyzed.

    Efficiently and accurately assessing the number of identifiable records for creating personal profiles

    公开(公告)号:US11176156B2

    公开(公告)日:2021-11-16

    申请号:US16409559

    申请日:2019-05-10

    Abstract: A system determines a name probability based on a first name dataset frequency of a first name value stored by a first name field in a personal record and a last name dataset frequency of a last name value stored by a last name field in a personal record. The system determines at least one other probability based on another dataset frequency of another value stored by another field in the personal record and an additional dataset frequency of an additional value stored by an additional field in the personal record. The system determines a combined probability based on the name probability and the at least one other probability. The system increments a count of identifiable personal records for each personal record that has a corresponding combined probability that satisfies an identifiability threshold. The system outputs a message based on the count of identifiable personal records.

    REAL-TIME PREDICTIONS BASED ON MACHINE LEARNING MODELS

    公开(公告)号:US20210241179A1

    公开(公告)日:2021-08-05

    申请号:US16777686

    申请日:2020-01-30

    Abstract: An online system performs predictions for real-time tasks and near real-time tasks that need to be performed by a deadline. A client device receives a real-time machine learning based model associated with a measure of accuracy. If the client device determines that a task can be performed using predictions having less than the specified measure of accuracy, the client device uses the real-time machine learning based model. If the client device determines that a higher level of accuracy of results is required, the client device sends a request to an online system. The online system provides a prediction along with a string representing a rationale for the prediction.

Patent Agency Ranking