-
公开(公告)号:US09165299B1
公开(公告)日:2015-10-20
申请号:US14139713
申请日:2013-12-23
Applicant: Palantir Technologies, Inc.
Inventor: Geoff Stowe , Harkirat Singh , Stefan Bach , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
CPC classification number: G06F17/3053 , G06F17/30345 , G06F17/30412 , G06F17/30539 , G06F17/30572 , G06F17/30598 , G06F17/30601 , G06F17/30604 , G06F17/30699 , G06F17/30705 , G06F17/3071 , G06F17/30867 , G06Q10/10 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/025 , G06Q40/10 , G06Q40/123
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
Abstract translation: 在各种实施例中,公开了用于从种子生成相关数据集合的集合的系统,方法和技术。 可以根据种子生成策略或规则生成种子。 可以通过例如检索种子,将种子添加到第一群集,检索群集策略或规则,以及基于聚类策略将相关数据和/或数据实体添加到群集来生成群集。 可以基于给定簇中的数据的属性来生成各种聚类分数。 此外,可以基于与集群相关联的各种聚类分数来生成集群组合。 群集可能会根据群集元素进行排名。 各种实施例可以使分析人员能够发现与数据集群相关的各种见解,并且可以适用于各种任务,包括例如税欺诈检测,信标恶意软件检测,恶意软件用户代理检测和/或活动趋势检测 其他。
-
公开(公告)号:US09135658B2
公开(公告)日:2015-09-15
申请号:US14264445
申请日:2014-04-29
Applicant: Palantir Technologies, Inc.
Inventor: Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
CPC classification number: G06F17/3053 , G06F17/30345 , G06F17/30412 , G06F17/30539 , G06F17/30572 , G06F17/30598 , G06F17/30601 , G06F17/30604 , G06F17/30699 , G06F17/30705 , G06F17/3071 , G06F17/30867 , G06Q10/10 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/025 , G06Q40/10 , G06Q40/123
Abstract: Techniques are disclosed for prioritizing a plurality of clusters. Prioritizing clusters may generally include identifying a scoring strategy for prioritizing the plurality of clusters. Each cluster is generated from a seed and stores a collection of data retrieved using the seed. For each cluster, elements of the collection of data stored by the cluster are evaluated according to the scoring strategy and a score is assigned to the cluster based on the evaluation. The clusters may be ranked according to the respective scores assigned to the plurality of clusters. The collection of data stored by each cluster may include financial data evaluated by the scoring strategy for a risk of fraud. The score assigned to each cluster may correspond to an amount at risk.
Abstract translation: 公开了用于优先考虑多个聚类的技术。 优先化集群通常可以包括识别用于对多个集群进行优先级排序的评分策略。 每个集群都是从种子生成的,并存储使用种子检索的数据集合。 对于每个集群,根据评分策略评估集群存储的数据集合的元素,并根据评估将分数分配给集群。 可以根据分配给多个聚类的各个分数对聚类进行排名。 由每个集群存储的数据的收集可以包括通过得分策略评估的财务数据以获得欺诈的风险。 分配给每个集群的分数可能对应于处于风险中的金额。
-
公开(公告)号:US20140280034A1
公开(公告)日:2014-09-18
申请号:US13826228
申请日:2013-03-14
Applicant: Palantir Technologies, Inc.
Inventor: Michael Harris , John Carrino , Eric Wong
IPC: G06F17/30
CPC classification number: G06F17/30451 , G06F17/30442 , G06F17/30463 , G06F17/30477 , G06F17/30864
Abstract: A fair scheduling system with methodology for fairly scheduling queries for execution by a database management system is disclosed. The techniques involve obtaining computer-executable query jobs and cost estimates to execute the query jobs. For example, the cost estimate can be a number of results the query is expected to return. Based on the cost estimates, the fair scheduling system causes the database management system to execute the query jobs as separately executable sub-query tasks in a round-robin fashion which can decrease latency of low cost queries concurrently executing with high cost queries.
Abstract translation: 公开了一种具有用于公正地调度数据库管理系统执行查询的方法的公平调度系统。 这些技术涉及获取计算机可执行查询作业和成本估计以执行查询作业。 例如,成本估算可以是查询预期返回的多个结果。 基于成本估算,公平调度系统使得数据库管理系统以循环方式将查询作业执行为可单独执行的子查询任务,这可以降低以高成本查询同时执行的低成本查询的等待时间。
-
公开(公告)号:US08788407B1
公开(公告)日:2014-07-22
申请号:US14139603
申请日:2013-12-23
Applicant: Palantir Technologies, Inc.
Inventor: Harkirat Singh , Geoff Stowe , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: G06Q40/00
CPC classification number: G06F17/3053 , G06F17/30345 , G06F17/30412 , G06F17/30539 , G06F17/30572 , G06F17/30598 , G06F17/30601 , G06F17/30604 , G06F17/30699 , G06F17/30705 , G06F17/3071 , G06F17/30867 , G06Q10/10 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/025 , G06Q40/10 , G06Q40/123
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
Abstract translation: 在各种实施例中,公开了用于从种子生成相关数据集合的集合的系统,方法和技术。 可以根据种子生成策略或规则生成种子。 可以通过例如检索种子,将种子添加到第一群集,检索群集策略或规则,以及基于聚类策略将相关数据和/或数据实体添加到群集来生成群集。 可以基于给定簇中的数据的属性来生成各种聚类分数。 此外,可以基于与集群相关联的各种聚类分数来生成集群组合。 群集可能会根据群集元素进行排名。 各种实施例可以使分析人员能够发现与数据集群相关的各种见解,并且可以适用于各种任务,包括例如税欺诈检测,信标恶意软件检测,恶意软件用户代理检测和/或活动趋势检测 其他。
-
-
-