-
公开(公告)号:US20170293653A1
公开(公告)日:2017-10-12
申请号:US15634422
申请日:2017-06-27
Applicant: Palantir Technologies, Inc.
Inventor: Michael Harris , John Carrino , Eric Wong
IPC: G06F17/30
CPC classification number: G06F16/24535 , G06F16/2453 , G06F16/24542 , G06F16/2455 , G06F16/951
Abstract: A fair scheduling system with methodology for scheduling queries for execution by a database management system in a fair manner. The system obtains query jobs for execution by the database management system and cost estimates to execute the query jobs. Based on the cost estimates, the system causes the database management system to execute the query jobs as separate sub-query tasks in a round-robin fashion. By doing so, the execution latency of low cost query jobs that return few results is reduced when the query jobs are concurrently executed with high cost query jobs that return many results.
-
公开(公告)号:US09715526B2
公开(公告)日:2017-07-25
申请号:US14726211
申请日:2015-05-29
Applicant: Palantir Technologies, Inc.
Inventor: Michael Harris , John Carrino , Eric Wong
IPC: G06F17/30
CPC classification number: G06F17/30451 , G06F17/30442 , G06F17/30463 , G06F17/30477 , G06F17/30864
Abstract: A fair scheduling system with methodology for fairly scheduling queries for execution by a database management system. The system obtains query jobs for execution by the database management system and cost estimates to execute the query jobs. The cost estimate can be a number of results the query is expected to return. Based on the cost estimates, the system causes the database management system to execute the query jobs as separately sub-query tasks in a round-robin fashion. By doing so, the execution latency of “low cost” query jobs that return few results is reduced when the query jobs are concurrently executed with “high cost” query jobs that return a large number of results.
-
公开(公告)号:US20160344758A1
公开(公告)日:2016-11-24
申请号:US14473920
申请日:2014-08-29
Applicant: Palantir Technologies Inc.
Inventor: David Cohen , Jason Ma , Bing Jie Fu , Ilya Nepomnyashchiy , Steven Berler , Alex Smaliy , Jack Grossman , James Thompson , Julia Boortz , Matthew Sprague , Parvathy Menon , Michael Kross , Michael Harris , Adam Borochoff
IPC: H04L29/06 , G08B21/18 , G06F3/0484
CPC classification number: G08B21/18 , G06F3/04842 , H04L63/0281 , H04L63/1433 , H04L63/145
Abstract: Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyses (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
Abstract translation: 本公开的实施例涉及一种数据分析系统,其可以自动生成存储器有效的集群数据结构,自动分析这些集群数据结构,并以优化的方式向分析者提供自动化分析的结果。 集群数据结构(本文中也称为数据集群)的自动化分析可以包括各种标准或规则的自动应用,以便生成数据集群的紧凑的,人类可读的分析。 可以将数据集群的可读分析(也称为“摘要”或“结论”)组织成交互式用户界面,以使分析人员能够在与各种数据集群相关联的信息之间快速导航,并有效地评估 这些数据集群在例如欺诈调查的背景下。 本公开的实施例还涉及聚类数据结构的自动评分。
-
公开(公告)号:US09501507B1
公开(公告)日:2016-11-22
申请号:US13728879
申请日:2012-12-27
Applicant: Palantir Technologies, Inc.
Inventor: Michael Harris , Jeff Wang , Bobby Prochnow
IPC: G06F17/30
CPC classification number: G06F17/30321 , G06F17/30551
Abstract: A method and apparatus for a data analysis system for analyzing data object collections that include geo-temporal data is provided. One or more temporal granularities are specified for the purpose of generating a geo-temporal data index. The time granularities correspond to temporal ranges expected to correspond to temporal ranges specified in user queries against the data. One or more temporal index bucket groups are generated based on to the specified time granularities. Geo-temporal input data is indexed based on the generated temporal index bucket groups. The system allows a data analyst to specify geo-temporal queries that include both geospatial component and a temporal component. The system transforms geo-temporal queries into one or more second queries that retrieve data items based on the temporal index bucket groups.
Abstract translation: 提供了一种用于分析包括地理时间数据的数据对象集合的数据分析系统的方法和装置。 为了生成地理时间数据索引而指定一个或多个时间粒度。 时间粒度对应于期望对应于在用户查询中针对数据指定的时间范围的时间范围。 基于指定的时间粒度来生成一个或多个时间索引桶组。 基于生成的时间索引桶组对地理时间输入数据进行索引。 该系统允许数据分析人员指定包含地理空间分量和时间分量的地理时间查询。 系统将地理时态查询转换为基于时间索引桶组检索数据项的一个或多个第二查询。
-
公开(公告)号:US09171334B1
公开(公告)日:2015-10-27
申请号:US14139628
申请日:2013-12-23
Applicant: Palantir Technologies, Inc.
Inventor: Alexander Visbal , Adam Borochoff , Jacob Albertson , Trevor Austin , Christopher Rogers , Daniel Campos , Matthew Sprague , Michael Kross , Parvathy Menon , Michael Harris
IPC: G06Q40/00
CPC classification number: G06F17/3053 , G06F17/30345 , G06F17/30412 , G06F17/30539 , G06F17/30572 , G06F17/30598 , G06F17/30601 , G06F17/30604 , G06F17/30699 , G06F17/30705 , G06F17/3071 , G06F17/30867 , G06Q10/10 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/025 , G06Q40/10 , G06Q40/123
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
Abstract translation: 在各种实施例中,公开了用于从种子生成相关数据集合的集合的系统,方法和技术。 可以根据种子生成策略或规则生成种子。 可以通过例如检索种子,将种子添加到第一群集,检索群集策略或规则,以及基于聚类策略将相关数据和/或数据实体添加到群集来生成群集。 可以基于给定簇中的数据的属性来生成各种聚类分数。 此外,可以基于与集群相关联的各种聚类分数来生成集群组合。 群集可能会根据群集元素进行排名。 各种实施例可以使分析人员能够发现与数据集群相关的各种见解,并且可以适用于各种任务,包括例如税欺诈检测,信标恶意软件检测,恶意软件用户代理检测和/或活动趋势检测 其他。
-
公开(公告)号:US09092482B2
公开(公告)日:2015-07-28
申请号:US13826228
申请日:2013-03-14
Applicant: Palantir Technologies, Inc.
Inventor: Michael Harris , John Carrino , Eric Wong
IPC: G06F17/30
CPC classification number: G06F17/30451 , G06F17/30442 , G06F17/30463 , G06F17/30477 , G06F17/30864
Abstract: A fair scheduling system with methodology for fairly scheduling queries for execution by a database management system is disclosed. The techniques involve obtaining computer-executable query jobs and cost estimates to execute the query jobs. For example, the cost estimate can be a number of results the query is expected to return. Based on the cost estimates, the fair scheduling system causes the database management system to execute the query jobs as separately executable sub-query tasks in a round-robin fashion which can decrease latency of low cost queries concurrently executing with high cost queries.
Abstract translation: 公开了一种具有用于公正地调度数据库管理系统执行查询的方法的公平调度系统。 这些技术涉及获取计算机可执行查询作业和成本估计以执行查询作业。 例如,成本估算可以是查询预期返回的多个结果。 基于成本估算,公平调度系统使得数据库管理系统以循环方式将查询作业执行为可单独执行的子查询任务,这可以降低以高成本查询同时执行的低成本查询的等待时间。
-
公开(公告)号:US11848760B2
公开(公告)日:2023-12-19
申请号:US17658893
申请日:2022-04-12
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: G06Q40/00 , H04L9/40 , G06F16/2457 , G06F16/23 , G06F16/242 , G06F16/28 , G06F16/9535 , G06Q10/10 , G06Q40/02 , G06Q40/10 , G06F16/335 , G06F16/35 , G06F16/26 , G06F16/2458 , G06Q40/03 , G06Q20/40 , G06Q30/018 , G06Q40/12 , G06Q20/38
CPC classification number: H04L63/145 , G06F16/23 , G06F16/244 , G06F16/2465 , G06F16/24578 , G06F16/26 , G06F16/283 , G06F16/285 , G06F16/287 , G06F16/288 , G06F16/335 , G06F16/35 , G06F16/355 , G06F16/9535 , G06Q10/10 , G06Q20/382 , G06Q20/4016 , G06Q30/0185 , G06Q40/00 , G06Q40/02 , G06Q40/03 , G06Q40/10 , G06Q40/123
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US20230047056A1
公开(公告)日:2023-02-16
申请号:US17818272
申请日:2022-08-08
Applicant: Palantir Technologies Inc.
Inventor: Allen Chang , Christopher Male , David Cohen , Dragos-Florian Ristache , Danielle Kramer , John Garrod , Michael Harris , Ryan Zheng , Stephen Freiberg
Abstract: Systems and methods including a framework for migration of live data. The method may comprised, by one or more hardware processors executing program instructions, receiving, at a migration proxy of the framework, code for reading data and writing data compatible with each of a plurality of states of a migration of data in a data store, wherein a service is at least intermittently reading data from and writing data to the data store; determining, by a migration runner of the framework, to perform the migration of the data; initiating, by the migration runner, the migration of the data, wherein the migration comprises a plurality of stages; storing, as the migration progresses through the plurality of stages, and at a migration data store of the framework, a current stage of the migration; and during the migration, using the migration proxy to read data from and write data to the data store.
-
公开(公告)号:US20200304522A1
公开(公告)日:2020-09-24
申请号:US16898850
申请日:2020-06-11
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Geoffrey Stowe , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: H04L29/06 , G06Q40/00 , G06F16/2457 , G06F16/23 , G06F16/242 , G06F16/28 , G06F16/9535 , G06Q10/10 , G06Q40/02 , G06F16/335 , G06F16/35 , G06F16/26 , G06F16/2458 , G06Q20/40 , G06Q30/00 , G06Q20/38
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
公开(公告)号:US10721268B2
公开(公告)日:2020-07-21
申请号:US16239081
申请日:2019-01-03
Applicant: Palantir Technologies Inc.
Inventor: Harkirat Singh , Brendan Weickert , Matthew Sprague , Michael Kross , Adam Borochoff , Parvathy Menon , Michael Harris
IPC: G06Q40/00 , H04L29/06 , G06F16/2457 , G06F16/23 , G06F16/242 , G06F16/28 , G06F16/9535 , G06Q10/10 , G06Q40/02 , G06F16/335 , G06F16/35 , G06F16/26 , G06F16/2458 , G06Q20/40 , G06Q30/00 , G06Q20/38
Abstract: In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
-
-
-
-
-
-
-
-
-