-
公开(公告)号:US10812551B1
公开(公告)日:2020-10-20
申请号:US15862422
申请日:2018-01-04
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Swaminathan Sivasubramanian , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Amol Devgan , Mukul Vijay Karnik
Abstract: A hosted analytics system may be integrated with transactional data systems and additional data sources such real-time systems and log files. A data processing pipeline may transform data on arrival for incorporation into an n-dimensional cube. Correlation between patterns of events in transactional data may be identified. Upon arrival, new data may be transformed and incorporated into the n-dimensional cube. Similarity between the new data and a previously identified correlation may be determined and flagged.
-
公开(公告)号:US09882949B1
公开(公告)日:2018-01-30
申请号:US14503077
申请日:2014-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Swaminathan Sivasubramanian , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Amol Devgan , Mukul Vijay Karnik
Abstract: A hosted analytics system may be integrated with transactional data systems and additional data sources such real-time systems and log files. A data processing pipeline may transform data on arrival for incorporation into an n-dimensional cube. Correlation between patterns of events in transactional data may be identified. Upon arrival, new data may be transformed and incorporated into the n-dimensional cube. Similarity between the new data and a previously identified correlation may be determined and flagged.
-
公开(公告)号:US10685042B2
公开(公告)日:2020-06-16
申请号:US14578786
申请日:2014-12-22
Applicant: Amazon Technologies, Inc.
IPC: G06F17/00 , G06F16/28 , G06F16/242 , G06F16/2453 , G06F16/27
Abstract: A corpus of information describing queries used to access a transactional data store may be used to identify analytical relationships that are not explicitly defined in a schema or supplied by a user. Join relationships may be identified based on field coincidence in elements of queries in the corpus. Join relationships may be indicative of dimensions and attributes of a dimension. Hierarchy levels for a dimension may be identified based on factors including data type, reference in an aggregating clause, and reference in a grouping clause.
-
公开(公告)号:US10430438B2
公开(公告)日:2019-10-01
申请号:US14494506
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
IPC: G06F16/28 , H04L29/06 , G06F16/24 , G06F16/26 , G06F16/185 , G06F16/27 , G06F16/901 , G06F21/62
Abstract: An online analytical processing system may comprise an n-dimensional cube structured using slice-based partitioning in which each slice comprises one or more hierarchies of data points. A region of a hierarchy may be classified according to computational demands associated with the region. A scaling or replication mechanism may be applied to the region based on the computational demands associated with that region.
-
公开(公告)号:US20190073398A1
公开(公告)日:2019-03-07
申请号:US16179802
申请日:2018-11-02
Applicant: Amazon Technologies, Inc.
IPC: G06F17/30
CPC classification number: G06F16/2456 , G06F16/275
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US10831759B2
公开(公告)日:2020-11-10
申请号:US16179802
申请日:2018-11-02
Applicant: Amazon Technologies, Inc.
IPC: G06F17/00 , G06F16/2455 , G06F16/27
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US10120905B2
公开(公告)日:2018-11-06
申请号:US14578841
申请日:2014-12-22
Applicant: Amazon Technologies, Inc.
Abstract: A probabilistic counting structure such as a hyperloglog may be formed during a table scan for each of a selected set of columns. The columns may be selected based on an initial estimate of relatedness, which may be based on data types of the respective columns. An estimated cardinality of an intersection or union of columns may be formed based on an intersection of the probabilistic data structures. A join path may be determined based on the estimated cardinality of an intersection or union of the columns.
-
公开(公告)号:US09824133B1
公开(公告)日:2017-11-21
申请号:US14494473
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
CPC classification number: G06F17/30592
Abstract: A multi-tenant system for providing hosted analytic services may be dynamically configured in response to a request from a user. A request for analytic services may comprise an indication of at least one data source to be incorporated into an n-dimensional cube. A data source connector and transformation pipeline may transform data received from the data source to a format compatible with a dimension and hierarchy model of the n-dimensional cube.
-
公开(公告)号:US11386115B1
公开(公告)日:2022-07-12
申请号:US14485003
申请日:2014-09-12
Applicant: Amazon Technologies, Inc.
IPC: G06F16/00 , G06F16/27 , H04L67/1097
Abstract: A transactional data storage engine may implement selectable storage endpoints. A selection of storage endpoints may be received at a transactional data storage engine. The selected storage endpoints may identify storage locations maintaining replicas of data for the transactional data storage engine. A storage engine configuration for the transactional data storage engine may be updated to include the storage endpoints so that access requests for the data may be sent to storage endpoints identified according to the storage engine configuration. In some embodiments, storage endpoints may identify strongly consistent or eventually consistent storage locations for performing reads of the data maintained for the transactional data storage engine.
-
公开(公告)号:US10776397B2
公开(公告)日:2020-09-15
申请号:US14494524
申请日:2014-09-23
Applicant: Amazon Technologies, Inc.
Inventor: Santosh Kalki , Srinivasan Sundar Raghavan , Timothy Andrew Rath , Mukul Vijay Karnik , Amol Devgan , Swaminathan Sivasubramanian
IPC: G06F16/28 , H04L29/06 , G06F16/24 , G06F16/26 , G06F16/185 , G06F16/27 , G06F16/901 , G06F21/62
Abstract: An online analytical processing system may comprise an n-dimensional cube partitioned into slices, in which each slice may represent data points at the intersections of fixed and variable dimensions. Computation of data points within a slice may be deferred. A dependency graph may be initially constructed, in which the dependency graph is utilized in a subsequent computation. Calculation of data points may be prioritized based on information indicative of a chance that the data points will be accessed.
-
-
-
-
-
-
-
-
-