摘要:
A database management system has a query interface for receiving a query and a query executor for executing the received query. The query executor dynamically generates a task, and executes a plurality of tasks in parallel. In execution of each task, each time data is required to be read from a database, the query executor generates a task for acquiring the data, and in execution of the generated task, issues a data read request for reading the data from the database, to thereby shorten a time taken to execute each task.
摘要:
Performing data analytics processing in the context of a large scale distributed system that includes a massively parallel processing (MPP) database and a distributed storage layer is disclosed. In various embodiments, a data analytics request is received. A plan is created to generate a response to the request. A corresponding portion of the plan is assigned to each of a plurality of distributed processing segments, including by invoking as indicated in the assignment one or more data analytical functions embedded in the processing segment.
摘要:
Embodiments relate to re-writing database query plans, and visualizing such re-written query plans. A query re-write framework includes a query normalization engine in communication with a rule catalog comprising query re-write rules in the form of rule classes. The framework receives as input, a query plan graph to be re-written. Based upon the engine's application of re-write rules from the catalog, the framework produces a re-written query plan graph as output. An interface component of the framework may provide a visualization of the re-written query plan graph as part of a dashboard. A user may access the framework to enable/disable existing rules in the catalog, add new rules to the catalog, and/or control a sequence and a precedence in which rules are applied to re-write the query plan. A user may interact with the visualization of the re-written query plan for purposes of de-bugging, re-write optimization, and/or query development.
摘要:
Performing data analytics processing in the context of a large scale distributed system that includes a massively parallel processing (MPP) database and a distributed storage layer is disclosed. In various embodiments, a data analytics request is received. A plan is created to generate a response to the request. A corresponding portion of the plan is assigned to each of a plurality of distributed processing segments, including by invoking as indicated in the assignment one or more data analytical functions embedded in the processing segment.
摘要:
Embodiments relate to an eigenvalue-based data query. An aspect includes receiving a query request that includes a query statement. Another aspect includes calculating eigenvalues of key component elements in the query statement. Another aspect includes matching eigenvalues of nodes in an execution plan of a historical query statement to the eigenvalues of the key component elements. Yet another aspect includes based on determining success of matching the eigenvalues of the key component elements to the eigenvalues of the nodes in an execution plan of the historical query statement, generating an execution plan of the query statement.
摘要:
In an example embodiment, event stream processing is performed by first parsing an input query into a directed acyclic graph (DAG) including a plurality of operator nodes. Then a grouping of one or more of the operator nodes is created. One or more partitions are created, either by the user or automatically, in the DAG by forming one or more duplicates of the grouping. A splitter node is created in the DAG, the splitter node splits data from one or more event streams and distributes it among the grouping and the duplicates of the grouping. Then, the input query is resolved by processing data from one or more event streams using the DAG.
摘要:
A locally optimized plan for executing a command using a sequence of steps can be determined for a single computing node. However, the locally optimized sequence of steps may not be optimized for a combined system comprising multiple computing nodes, any one of which may be tasked with executing the command. A plan that is optimized for the combined system may be determined by comparing the predicted cost of locally optimized plans for computing nodes in the combined system.
摘要:
Database system comprising nodes configured in a tree structure is disclosed. The system includes a shared metadata store on the root node. Child nodes may request metadata from their ancestors. Parents will forward the request upward until the metadata is found or the root node is reached.
摘要:
Data query in a share-nothing database includes obtaining a query request and generating an optimized access plan with respect to the query request. The query request relates to external data stored in an external data source and contains a definition for expected distribution of the external data. The data query also includes obtaining data distribution information related to the expected distribution based on the optimized access plan, transmitting the data distribution information to the external data source so that the external data source splits and returns the external data in accordance with the data distribution information, and executing query-related processing of the split external data in accordance with the optimized access plan.
摘要:
An information storage system includes: a data storing unit storing key value data in which a key is one of a plurality of elements of record data composed of the elements and the key is associated with a value including one or a plurality of record data; and a data structure converting unit converting a data structure of the key value data stored by the data storing unit into another data structure by changing the key. The data structure converting unit performs conversion of the data structure of the key value data stored by the data storing unit in accordance with a use condition of the key value data.