Abstract:
Disclosed herein are an apparatus and method for managing a data stream distributed parallel processing service. The apparatus includes a service management unit, a Quality of Service (QoS) monitoring unit, and a scheduling unit. The service management unit registers a plurality of tasks constituting the data stream distributed parallel processing service. The QoS monitoring unit gathers information about the load of the plurality of tasks and information about the load of a plurality of nodes constituting a cluster which provides the data stream distributed parallel processing service. The scheduling unit arranges the plurality of tasks by distributing the plurality of tasks among the plurality of nodes based on the information about the load of the plurality of tasks and the information about the load of the plurality of nodes.
Abstract:
Provided are a cluster data management system and a method for data restoration using a shared redo log in the cluster data management system. The data restoration method includes collecting service information of a partition served by a failed partition server, dividing redo log files written by the partition server by columns of a table including the partition, restoring data of the partition on the basis of the collected service information and log records of the divided redo log files, and selecting a new partition server that will serve the data-restored partition, and allocating the partition to the selected partition server.
Abstract:
A concurrency control method for searching the high-dimensional index tree of a database is disclosed. The concurrency control includes: a) adding a root node to the queue and acquiring the shared lock for reinsertion node; b) determining whether the queue is empty or not, fetching a node from the queue and assigning the fetched node as a current node if queue is not empty, releasing the shared lock and terminating the search process if queue is empty; c) acquiring the shared latch in the current node, selecting the lower nodes which are within the query range and adding the selected nodes to the queue if current node is not leaf or to the result set if current node is leaf; and d) returning to the step b).
Abstract:
Disclosed are a column-based data managing method and apparatus, and a column-based data searching method. The column-based data managing method includes determining whether the size of the column-group data file exceeds a partitioning threshold, dividing the column-group data if the size exceeds the partitioning threshold, and generating divided column-group data files.
Abstract:
Provided are a system and a method for indexing high-dimensional data in parallel in a cluster environment. The system for indexing high-dimensional data in parallel in a cluster environment includes a Spill-tree creation means for creating a Spill-tree using an sampled N-dimensional feature vector, a feature vector division storage means for distributedly storing the N-dimensional feature vector in a terminal node of the Spill-tree, and a local signature creation means for creating and managing a local signature for the N-dimensional feature vector dispersed into each node of the Spill-tree.
Abstract:
Provided is a continuous query processing apparatus and method using operation sharable among multiple queries on an Extensible Markup Language (XML) data stream. The apparatus, includes: a storing unit for storing a sharable operation result; a syntactic analyzation unit for performing a syntactic analysis on the registered continuous query; a semantic analyzation unit for analyzing the meaning upon receiving a syntactic analysis result from the syntactic analyzation unit; a sharable operation extracting unit for extracting a sharable operation upon receiving a semantic analysis result from the semantic analyzation unit; and a query execution unit for storing the result of the extracted sharable operation in the storing unit and performing the continuous queries on an XML data stream based on the result of the semantic analysis and the result of the sharable operation stored in the storing unit.
Abstract:
Provided is a stream data processing system and method for avoiding duplication of data process. The system including: an evaluation result storing unit for updating and storing a query condition evaluation result; a window evaluating unit for performing window evaluation; a data separating unit for separating data into new data and duplication input data; a reuse result extracting unit for receiving duplication input data from the data separating unit and extracting a query condition evaluation result; a query condition evaluating unit for receiving new data from the data separating unit, performing query condition evaluation and creating a query condition evaluation result; and a result organizing unit for receiving the query condition evaluation result, merging, outputting and transmitting the query condition evaluation result to the evaluation result storing unit.
Abstract:
A cache consistency maintenance procedure to select one of a detection-based cache consistency maintenance procedure optimized for record-based locking and an avoidance-based consistency-based maintenance procedure optimized for table and block-based locking. To support the characteristic of DBMS in which table locking and record locking are consistent to access the same table, the two kinds of the consistency maintenance policies for the same block are processed by a single buffer load process and the two kinds of the consistency maintenance policies are consistent with each other to provide better configuration and performance.
Abstract:
A recovery method for a high-dimensional index structure is disclosed, in which a reinsert operation is employed based on ARIES (algorithm for recovery and isolation exploiting semantics) and a page-oriented redo and a logical undo. Further, a recording medium on which a program for carrying out the above method is recorded is disclosed, the program being readable by a computer. The recovery method for a high-dimensional index structure employing a reinsert operation according to the present invention includes the following steps. At a first step, an entry is inserted into a node, a minimum bounding region is adjusted, an overflow is processed, and a log record is stored. At a second step, the log record thus stored is recovered.
Abstract:
A bulk loading method, for use in a high-dimensional index structure using some parts of dimensions based on an unbalanced binarization scheme, accelerates an index construction and improves a search performance. For the purpose, the bulk loading method calculates a topology of the index by recognizing information for the index to be constructed using a given data set, splits the given data set into sub-sets of data by repeatedly performing an establishment of a split strategy and a binarization based on the calculated topology of the index, if a leaf node is derived from the sub-sets of data divided through a top-down recursive split process, reflects a minimum bounding region of the leaf node on a higher node, and, if a non-leaf node is generated, repeatedly performing the above processes for another sub-set of data to thereby produce a final root node.