Abstract:
A method for optimizing migration efficiency of a data file over network is provided. Specifically, a total time of compression time of the data file, transfer time of the data file over the network, and decompression time of the data file, is minimized by adaptively selecting compression methods to compress each data block of the data file. For selecting a compression method for a data block, information entropy of the data block is analyzed, and a real status of computing and system resources is considered. Further, trade-off among the resource usage, compassion speed and compression ratio is made to calculate an optimized transmission solution over the network for each data block of the data file.
Abstract:
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
Abstract:
Automatically associating information technology resource patterns with specific information technology products by receiving a set of data about information technology assets, matching a subset of that data to a pattern in a set of patterns, determining that the subset of the data represents a product associated with that pattern, reporting this determination; receiving feedback on the accuracy of the determination, and updating pattern set information in response to that feedback.
Abstract:
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
Abstract:
The invention relates to compressed data transmission in wireless data communication. Disclosed are methods and apparatuses for transporting residue of vehicle position data via a wireless network. A disclosed method for transporting residue of vehicle position data via a wireless network, includes the steps of: receiving data for updating residue encoding schema from a monitoring server; constructing a residue encoding schema based on the data, thereby producing a constructed residue encoding schema; and storing the constructed residue encoding schema such that the constructed residue encoding schema will become the current residue encoding schema; where: the constructed residue encoding schema is constructed such that each residue of the constructed residue encoding schema corresponds to a code; and the constructed residue encoding schema is constructed such that a residue having a relatively high probability of occurrence corresponds to a code of relatively short length.
Abstract:
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
Abstract:
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
Abstract:
A method for optimizing migration efficiency of a data file over network is provided. Specifically, a total time of compression time of the data file, transfer time of the data file over the network, and decompression time of the data file, is minimized by adaptively selecting compression methods to compress each data block of the data file. For selecting a compression method for a data block, information entropy of the data block is analyzed, and a real status of computing and system resources is considered. Further, trade-off among the resource usage, compassion speed and compression ratio is made to calculate an optimized transmission solution over the network for each data block of the data file.
Abstract:
A method for scheduling MapReduce tasks includes receiving a set of task statistics corresponding to task execution within a MapReduce job, estimating a completion time for a set of tasks to be executed to provide an estimated completion time, calculating a soft decision point based on a convergence of a workload distribution corresponding to a set of executed tasks, calculating a hard decision point based on the estimated completion time for the set of tasks to be executed, determining a selected decision point based on the soft decision point and the hard decision point, and scheduling upcoming tasks for execution based on the selected decision point. The method may also include estimating a map task completion time and estimating a shuffle operation completion time. A computer program product and computer system corresponding to the method are also disclosed.
Abstract:
A method, an apparatus, and a system for locating sensor data. The method includes the steps of: obtaining an index table; intercepting a query for sensor data in runtime; extracting a characteristic parameter from a query condition; locating a block identifier of matching sensor data storage blocks in the index table by using the characteristic parameter; and loading the storage blocks into a memory space of a working processor; where the index table contains mapping relationships between block identifiers of sensor data storage blocks and characteristic attributes of sensor data.