Abstract:
A method, apparatus, and system for custom policy driven data placement and information lifecycle management in a database management system are provided. A user or database application can specify declarative custom policies that define the movement and transformation of stored database objects. A custom policy defines, for a database object, a custom function to evaluate on an associated database object to determine whether an archiving action is triggered. Archiving actions may include compression, data movement, table clustering, and other actions to place the database object into an appropriate storage tier for a lifecycle phase of the database object. The custom function is defined by the database user, and can flexibly include any customized business logic using data sources internal and external to the database, including database access statistics such as segment level or block level heatmaps.
Abstract:
A method, apparatus, and system for periodic performance optimization through heatmap based management of an in-memory area are provided. A heatmap is maintained to track database accesses, and a sliding most recent time window of the heatmap is externalized to a desired granularity level to provide access statistics regarding candidate elements to be possibly placed in the in-memory area. Initially and on a periodic basis, an appropriate knapsack algorithm is chosen based on an analysis on the computational costs versus the benefits of applying various knapsack algorithms for the candidate elements. Using the chosen algorithm in conjunction with a selected performance model, an optimized configuration of the in-memory area is determined. The optimized configuration indicates a set of elements chosen from the candidate elements, optionally specified with compression levels. A task scheduler then schedules the appropriate tasks, working in a coordinated fashion, to reconfigure the in-memory area according to the optimized configuration.
Abstract:
A method, apparatus, and system for policy driven data placement and information lifecycle management in a database management system are provided. A user or database application can specify declarative policies that define the movement and transformation of stored database objects. The policies are associated with a database object and may also be inherited. A policy defines, for a database object, an archiving action to be taken, a scope, and a condition before the archiving action is triggered. Archiving actions may include compression, data movement, table clustering, and other actions to place the database object into an appropriate storage tier for a lifecycle phase of the database object. Conditions based on access statistics can be specified at the row level and may use segment or block level heatmaps. Policy evaluation occurs periodically in the background, with actions queued as tasks for a task scheduler.
Abstract:
A method, apparatus, and system for policy driven data placement and information lifecycle management in a database management system are provided. A user or database application can specify declarative policies that define the movement and transformation of stored database objects. The policies are associated with a database object and may also be inherited. A policy defines, for a database object, an archiving action to be taken, a scope, and a condition before the archiving action is triggered. Archiving actions may include compression, data movement, table clustering, and other actions to place the database object into an appropriate storage tier for a lifecycle phase of the database object. Conditions based on access statistics can be specified at the row level and may use segment or block level heatmaps. Policy evaluation occurs periodically in the background, with actions queued as tasks for a task scheduler.
Abstract:
Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.
Abstract:
Techniques related to automatic cache management are disclosed. In some embodiments, one or more non-transitory storage media store instructions which, when executed by one or more computing devices, cause performance of an automatic cache management method when a determination is made to store a first set of data in a cache. The method involves determining whether an amount of available space in the cache is less than a predetermined threshold. When the amount of available space in the cache is less than the predetermined threshold, a determination is made as to whether a second set of data has a lower ranking than the first set of data by at least a predetermined amount. When the second set of data has a lower ranking than the first set of data by at least the predetermined amount, the second set of data is evicted. Thereafter, the first set of data is cached.
Abstract:
Techniques related to efficient data storage and retrieval using a heterogeneous main memory are disclosed. A database includes a set of persistent format (PF) data that is stored on persistent storage in a persistent format. The database is maintained on the persistent storage and is accessible to a database server. The database server converts the set of PF data to sets of mirror format (MF) data and stores the MF data in a hierarchy of random-access memories (RAMs). Each RAM in the hierarchy has an associated latency that is different from a latency associated with any other RAM in the hierarchy. Storing the sets of MF data in the hierarchy of RAMs includes (1) selecting, based on one or more criteria, a respective RAM in the hierarchy to store each set of MF data and (2) storing said each set of MF data in the respective RAM.
Abstract:
Techniques are provided for determining costs for alternative execution plans for a query, where at least a portion of the data items required by the query are in in-memory compression-units within volatile memory. The techniques involve maintaining in-memory statistics, such as statistics that indicate what fraction of a table is currently present in in-memory compression units, and the cost of decompressing in-memory compression units. Those statistics are used to determine, for example, the cost of a table scan that retrieves some or all of the necessary data items from the in-memory compression-units.
Abstract:
A method, apparatus, and system for automated information lifecycle management using low access patterns in a database management system are provided. A user or the database can store policy data that defines an archiving action when meeting an activity-level condition on one or more database objects. The archiving actions may include compression, data movement, and other actions to place the database object in an appropriate storage tier for a lifecycle phase of the database object. The activity-level condition may specify the database object meeting a low access pattern, optionally for a minimum time period. Various criteria including access statistics for the database object and cost characteristics of current and target compression levels or storage tiers may be considered to determine the meeting of the activity-level condition. The policies may be evaluated on an adjustable periodic basis and may utilize a task scheduler for minimal performance impact.
Abstract:
Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.