Abstract:
The present invention relates to autonomous tuning of a data grid of documents in a database. Herein are techniques for storage cells to autonomously maintain local indices and other optimization metadata and algorithms to accelerate selective access into a distributed collection of documents. In an embodiment, each storage cell persists a respective subset of documents. Each storage cell stores, in memory, respective index(s) that map each item to location(s), in one or more documents of the respective subset of documents, where the item occurs. One or more computers execute, based on at least a subset of the indices of the storage cells, a data access request from a database management system. In an embodiment, a cloud of JSON document services provides an easy-to-use, fully autonomous JSON document database that horizontally and elastically scales to deliver fast execution of document transactions and queries without needing tuning by a database administrator.
Abstract:
A query may be rewritten to leverage information stored in a structured XML index. An operator in the query may be analyzed to determine an input source database object for the operator by traversing an operator tree rooted at the operator. The path expressions associated with the operator tree may be fused together to form an effective path expression for the operator. If the effective path expression directly matches a path expression derived from the index, the query may be rewritten using references to the index. Operators in a query that have effective paths that refer to data in the same index table may be grouped together. A single subquery may be written for a group of operators. Also, a structured XML index may be used as an implied schema for indexed XML data. This implied schema may be used to optimize queries that refer to the indexed XML data.
Abstract:
Techniques are described for materializing pre-computed results of expressions. In an embodiment, a set of one or more column units are stored in volatile or non-volatile memory. Each column unit corresponds to a column that belongs to an on-disk table within a database managed by a database server instance and includes data items from the corresponding column. A set of one or more virtual column units, and data that associates the set of one or more column units with the set of one or more virtual column units, are also stored in memory. The set of one or more virtual column units includes a particular virtual column unit storing results that are derived by evaluating an expression on at least one column of the on-disk table.
Abstract:
Hierarchical data objects are indexed using an index referred to herein as a hierarchy-value index. A hierarchy-value index has, as index keys, tokens (tag name, a word in node string value) that are extracted from hierarchical data objects. Each token is mapped to the locations that correspond to the data for the token in hierarchical data objects. A token can represent a non-leaf node, such as an XML element or a JSON field. A location can be a region covering and subsuming child nodes. For a token that represents a non-leaf node, a location to which the token is mapped contains the location of any token corresponding to a descendant node of the non-leaf node. Thus, token containment based on the locations of tokens within a hierarchical data object may be used to determine containment relationships between nodes in a hierarchical data object.
Abstract:
Data can be categorized into facts, information, hypothesis, and directives. Activities that generate certain categories of data based on other categories of data through the application of knowledge which can be categorized into classifications, assessments, resolutions, and enactments. Activities can be driven by a Classification-Assessment-Resolution-Enactment (CARE) control engine. The CARE control and these categorizations can be used to enhance a multitude of systems, for example diagnostic system, such as through historical record keeping, machine learning, and automation. Such a diagnostic system can include a system that forecasts computing system failures based on the application of knowledge to system vital signs such as thread or stack segment intensity and memory heap usage. These vital signs are facts that can be classified to produce information such as memory leaks, convoy effects, or other problems. Classification can involve the automatic generation of classes, states, observations, predictions, norms, objectives, and the processing of sample intervals having irregular durations.
Abstract:
JSON Duality Views are object views that return JDV objects. JDV objects are virtual because they are not stored in a database as JSON objects. Rather, JDV objects are stored in shredded form across tables and table attributes (e.g. columns) and returned by a DBMS in response to database commands that request a JDV object from a JSON Duality View. Through JSON Duality Views, changes to the state of a JDV object may be specified at the level of a JDV object. JDV objects are updated in a database using optimistic lock.
Abstract:
Techniques are provided for creating a “ubiquitous search index” which allows for full-text as well as value range-based search across all columns from multiple database tables, multiple user-defined unmaterialized views, and external sources. In one implementation, the data is indexed in a peculiarly constructed schema-based JSON format without duplicating data. The techniques maintain eventual consistency with the normalized source of truth database tables, and do not have a significant impact on the performance of transactional Data Manipulation Language (DML) operations.
Abstract:
A computer analyzes a relational schema of a database to generate a data entry schema and encodes the data entry schema as JSON. The data entry schema is sent to a database client so that the client can validate entered data before the entered data is sent for storage. From the client, entered data is received that conforms to the data entry schema because the client used the data entry schema to validate the entered data before sending the data. Into the database, the entered data is stored that conforms to the data entry schema. The data entry schema and the relational schema have corresponding constraints on a datum to be stored, such as a range limit for a database column or an express set of distinct valid values. A constraint may specify a format mask or regular expression that values in the column should conform to, or a correlation between values of multiple columns.
Abstract:
Techniques are provided for creating a “ubiquitous search index” which allows for full-text as well as value range-based search across all columns from multiple database tables, multiple user-defined unmaterialized views, and external sources. In one implementation, the data is indexed in a peculiarly constructed schema-based JSON format without duplicating data. The techniques maintain eventual consistency with the normalized source of truth database tables, and do not have a significant impact on the performance of transactional Data Manipulation Language (DML) operations.
Abstract:
The disclosed embodiments relate to a system that automatically adapts a prognostic-surveillance system to account for aging phenomena in a monitored system. During operation, the prognostic-surveillance system is operated in a surveillance mode, wherein a trained inferential model is used to analyze time-series signals from the monitored system to detect incipient anomalies. During the surveillance mode, the system periodically calculates a reward/cost metric associated with updating the trained inferential model. When the reward/cost metric exceeds a threshold, the system swaps the trained inferential model with an updated inferential model, which is trained to account for aging phenomena in the monitored system.