Abstract:
Systems, methods, and computer-readable media for determining column ordering of a data storage table for search optimization are described herein. In some examples, a computing system is configured to receive input containing statistics of a plurality of queries. The computing system can then determine a new column order (i.e., layout) based at least in part on the statistics. In some example techniques described herein, the computing system can determine the new column order based at least in part on the hardware components storing the data storage table, storage system parameters, and/or user preference information. Example techniques described herein can apply the new column order to data subsequently added to the data storage table. Example techniques described herein can apply the new column order to existing data in the data storage table.
Abstract:
Embodiments of the present disclosure provide a data query method and apparatus, which implement a technical effect that data meeting a precision requirement is queried according to a user's requirement of for data precision. The method includes: receiving a query instruction that includes a query condition and query precision; determining a data partition that meets the query condition; determining a data sub-partition corresponding to the query precision from the data partition; and querying data in the data sub-partition to obtain a query result.
Abstract:
A system, method, and computer-readable medium for allocation of a Locality-sensitive Non-Unique Secondary Index are provided. The Locality-sensitive Non-Unique Secondary Index preserves the similarity of incorporated fields as well as improves the average secondary index sub-table look-up performance and is advantageously resilient to the type of predicates and workloads applied thereto. Rows of the secondary index having values of the columns that are hashed to determine a secondary index sub-table row location have a higher probability of being closely located within the secondary index than rows with more dissimilar column values that are hashed to determine the secondary index row location.
Abstract:
According to some embodiments of the invention, a method of data management is provided. The method includes generating a plurality of sub-tables in a table of a relational database. Each sub-table has a predicate that indicates at least a partial description of information to be stored in the sub-table. The method also includes storing in the plurality of sub-tables one or more records having data. Each record is stored in the sub-table having the predicate that matches at least a portion of the data of the record.
Abstract:
Current cell values are provided to a client using two-passes. When a first request to provide values is received during a first pass, default values are provided to the client. Upon receiving each value request, the formula parameters are collected that are associated with the cell. The formula parameters are parsed to determine data that is to be retrieved from a database. Once the locations for all of the data to be retrieved has been determined, the data is retrieved from a database in as few as hits as possible. After obtaining the current values from the database, the client is informed to request the values a second time. When the second request to provide values is received, the client is provided with the calculated values during the second pass.
Abstract:
Described herein are approaches for generating execution plans for database commands that include an in-list predicate. The approaches can be used to generate execution plans that exploit the power of in-list iterators in ways and under circumstances not previously supported by conventional DBMSs. An in-list iterator may be used with execution subplans for processing multi-column in-list queries. An in-list iterator is used with execution subplans that scan function-based indexes. The execution plans for a multi-column in-list query limit table scans to only table partitions that contain data that satisfy the query.
Abstract:
A computer-Implemented method of storing RDF graph data in a graph database including a set of RDF tuples. The method includes obtaining one or more adjacency matrices wherein each adjacency matrix represents a group of tuples of the graph database comprising a same predicate. The method further includes storing, for each of the one or more adjacency matrices, a data structure includes an array. The array includes one or more indices each pointing to a sub-division of the adjacency matrix, and/or one or more elements each representing a group of tuples of the RDF graph database of a respective sub-division of the adjacency matrix.
Abstract:
A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.
Abstract:
A table organized into a set of batch units is accessed. A set of N-grams are generated for a data value in the source table. The set of N-grams include a first N-gram of a first length and a second N-gram of a second length where the first N-gram corresponds to a prefix of the second N-gram. A set of fingerprints are generated for the data value based on the set of N-grams. The set of fingerprints include a first fingerprint generated based on the first N-gram and a second fingerprint generated based on the second N-gram and the first fingerprint. A pruning index that indexes distinct values in each column of the source table is generated based on the set of fingerprints and stored in a database with an association with the source table.
Abstract:
Provided are a data search method and apparatus, an electronic device and a storage medium. The method includes acquiring search data and a search condition and determining a target data set corresponding to the search data; determining each data distance between the search data and a respective query datum included in the target data set; performing data filtering on the each data distance based on the search condition and writing each filtered data distance as a target data distance into a memory; and reading the target data distance stored in the memory, using a query datum corresponding to the target data distance as a target response datum of the search data and displaying the target response datum.