Technique for fast join processing of dictionary encoded key columns in relational database systems

    公开(公告)号:US11288275B2

    公开(公告)日:2022-03-29

    申请号:US17015421

    申请日:2020-09-09

    Abstract: For join acceleration, a computer stores local encoding dictionaries (ED), including a build ED that contains a plurality of distinct build dictionary codes (DC) and a probe ED that contains a plurality of distinct probe DCs that is not identical to the plurality of distinct build DCs (BDC). Build data rows (DR) that contain a build key that contains BDCs from the plurality of distinct BDCs is stored. Probe DRs that contain a probe key that contains probe DCs from the plurality of distinct probe DCs is stored. A request for a relational join of the build DRs with the probe DRs is received. The BDCs from the build key and the probe DCs from the probe key are transcoded to global DCs (GDC) of a global ED. Based on GDCs for the build key, a build array whose offsets are respective GDCs of the global ED is populated. Based on GDCs for the probe key, offsets of the build array are accessed. A response to the request for the relational join that is based on accessing offsets of the build array is sent.

    DEPLOYING A VECTOR INDEX ON MULTIPLE NODES OF A CLUSTER

    公开(公告)号:US20250094400A1

    公开(公告)日:2025-03-20

    申请号:US18885640

    申请日:2024-09-14

    Abstract: Techniques for deploying a vector index on multiple nodes of a cluster are provided. In one technique, an instruction is received to create a vector index on a set of vectors that is stored in a vector database that is connected to the multiple nodes. In response, an HNSW index is created based on the set of vectors and the HNSW index is stored on each node. In response to receiving a vector query, a node processes the vector query against its copy of the HNSW index. In another technique, each node retrieves, from a vector database, a respective subset of a set of vectors and generates, based on the respective subset, a respective HNSW index. A vector query is transmitted to each node, which traverses its HNSW index to generate results of the vector query. The results from each node are combined to generate final results.

    Selective data mirroring for in-memory databases

    公开(公告)号:US10331572B2

    公开(公告)日:2019-06-25

    申请号:US15979130

    申请日:2018-05-14

    Abstract: Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data. Selection of data to be maintained in the volatile memory may be based on various factors. Once selected the data may also be compressed to save space in the volatile memory. The compression level may depend on one or more factors that are evaluated for the selected data. The factors for the selection and compression level of data may be periodically evaluated, and based on the evaluation, the selected data may be removed from the volatile memory or its compression level changed accordingly.

    Selective data compression for in-memory databases

    公开(公告)号:US09990308B2

    公开(公告)日:2018-06-05

    申请号:US14841561

    申请日:2015-08-31

    Abstract: Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data. Selection of data to be maintained in the volatile memory may be based on various factors. Once selected the data may also be compressed to save space in the volatile memory. The compression level may depend on one or more factors that are evaluated for the selected data. The factors for the selection and compression level of data may be periodically evaluated, and based on the evaluation, the selected data may be removed from the volatile memory or its compression level changed accordingly.

Patent Agency Ranking