Selectivity estimation for query execution planning in a database

    公开(公告)号:US10162860B2

    公开(公告)日:2018-12-25

    申请号:US14517964

    申请日:2014-10-20

    Abstract: A computer-implemented method of estimating selectivity of a query may include generating, for data stored in a database in a memory, a one-dimensional value distribution for each of a plurality of attributes of the data. A multidimensional histogram may be generated, wherein the multidimensional histogram includes the one-dimensional value distributions for the plurality of attributes of the data. The multidimensional histogram may be converted to a one-dimensional histogram by assigning each bucket of the multidimensional histogram to corresponding buckets of the one-dimensional histogram and ordering the corresponding buckets according to a space-filling curve. One or more bucket ranges of the one-dimensional histogram may be determined by mapping the query conditions on the one-dimensional histogram. The selectivity of the query may be estimated by estimating how many data values in the one or more bucket ranges will meet the query conditions.

    Approximate string matching optimization for a database

    公开(公告)号:US10095808B2

    公开(公告)日:2018-10-09

    申请号:US15494874

    申请日:2017-04-24

    Abstract: Software for processing a database query that includes: (i) receiving a query of a database including a search value; (ii) determining a distance between the search value and at least one reference value; (iii) determining a maximum distance from the search value to be used in searching a plurality of datasets of the database, wherein the maximum distance from the search value defines a search range and is based, at least in part, on the determined distance between the search value and the at least one reference value; (iv) determining a subset of datasets from the plurality of datasets that includes datasets for which a data range with respect to each reference value overlaps with the search range; and (v) performing approximate string matching for the search value on the subset of datasets.

    Method for processing a database query

    公开(公告)号:US10698912B2

    公开(公告)日:2020-06-30

    申请号:US15941377

    申请日:2018-03-30

    Abstract: The invention relates to a computer-implemented method for processing a query in a database, the query comprising a search value. The database comprises a plurality of datasets the datasets comprising entries, wherein distance statistics are assigned to the datasets. The distance statistics describe the minimum and maximum distance between the values of the entries of a dataset of the plurality of datasets and a reference value. The method comprises determining the distance between the search value and the reference value, said determination resulting in a search distance, determining a subset of datasets from the plurality of datasets for which the search distance is within the limits given by the minimum and maximum distances described by the respective distance statistics, and searching for the search value in the subset of datasets.

    METHOD FOR PROCESSING A DATABASE QUERY
    7.
    发明申请

    公开(公告)号:US20180225338A1

    公开(公告)日:2018-08-09

    申请号:US15941377

    申请日:2018-03-30

    CPC classification number: G06F16/2462 G06F16/245 G06F16/951

    Abstract: The invention relates to a computer-implemented method for processing a query in a database, the query comprising a search value. The database comprises a plurality of datasets the datasets comprising entries, wherein distance statistics are assigned to the datasets. The distance statistics describe the minimum and maximum distance between the values of the entries of a dataset of the plurality of datasets and a reference value. The method comprises determining the distance between the search value and the reference value, said determination resulting in a search distance, determining a subset of datasets from the plurality of datasets for which the search distance is within the limits given by the minimum and maximum distances described by the respective distance statistics, and searching for the search value in the subset of datasets.

Patent Agency Ranking