-
公开(公告)号:US09870398B1
公开(公告)日:2018-01-16
申请号:US13936840
申请日:2013-07-08
Applicant: Teradata US, Inc
Inventor: Sung Jin Kim , Rama Krishna Korlapati
IPC: G06F17/30
CPC classification number: G06F17/30469
Abstract: A database system may include a storage device configured to store a plurality of database tables. The database system may further include a processor in communication with the storage device. The processor may determine a first sampling percentage to be used on a column of a database table. The first sampling percentage may be based on a respective frequency of each column value in the column. The processor may determine a second sampling percentage to be used on the column in generation of the plan to respond to the database query. The second sampling percentage may be based on size of the database table. The processor may select the maximum of the first sampling percentage and the second sampling percentage. The selected sampling percentage may be used to collect statistics on the column. The collected statistics may be used to generate at least one database query response plan associated with the column. A method and computer-readable medium may also be implemented.
-
公开(公告)号:US09753978B2
公开(公告)日:2017-09-05
申请号:US15366523
申请日:2016-12-01
Applicant: International Business Machines Corporation
Inventor: Shuo Li , Meng Wan , Xiaobo Wang , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30474 , G06F17/30451 , G06F17/30457 , G06F17/3046 , G06F17/30469 , G06F17/30492
Abstract: A tool for combining common processes shared by at least two or more sub-queries within a query is provided. The tool determines the query with the at least two or more sub-queries. The tool determines whether one or more sub set relationships are shared between the at least two or more sub-queries. Responsive to a determination that one or more sub set relationships are shared between the at least two or more sub-queries, the tool determines an order class for the at least two or more sub-queries based on the one or more sub set relationships. The tool determines an access path for the query. The tool executes the access path during run-time for data accessing.
-
公开(公告)号:US09678999B1
公开(公告)日:2017-06-13
申请号:US14588109
申请日:2014-12-31
Applicant: Teradata US, Inc.
Inventor: Michael A. Gibas , Hien T. To
IPC: G06F17/30
CPC classification number: G06F17/30469
Abstract: Based on a request, a processor may identify a multi-dimensional dataset stored in the at least one of a plurality of data tables and identify each dimension of the multi-dimensional dataset. For each respective identified dimension, the processor may sort the identified dataset on values of the respective identified dimension, partition the sorted dataset into a predetermined number of intervals associated with the respective identified dimension, determine a number of rows for each interval and select a lower boundary value and an upper boundary value for each interval. The upper boundary value may be the highest value in each interval. The lower boundary value for an interval having lowest sorted values may be the lowest value in the interval or the upper boundary value of an interval having immediately preceding partitioned values. The processor may further store the boundary values and rows for each interval of each identified dimension as the histogram. A method and computer-readable medium are also disclosed.
-
公开(公告)号:US20170124134A1
公开(公告)日:2017-05-04
申请号:US14930736
申请日:2015-11-03
Applicant: International Business Machines Corporation
Inventor: Ke Wei Wei , Maryela E. Weihrauch , Hao Wu , Xin Ying Yang , Miao Zheng
IPC: G06F17/30
CPC classification number: G06F17/30345 , G06F17/30153 , G06F17/30469 , G06F17/30699 , G06F17/30914
Abstract: A computer maps a literal in a database query to a digital representation, wherein the database query comprises a predicate, the literal is a part of the predicate, and the digital representation is predetermined based at least in part on external statistical data. The computer estimates a filter factor for the predicate based at least in part on the digital representation and compressed statistical data, wherein the compressed statistical data are prepared at least in part from the external statistical data.
-
公开(公告)号:US20170083548A1
公开(公告)日:2017-03-23
申请号:US14856719
申请日:2015-09-17
Applicant: International Business Machines Corporation
Inventor: Ting Xu Guan , Shuo Li , Ping Liang , Ke Wei Wei , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30368 , G06F17/30377 , G06F17/30463 , G06F17/30469 , G06F17/30474
Abstract: A computer-implemented method includes identifying one or more database modification statements and identifying one or more operational unit indicators. The one or more operation unit indicators are caused to be generated by the one or more database modification statements. An anticipated operational size is determined. The anticipated operational size is an estimated total number of the one or more operational unit indicators. An anticipated operational throughput rate is determined. The anticipated operational throughput rate is a rate at which the operational unit indicators are expected to be generated. An anticipated total execution time of the one or more database modification statements is determined based on the anticipated operational size and the anticipated operational throughput rate. A corresponding computer program product and computer system are also disclosed.
-
公开(公告)号:US20170060946A1
公开(公告)日:2017-03-02
申请号:US14837019
申请日:2015-08-27
Applicant: International Business Machines Corporation
Inventor: Shuo Li , Heng Liu , Ke Wei Wei , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30463 , G06F17/30469 , G06F17/30483 , G06F17/30554
Abstract: A computer-implemented method includes identifying a query, including one or more predicates and one or more branches, wherein one or more branches includes one or more legs. The computer-implemented method further includes, for each branch, in parallel: determining a risk, determining a return row threshold, estimating a number of return rows; terminating access if the return rows exceed the threshold. The computer-implemented method further includes, for each leg, in parallel: determining a leg return row threshold; accessing the leg; fetching one or more return rows into one or more leg return row pages; terminating access if the return rows exceed the threshold; intersecting one or more leg return row pages into one or more intersected leg return row pages; and applying the one or more predicates to the one or more intersected leg return row pages. The method may be embodied in a corresponding computer system or computer program product.
Abstract translation: 计算机实现的方法包括识别包括一个或多个谓词和一个或多个分支的查询,其中一个或多个分支包括一个或多个分支。 计算机实现的方法还针对每个分支并行地包括:确定风险,确定返回行阈值,估计返回行数; 如果返回行超过阈值,则终止访问。 计算机实现的方法还包括:对于每个支路并行:确定支路返回行阈值; 进入腿部 将一个或多个返回行提取到一个或多个返回行页中; 如果返回行超过阈值,则终止访问; 将一个或多个腿返回行页相交到一个或多个相交的腿返回行页中; 以及将所述一个或多个谓词应用于所述一个或多个相交的腿返回行页。 该方法可以体现在相应的计算机系统或计算机程序产品中。
-
公开(公告)号:US09569496B1
公开(公告)日:2017-02-14
申请号:US15067560
申请日:2016-03-11
Applicant: International Business Machines Corporation
Inventor: Shuo Li , Meng Wan , Xiaobo Wang , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30474 , G06F17/30451 , G06F17/30457 , G06F17/3046 , G06F17/30469 , G06F17/30492
Abstract: A tool for combining common processes shared by at least two or more sub-queries within a query is provided. The tool determines the query with the at least two or more sub-queries. The tool determines whether one or more sub set relationships are shared between the at least two or more sub-queries. Responsive to a determination that one or more sub set relationships are shared between the at least two or more sub-queries, the tool determines an order class for the at least two or more sub-queries based on the one or more sub set relationships. The tool determines an access path for the query. The tool executes the access path during run-time for data accessing.
Abstract translation: 提供了用于组合查询内的至少两个或多个子查询共享的公共进程的工具。 该工具使用至少两个或多个子查询确定查询。 所述工具确定在所述至少两个或更多个子查询之间是否共享一个或多个子集关系。 响应于确定在所述至少两个或更多个子查询之间共享一个或多个子集合关系,所述工具基于所述一个或多个子集合关系确定所述至少两个或更多个子查询的订单类别。 该工具确定查询的访问路径。 该工具在运行时执行访问路径以进行数据访问。
-
公开(公告)号:US20170017691A1
公开(公告)日:2017-01-19
申请号:US15210094
申请日:2016-07-14
Applicant: International Business Machines Corporation
Inventor: Hao Feng , Shuo Li , ShengYan Sun , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30469 , G06F17/30448 , G06F17/30474
Abstract: In an approach for calculating one or more access paths during bind time, a computer receives a query. The computer identifies one or more access paths for processing the received query, wherein the one or more access paths include steps associated with retrieving data from a database based on the received query. The computer calculates resource costs associated with processing the received query on the one or more identified access paths based on one of more of: resources utilized to perform steps associated with processing the received query, and system statistics associated with the one or more identified access paths.
Abstract translation: 在绑定时间内计算一个或多个访问路径的方法中,计算机接收查询。 计算机识别用于处理所接收的查询的一个或多个访问路径,其中所述一个或多个访问路径包括与基于所接收的查询从数据库检索数据相关联的步骤。 计算机基于以下之一来计算与处理接收的查询相关联的资源成本:用于执行与处理所接收的查询相关联的步骤的资源;以及与所述一个或多个所识别的访问路径相关联的系统统计 。
-
公开(公告)号:US20160371332A1
公开(公告)日:2016-12-22
申请号:US15046065
申请日:2016-02-17
Applicant: International Business Machines Corporation
Inventor: Shuo Li , Ping Liang , Ke Wei Wei , Xin Ying Yang
IPC: G06F17/30
CPC classification number: G06F17/30442 , G06F17/30339 , G06F17/30469 , G06F17/30486 , G06F17/30584
Abstract: In an approach to determining an access method for a partition in a partition table, a computer receives a query and determines if there is a partition table utilized by the query. When there is a partition table utilized by the query, then the computer determines that a partition in the partition table meets the plurality of conditions of the query. The computer collects a plurality of partition level statistics for the partition that meets the plurality of conditions of the query. Additionally, the computer determines, based, at least in part, on the plurality of partition level statistics, a cost for one or more access methods for the partition that meets the plurality of conditions of the query. Furthermore, the computer determines, based, at least in part, on the cost for each access method, an access method for the partition that meets the plurality of conditions of the query.
Abstract translation: 在确定分区表中的分区的访问方法的方法中,计算机接收查询并确定是否存在由查询使用的分区表。 当查询使用分区表时,计算机确定分区表中的分区符合查询的多个条件。 计算机收集满足查询的多个条件的分区的多个分区级别统计信息。 另外,计算机至少部分地基于多个分区级别统计信息确定满足查询的多个条件的分区的一个或多个访问方法的成本。 此外,计算机至少部分地基于每个访问方法的成本确定满足查询的多个条件的分区的访问方法。
-
公开(公告)号:US20160306850A1
公开(公告)日:2016-10-20
申请号:US15196237
申请日:2016-06-29
Applicant: International Business Machines Corporation
Inventor: Mihaela A. Bornea , Julian Dolby , Achille B. Fokoue-Nkoutche , Anastasios Kementsietsidis , Kavitha Srinivas
IPC: G06F17/30
CPC classification number: G06F17/30469 , G06F17/30442 , G06F17/30477 , G06F17/3053 , G06F17/30935 , G06F17/30958
Abstract: Systems and methods for optimizing a query, and more particularly, systems and methods for finding optimal plans for graph queries by casting the task of finding the optimal plan as an integer programming (ILP) problem. A method for optimizing a query, comprises building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern, determining a plurality of flows of query variables between the plurality of components, and determining a combination of the plurality of flows between the plurality of components that results in a minimum cost to execute the query.
-
-
-
-
-
-
-
-
-