Wednesday, November 29, 2017

Adaptive Query Processing and Automatic Tuning - Adaptive Joins - 2

Continuing from the introduction written over here, let’s look at one of the aspects of Adaptive Query Processing which is Adaptive Joins.

Before we start Adaptive Joins, SQL Server's Query optimizer internally has 3 join types.

Nested Loop Join - Usually picked by optimizer, when tables participating in the joins are small or when one of the table is big and indexed on joining column
Hash Join: One of the table is big but doesn't have an index
Merge Join: Both the tables are big, and the joining column is indexed

So, the size of the table or the rows participating in the join plays a major role SQL's Query optimizer picking the correct join type. Let’s say, SQL Server estimates few rows participating in the join at compile time and picks "nested join" but during runtime, the join extracts larger number of rows and thereby making the choice of plan ineffective. 

SQL Server 2017's adaptive join addresses this problem. With Adaptive Join, SQL Server comes with plan attribute called "Adaptive Row Threshold" which is a row count threshold estimated by SQL Server.  "Adaptive Row Threshold" will help SQL Server dynamically alter the plan choice at run time. Adaptive join in short will work in the following way

  • If the number of rows participating in the join are greater than "Adaptive Row Threshold" rows, then "Hash Join" operator is used for join operation
  • If the number of rows participating in the join are lesser than "Adaptive Row Threshold" rows, then "Nested loop Join" operator is used for join operation
Consider the following query:

SELECT  [fo].[Order Key], [si].[Lead Time Days], [fo].[Quantity]
FROM    [Fact].[Order] AS [fo]
INNER JOIN [Dimension].[Stock Item] AS [si]
       ON [fo].[Stock Item Key] = [si].[Stock Item Key]
WHERE   [fo].[Quantity] = 360;

Observe the picture below:

Query Optimizer fixes 68 rows as Adaptive join threshold.
Estimated number of rows for the outer table in the join was 213. As 213 is greater than 68, estimated join type is Hash Match Join. During runtime, 206 rows from outer table participated in the join and as it was greater than "Adaptive join threshold" of 68, Hash match join was selected.
Sharing another example where Adaptive Join switches the join type in runtime. Refer to pic below

In this case, threshold was 46 rows and estimate were 87 rows. However, runtime row count was just 36 rows hence switching the join type to Nested loop join from an estimate of "Hash Join"
 The ability to alter execution plan operator in runtime would make SQL Server's query optimizer weed out most of the incorrect plan choices during runtime.

No comments: