Sunday, November 12, 2017

Adaptive Query Processing and Automatic Tuning - Introduction - 1


SQL Server 2017 and Azure SQL Database are being touted as the self learning, artificial intelligent database. One of the main reasons for the same is due to release of the following features

 1) Adaptive Query Processing

2) Automatic Tuning
Adaptive Query Processing is SQL Server's first artificial intelligent attempt to fix poor performing queries due to incorrect cardinality estimates. Query optimizer produces better plans using "Adaptive Query Processing" which is enabled by default when one is on SQL 2017's compatibility mode (CE 140). Queries can benefit via Adaptive Query Processing in the following scenarios.

1) Adaptive Join -> Query optimizer makes a intelligent choice between Nested Loop and Hash Join in runtime

2) Memory Grant Feedback -> Query optimizer adjusts the memory granted to a query depending upon the previous run's execution results

3) Interleaved Execution -> Portion of the query is executed while generating the query plan to come with the estimates. Applies currently for Multi Statement Table Valued Functions alone

 The above scenarios and operators are likely to expand in upcoming releases or patches

 Other aspect of auto driven query tuning is "Automatic Tuning" feature. Automatic Query tuning helps in the following scenarios
 Automatic Tuning:

1) Force Last Good Plan: Attempts to fix query regression by detecting any sudden change in query performance due to plan change and forcing the last good performing plan. one can turn it on using the command below
ALTER DATABASE DBName SET AUTOMATIC_TUNING ( FORCE_LAST_GOOD_PLAN = ON );
2) Automatic Index creation and deletion: Feature available only in Azure SQL Database. By studying the workload, the database engine automatically deletes and creates indexes!!!
This is just a introduction post. As usual,will be writing a series of posts to cover the individual sections in detail.

Friday, October 27, 2017

SQL Azure Databases - Active Secondary - Failover Groups - Part 4

150th Blog Post . So, the post involves video demo :)

Previous posts on this series was about using "Geo Replication" to configure active secondary. However, it comes with few limitations like

* No transparent failover capabilities like Always on Listener

* No availability groups and group failover possible

So, Azure Failover Groups provides "Read Only" / "Read Write" connection URLs which allows applications transparent failovers. Read only and Read write workloads get different connection strings and upon failover, applications connect to the appropriate node automatically

Also, one can combine group of databases as a unit of failover and make them failover together as well, as done in Always on availability groups. 

Step by step instructions on configuring availability groups provided below

* Configure Primary Server, Secondary Server and Geo Replication as explained in http://strictlysql.blogspot.sg/2017/05/sql-azure-databases-active-secondary.html

* Click on the primary server ( not database ) and select failover groups


* Click on the add group and

          * Provide the failover group a name
          * Pick the existing database "db1" to be added to the failover group
          * Pick the existing secondary server
          * Click on create


* Once done, failover group is listed as shown below


* Click on the failover group, Read only and Read Write URLs can be seen as shown below


* To perform failover click on the failover icon. Post failover, failover group looks like the picture below


To make the understanding easier, I have recorded the video of the demo. Please take a look. Happy learning



Wednesday, September 6, 2017

Dynamic Threshold for Statistic Update - SQL Server 2016

By default, on a SQL Server Database, statistics are updated automatically when one hits a threshold of changes. If the Auto Update Statistics setting is turned on by default, the threshold for number of changes to trigger auto statistic update before SQL Server 2016 was

For tables with more than 500 rows - 20% of tables row count + 500 changes
For tables with less than 500 rows - 500 changes

However, in modern day times, the above formula may not be effective. 10 million row tables is pretty much a norm and not typically considered huge. The threshold of changes for 10 million row table would be 2,000,500 changes, which is a huge number of changes required to trigger auto stat update. This effectively implies, on larger tables one would be running without updated statistics almost all the time as the threshold is too high as the tables get bigger. To address this problem, SQL Server 2016 has a behavior called dynamic statistic threshold,  which is enabled by default.


As the table gets bigger, the threshold for statistic update reduces. For tables greater than 25,000 rows dynamic statistics is applicable. For example consider the following table with 1 million rows

Let’s update statistic before starting the task



Make note the updated stat time (12:12 AM)

Now let’s run the following script to make 100,000 changes (10%)

SET ROWCOUNT 100000
GO
/* SET Rowcount so that we can update just 100K rows */

Update Production.product_1M
SET StandardCost = StandardCost + 5
WHERE  Productid % 5 = 0
GO

Please note on older versions of SQL Server, by default, 10% changes wouldn't suffice to trigger auto stat update.


Now let’s check if statistics are updated. A select is query has to be fired on the table to trigger auto statistic update.


Surprisingly they are updated as the statistic updated time has changed to 12:15 AM.

Though there is no official exact formula how statistic update has been triggered, following pic shows how the threshold percentage reduces with row count of the table.


Picture referred from the following blog: https://blogs.msdn.microsoft.com/saponsqlserver/2011/09/07/changes-to-automatic-update-statistics-in-sql-server-traceflag-2371/

Dynamic Statistic update threshold is not a new feature and has been available since SQL Server 2008 via the trace flag 2371. Meaning, one can enable "Dynamic Statistic update threshold" on SQL Server 2008 and after by turning the trace flag 2371 on. On SQL Server 2016 and after, "Dynamic Statistic update threshold" has been made a default behavior.

Sunday, September 3, 2017

DBCC CloneDatabase for SQL Server 2012

Dear All,

In case you are wondering, title of the post is not a typo :) DBCC CloneDatabase is going to be available for SQL Server 2012 :)

For folks who don't know, DBCC CloneDatabase is a feature introduced in SQL Server 2014 SP2 for creating a skeleton copy of the database. When I mean skeleton copy, I mean without the data but the exact schema, table structure, index, constraints, file and file groups and the best of all - table and index statistics. Using DBCC CloneDatabase, one can actually get a exact copy of the database ( including actual statistics ) but without the data. This is extremely useful in performance troubleshooting as one can clone a production database using DBCC CloneDatabase and backup the cloned database and restore to DEV / UAT to get estimated plans. SQL Server 2016 onwards Querystore's data and reports are also included to the cloned database. One can read more about DBCC CloneDatabase over here

Coming back to the title of the post, Microsoft product team will be extending DBCC CloneDatabase to SQL Server 2012 as a part of SQL Server 2012 Service Pack 4. SQL Server 2012 Service Pack 4 ( SP4 ) is due to be released in September 2017. A excellent move by product team to extend cloned database SQL Server 2012. Check out the official announcement of SP4 over here. More importantly, observe the comments section :) 

Monday, August 28, 2017

Data Channel Interview - 08 - Leila Etaati on Machine Learning using Microsoft BI

Dear All,

Very happy to share the 8th interview of #DataChannel. Dr. Leila Etaati MVP, MCT from New Zealand discusses Machine Learning using Microsoft Business Intelligence tools in this interview. Leila gives us an insight into Machine Learning process and explains how Azure ML, Power BI, R and SQL Server can be used in it. Thanks Leila for the informative interview




Friday, June 9, 2017

SQL Azure Databases - Active Secondary - Failover - Part 3

Continuing on the series on SQL Azure Databases - Active Secondary, this post will explore the task of failing over to secondary.

Fairly straightforward task, just one needs to take care of few items.

Step 1: Login into portal.azure.com and click on Databases. The primary and secondary databases are listed



Step 2: Click on the primary and then click on "Geo Replication"



Step 3: Scroll down and click on the secondary as shown below


Step 4: Click on Failover as shown below


Step 5: Click "Ok" to confirm Failover



Step 6: The dotted lines between primary and secondary, showing the failover progress, turn to solid line upon completion of failover


Step 7: After failover, the replication role changes confirming the successful failover as shown below



Step 8: Secondary server accepts write operations as shown below


Few important things to know

This failover is like a DB Mirroring failover. This means application will need to change their connection strings to secondary server upon failover. Unlike, Always on Availability groups, azure geo replication explained above, doesn't support transparent application failover.

However, Azure Failover Groups (right now in preview) allows transparent failover for applications. More about that on next post :)

Saturday, May 27, 2017

SQL Azure Databases - Active Secondary - How to connect? - Part 2

As a continuation of "SQL Azure Databases - Active Secondary" series started earlier, this post will cover how to connect to a azure SQL Database primary and secondary after configuring "Geo Replication".

To some, it may sound like "what is the big deal in connecting to, after configuring it". If you are new to azure, it can take a while for you figure out how to connect and at sometimes it can turn out to be one of those annoying simple things, which refuse come off in easily :)

As described in the earlier post, we have configured the following

* Database called "rajprimary" on "rajprimarysvr" server
* Geo replicated database copy of "rajprimary" on "rajsecondary" server  


To connect to the primary or the secondary server, follow the steps provided below

Configure the firewall:


* Login to your portal.azure.com

* Click on SQL Databases on the left 





* Click on the primary / secondary databases ( as shown under replication role), depending upon the one you would like to connect to


* Click on set firewall as shown below


* Firewall settings are turned on and IP Addresses that are listed alone are allowed to access. Click on "add client IP" for allowing the machine that you are using, to connect to the azure database. 



* You may add other client IP Addresses to, if they are to connect to the database. Click on save after adding.



Connect to the database

* Start sql server management studio on the machine you are logged on, and connect to ".database.windows.net". In this case it would be rajprimarysvr.database.windows.net

 if you are not sure of your database servername, you may look at by clicking on "sql databases" -> "databasename" -> "Properties"

 * Provide the user account and password used while setting up the server

Upon connecting, one would notice that only database related components are visible. Server related components like SqlAgent, Server Error logs are not seen as we are using "Sql database" service on azure.


Same steps can be repeated for secondary server too. More on Azure - Geo replication in upcoming posts.