The SQL Performance Needle in a Haystack

Technical communities are thriving and the knowledge they provide to professionals is immense.  In the SQL Server community especially, countless blog sites, forum posts, tweets, webinars and white papers exist that help DBAs solve tough problems every day and we are all fortunate to have such information available to us with a simple Web search. Sometimes, though, problems arise that are either too uncommon or generic enough to elude even the most tenacious researcher.  You may find a partial resolution or workaround but these bring with them their own caveats and open up more potential issues. Many times these problems surface at the worst times like right before a planned migration with a pre-determined deadline or during the middle of an end of quarter reporting cycle with the business standing by while the “All Hands On Deck” siren is sounded.

Cardinality Estimator can effect Performance     

In a recent situation, just like the one above, a customer was having an issue with the new Cardinality Estimator in SQL Server 2016 where some critical queries started to sporadically run slowly.  It actually took some time to even determine that the Cardinality Estimator was at the heart of the issue as they were sure it was a bottleneck with resources that was causing queries to run slower and slower.  They added more memory and CPU, checked IO latency, changed Max Degree of Parallelism and other configuration settings trying to resolve the problem, to no avail. The environment just seemed unstable from one day to the next. Once they had help determining the issue, finding a permanent fix was even more elusive.  The options on the table were to use the previous Cardinality Estimator for every piece of code was not acceptable because most of the code was running well.   Adding a query hint was an option but the hint that was well documented online by Microsoft came with another caveat and that was the developer had to have administrative level privileges to implement it.  That was not going to work.   Ultimately the solution was found online buried in an inconspicuous little KB article that did not include any significant keywords to bring this to the surface. That simple solution found at was to add the following code to individual queries.


Sometimes a simple workaround is available but finding it may take a whole team of very smart people to find it because it is not well documented or the root of the problem was not really understood to begin with.   This is ultimately the good and bad of working through such problems. On the one hand it was painful to go through, but having gone through it, you are more inclined to know where to, or maybe more importantly “how to” find solutions to tough problems in the future.