Right-Sizing for the Cloud

The Problem with Overprovisioning in the Cloud

When it is time to purchase new resources for one or more applications, the non-scientific approach of “guesstimating” the requirements can be expensive whether for on-premises capital expenditures or Cloud-based services.  For stakeholders deciding on how to provision the data services, is it better to start small and grow the resources as needed? Certainly the elasticity of modern Cloud services like Azure SQL Databases makes growing into the application an easy decision.  You can easily expand resources or move to higher tier levels with a single mouse click, or even automatically as the workload increases.  But for the SQL applications that are destined for residency on a Virtual Machine (VM), it is not straightforward.  Over-provisioning RAM and CPUs is costly, but it is often the first choice because it makes the business owners happy to believe that with more hardware their applications will be faster.  Unfortunately, this approach often leads to a lot of very big SQL Servers doing very little work.  In time when things start slowing down, which they will as data piles on, the first inclination is to add even more costly resources.

Reducing Your Footprint

The best approach to “right-sizing” SQL Server, especially as you are moving from an on-premises server to a VM in a Cloud service is not by guesstimating, but interrogating.  Knowing the workload on a SQL Server instance does not just mean knowing how much CPU or RAM it is using, or even how much physical space is required to house the databases.  It means knowing the queries that are running on your server, what they are waiting on, and understanding how to address any issues in the queries themselves.  In other words, can that same query run faster on a smaller server?  By focusing your efforts on poor performing code, fewer resources will be required. Generally, this is a shared responsibility of the DBA and developer.

Who has the time?

Operational or “administrative” DBAs do not always have the time to devote to the tasks they need to perform to reduce an application’s resource needs. This is partly because they are busy fighting fires, keeping other applications up and running, safely backed up and secure. That is their primary job. When it comes time to discuss migrating the application to a Cloud service or consolidating to save on licensing costs, the effort of determining what resources will be needed often becomes another ‘guesstimate” based on RAM and Memory consumption.  There is just not enough time to gather the numbers.

What information should be gathered and analyzed?

There are specific steps that should be taken when analyzing a workload with the goal of making the application perform better while at the same time reducing resource consumption.
  • Establish a baseline. There are several tools on the market that will make this easier, but the best and free way is to turn on performance monitor on the server and grab the relevant counters.
  • Gather metrics from the server itself. SQL Server has an incredibly vast amount of data that it stores about query execution. By using Dynamic Management Views (DMVs) you can quickly identify the worst performing queries. This information will include wait statistics.
  • Determine if there are missing, unused or duplicate indexes. This information, too, can easily be extracted from the server itself.
  • Monitor for blocking. Often application issues arise from inefficient use of locks and transactions.
  • Once you have identified the worst performing code that is taking the most resources, start looking at the code for tuning opportunities. This is the most time-consuming part of the analysis but the one that will yield the best results.
  • After the new code is applied, continue to review the baseline and query performance.
  • Repeat

Focus on Low Hanging Fruit First

It is not uncommon when the above tasks are performed that the worst performing queries will be the easiest to rewrite or address with missing indexes and in turn will have the most impact on performance. This is the low hanging fruit. You may find, for example, that a single CPU-intensive query is running 250 times a day and taking over a minute for each run. And it is a stored procedure for a report! Your goal is to ensure you are not paying for resources you do not need. Conversely, you may need more resources than you are currently paying for.  It is helpful to know where to look to get the right information when you are right-sizing.

SQLWatchmen Can Help

If you don’t have staff with the right skills to do this, or if they have constraints on their time, then contact us to see if we can help you in your situation. We have helped many people figure out what sized cloud servers are needed for their workload, or to tune the workload to reduce resources used so smaller, cheaper hardware can be used. This has resulted in our customers saving thousands of dollars each month.  We can help you save ongoing expenses as well.