Thursday, July 11, 2013

Best Practices for SQL Server

Introduction

At VMworld 2008 in Las Vegas several of us in our virtual performance  team met with a variety of customers to talk about Microsoft SQL Server.  We already had a large base of customers running very many SQL Server  DBs on our products and we wanted to collect information on the  challenges posed in the process of virtualizing this critical workload.  We were pleased to see that ESX Server handled SQL VMs with excellent  performance. But, for many customers, the first efforts at virtualizing  SQL didn't yield high-performing SQL VM.  After careful investigation  and many, many discussions we've started to put together the puzzle as  to where SQL Server performance problems come from.  This page will  document these common problems, borrowing slides from our presentations  on the subject.

Virtualizing SQL: The Checklist

We've talked with dozens of customers in the past months to document the  issues that resulted in poor SQL performance. Happily, none of the  issues were due to underlying technologies. Here is a list of issues and  an explanation of the impacts. These items are roughly listed in the  order of decreasing likelihood of occurrence.

Item 1: Configure Storage Correctly

Storage configuration problems are the number one cause of SQL  performance issues.  Usually these problems arise because the DBA  requests a virtual disk of the VI admin, the VI admin places the VMDK on  a LUN that may or may not meet the DBA's performance needs.  For  instance:
  1. VMs' VMDK files placed on VMFS volumes without enough spindles.
  2. Many VMDK files placed on a single VMFS volume which could use more spindles.
  3. Database and log files placed on the same LUN which, you guessed it, could use more spindles.

This may be obvious to some, but this problem occurs again and again.   The VI administrator should be aware of a few technical items that can  help understand and avoid this problem:
  1. Based on the IO demands of the DB files, a certain number of  spindles should be guaranteed to this file.  This means that its VMDK  must be placed on a VMFS volume to accout for the SQL Server's demands  and all of the other demands on that volume.
  2. Mixing sequential activity (such as log file update) and random  activity (such as database access) results in random behavior.  This  means that the LUN configuration in the pre-virtual physical environment  may not be sufficient for the consolidated environment.  This is  discussed some in Storage Performance: VMFS and Protocols.
  3. When storage isn't meeting the SQL Server's demands, the device  latency or kernel latency (queueing time) will increase.  Read up on  these counters in Storage Performance Analysis and Monitoring.

Item 2: Use Recent Hardware


Often companies that are dipping their metaphorical toes into
virtualization want to run proof-of-concept (POC) experiments to verify
that the virtual platform can meet their performance expectations. But
its surprising how many times these experiments are run on older,
poorly-performing hardware. Presumably the shiny, new systems were in
use for production applications so only the mothballed, cobweb-covered
servers from a previous generation were available for the POC. This
causes many problems.  Check out this slide from a talk on SQL Server at VMworld Europe 2009:
newer_hardware.png
The slide points out a couple of things. First, the larger caches and  shorter pipelines on newer Intel processors results in a considerable  drops in performance overheads.  Second, the latency of the VMEXIT  instruction, which determines the amount of time it takes to transition  from the VM to the VMkernel, has shrunk by a large amount with  subsequent generations of hardware.  And don't forget the other  additions from Intel and AMD such as hardware assisted memory management  and IO virtualization.

Item 3: Follow SQL Server Best Practices


Microsoft has kindly provided a web page of best practices for SQL Storage configuration. These be practices should still be followed when configuring your virtual SQL deployments!

Item 4: Configure VM Identically to Native and Run The Right Test

For many SQL Server POCs the goal is to measure the VM's ability to  perform, with respect to the virtual platform. If this comparison is to  be performed, its critical that the VM be configured identically to the  physical hardware. Obviously this means that the VM should be run on the  same hardware using identically configured LUNs. Its also important to  ensure that the VM has the same number of vCPUs and amount of memory as  the physical baseline. This means restricting the number of pCPUs and  amount of memory with NUMPROC and MAXMEM, respectively, in boot.ini.

It also means that the test being applied should be understood.  If a  benchmark is chosen that uses a very small database, the content will be  cached and the storage system won't be used.  This can skew the results  and produce recommendations not consistent with production deployments.   Here is another slide from the same VMworld Europe 2009 presentation  detailing some of what we know about the SQL Server benchmarking  alternatives:
sql_benchmarks.png
We at VMware prefer DVD Store.

Item 5: Use VMware's ESX Server


VMware's hosting products, VMware Server, VMware Workstation, and even  VMware Fusion, are all capable of running SQL Server. But if the  database is going to be run in production on enterprise-class hardware,  use VMware's enterprise-class hypervisor: ESX Server.  These products  are not often confused by the initiated but rogue members of large  companies often run off-the-books proof-of-concept experiments on  VMware's hosted products.  When they produce results they don't like,  the results get spread throughout the company which can slow the virtual  deployment.
Consider the following data, again from the VMworld Europe 2009 SQL Server presentation:
vmmark_esx_server.png
This information is getting a bit dated now, as it was performed years  ago on ESX Server 3.0.  But the point stands: before believing results  claiming that "VMware cannot run SQL Server" its worth investigating the  platform used to generate the results.

Item 6: Understand Memory Management and Configure Correctly

Database performance is heavily dependent on the amount of memory  available. Almost without exception, providing more memory to SQL Server  will improve performance. However, if that memory is coming from a host  that is already over-committed or is being provided through workarounds  to 32-bit limitations, performance may suffer. Here are a few keys for  SQL Server memory management:
  1. If more than 3 GB is desired, use 64-bit versions of the OS and application.
  2. If memory is over-committed on the box, set reservations for  performance-critical SQL Server VMs to guarantee that those VMs' memory  isn't ballooned or swapped out.
  3. If SQL Server's "lock pages in memory" parameter has been set,  provide set the VM's reservations to the amount of memory in the VM.  This setting can adversely interfere with ESX Server's balloon driver.  Setting reservations will stop the balloon driver from inflating into  the VM's memory space.

Item 7: Align Disk Partitions

This item is really a special but very important case of item two,  follow best practices. Partition alignment can impact storage  performance which can be critical to some SQL Server VMs' performance.  See VMware's paper on partition alignment for more information on this.

Indonesian Greeting on Voyager Golden Record

In September 5th 1977, NASA launched space probe named Voyager 1. The mission to observed our solar systems and beyond. It's still trave...