Performance Increase? Changing Software iSCSI Adapter Queue Depth

I had an interesting point raised on the blog today from Colin over at Solori.net He suggested changing the IOPS=QUE_DEPTH to see if I can decrease my storage latency.  I wasn’t able to find any settings to alter the queue depth on HP StoreVirtual and I’m not fortunate enough to have a Fiber channel SAN kicking around, so I don’t have the ability to change an HBA setting.  However this got the grey matter whirring , what about changing the Software iSCSI Queue Depth in ESXi5.1?

Before we get into the testing, I think it’s an idea to go over how a block of data gets from a VM to the hard disks on your NAS/SAN.

  1. Application e.g. Word Document
  2. Guest VM SCSI queue
  3. VMKernel
  4. ESXi vSwitch
  5. Physical NIC
  6. Physical Network Cable
  7. iSCSI Switch Server Port
  8. iSCSI Switch Processor
  9. iSCSI Switch SAN Port
  10. Physical Network Cable
  11. iSCSI SAN Port
  12. iSCSI Controller

I actually feel sorry for the blocks of data, they must be knackered by the time they are committed to disk.

At the moment I’m sending 1 IOP down each iSCSI path to my HP StoreVirtual VSA.  The results of this was an increase in overall IOPS performance, but an increase in latency see blog post Performance Increase? Changing Default IOP Limit

The Software iSCSI Queue Depth can be verified by going into ESXTOP and pressing U (LUN)

SSDVOL02 is naa.6000eb38c25eb740000000000000006f which has a Disk Queue Depth of 128

Disk Queue Length

IOMeter will let us know the overall latency which is what the Guest OS sees, which is great, but what we care about is knowing where the latency is happening.  This could be in one of three places:

  1. Guest VM SCSI queue
  2. VMKernel
  3. Storage Device

I have spun up a Windows 7 test VM, which has 2 vCPU and 2GB RAM.  Again for consistency I’m going to use the parameters set out by http://vmktree.org/iometer/

The Windows 7 test VM is the only VM on a single RAID 0 SSD Datastore.  It is also the only VM on the ESXi Host.  So we shouldn’t expect any latency due to compute resources being in constraint.

We are going to use ESXTOP to measure our statistics using d (disk adapter) u (LUN) and v (VM HDD) and collate these with the IOMeter results.

The focus is going to be on DAVG/cmd KAVG/cmd and QAVG/cmd these are related to

DAVG/cmd  is Storage Device latency.

KAVG/cmd is VMKernel Device latency

GAVG/cmd is the total of DAG/cmd and KAVG/cmd

QAVG/cmd is Queue Depth of our iSCSI Software Adapter

Storage DEPTH

Taken from Interpreting ESXTOP Statistics

‘DAVG is a good indicator of performance of the backend storage. If IO latencies are suspected to be causing performance problems, DAVG should be examined. Compare IO latencies with corresponding data from the storage array. If they are close, check the array for misconfiguration or faults. If not, compare DAVG with corresponding data from points in between the array and the ESX Server, e.g., FC switches. If this intermediate data also matches DAVG values, it is likely that the storage is under-configured for the application. Adding disk spindles or changing the RAID level may help in such cases.’

Our Software iSCSI Adapter is vmhba37

Note that for the ESXTOP statistics, I took these at 100 seconds into each IOMeter run.

Default 128 Queue Depth

128 Queue Depth Results

Right then let’s make some changes shall we.

I’m going to run the command esxcfg-module -s iscsivmk_LunQDepth=64 iscsi_vmk which will decrease our Disk Queue Depth to 64.

This will require me to reboot ESXi03 so I will see you on the other side.

DiskQueue64

Let’s verify that the Disk Queue Depth is 64 by running ESXTOP with the U command.

DiskQueueLength64

Altered 64 Queue Depth

64 Queue Depth Results

Let’s run the command esxcfg-module -s iscsivmk_LunQDepth=192 iscsi_vmk which will increase our Disk Queue Depth to 192.  Then reboot our ESXi Host.

Again, we need to verify that the Disk Queue Depth is 192 by running ESXTOP with the U command.

192 ESXTOP

Altered 192 Queue Depth

192 Queue Depth Results

So the results are in.  Let’s compare each test and see what the consensus is.

Comparison Results – IOMeter

The table below is colour coded to make  it easier to read.

RED – Higher Latency or Lower IOPS

GREEN – Lower Latency or Higher IOPS

YELLOW – Same results

IOMeter Results

Altering the Software iSCSI Adapter Queue Depth to 64 decreases latency by an average of 3.51%.  IOPS are increase on average by 2.12%

Altering the Software iSCSI Adapter Queue Depth to 192 decreases latency by an average of 3.03%.  IOPS are increase on average by 2.23%

Comparison Results – ESXTOP

The table below is colour coded to make  it easier to read.

RED – Higher Latency or Lower IOPS

GREEN – Lower Latency or Higher IOPS

YELLOW – Same results

ESXTOP Results

Altering the Software iSCSI Adapter Queue Depth to 64 decreases latency between Storage Device and Software iSCSI Initiator by an average of 0.06%.  VMKernel latency is increased by 501.42%.

Altering the Software iSCSI Adapter Queue Depth to 192 increases latency between Storage Device and Software iSCSI Initiator by an average of 6.02%.  VMKernel latency is decreased by 14.29%.

My Thoughts

The ESXTOP GAVG compares to the latency experienced by IOMeter for 32KB Block 100% Sequential 100% Read and 32KB Block Sequential 50% Read 50% Write.  I could put the differences down to latency in the Guest VM iSCSI queue. 

However, the differences between ESXTOP GAVG and IOMeter for 8KB Block 40% Sequential 60% Random 55% Read 35% Write and 8K Block 0% Sequential 100% Random 70% Read 30% Write are vastly different.  If anyone has some thoughts on this, that would be appreciated.

Overall altering the Software iSCSI Adapter Queue Depth to 64 gave a slightly performance increase for IOPS and latency, however not enough for me to warrant changing this full time in the vmfocus.com lab.

Final note, you should always follow the advice of your storage vendor and listen to there recommendations when working with vSphere.

How To Change Default IOP Limit

After my last blog post, I realised I hadn’t actually walked you threw how to change the default IOP limit used by Round Robin.

To crack on and do this we need a SSH client such as Putty

Each change, only has to be made per Datastore which makes things a little easier.

SSH to your ESXi Host and enter your credentials.  We are going to run the command to give us the Network Address Authority names of our LUN’s.

esxcli storage nmp device list | grep naa

NAA 1

A quick look in the vSphere Web Client shows us which Datastores the NAA belong too.

NAA 2

In my case, I want to change the settings for all of the Datastores.  So we will start by checking the current multi path policy to ensure it’s set to Round Robin and the default IOP maximum limit.  Let’s run the following command:

esxcli storage nmp psp roundrobin deviceconfig get -d naa.6000eb3b4bb5b2440000000000000021

A bit like ‘Blue Peter’ here is one I did earlier! Not very helpful.

NAA 3

Let’s run the same command again but for a different NAA.

NAA 4

Excellent, to change the default maximum IOP limit to 1 enter this command

esxcli storage nmp psp roundrobin deviceconfig set -d naa.6000eb39c167fb82000000000000000c –iops 1 –type iops

To check, everything is ‘tickety boo’ enter

esxcli storage nmp device list | grep policy

You should see that each Datastore default maximum IOP limit is set at 1

NAA 5

Performance Increase? Changing Default IOP Maximum

I was reading Larry Smith JR’s blog post on Nexentastor over at El Retardo Land and I didn’t know that you could change the default maximum amount of IOPS used by Round Robin.

By default vSphere allows 1000 IOPS down each path before switching over to the next path.

Now, I wanted to test the default against 1 IOP down each path, to see if I could eek some more performance out of the vmfocus.com lab.

So before we do this, what’s our lab hardware?

ESXi Hosts

2 x HP N40L Microserver with 16GB RAM, Dual Core 1.5GHz CPU, 4 NICs

SAN

1 x HP ML115 G5 with 8GB RAM, Quad Core 2.2GHz CPU, 5 NICs

1 x 120GB OCZ Technology Vertex Plus, 2.5″ SSD, SATA II – 3Gb/s, Read 250M using onboard SATA Controller

Switch

1 x HP 1910 24G

And for good measure the software?

ESXi Hosts

2 x ESXi 5.1.0 Build 799733 using 2 x pNIC on Software iSCSI Initiator with iSCSI MPIO

1 x Windows Server 2008 R2 2GB RAM , 1 vCPU, 1 vNIC

SAN

1 x HP StoreVirtual VSA running SANiQ 9.5 with 4GB RAM, 2vCPU, 4 vNIC

Switch

1 x HP v1910 24G

Let’s dive straight into the testing shall we.

Test Setup

As I’m using a HP StoreVirtual VSA, we aren’t able to perform any NIC bonding, which in turn means we cannot setup any LACP on the HP v1910 24G switch.

So, you may ask the question why test this as surely to use all the bandwidth you need them to be in LACP mode.  Yep, I agree with you, however, I wanted to see if changing the IOP limit per path to 1, would actually make any difference in terms of performance.

I have created an SSD Volume on the HP StoreVirtual VSA which is ‘thin provisioned’.

Volume Details

From this I created a VMFS5 datastore in vSphere 5.1 called SSDVOL01.

Datastore

And set the MPIO policy to Round Robin.

MPIO

VMF-APP01 is acting as our test server and this has a 40GB ‘thinly provisioned’ HDD.

HDD

We are going to use IOMeter to test our performance using the parameters set out under vmktree.org/iometer/

Test 1

IOP Limit – 1000

SANiQ v9.5

Test 1

Test 2

IOP Limit – 1

SANiQ v9.5

Test 2

Test 1 v 2 Comparison

Test 1 Comparison

We can see that we get extra performance at the cost of higher latency.  Now let’s upgrade to SANiQ v10.0 AKA LeftHand OS 10.0 and perform the same tests again and see what results we get as HP claim it to be more efficient,

Test 3

IOP Limit – 1000

LeftHand OS10.0 (SANiQ v10.0)

Test 3

Test 1 v 3 Comparison

Test 1v3  Comparison

HP really have made the LeftHand OS 10.0 more efficient some very impressive results!

Test 4

IOP Limit – 1

LeftHand OS10.0 (SANiQ v10.0)

Test 4

Test 2 v 4 Comparison

Test 2v4 Comparison

Overall, higher latency for slightly better performance.

Test 1 v 4 Comparison

Test 1v4 Comparison

From our original configuration of a 1000 IOPS Limit per path and SANiQ 9.5.  It is clear that an upgrade to LeftHand OS10.0 is a must!

Conclusion

I think the results speak for themselves, I’m going to stick with the 1 IOP limit on LeftHand OS10.0 as even though the latency is higher, I’m getting a better return on my overall random IOPS.

Gotcha: vSphere Metro Storage Cluster (VMSC) & HP StoreVirtual

So you have put together an epic vSphere Metro Storage Cluster using your HP StoreVirtual SAN (formerly Lefthand) using the following rules:

  • Creating volumes for each site to access it’s datastore locally rather than going across the inter site link
  • Creating DRS ‘host should’ rules so that VM run on the ESXi Hosts local to the volumes and datastores they are accessing.

The gotcha occurs when you have a either a StoreVirtual Node failure or a StoreVirtual Node is rebooted for maintenance, let me explain why.

In this example we have a Management Group called SSDMG01 which contains:

  • SSDVSA01 which is in Site 1
  • SSDVSA02 which is in Site 2
  • SSDFOM which is in a Site 3

We have a single volume called SSDVOL01 which is located at Site 1

StoreVirtual uses a ‘Virtual IP’ Address to ensure fault tolerance for iSCSI access, you can view this under your Cluster then iSCSI within the Centralized Management Console.  In my case it’s 10.37.10.2

Even though iSCSI connections are made via the Virtual IP Address, each Volume goes via a ‘Gateway Connection’ which is essentially just one of the StoreVirtual Nodes.  To check which gateway your ESXi Hosts are using to access the volumes, select your volume and then choose iSCSI Sessions.

In my case the ESXi Hosts are using SSDVSA01 to access the volume SSDVOL01 which is correct as they are at Site 1.

Let’s quickly introduce a secondary a second Volume called SSDVOL02 and we want this to be in Site 1 as well.  Let’s take a look at the iSCSI sessions for SSDVOL02

Crap, they are going via SSDVSA02 which is at the other site, causing latency issues.  Can I do anything about this in the CMC? Not that I can find.

HP StoreVirtual is actually very clever, what it has done is load balance the iSCSI connections for the volumes across both nodes in case of a node failure.  In this case SSDVOL01 via SSDVSA01 and SSDVOL02 via SSDVSA02.  If you have ever experienced a StoreVirtual node failure you know that it takes around 5 seconds for the iSCSI sessions to be remapped, leaving your VM’s without access to there HDD for this time.

What can you do about this? Well when creating your volumes make sure you do them in the order for site affinity to the ESXi Hosts, we know that the HP StoreVirtual just round robins the Gateway Connection.

That’s all very well and good, what happens when I have a site failure, let’s go over this now.  I’m going to pull the power from SSDVSA01 which is the Gateway Connection for SSDVOL01.  It actually has a number of VM’s running on it.

Man down! As you can see we have a critical event against SSDVSA01 and the volume SSDVOL01 status is ‘data protection degraded.

Let’s take a quick look at the iSCSI sessions for SSDVOL01, they should be using the Gateway Connection SSDVSA02

Yep all good, it’s what we expected.  Now let’s power SSDVSA01 back up again and see what happens.  You will notice that the HP StoreVirtual re syncs the volume between the Nodes and then it’s shown as Status: Normal.

Here’s the gotcha, the iSCSI sessions will continue to use SSDVSA02 in Site 2 even though SSDVSA01 is back online at Site 1.

After around five minutes StoreVirtual will automatically rebalance the iSCSI Gateway Connections.  Great you say, ah but we have a gotcha.  As SSDVOL02 has now been online the longest, StoreVirtual will use SSDVSA01 as the gateway connection meaning we are going across the intersite link.  So to surmise our current situation:

  • SSDVOL01 using Site2 SSDVSA01 as it’s Gateway Connection
  • SSDVOL02 using Site1 SSDVSA02 as it’s Gateway Connection

Not really the position we want to be in!

Rebalance 2Rebalance

We can get down and dirty using the CLIQ to manually rebalance the SSDVOL01 onto SSDVSA01 perhaps? Let’s give it a whirl shall we.

Login to your VIP address using SSH but with the Port 16022 and enter your credentials.

Then we need to run the command ‘rebalanceVIP volumeName=SSDVOL01’

Rebalance 3

If your quick and flick over to the CMC you will see the Gateway Connection status as ‘failed’ this is correct don’t panic.

Rebalance 4

Do we have SSDVOL01 using SSDVSA01? Nah!

Rebalance 2

The only way to resolve this is to either Storage vMotion your VM’s onto a volume with enough capacity at the correct site or reboot your StoreVirtual Node in Site 2.

In summary, even though HP StoreVirtual uses a Virtual IP Address this is tied to a Gateway Connection via a StoreVirtual Node, you are unable to change the iSCSI connections manually without rebooting the StoreVirtual Nodes.

Hopefully, HP might fix this with the release of LeftHand OS10.1

LeftHand OS 10.0 – Active Directory Integration

I upgraded the vmFocus lab last night to LeftHand OS 10.0 as with anything new and shiny, I feel an overwhelming urge to try it!

So what’s new? Well according to the HP Storage Blog the following:

  • Increased Windows integration – We now offer Active Directory integration which will allow administrators to manage user authentication to HP StoreVirtual Storage via the Windows AD framework. This simplifies management by bringing SAN management under the AD umbrella. With 10.0 we are also providing support for Windows Server 2012 OS.
  • Improved performance – The engineering team has been working hard with this release and one of the great benefits comes with the performance improvements. LeftHand OS version 10.0 has numerous code enhancements that will improve the performance of HP StoreVirtual systems in terms of application performance as well as storage related functions such as snapshots and replication. The two major areas of code improvements are in multi-threading capabilities and in internal data transmission algorithms.
  • Increased Remote Copy performance – You’ll now experience triple the performance through optimization of the Remote Copy feature that can reduce you backup times by up to 66%.
  • Dual CPU support for VSA – In this release, the VSA software will now ship with 2 vCPUs enabled. This capability, in addition multi-threading advancements in 10.0, enhances performance up to 2x for some workloads. As a result of this enhancement, we will now also support running 2 vCPUs in older versions of VSA. So if you’ve been dying to try it, go ahead. Our lab tests with SAN/iQ 9.5 and 2 vCPUs showed an up to 50% increase in performance.
  • Other performance improvements – 10.0 has been re-engineered to take advantage of today’s more powerful platforms, specifically to take better advantage of multi-core processors, and also improves the performance of volume resynchronization and restriping and merging/deleting snapshot layers.

Active Directory Integration

The first thing I wanted to get up and running was Active Directory integration.  So I went ahead and created a Security Group called CMC_Access

CMC SG

Naturally, we need a user to be in a Security Group, so I created a service account called CMC and popped this into the CMC_Access Security Group

CMC User

Into the CMC, oops I mean the new name which is HP LeftHand Centralized Management Console.  Expand your Management Group and Right Click Administration and Select Configure External Authentication.

CMC External Authentication 1

Awesome, we now need to configure the details as follows:

  • Bind User Name the format is username@domain.  So in my case it’s cmc@vmfocus.local
  • Bind Password is your password, so in my case it’s ‘password’
  • Active Directory Server IP Address 192.168.37.201 (which is VMF-DC01), your port is 389
  • Base Distinguished Name this is DC=vmfocus, DC=local

CMC External Authentication 2

Hit ‘Validate Active Directory’ and you should be golden.

CMC External Authentication 3

Hit Save, don’t worry it will take a while.

TOP TIP: If your note sure what your Base Distinguished Names is, launch ADSI Edit and that will soon tell you.

Next we need to Right Click on Administration and choose New Group

CMC External Authentication 4

Give your Group a name and a Description, I’m going to roll with cmc_access (I know original) and they are going to have Full rights.   We then need to click on Find External Group

CMC External Authentication 5

In the ‘Enter AD User Name’ enter the Bind User Name from the External Authentication, so in my case this is cmc@vmfocus.local and hit OK

CMC External Authentication 6

If all has gone to plan, you should see your Active Directory Group, select this and hit OK

CMC External Authentication 7

It should appear in the Associate an External Group dialogue box, hit OK

CMC External Authentication 8

Then logout and log back in again as your Active Directory user, making sure that you use the format name@domain

CMC External Authentication 9

One of the odd things that I have noticed, is that it takes an absolute age to login, not sure why this is, but I’m sure HP will fix it in an upcoming release!