Storage Spaces Direct Overview

Storage Spaces Direct is an area which I have been meaning to look into, but for one reason or another it has slipped through the gaps until now.

What Is Storage Spaces Direct

Storage Spaces Direct is a shared nothing software defined storage which is part of the Windows Server 2016 operating system.  It creates a pool of storage by using local hard drives from a collection (two or more) individual servers.

The storage pool is used to create volumes which have in built resilience, so if a server or hard drive fails, data remains online and accessible.

What Is The Secret Sauce?

The secret sauce is within the ‘storage bus’ which is essentially the transport layer that provides the interaction between the physical disks across the network using SMB3. It allows each of the Hosts to see all disks as if they where it’s own local disk using Cluster Ports and Cluster Block Filter.

The Cluster Ports is like an initiator in iSCSI terms and Cluster Block Filter is the target, this allows each disk to presented to each Host as if it was it’s own.

Storage Bus v0.1

For a Microsoft supported platform you will need a 10GbE network with RDMA compliant HBA’s with either iWARP or RoCE for the Storage Bus.

Disks

When it comes to Storage Spaces Direct, all disks are not equal and you have a number of disk configurations which can be used.   Drive choices are as follows:

  • All Flash NVMe
  • All Flash SSD
  • NVMe for Cache and SSD for Capacity (Writes are cached and Reads are not Cached)
  • NVMe for Cache and HDD for Capacity
  • SSD for Cache and HDD for Capacity (could look at using more expensive SSD for cache and cheaper SSD for capacity)
  • NVMe for Cache and SSD and HDD for Capacity

In a SSD and HDD configuration the Storage Bus Layer Cache binds SSD to HDD to create a read/write cache.

Using NVMe based drives will provide circa 3 x times performance at typically 50% lower CPU cycles versus SSD, but come at a far greater cost point.

It should be notes that as a minimum 2 x SSD and 4 x HDD are needed for a supported Microsoft configuration.

Hardware

In relation to the hardware it must be on Windows Server Catalog and Certified for Windows Server 2016.  Both HPE DL380 Gen10 and Gen9 are supported along with HPE DL360 Gen10 and Gen9.  When deploying Storage Spaces Direct you need to ensure that the Cluster creation passes all validate tests to be supported by Microsoft.

  • All servers need to be the same make and model
  • Minimum of Intel Nehalem process
  • 4GB of RAM per TB of cache drive capacity on each server to store metadata e.g. 2 x 1TB SSD per Server then 8GB of RAM dedicated to Storage Spaces Direct
  • 2 x NICS that are RDMA capable with either iWARP or RoCE dedicated to the Storage Bus.
  • All servers must have the same drive configuration (type, size and firmware)
  • SSDs must have power loss protection (enterprise grade)
  • Simple pass through SAS HBA for SAS and SATA drives

Things to Note

  • The cache layer is completely consumed by Cluster Shared Volume and is not available to store data on
  • Microsoft recommendation is to make the cache drives a multiplier of capacity drives e.g. 2 x SSD per server then either 4 x HDD or 6 x HDD PER SERVER
  • Microsoft recommends a single Storage Pool per cluster e.g. all the disks across A 4 x Hyper-V Hosts contribute to a single Storage Pool
  • For a 2 x Server deployment the only resilience choice is a two way mirror.  Essentially data is written to two different HDD in two different servers, meaning your capacity layer is reduced by 50%.
  • For a 3 + Server deployment Microsoft recommends a three way mirror.  Essentially three copies of data across 3 x HDD on 3 x Servers reducing capacity to 33%.  You can undertake single parity (ALA RAID5) but Microsoft do not recommend this.
  • Typically a 10% cache to capacity scenario is recommended e.g. 4 x 4TB SSD is 16TB capacity then 2 x 800GB SSD should be used.
  • When the Storage Pool is configured Microsoft recommend leaving 1 x HDD worth of capacity for immediate in-place rebuilds of failed drives.  So with 4 x 4TB you would leave 4TB un allocated in reserve
  • Recommendation is to limit storage capacity per server to 100TB, to reduce resync of data after downtime, reboots or updates
  • Microsoft recommends using ReFS for Storage Spaces Direct for performance accelerations and built in protection against data corruption, however it does not support de-duplication yet.  See more details here https://docs.microsoft.com/en-us/windows-server/storage/refs/refs-overview

HP 3PAR Streaming Remote Copy Replication

The replication in 3PAR Arrays has always been mediocre.  In the older versions of 3PAR Inform OS if you choose ‘sync replication’ for a single remote copy group, you could not use ‘a sync’ for any other remote copy groups.

This feature was addressed in a newer version of 3PAR Inform OS, however your lowest RPO using ‘a sync’ was bottlenecked at 15 minutes regardless of available bandwidth.

With the release of the HP 3PAR 20000 Series, comes a new feature which is streaming ‘a sync’ replication.

What Is Streaming ‘A Sync’ Replication

Essentially, if you have the bandwidth and cache available the source 3PAR will stream replication across to the target 3PAR reducing your RPO below 15 minutes.  I like to think of it as a best endeavours.

Replication Modes

When designing a replication infrastructure it’s important to know the transport method as well as the thresholds in terms of bandwidth and latency between source and target arrays.  This ensures that you are not only within a supported SLA, but also to ensure that write performance of the source array is not effected.

The table below shows supported thresholds.

Replication Modes

Architecture

The source array uses a local cache to maintain host write transactions in memory.  A concept known as ‘delta sets’ are used.

Source 3PAR Array

  • I/Os are transferred from primary array to secondary array as part of a delta sets
  • I/Os on the primary array that belong to a particular remote copy group are grouped together into delta sets
    • A delta set is made up of sub-set of I/Os, where sub-set represents I/Os owned by a remote copy group on each given node

Target 3PAR Array

  • A delta set is applied on the secondary RC volume group only after:
    • The entire delta set has been received in the secondary array cache
    • And the previous sets that this delta set depends upon have completed.
  • A secondary RC volume group is always in a crash consistent state, before or after the application of a delta set. It is not crash consistent during the application of a delta set.
    • If the delta set fails to apply on the secondary volume then the group stops and a fail back to the last coordinated snapshot is required

Remote Copy Architectuer

What About Write Bursts?

A write burst is when the array receives a significant number of writes which could last for a few minutes.  If the inter site link between source and target array is sufficient this has no impact.

It is when the inter site link cannot cope or the write cache gets filled then the source 3PAR will choose a random remote copy group to stop and a snapshot is taken.

Note: You have no control over which remote copy group

Once stopped these groups will start again at the next sync period.

Final Thoughts

This is a great feature set being added to the 3PAR 20,000 Series.  I’m sure when the next .1 release update is received you will be able to select which remote copy groups you would want to stop either due to a write burst or cache overflow.

With most 3PAR updates, I expect the streaming ‘a sync’ replication to find its way into the 7×00 series within a short period of time.

How To: Map HP StoreVirtual Volumes to Datastores

Problem Statement

You have created numerous datastores on your HP StoreVirtual of the same size and presented these to your ESXi Hosts.  However, you have since forgotten how the datastores map back to the volumes.

When you check the Runtime Name of your devices (Storage > Devices) to find out the LUN number, you see that each LUN has is ‘0’ as per the screenshot below.

LUN 0

This can be confirmed in HP StoreVirtual Centralised Management Console under Servers > Select Server > Volumes & Snapshots

LUN 0 HP SV

Not very helpful at all!

Resolution

Each datastores has a unique iSCSI Target string which can be used to identify how they are mapped to volumes.

To find out what they are select the Datastore > Properties > Manage Paths

Device Properties

At the bottom we can see the Target, this shows tells us the following details:

  • DC02-MG01
    • Denotes the Management Group the volume is in
  • 39 is the hexadecimal representation of 27 which is the VMware NAA (thanks to Jonathan Reid for this information)
    • Denotes the unique target identifier for the volume
  • DC01-DR01SRM
    • Denotes the volume name on the HP StoreVitual

Target Name

So we now know this datastore corresponds to the volume called DC01-DR01SRM in Management Group DC02-MG01.

3PAR StoreServ Zoning Best Practice Guide

This is an excellent guide which has been written by Gareth Hogarth who has recently implemented a 3PAR StoreServ and was concerned about the lack of information from HP in relation to zoning.  Being a ‘stand up guy’ Gareth decided to perform a lot of research and has put together the ‘3PAR StoreServ Zoning Best Practice Guide’ below.

This article focuses on zoning best practices for the StoreServ 7400 (4 node array), but can also be applied to all StoreServ models including the StoreServ 10800 8-node monster.

3PAR StoreServ Zoning Best Practice Guide

Having worked on a few of these, I found that a single document on StoreServ zoning BP doesn’t really exist. There also appears to be conflicting arguments on whether to use Single Initiator – Multiple Target zoning or Single Initiator – Single Target zoning. The information herein can be used as a guideline for all 3PAR supported host presentation types (VMware, Windows, HPUX, Oracle Linux, Solaris etc…).

Disclaimer:  Please note that this is based on my investigation, engaging with HP Storage Architects and Implementation Engineers. Several support cases were opened in order to gain a better understanding of what is & isn’t supported. HP recommendations change all the time, therefore it’s always best to speak with HP or your fabric vendor to ensure you are following latest guidelines or if you need further clarification.

Right, let’s start off with Fabric Connectivity

In terms of host connectivity options the StoreServ 7000 (specifically the 7400) provides us with the following:

  • 4x built-in 8 Gb/s Fibre Channel ports per node pair.
  • Optional 8 Gb/s Quad Port Fibre Channel HBA (Host Bus Adapter) per node (we will be focusing on this configuration option).
  • Optional 10 Gb/s Dual Port FCOE (Fibre Channel over Ethernet) converged network adapter per node.

StoreServ target ports are identified in the following manner Node:Slot:Port.

StoreServ target ports located on the on-board HBA’s will always assume the slot identity of 1, respectively StoreServ targets ports located on the optional expansion slot will always assume the identity of slot 2.

StoreServ nodes are grouped in pairs, it’s important to pay particular attention to this when zoning host initiators (server HBA ports) to the StoreServ Target ports.

StoreServ7000-HostPorts

Recommendations

  • Each HP 3PAR StoreServ node should be connected to two fabric switches.
  • Ports of the same pair of nodes with the same ID (value) should be connected to the same fabric.
  • General rule – odd ports should be connect to fabric 1 and even ports should be connected to fabric 2.

Figure 1a below identifies physical cabling techniques, mitigating against single points of failure using a minimum of two fabric switches, which are separated from each other.

The example below illustrates StoreServ nodes with supplementary quad port HBA’s:

figure 1a_StoreServ_nPcabling

Moving on to Port Persistence

As already covered by Craig in this blog post, a host port would be connected and zoned on the fabric switch via one initiator (host HBA port) to one HP 3PAR StoreServ target port (one-to-one zoning). The pre-designated HP 3PAR StoreServ backup port must be connected to the same fabric as its partner node port.

It is best practice that a given host port sees a single I/O path to HP 3PAR StoreServ. As an option, a backup port can be zoned to the same host port as the primary port, which would result in the host port seeing two I/O paths to the HP 3PAR StoreServ system. This would also result in the configuration where a HP 3PAR StoreServ port can serve as the primary port for a given host port(s) and backup port for host port(s) connected to its partner node port.

Persistent ports leverage SAN fabric NPIV functionality (N_Port ID Virtualization) for transparent migration of a host’s connection, to a predefined partner port on the HP 3PAR StoreServ array during software upgrades or node failure.

One of the ways this is accomplished is by having a predefined host facing port on the 3PAR StoreServ array, so that in the event of upgrade (node shutdown) or node down status the partner port will assume the identity of its partner port. The whole process is transparent to the host. When the node returns to normal I/O is failed back to the original target port.

Although unconfirmed I have heard that in in future releases of Inform OS we will get this level of protection at the port level.

Essentially for this to work Port Persistence requires that corresponding ‘native’ and ‘guest’ StoreServ ports on a node pair, be connected to the same fibre channel fabric.

Requirements for 3PAR Port Persistence:

  • The same host ports on the host facing HBA’s in the nodes in a node pair must be connected to the same fabric switch.
  • The host facing ports must be set to target mode.
  • The host facing ports must be configured for point-to-point connections.
  • The Fibre Channel fabric must support NPIV and have NPIV enabled on the switch ports.

Checking and enabling NPIV

Brocade Fabric OS (ensure you have the appropriate license which enables NPIV)

admin> portcfgshow ‘port#’

If the NPIV capability is enabled, the results of the portcfgshow command will identify this, i.e NPIV capability ON.

If the NPIV capability is not enabled, you can turn it on with the following command:

admin> portCfgNPIVPort ‘port#’ 1   (1 = on, 0 = off)

 Cisco MDS Series Switches

fabSwitch # conf t

fabSwitch(config) # feature npiv (Enables NPIV for all VSANs on the switch)

QLogic SANbox 3800, 5000 and 9000 Switches

Don’t require a license, it’s enabled by default (just ensure you are using firmware version 6.8.0.0.3 or above).

Now let’s cover Switch Zoning (Fibre Channel)

SAN zoning is used to logically group hosts and storage devices together in a physical SAN, so that authorised devices can only communicate with each other if they are in the same SAN zone.

The function of zoning is to:

  • Restrict access so that hosts can only see the data they are authorised to see.
  • Prevent RSCN (Registered State Change Notification) broadcasts.

What are ‘RSCNs’ ? RSCNs are a feature of fabric switches.  It’s a service of the fabric that notifies devices of changes on the state of other attached devices. For example if a device is reset, removed or otherwise undergoes a significant change in status.

These broadcasts are made to all members in the configured SAN zone. As hosts and storage targets can be grouped in a zone its best practice to reduce the impact of these types of broadcasts (Note: an argument against RSCN’s causing issues in zoning tables is that newer HBA’s do a good job limiting the impact of these types of broadcasts).  Nevertheless, I prefer limiting the number of initiators and targets in a fabric zone to a minimum.

Zoning Types

  • Domain, Port zoning uses switch domain id’s and port numbers to define zones.
  • Port World Wide Name or pWWN zoning uses port World Wide Names to define zones. Every port on a HBA has a unique pWWN. (A host HBA comprises of a – nWWN & a pWWN, the nWWN refers to the whole device whereas the pWWN refers to the individual port.

The preferred zoning unit for the 3PAR StoreServ is pWWN. If you are currently using Domain, Port migrating to pWWN is very easy. Simply create new zones based on the pWWN of the host and the pWWN of the storage target, add these new zones to your fabric switches, zoning-out the references to Domain, Port for that respective HBA port. Some fabric vendor’s support mixing both Domain, Port and pWWN in the same zone. I prefer using one or the other explicitly.

The following command outputs the StoreServ ports and partner ports, which can be used to identify the node pWWN’s for zoning.

3PAR01 cli% showport 

HP 3PAR StoreServ supports the following zoning configurations:

  • Single initiator – Single Target per zone (recommended)
  • Single initiator – Multiple Targets per zone

Use Single Initiator – Single Target per zone over Single Initiator Multiple Targets per zone to reduce RSCN’s as previously discussed.

At the time of writing, HP 3PAR OS implementation documentation references Single Initiator Multiple Targets as the recommended zoning type. However, when I queried this I was directed to use Single Initiator – Single Target Zoning.  HP support pointed me in the direction of this document which identifies Single Initiator – Single Target zoning as best practice: http://www8.hp.com/h20195/v2/GetDocument.aspx?docname=4AA4-4545ENW

HP will support Single Initiator – Multiple Target, but you should not have a single host initiator attached to more than two StoreServ target ports!

Host port WWN’s should be zoned in partner pairs. For example if a host is zoned to node port 0:2:1, then it should be zoned to node port 1:2:1 (I’m speculating here, but I guess this is because controller nodes mirror cache I/O, so that in the event of node failure write operations in cache are not lost – hence we zone in node pairs and not across nodes from different pairs).

After you have zoned the host pWWN to the StoreServ node pWWN, you can use the 3PAR CLI showhost command to ensure that each host initiator is zoned to the correct StoreServ target ports (ensuring initiators go to different targets over different fabrics).

Figure 1b represents a staggered approach where you would have odd numbered VMware hosts connecting to nodes 0 & 1, and even numbered hosts connecting to nodes 2 & 3 (Note: currently the StoreServ is designed to tolerate a single node failure only, this includes the 8-node StoreServ 10800 array).

The example depicts Single Initiator – Single Target zoning, so a host with two HBA ports connecting over two fabrics will have a total of four zones (two per fabric). In case you were wondering the maximum allowed is eight (also known as fan-in limitation which is four per fabric).

figure 1b_host_zoning

Here are some additional points to be aware of

 Fan-in/Fan-out ratios:

  • Fan-in refers to a host server port connected to several HP 3PAR storage ports via Fibre Channel switch.
  • Fan-out refers to the HP 3PAR StoreServ storage port that is connected to more than one host HBA port via Fibre Channel switch.

Note: Fan-in over subscription represents the flow of data in terms of client initiator to StoreServ target ports. HP/3PAR documentation states that a maximum of four HP 3PAR storage system ports can fan-in to a single host server port (if you are thinking great, I’ll connect my VMware host to 8 ports [four per fabric] think again.  Using this approach when you have hundreds of hosts can quickly reach the maximum StoreServ port connection limitation which is 64!) it’s just not necessary.

StoreServ Target Port Maximums (As per 3PAR InForm OS 3.1.1 please observe the following):

  • Maximum of 16 hosts initiators per 2Gb HP 3PAR StoreServ Storage Port
  • Maximum of 32 hosts initiators per 4Gb HP 3PAR StoreServ Storage Port
  • Maximum of 32 hosts initiators per 8Gb HP 3PAR StoreServ Storage Port
  • Maximum total of 1,024 host initiators per HP 3PAR StoreServ Storage System

HP documentation states that these recommendations are guidelines, adding more than the recommended hosts should only be attempted, when the total expected workload is calculated and shown not to overrun either the queue depth or throughput of the StoreServ node port.

Note: StoreServ storage ports irrespective of speed, will negotiate at the lowest speed of the supporting fabric switch (keep this in mind when calculating the number of host connections).

The following focuses on changing the target port queue depth on a VMware ESX environment.

The default setting for target port queue depth on the ESX host can be modified to ensure that the total workload of all servers will not overrun the total queue depth of the target HP StoreServ system port. The method endorsed by HP is to limit the queue depth on a per-target basis. This recommendation comes from limiting the number of outstanding commands on a target (HP 3PAR StoreServ system port), per ESX host.

The following values can be set on the HBA running VMware vSphere. These values limit the total number of outstanding commands the operating system routes to one target port:

  • For Emulex HBA target throttle = tgt_queue_depth
  • For Qlogic HBA target throttle = ql2xmaxqdepth
  • For Brocade HBA target throttle = bfa_lun_queue_depth

(Note: for instructions on how to change these values follow VMware KB1267‎, these values are also adjustable on Linux Redhat & Solaris).

The Formula used to calculate these values is as follows:

(3PAR port queue depth [see below]) / (total number of ESX severs attached) = recommended value

The I/O queue depth for each HP 3PAR StoreServ storage system HBA mode is shown below:

Note: The I/O queues are shared among the connected host server HBA ports on a first come first serve basis.

HP 3PAR StoreServ Storage HBA I/O queue depth values
Qlogic 2Gb 497
LSI 2Gb 510
Emulex 4Gb 959
HP 3PAR HBA 4Gb 1638
HP 3PAR HBA 8Gb 3276

Well, hopefully you found the above information useful. Here is a high level summary of what we have discussed:

  • Identify and enable NPIV on your fabric switches (Fibre Channel only feature – NPIV-Port Persistence is not present in iSCSI environments)
  • Use Single Initiator -> Single Target zoning (HP will support Single Initiator – Multiple Target, but you should not have a single host initiator attached to more than two StoreServ target ports).
  • A maximum of four HP 3PAR Storage System ports can fan-in to a single host server port.
  • Zoning should be done using pWWN. You should not use switch port/Domain ID or nWWN.
  • A host (non-hypervisors) should be zoned with a minimum of two ports from the two nodes of the same pair. In addition, the ports from a host zoning should be mirrored across nodes.
  • Hosts need to be zoned to node pairs. For example, zoned to nodes 0 and 1 or to nodes 2 and 3. Hosts should NOT be zoned to non-mirrored nodes such as 0 and 3.
  • When using hypervisors, avoid connecting more than 16 initiators per 4 Gb/s port or more than 32 initiators per 8 Gb/s port.
  • Each HP 3PAR StoreServ system has a maximum number of initiators supported, that depends on the model and configuration.
  • A single HBA zoned with two FC ports will be counted as two initiators. A host with two HBA’s, each zoned with two ports, will count as four initiators.
  • In order to keep the number of initiators below the maximum supported value, use the following recommendations:
    • Hypervisors: four paths maximum.
    • Other hosts (non-hypervisors): two paths to two different nodes of the same port pairs.
  • Hypervisors can be zoned to four different nodes but the hypervisor HBAs must be zoned to the same Host Port on HBAs in the nodes for each Node Pair.

Reference Documents

HP SAN Design Reference Zoning Recommendations

HP 3PAR InForm® OS 3.1.1 Concepts Guide

The HP 3PAR Architecture

HP UX 3PAR Implementation Guide

HP 3PAR Red Hat Enterprise Linux and Oracle Linux Implementation Guide

HP 3PAR VMware ESX Implementation Guide

HP 3PAR StoreServ Storage and VMware vSphere 5 best practices

HP 3PAR Windows Server 2012, Server 2008 Implementation Guide

HP Brocade Secure Zoning Best Practises

HP 3PAR Peer Persistence Whitepaper

An introduction to HP 3PAR StoreServ for the EVA Administrator

Building SANs with Brocade Fabric Switches by Syngress

Part 3 – Automating HP StoreVirtual VSA Failover

In part two we installed and configured HP StoreVirtual VSA on vSphere 5.1 in this blog post we are going to look at automating failover.

I think a quick recap is in order.  If you remember we received a warning when adding SATAVSA01 and SATAVSA02 to the Management Group SATAMG01.  Which was:

‘to continue without installing a FOM, select the checkbox below acknowledging that a FOM is required to provide the highest level of data availability for a 2 storage system management group configuration. Then click next’.

This error message is about quorum, a term that I’m sure alot of you are familiar with when working with Windows clusters.  Each VSA run’s whats known as a ‘manager’ which is really a vote.  When we have two VSA’s we have two votes, which is a tie.  Let’s say that one VSA has an issue and goes down, how does the the remaining VSA know that? Well it doesn’t.  It could be that both VSA’s are up and they have lost’s the network between them.  This then result’s in split brain scenario.

This is where the Failover Manager comes into play.  So what exactly is a Failover Manager? Well it’s specialized version of the SAN/iQ software which runs under ESXi, VMware Player or the elephant in the room (Hyper V).  It’s purpose in life is to be a ‘manager’ and maintain quorum by introducing a third vote ensuring access to volumes in the event of a StoreVirtual VSA failure.  The Failover Manager is downloaded as an OVF and the good news is we already have a copy which we have extracted.

A few things to note about the Failover Manager.

  • Do not install the Failover Manager on a StoreVirtual VSA you want to protect,as if you have a failure the Failover Manager will loose connection.
  • Ideally it should be installed at a third physical site.
  • Bandwidth requirements to the Failover Manager should be 100 Mb/s
  • Round trip time to the Failover Manager should be no more than 50ms

In this environment we will be installing the Failover Manager on the local storage of ESXi02 and placing it into a third logical subnet.  I think a diagram and a reminder of the subnets are in order.

Right then, let’s crack on shall we.

Installing Failover Manager

We are going to deploy SATAFOM onto ESXi02 local hard drive which is called ESXi02HDD (I should get an award for my naming conventions).

The Failover Manager or FOM from now on, is an OVF so we need to deploy it from vSphere Client.  To do this click File > Deploy OVF Template.

Browse to the location of your extracted HP StoreVirtual VSA files ending in FOM_OVF_9.5.00.1215FOM.ovf

Click Next on the OVF Template Details screen and Accept the EULA followed by Next.  Give the OVF a Name in this case SATAFOM and click Next.  When you get to the storage section you need to select the local storage on a ESXi Host which is NOT running your StoreVirtual VSA.  In this case it is ESXi02HDD

Click next and select your Network Mapping and click Finish.

TOP TIP, don’t worry if you cannot select the correct network mapping during deployment. Edit the VM settings and change it manually before powering it on.

If all is going well you should see a ‘Deploying SATAFOM′ pop up box.

Whilst the FOM is deploying let’s talk networking for a minute.

On ESXi02, I have a subnet called FOM which is on VLAN 40.  We are going to pop the vNIC;s of SATAFOM into this.  The HP v1910 24G is the layer three default gateway between all the subnets and is configured with VLAN Access Lists to allow the traffic to pass (I will do a VLAN Access List blog in the future!)

Awesome let’s power the badboy on.

We need to use use the same procedure we used to set the IP address’s on the FOM as we did on the VSA.  Hopefully you should be cool with this, but if you need a helping hand refer back to How To Install & Configure HP StoreVirtual VSA On vSphere 5.1

The IP address’s I’m using are:

  • eth0 – 10.37.40.1
  • eth1 – 10.37.40.2

Failover Manager Configfuration

Time to fire up the HP Centralized Management Console (CMC) and add the IP Address into  Find Systems.

Log into view SATAFOM and it should appear as follows.

Let’s Rich Click SATAFOM and ‘Add to an Existing Management Group’ SATAMG01

Crap, Craig that didn’t work, I got a popup about a Virtual Manager. What’s that all about?

Nows a good time as any to talk about two other ways to failover the StoreVirtual VSA.

Virtual Manager this is automatically added to a Management Group that contains an even number of StoreVirtual VSA’s.  If in the event you have a VSA failure you can start the Virtual Manager manually on the VSA which is working.  Does it work? Yes like a treat but you will have downtime until the Virtual Manager is started and you nerd to also stop it manually when the failed VSA is returned to action.  Would I use it? If you know your networking ‘onions’ you should be able configure the FOM in a third logical site to avoid this scenario.

Primary Site in a two manager configuration you can designate one manager (StoreVirtual VSA) as the Primary Site.  So if the secondary VSA goes offline you maintain quorum.  The question is why would you do this? Honestly I don’t know, because unless you have some proper ninja skills, how do you know which VSA is going to fail? Also you need to manually recover quorum, which isn’t for the feint heated.  My recommendation, simples, avoid.

OK back on topic.  We need to remove the Virtual Manager from SATAMG01, which is straight forward.  Right Click > Delete Virtual Manager.

Let’s try adding the SATAFOM back into Management Group SATAMG01.  Voila it works!  You might get a registration is required notice, we can ignore that as I’m assuming you have licensed your StoreVirtual VSA.

(I know I have some emails, they are to do with feature registration and Email settings)

Let’s Try & Break It!

Throughout this configuration we have used the following logic:

  • SATAHDD01 runs SATAVSA01
  • SATAHDD02 runs SATAVSA01
  • SATAVSA01 and SATAVSA02 are in Management Group SATAMG01
  • SATAVSA01 and SATAVSA02 have a volumes called SATAVOL01 and SATAVOL02 in Network RAID 10

In my lab I have a VM called VMF-DC01 which you guessed it is my Domain Controller, it resides on SATAVOL02.

Power Off SATAVSA01

We are going to power off SATAVSA01 which will mimic it completely failing, no shutdown guest for us!  Fingers crossed we should still maintain access to VMF-DC01.

Crap we lost connection for about 10 seconds to VMF-DC01 and then it returned whys that Craig you ask?

Well if you remember all the connections go to a Virtual IP Address in this case 10.37.10.1 This is just mask as even though the connections hit the VIP, they are directed to one of the StoreVirtual VSA, in this case SATAVSA01.

So when we powered off SATAVSA01 all the iSCSI connections had to be ceased and then represented back via the VIP to SATAVSA02.

Power Off SATAVSA02

To prove this, let’s power on SATAVSA01 and wait for quorum to be recovered.  OK let’s power off SATAVSA02 this time and see what happens.

I was browsing through folders and received a momentary pause of about one second which to be fair on a home lab environment is pretty fantastic.

So what have we learned? We can have Network RAID  1 with Hardware RAID 0 and make our infrastructure fully resilient.  To sum up, I refer back to my opening statement which was the HP StoreVirtual VSA is sheer awesomeness!