We are now ready for Recovery Plans! So the question is what are they? Well a Recovery Plan is what we would like to happen in the event of a DR situation, let me explain what I mean.
Let’s imagine you have two Exchange 2010 servers, one providing the CAS/Hub Transport Role and the other providing the Mailbox role,you would want these to come up in a specific order, the Mailbox first then the CAS/Hub server. That’s all great but I can hear you saying, but what about IP address? That’s going to cause me some proper dramas, in fact what DNS all of the records are going to be wrong!
Well the panic is over with SRM we can address all of these issues! We can:
- Bring virtual machines up in a certain order.
- Change virtual machines IP address
- Run a script or batch file
Pretty cool eh? Right let’s crack on with the configuration.
Let’s select Recovery Plans from the bottom left hand menu and then Create Recovery Plan from the top right Commands box
Select your Recovery Site, in my case DR and click Next
From a design perspective, I would always recommend that you have a Recovery Plan per Protection Group as this gives you a higher level of control to fail over only particular virtual machines. In this case we are going to select PG_SATA_TEST01 and click Next
The next screen, is quite interesting, we can have a ‘test network’ in our DR site which is preconfigured so that rather than SRM creating a network for us, we can have the virtual machines come up in a predefined network when we ‘test DR’. Why would I want to do this? Well it would give you access to the virtual machines in the DR location and you can test connectivity between them.
In this scenario we are going to leave the ‘test network’ setting to Auto and click Next
Next we need to give the Recovery Plan a name, I’m going to be imaginative and call mine RP_SATA_TEST01 in the description I always reference the Protection Group that we are going to perform the recovery on. Then click Next
We then get a summary screen, click Finish to complete.
Awesome we should now have a Recovery Plan we can test, I’m itching to give it a whirl!
Before we do this, let’s take a quick swing by our HP StoreVirtual VSA’s to make sure everything is ‘tickety boo’
Let’s login to the CMC and open both SATAMG01 and SSDMG01 and expand both clusters. Select PR_SATA_TEST01_RS and make sure the Status (on the right hand side) is ‘normal’
Awesome, let’s give do a Test Recovery!
Select RP_SATA_TEST01 and then the Summary Tab and then click Test
We now get a pop up asking if we want to replicate recent changes or not for the test. If you select yes, SRM will use the SRA to send the commands to the HP StoreVirtual VSA to replicate the Volume PR_SATA_TEST01. I’m going to choose no, as I haven’t actually changed any data (we will do this later). Click Next
We now need to click Start and let the SRM magic happen.
At this point, we want to see what’s going on so let’s jump onto the Recovery Steps Tab and expand all of the stages.
So what’s going on here? Well let’s go threw this step by step
Step 1 SRM will replicate the storage if you have selected this option, we chose not to hence why the status is ‘not applicable’
Step 2 SRM will bring any hosts out of Standby if you are using Distributed Power Management at the DR site
Step 3 SRM will suspend non-critical VM’s at DR site so that the resources are available to be used by the virtual machines we are testing
Step 4 This is probably the most important step to understand. SRM doesn’t want to interfere with the replication process, if it did then it would have to make the replicated LUN in this case PR_SATA_TEST01_RS_Rmt.16 Read/Write and we don’t want to do that. So instead SRM uses the SRA to invoke a point in time snapshot of the read only PR_SATA_TEST01_RS_Rmt.16 which it turns into a Read/Write copy so that the virtual machine can be accessed.
I want to show you this from HP StoreVirtual VSA perspective, if you look below our replicated volumes haven’t been touched but we do have a Read/Write copy of PR_SATA_TEST01_RS.Rmt.16 (see it’s dark blue)
Step 5-9 SRM powers on the virtual servers in priority order.
Boom we have test complete!
Let’s nip over to VMF-ADMIN02 which is my DR vCenter and see what’s going down.
Cool, VMF-TEST02 is up and running it’s go the same IP Address and it’s been presented with the snapshot of the read only DR volume PR_SATA_TEST01 and that SRM has put VMF-TEST01 into a srm-recovery-portgroup
Good skills, let’s roll back the Test Back to VMF-ADMIN01 which is Production vCenter and click Cleanup
Essentially, SRM just reverses the process above, if all went well, you should see this
Let’s double check the CMC to make sure everything is back to they way it should be, voilà it is!
If like me you want to see what’s going on in more detail, run the Test again, but this time make sure you go over to VMF-ADMIN02 and slect Tasks & Events at Root level. This will show you everything that SRM does to perform a test failover. Pretty impressive to say the least.
Change IP Address
We probably want to change the IP address details of VMF-TEST01 when it fails over so it’s on the right subnet, using the right default gateway and DNS server. To do this Select the Virtual Machines Tab and Select Configure Recovery
Select IP Settings – NIC 1 and place a Tick in Customize IP settings during recovery and lastly click on Configure Protection and enter your IP details, rinse and repeat this for Configure Recovery
For those of you in the UK, here’s one I made earlier
Hit OK, and perform another Test Recovery, fingers crossed we should see that the IP address changes at the DR site. Time for a quick brew whilst we run the test.
The results are in and we have success!
Let’s roll back and make some more config changes
My real world experience using SRM is that we need to do more with DNS than just change the IP address, it’s a good idea to update DNS as well. Now I’m not a ‘script guy’ so I use gold old fashioned batch files.
On VMF-TEST01 we are going to create the following batch file:
The batch file will be called ipconfigupdate.bat and saved on root of the C: Drive on VMF-TEST01
Cool, now let’s configure SRM to register the new DNS details.
Back to the Virtual Machines Tab and Configure Recovery for VMF-TEST01
We are going to select a ‘Post Power On Step’ and then Add
We are going to use ‘Command on Recovered VM’ and give the Step the name ‘Ipconfig Register DNS’ and the content is going to be c:windowssystem32.cmd.exe /c c:ipconfigupdate.bat and the Timeout value is 1 minute
The first part c:windowssystem32.cmd.exe tells SRM where to find the application you want to run in this case it’s Windows Command Prompt and then second part /c c:ipconfigupdate.bat tells SRM to run the batch file under Windows Command Prompt.
OK, now we need to think about how we are going to test this, as if VMF-TEST01 fails over into Auto Network Port Group then it won’t be able to communicate with the Domain Controller in the DR site. So ladies and gentlemen we are going to do what known in the IT world as ‘frig’ to test this.
We are going to shut down VMF-TEST01 at the Production Site and then change the Auto Network to DRLAN, so that when VMF-TEST01 comes up at DR it can communicate with my DC.
If you remember we need to edit the Recovery Plan RP_SATA_TEST01 to change the test Port Group.
Right then let’s run a Test recovery and see if my ‘frig’ works! It might be time for a brew, as when we customize the IP Address, SRM will bring the guest VM online, change the IP Address’s and then shut it down, wait for VMware Tools and then run our batch file.
Awesome, well the Test recovery was a success.
Let’s check VMF-TEST01, well it’s got the right IP Address and the right Port Group. I’m going to attempt a ping, success! I feel like the A-Team when a plan comes together.
TOP TIP: Don’t forget to change your DNS back
Virtual Machine Priory Order
The last item I want to cover off is Virtual Machine Priority Order. We have a range of 1 to 5. Priority 1 VM’s start first and 5 start last. The cool thing about this is that it wait’s for VMware Tools to start before the next VM is powered on.
To configure this we need to go back to the Virtual Machines Tab and Right Click VMF-TEST01 Select Priority and then the level you want.
Boom job done!
That’s it for this post, on the next blog entry we are going to failover, reprotect and failback.
15 thoughts on “Part 4 – Configuring Site Recovery Manager (SRM) With HP StoreVirtual VSA”
Instead of writing and pointing to a batch file, could you just use “ipconfig.exe /registerdns” within the “Context” box?
Unfortunately not, you can have your commands/scripts held on the SRM Server instead so it’s easier to manage.
what about the clients machine? how does the client communicate with the new subnet if DHCP and DNS is down?
In your DR site you will already have a number of servers e.g. DC, vCenter, SQL etc. These should be on a seperate subnet and then a further subnet for client machines in your DR site. So you should be able to go to the DR site, plug into your switch and get an IP Address from DHCP. Therefore SRM should not be used to failover your DC’s. The only thing then to consider is seizing FSMO roles.
Thanks for that.
But what about the production site? If the server infrastructure had a meltdown, but the link is still up, how are the clients from the production access the DR site?
This is a networking question and AD question. The KCC will take care of Active Directory authentication. Having a DHCP relay on your default gateway will send DHCP traffic across to the DR site & give out an IP address. SRM is a small part of any DR plan, loads of other considerations.
I’m trying to set the VMs addressing to DHCP, I noticed its set to No, how do we change this as a Yes so the failovered VMs receive IP from its DHCP scope?
It’s a bit of a tricky one to find, you need to go to your Recovery Plan > Select Virtual Machines Tab > Configure Recovery for VM > Select IP Settings.
Yes, I have looked into this one before, but when you edit the IP of the VM, the DHCP is set to No. How do I change this to yes so the VM can receive IP from DHCP?
You simply need to assign either DHCP for both Production and DR or a static for Production and DHCP for DR.
Ok, I guess we need to use the IP customisation tool for IP change. If we use this tool do we need to register DNS? Or does it do it automatically?
You will need to register DNS. Looking back at your previous questions, I would recommend you purchase Mike Lavericks SRM book and lab your environment to test it.
In case you’re unaware, Microsoft has a hotfix available for Server 2008, 2008 R2, Windows 7 and Vista systems that addresses an issue in the DNS client that can cause the system to essentially drop out of DNS when the DNS servers are changed and the server rebooted. It’s KB 2520155 (http://support.microsoft.com/kb/2520155). This hotfix fixed a long standing issue in our environment with SRM directly related to DNS registrations (I was about to implement a similar script as yours to fix this problem, then discovered the hotfix), and it might be the same as what you’re experiencing.
Worth a look and/or test.
Thanks for the heads up Ryan