How To See Local RAID in ESXi 5.1

One of my work colleagues Mat Smith pointed out that when you install the generic ESXi hypervisor from the VMware site you get basic HP or Dell hardware information which is OK, but if you only have local storage you don’t know what state the underlying RAID configuration is in unless you have access to iLO or DRAC.

ESXi01 Hardware

This can be easily rectified by downloading the latest HP Custom Image for ESXi 5.1.0 ISO at the time of writing this blog post, the latest update is VMware-ESXi-5.1.0-799733-HP-5.34.23.iso

Once you have downloaded the ISO go into Update Manager > Admin View > ESXi Images and Select Import ESXi Image

ESXi Images 1

Select your the HP Custom Image for ESXi 5.1.0 ISO and click Next

ESXi Images 2

It should only take a minute or so and you will see the HP Custom Image for ESXi 5.1.0 ISO has been uploaded.  Once done hit next.

ESXi Images 3

Next we need to create a a baseline image, I’m going to roll with HP Custom Image ESXi 5.1.0 then click Finish

ESXi Images 4

Fingers crossed you should see the Imported Image

ESXi Images 5

Next we are going to go the Hosts and Clusters View and select the Update Manager Tab and then select Attach

ESXi Images 6

Select you Upgrade Baseline image and click Attach

ESXi Images 7

Next select Scan & choose Upgrades and then select Scan again

ESXi Images 8

Suprisingly enough after the scan completes you will notice that your ESXi Hosts are no longer compliant

ESXi Images 9

I tend to perform Baseline Upgrades on ESXi Hosts individual, rather than at Cluster level, just in case anything goes wrong.  With this in mind, go to your first ESXi Host and Select Remediate

Remediate 01

Select Upgrade Baseline and choose HP Custom Iage ESXi 5.1.0 and hit Next

Remediate 02

Accept the EULA and hit Next

Remediate 03

Woah, whats this message? ‘Remove installed third party software that is incompatible with the upgrade, and continue with remediation?  Word of warning you might want to check with your IT team to make sure that you aren’t going to lose any functionality.

Remediate 04

Enter a Task Name & Description and hit Next

Remediate 05

On the Host Remediation Options, make sure you tick ‘Disable any removable media devices connected to the virtual machines on the host’ as we don’t want an attached ISO to be the cause of our failure.  When you are ready hit Next

Remediate 06

On the Cluster Remediation Options, I tend to make sure that DPM is disabled and also Admission Control so that the ESXi Host can actually be patched.  Then click Next

Remediate 07

Once you are happy with your Upgrade Baseline, click Finish.  Time to go and make a brew as this is going to take along time!

Remediate 08

Awesome now that’s completed, we can see the Local Storage on the ESXi Host.

Storage View

Rinse and repeat for the rest of your ESXi Hosts.

London VMUG 25/04/2013 – Get Involved

Crikey, the second London VMUG is just over a week away, so if you haven’t registered for the event yet, I urge you to get involved

If you are a techie, why would you attend one of these events? Well apart from being free (which is awesome) it allows you to learn from your peers, this may be about a project you are working on and you want to know some of potential pitfalls, or perhaps you are interested in a new area such as software defined networking.

Great line up as always, I’m looking forward to hearing from Shmuel Kliger (VMTurbo), Hans de Leenheer (Veeam) and Dave Burgess (VMware).

VMUG Agenda

Registration begins at 08:30 and doors close at 17:15.

Take a moment and pop the address in your mobile phone, so you don’t get lost on your way there!

London Chamber of Commerce and Industry
33 Queen Street
London EC4R 1AP

3PAR StoreServ 7000 Software – Part 7

Remote Copy is the term 3PAR StoreServ uses for replicating Virtual Volumes either synchronously or ‘a synchronously’.  The last time I spoke to HP, they mentioned that the highest supported latency for synchronous replication RTT was <1.7ms.

I have been fortunate enough to have configured a number of 3PAR’s with VMware’s Site Recovery Manager and setting up and configuring the Storage Replication Adapter (SRA) was a breeze.  The only downside was that when you performed a test failover it always failed until you changed the Advanced VMFS3 setting to

VMFS3.HardwareAcceleratedLocking 0

One of the things I disliked about Remote Copy was the fact that if you couldn’t have ‘synch’ and ‘a synch’ Remote Copy Groups.  The great news is this has now been changed and with 3PAR OS 3.1.2 we can have booth, hoorah!

However, something which I don’t really understand is that HP only support a two node system (which is a common deployment) using both Remote Copy Fiber Channel and Remote Copy IP for ‘synch’ and ‘a synch’ Remote Copy Groups.  Not sure how many people have both fiber and ethernet presented from intersite links?

3PAR StoreServ 7000 now supports vSphere Metro Storage Cluster using Peer Persistence (later in this blog post), it mentions that up to 5ms RTT is supported, however I’m pretty sure that the user experience would be somewhat dire to say the least, can you imagine waiting for the acknowledgement on the remote array?

vMSC

You can vMotion between sites, however a few things to consider when doing this:

  1. Think of the intersite link (ISL) usage, would enough bandwidth be available to continue synch replication?
  2. If a VM’s datastore is at the other end of the ISL then you are using very ineffective routing
  3. Should always be used with Enterprise Plus licenses so you can instigate should Storage DRS rules to ensure that VM’s should always use the datastores they are in the same site as.

From a 3PAR StoreServ perspective the Virtual Volume is exported with the same WWN to both arrays in Read/Write mode, however only the Primary copy is marked as Active, the Secondary copy is marked as Passive.

At the time of writing this post, the failover is manual, as a quorum holder has not been created yet.  I’m sure it won’t be long and 3PAR will have something like the Failover Manager (FOM) that StoreVirtual uses.

A few of other points to know about Remote Copy are:

  • Supports up to eight FC or IP links between 3PAR StoreServs
  • Supports replication from one StoreServ to two StoreServ for added redundancy

Sync Long Distance

My overall experience with Remote Copy in Inform OS 3.1.1 has been that of frustration, a lot of the work has to be done via the CLI as the GUI has a nasty habit of not sending the correct commands or for some reason Remote Copy Links not establishing.  A few of the commands that I have used on a regular basis are:

showport -rcip
showport -state
showrcopy links
stoprcopy
startrcopy
dismissrcopylink <3PARName> 2:6:1:<targetIP> 3:6:1:<targetIP>
admitrcopylink <3PARName> 2:6:1:<targetIP> 3:6:1:<targetIP>
controlport rcip addr <targetIP> 255.255.255.0 2:6:1
controlport rcip addr <targetIP> 255.255.255.0 3:6:1
controlport rcip gw <gatewayIP> 2:6:1
controlport rcip gw <gatewayIP> 3:6:1
controlport rcip speed 100 full 2:6:1

controlport rcip speed 100 full 3:6:1

One of the things I think is a great feature of Remote Copy on 3.1.2 is Remote Copy Data Verficiation, which allows you to compare your read/write (Primary) volume and your read (Secondary) volume.  To implement this you run the ‘checkrcopyvv’ command which creates a snapshot of the read/write (Primary) volume and then cmopares it to the read (Secondary) volume.  If inconsistencies are found then only the required blocks are copied across.

Note that only one checkrcopyvv can be run at a time.

With 3PAR OS 3.1.1. you have always been able to perform bi-directional remote copy, however now it is supported!

Remote Copy N+
I know everyone likes there configuration maximums, so just to let you know the limits are:
  1. Synchronous Remote Copy – 800 Volumes
  2. Asynchronous Remote Copy – 2400 Volumes

Peer Persistance

I mentioned above that Peer Persistence has been included to allow support for vSphere Metro Storage Cluster so how does it work?

  1. Asymmetric Logical Unit Access (ALUA) is used to define the target port groups for both primary and secondary 3PAR StoreServ.
  2. The Remote Copy volumes are created on both arrays and exported to the hosts at both sites using the same WWN’s in Read/Write mode, however only one site has active I/O, the other site is passive.
  3. When you switch over, the primary volumes are blocked and any ‘in flight’ I/O is trained and the group is stopped and failed over.
  4. Target port groups on the primary site become passive and the secondary site become active.
  5. The blocked IO on the primary volumes becomes unblocked and a sense error is created indicating a change of target port group to the secondary volumes
  6. Remote Copy Group is updated and the restarted replicating in the other direction.

To move across your would use the command setrcopygroup switchover <group> to change the passive to active without impacting any I/O.

Peer Persistance

There are a few risks with Peer Persistence  firstly it shouldn’t be used with a large number of virtual volumes (no exact numbers from HP yet).  The reason for this is the switch over could take more than 30 seconds as a snapshot is taken at both the primary and secondary site just in case the operation fails e.g. ISL goes down.  Worst case scenario you would need to promote a volume manually.

10 Things To Check (Quickly) in vCenter

As part of my day job, I review vSphere infrastructures giving recommendations on areas which could be potential concerns.  Many of the business’s that I see engaged consultants to perform the initial installation and configuration and hand vCenter/vSphere back to the internal IT department.  Overtime, changes are made and settings are updated without consideration to what they mean.

So with this in mind, I decided to put together this blog post ’10 Things To Check Quickly in vCenter’

1. Admission Control

The whole point of admission control is to ensure that you have the redundancy within your infrastructure to tolerate a failure of some description, more often than not this is N+1.  So check your admission control is first of all enabled and secondary it is set correctly e.g. 2 x ESXi Hosts should be 50% CPU and 50% Memory

I have seen countless installations where this has been turned off to enable an new VM’s to be ran and the hosts where never upgraded to compensate for this increase in workload.

Admission Control

2. DAS Isolation Address

The default setting is a single isolation address which is your default gateway.  What happens if this goes down in a vSphere 4.1 environment? Well man down is the reaction!  Ensure that you specify numerous IP address, I commonly go for:

1. Layer 2 switch IP address used for vMotion/FT

2. SAN management IP address

3. LAN/Management  default gateway IP Address

DAS Isolation

3. VM Monitoring

Turn this on, I know the default is disabled, but that’s not an excuse.  Why wouldn’t you want vSphere to monitor your VM’s and restart them if it has no network or datastore activity?

VM Monitoring

4. VM Restart Priority

Let’s start with the premise that not all virtual machines are equal.  If you have virtualised Domain Controllers you would want these to be high priority restarts, followed by SQL and then application servers that connect to SQL.  I wrote a blog post on this a while back click me.

Take a few minutes and check with your server team to ensure that if you do have a failure then you have done your best to bring applications up in the right order.

VM Restart Priority

4. DRS Rules

Spend some time working with application team creating sensible DRS Anti Affinity and Affinity rules.  Some examples are:

  • Anti Affinity – Domain controllers to be running on the same ESXi host?
  • Anti Affinity – SQL Cluster with RDM
  • Anti Affinity – XenApp/Terminal Server farm members
  • Affinity – BES and SQL

Anti Affinity

5. VMware Update Manager

I quite often see environments where VMware Update Manager hasn’t been installed and if it has you can almost guarantee that the ESXi Hosts/VM/vApp haven’t been patched.

Without being flippant, there is a reason my VMware release patches/updates which is generally for bug fixes or security issues.

VYM

6. Alerting

Check to make sure that you have a valid SNMP/SMTP server setup, as after infrastructure migrations these settings can often be wrong.

Also take some time to configure alerting at ‘root’ level in vCenter to make sure they meet you business needs.  If you aren’t sure what to implement, I wrote a couple of blog posts on this subject to get you started:

Setting Up & Configuring Alarms in vCenter 5 Part 1

Setting Up & Configuring Alarms in vCenter 5 Part 2

Alerts

7.  Time Configuration

Virtual Machines take there initial time settings from the ESXi host.  We all know what dramas can happen if your virtual machines are more than 15 minutes out of sync with your domain controllers.  Use your internal domain controllers as your NTP Servers for your ESXi Hosts, it stops unnecessary NTP traffic going traversing firewalls and ensures that you won’t be affected with time skew.

NTP Servers

8. Virtual Machines With ISO’s Attached

We all pop ISO’ onto Local Storage on ESXi Hosts as it’s not taking up valuable space on our SAN.  The worse thing we can do is forget that we have them attached as if HA needs to come into action, these VM’s are going to fail.

Either check your Local Datastores on a regular basis or if you have lots of ESXi Hosts, then use tools such as PowerGUI with the VMware Management pack installed to script it.

HA Failure

9. Hot Add Memory/CPU

Virtual Machine workloads change over time, why cause unnecessary downtime and potential evening or weekend work for yourself? Make sure that you enable Memory and CPU Hot Add on your templates.

Hot Add

10. Resource Pools

The golden rule is know what you are doing with resource pools as if you go into resource contention they are going to come into play. I have seen resource pools used as containers/folders, resource pools created at cluster level to protect ‘high importance’ VM’s which result in these VM’s having less resources to use! A quick explanation of this can be found over at Eric Sloof’s site NTPRO.NL

Resource Pools

3PAR StoreServ 7000 Software – Part 6

So you have got an awesome new 3PAR StoreServ 7400 and its all hooked up.  How do you get the data from your old array onto the 3PAR StoreServ? Well if you have vSphere no problem you could use Storage vMotion or if you are performing a data migration good old robocopy would do the trick.

However in some situations you don’t have the luxury of either of these, you just need to get the data from your old SAN to your new SAN.  This is where Peer Motion comes in strutting it’s stuff.

Peer Motion

Peer Motion allows non disruptive data migration from either 3PAR to 3PAR or selected EVA to 3PAR.  Essentially the destination SAN (3PAR StoreServ) connects to the source SAN as a peer and imports the data while the source SAN I/O continue.

The good news is that with each new 3PAR StoreServ you get a 180 day license for Peer Motion for free!

So how does it work?

Step 1 – 3PAR StoreServ is connected as a Peer to the Host via FC

Step 2 – 3PAR StoreServ is connected to the Host and the Virtual Volumes using admitvv

Step 3 – Old SAN is removed and the Virtual Volume is imported into the 3PAR StoreServ

Step 4 – Host links to the old SAN are removed

EVA Management & Configuration

I think all of us have known that the EVA has been slowly dieing, so below is a quick overview of how the software maps across.

Array Management
HP P6000 Command View Software = HP 3PAR Management Console (MC)
HP Storage System Scripting Utility (SSSU) = HP 3PAR 3PAR OS CLI

Performance Management
HP P6000 Performance Advisor Software = HP 3PAR MC (Real time)
HP P6000 Performance Advisor Software = HP 3PAR System Reporter (History)
HP Performance Data Collector (EVAPerf) = HP 3PAR System Reporter
HP EVAPerf = HP 3PAR 3PAR OS CLI

Replication Management
HP Replication Solutions Manager (RSM) = 3PAR MC /CLI
HP RSM =Recovery Manager (SQL/Exchage/Oracle/vSphere)

Recovery Manager

To be honest I haven’t ever used HP Recovery Manager and I can’t forsee a time when I will.  However for the purpose of the HP – ASE, I need to understand what it is and does.

Recovery Manager creates application consistent copies of Exchange and SQL using Microsoft VSS, it also works with Oracle, VMware, Remote Copy, Data Protector and NetBackup.

Recovery Manager