ESXi 5 Host Isolation

What is a ‘host isolation’?

It’s the term that VMware use to define when an ESXi host is no longer able to communicate with specific IP address’s and therefore it is deemed to be isolated from the rest of the cluster.

By default the ESXi hosts default gateway (the VMkernal gateway) is used.  Depending on your infrastructure this is normally a Layer 3 switch, router or firewall.

Whats the problem with that you ask? Well what happens if you have an outage of your Layer 3 switch, firewall or router? Well vCentre will think that your ESXi hosts are isolated and depending on your ‘host isolation response’ perform one of the following actions:

The recommended action for vCentre 5 is to ‘leave powered on’.

We therefore need to provide more external devices for vCentre to communicate with before it invokes a host isolation response. To do this we go into the Cluster Settings > vSphere HA > Advanced Options.

We then add additional IP address’s that we went vCentre to communicate with in the following format:


We then end the range of IP address’s with ‘das.usedefaultisolationaddress’ ‘false’

What IP address’s would I recommend you use in a production environment?

– vMotion/FT switches
– SAN Controller Management IP address’s
– Layer 2 Switch
– Layer 3 Switch
– Firewall

Virtual Machine Notes

It’s happens to the best of us, we go to a client site to check/review the current infrastrucutre and we get greeted with zero documentation apart from it being in ‘Dave the IT guys head’.

‘Dave the IT guy’ is then on annual leave or ill so we start a manual process of trying to discover an account with enough privileges to let us login to servers and eventually find out which server holds the vCentre role.

Great, we then get greeted with the old IT favourite, servers named after items from Star Wars!  The usual suspects are there, R2D2, C3PO, DarthVader, LukeSkywalker, Endor etc.

This leads me onto Virtual Machine notes, something which all of us have been guilty of over looking.  It’s such a simple thing that makes every discovery process or day to day administration so much easier.

Spend a few minutes per server to make a quick note about what roles they perform and it makes everyones life easier.

We only need to click the Virtual Machine and then add a note.

Fabric Zoning Best Practices

After yesterdays post on HBA’s I was thinking about fibre channel, which leads in nicely to todays post about fabric zoning best practices.

So, what is a ‘Single Initiator Zone’ and why do we implement them?

An initiator is the HBA in your ESXi Host, typically these are two port or perhaps in four port depending on your requirements.  Each port is known as an initiator.
Part of your VMware design would be to have at least two HBA’s with two ports (initiators) for redundancy. These would then connect to the storage processor on your SAN (the target) which would have four  ports, two on each disk controller.

We then have two fabric switches for redundancy to ensure that our SAN continues to recieve storage requests if a single fabric switch failes.

Following this through our ESXi Host has ports E1 & E2 on HBA1 and E3 & E4 on HBA2.  The SAN has S1 & S2 on disk controller 1 and S3 & S4 on disk controller 2.

From this we will end up with eight zones, as each zone has a single initiator and single target.

E1 to S1 via Fabric Switch 1
E1 to S3 via Fabric Switch 2
E2 to S2 via Fabric Switch 1
E2 to S4 via Fabric Switch 2
E3 to S1 via Fabric Switch 1
E3 to S3 via Fabric Switch 2
E4 to S2 via Fabric Switch 1
E4 to S4 via Fabric Switch 2

If your like me, then looking at a picture makes a lot more sense

Brocade produce a ‘Fabric Zoning Best Practices’ White Paper, which is the paper I tend to follow when implementing fabric zoning.

The white paper can be found here

Don’t forget that Fabric Zoning has nothing to do with LUN masking which is used to choose which servers are allowed to see which LUN.  For example in an vCenter environment you would normally want all of your hosts to be able to see all of the LUN’s for vMotion to work.  The only expection to this would be if you had multiple clusters where you would LUN mask each clusters hosts.