vSphere 4.x Planning For VM Tools & Hardware Upgrade

In most environments the upgrade of VMware Tools and Hardware causes issues, not in terms of actually performing the upgrade, but the time taken to co-ordinate the activity.

In no particular order:

  • Change control for VM downtime
  • Ensuring you have a known good working backup (snapshots aren’t always accepted)
  • Align application team to test services after reboot
  • Knowing the VM dependencies on and the effect it will have on another VMs
  • Potential issues after upgrade E1000 and vSphere 5.x PSOD for example VMware KB 2059053

We are currently in vSphere 4.x upgrade season as it goes end of support life on 21st May 2014, which leads me onto the purpose of this post.

Problem Statement

vSphere 4.1 environment which had been upgraded from ESX 3.5.  VM Tools still running at ESX 3.5 level.

Customer needs to know how many reboots are required to bring them up to the highest vSphere 5.x version, to enable the internal co-ordination of application teams to test services after reboots.

Step 1

Oldest servers are HP BL480c G1 (yes that’s right G1) nothing like sweating an asset!

VMware Compatibility Guide shows that ESXi 5.1 U2 is supported, however this didn’t ring true as I remembered a support statement from HP about DL380 G5.  A quick check on the HP VMware Support Matrix shows that 5.0 U3 is the highest supported.

Step 2

VMware Tools and Hardware Upgrade, was causing concern as I could only find information on upgrading the VM Hardware version from 4 or 7 to 8, see VMware KB1010675

The interoperability  matrix shows that VMware Tools is not supported, but that didn’t answer my question.  Did I need to reboot once or twice to get VMware Tools to the newest version?

Interoperability

Solution

I created a VM using ‘custom’ so I could choose Hardware version 4 and installed Windows Server 2008 R2 from an ISO.

HW 4

VMware have all the old versions of VMware Tools located at over here package.vmware.com/tools a manual download and installation ESX 3.5p27 for Windows, this correlated to version 7304.

Note: VMware tools to ESXi version can be found here

After the usual reboot, I can confirm the following

  1. You can go straight from VMware Tools at version 3.5 to version 8 with a single reboot.
  2. An upgrade of hardware is possible straight from 4 to 8 (but we already knew that from the VMware KB).

Only thing left to do is inform the customer about the potential issue with E1000 vNIC’s and PSOD!

PHD Virtual – Special Olympics

AD_Special-Olympics_for-blogger-sitesThought I would share this as the folks over at PHD Virtual are making donations to the Special Olympics if you download a Free Trial before 31st March 2014.

To take part, simply download any PHD or Unitrends free trial and PHD Virtual will donate $1.00 towards the Special Olympics:

  • PHD Virtual Backup and Replication for VMware, Hyper-V and Citrix
  • Unitrends Enterprise Backup for physical and virtual environments
  • PHD Virtual ReliableDR for automated failover, failback and testing
  • PHD Virtual Recovery Management Suite

*Donations will be matched to the number of trial downloads up to $5,000 total

To get involved and help but some cash towards a great sporting event follow this link

vSphere 5.1 – Unable To Deploy VM From Template

Problem Statement

After upgrading to vSphere 5.1 you experience the following error ‘the public key in specification does not match the vCenter public key.  You have to reenter the password in order to proceed’ when trying to deploy new virtual machines from templates.

Error

Issue

The administrators password used to create the virtual machine was encrypted using a different Certificate Authority to the one installed on vCenter.  This means that the password is no longer trusted.

Resolution

In vCenter go to Home > Customization Specification Manager

Customization Specification 01

Edit Specification > 5 Administrator Password > Re Enter Credentials > Save

Customization Specification 02

If your Customization Specification is domain joined you will need to enter your credentials again under 9 Workgroup or Domain

Customization Specification 03

You will now be able to deploy your VM from the template in question.

vCOPS for View Licensing

Having not deployed vCenter Operations Manager for View in the ‘wild’ I wasn’t sure of the licensing model.  After some research and help from the community I was able to answer my questions, so thought I would ‘pay it forward’ and put together a blog post.

vCOPS for View

Q. What monitoring does it include?

A.  The ability to monitor the number of Horizon View desktops that you have purchased.  Also included is monitoring of your Connection and Security Servers.

Q. Do I need to purchase vCOPS separately as a management portal?

A. No, this is included

Note The vCOPS portal is specific to View and does not give you the ability to monitor your vSphere environment except for Connection and Security Servers.

Q. I want to use my existing vCOPS to monitor View what version do I need?

A. At least Advanced edition.

Note I changed my Standard vCOPS license to Enterprise on a free trial key and then added the View adapter.  I then reverted back to the Standard license to see what would happen,  Unfortunately, you receive the error message ‘this product is unlicensed or cannot connect to the vSphere Server.  Use a vSphere Client to connect to the vCenter Server and assign a license key’.

vCOPS Error

 Thanks to the following chaps from Twitter for their input:

  • Sunny Dua @Sunny_Dua
  • Michael Armstrong @m80arm
  • Hersey Cartwright @herseyc
  • Thomas Brown @thombrown

Load Balancing Horizon View – Failure Testing

In the last post Load Balancing Horizon View – Design we looked at the differences between DNS Round Robin, Windows Network Load Balancing and Load Balancers and the design concepts for internal and external use.

In this post we will focus on testing failure scenarios to understand the impact of various components failing within a design.

Lab Setup

The Horizon View environment is configured as follows:

  • 2 x NetScaler VPX-Express in High Availability
  • 2 x Horizon View Security Servers
  • 2 x Horizon View Connection Servers

For the NetScaler configuration I followed the excellent Load Balancing VMware View with NetScaler guide by Dale Scriven who runs the blog vhorizon.co.uk.  The only addition to this was an additional TCP Service group for 8443 (HTML5).

Service Groups

In the interests of sharing the configuration, below are extracts from each area.

Internal Logical Design

VMFocus View Internal Design HA v0.1

External Logical Design

VMFocus View Remote Access Design HA v0.1

vSphere Web Client

vSphere Client View

Horizon View Administrator

View Client View

NetScaler VPX-Express Admin

NetScalerClient View

Internal Connection Server Failure Scenario – Secure Gateway/Connection Unticked

Connection Server Unticked

I will have a two connections to my Desktop Pool, both via View Client.

Table to Show Expected Results – Internal Connection Server Failure – Secure Gateway/Connection Unticked

Criteria Expected Result Recovery Time
Connection Server Power Off Desktop remains connected n/a
Connection Server Shut Down Desktop remains connected n/a
NetScaler VPX-Express Power Off Desktop remains connected n/a
NetScaler VPX-Express Shut Down Desktop remains connected n/a

Table to Show Actual Results – Internal Connection Server Failure – Secure Gateway/Connection Unticked

Criteria Actual Result Recovery Time
Connection Server Power Off Desktop remains connected n/a
Connection Server Shut Down Desktop remains connected n/a
NetScaler VPX-Express Power Off Desktop remains connected n/a
NetScaler VPX-Express Shut Down Desktop remains connected n/a

Not much to say really, everything performed as expected.

Internal Connection Server Failure Scenario – Secure Gateway/Connection Ticked

Connection Server Ticked

Again, I will have a two connections to my Desktop Pool, both via View Client.

Table to Show Expected Results – Internal Connection Server Failure – Secure Gateway/Connection Ticked

Criteria Expected Result Recovery Time
Connection Server Power Off Desktop session disconnect, then manual reconnect 20 seconds
Connection Server Shut Down Desktop session disconnect, then manual reconnect 25 seconds
NetScaler VPX-Express Power Off Desktop session disconnect, then manual reconnect 20 seconds
NetScaler VPX-Express Shut Down Desktop session disconnect, then manual reconnect 25 seconds

Table to Show Actual Results – Internal Connection Server Failure – Secure Gateway/Connection Ticked

Criteria Actual Result Recovery Time
Connection Server Power Off Desktop session disconnected after 2 seconds, manual reconnect 28 seconds to be logged back into desktop
Connection Server Shut Down Desktop session disconnected after 4 seconds, manual reconnect 35 seconds to be logged back into desktop
NetScaler VPX-Express Power Off Desktop session disconnected after 5 seconds, manual reconnect 33 seconds to be logged back into desktop
NetScaler VPX-Express Shut Down Desktop session disconnected after 9 seconds, manual reconnect 41 seconds to be logged back into desktop

The Citrix NetScaler VPX offer high availability for the sharing of configuration and virtual IP address. They do not provide no session loss between appliance failure.

External Failure Scenario Expected Results

I will have a three connections to my Desktop Pool, two via View Client, one via Blast (HTML5) and the last via View Client.  The Horizon View Administrator will be checked before each test to see which Security Server has the heaviest load and this one will form the test.

View Test

After each test Horizon View Administrator will be checked to find which Security Server has the heaviest load to perform the next test.

Criteria Expected Result Recovery Time
Security Server Power Off Desktop session disconnect, then manual reconnect 40 seconds
Security Server Shut Down Desktop session disconnect, then manual reconnect 40 seconds
Connection Server Power Off Desktop session disconnect, then manual reconnect 40 seconds
Connection Server Shut Down Desktop session disconnect, then manual reconnect 40 seconds
NetScaler VPX-Express Power Off Desktop session disconnect, then manual reconnect 60 seconds
NetScaler VPX-Express Shut Down Desktop session disconnect, then manual reconnect 60 seconds

External Failure Scenario Actual Results

Criteria Actual Result Recovery Time
Security Server Power Off Desktop session disconnected after 14 seconds, manual reconnect 52 seconds to be logged back into desktop
Security Server Shut Down Desktop session disconnected after 12 seconds, manual reconnect 55 seconds to be logged back into desktop
Connection Server Power Off Desktop session disconnected after 19 seconds, manual reconnect 109 seconds reconnected, black desktop background.  Timeout message 134 seconds.  Second reconnect, 252 seconds reconnected, black desktop background.  Timeout message 283 seconds. Loop via View Client.  Can connect via Blast (HTML5) to desktop.
Connection Server Shut Down Desktop session disconnected after 24 seconds, manual reconnect 118 seconds reconnected, black desktop background.  Timeout message 141 seconds.  Second manual reconnect, 276 seconds reconnected, black desktop background.  Timeout message 301 seconds. Loop via View Client.  Can connect via Blast (HTML5) to desktop.
NetScaler VPX-Express Power Off Desktop session disconnected after 4 seconds, manual reconnect 39 seconds to be logged back into desktop.
NetScaler VPX-Express Shut Down Desktop session disconnected after 19 seconds, manual reconnect 57 seconds to be logged back into desktop.

When a View Client connects externally, the NetScaler VPX passes traffic to the least loaded Security Server.  Remember a Security Server is bound to a single Connection Server and that ALL traffic is proxied via the Security Server.

When first Security Server fails you are disconnected (as expected). When the View Client is launched again the NetScaler VPX routes traffic via the secondary Security Server and the secondary Connection Server.

  1. Everything OK NetScaler > Security Server 01 > Connection Server 01 > Desktop
  2. Failed Security Server NetScaler > Security Server 01 > No Access To Connection Server 01
  3. Reconnect NetScaler > Security Server 02 > Connection Server 02 > Desktop

What I found most interesting was the Connection Server failures. In this scenario, the Security Servers are up and a Connection Server goes down.

Trying to reconnect to via the View Client, enables you to authenticate successfully, but you receive a ‘black desktop screen’ and then a connection time out.

Looking at the connection status of the NetScaler VPX-Express services, only the HTTPS SSL Bridge to 443 on Security Server 01 is down and the rest of the services are up.

Failure Connection Server Power Off 01

When the NetScaler VPX polls the Security Server on 443 HTTPS, 4172 TCP and 4172 UDP it sees that the PCoIP services on 4172 are up and tries to reconnect back to the original TCP session, due to the fact that our Persistency Group is Source IP and that we are connecting back over the same ports.

Connecting via Blast HTTPS 8443 works, I imagine this is due to a new TCP connection being established to Security Server02, which in turn connects via Connection Server 02 which is up.

Disconnecting from the Blast Desktop, I was able to reconnect to my desktop using View Client.

Final Word

Hopefully this post has gone someway to helping you understand the failure scenarios .  Knowing what to expect is key as it allows you to set expectations to both the business and users.