System Logging Is Not Configured On Host ESXi5

System logging is not configured on host ESXi03, what does this mean?

Well on my ESXi5 hosts, I didn’t have any persistent storage as they where booting from USB, which means that when the host is rebooted all the log files disappear as they are held in RAM. Probably not a good idea then.

So how do we get around this? A number of ways can be used, however, I prefer to keep things simple.  Connect either to vCenter or ESXi Host and navigate to the Configuration Tab then onto Advanced Settings.

Next select Syslog from the left hand menu and then you want to enter the syntax as follows in Syslog.global.logDir

[DatastoreName]/log

Top Tip, the datastore name is case sensitive

So if your ESXi5 host is connected to a datastore called VMAPP01 then the syntax would be [VMAPP01]/log

Click OK to apply and let’s check the Summary Tab

Boom, the ‘system logging is not configured on host ESXi03’ has gone!

Setting Up & Configuring Alarms in vCenter 5 Part 2

In the previous post setting up and configuring alarms in vCenter 5 Part 1 we looked at the initial configuration.  We are now going to run through some of the default alarms, with some suggested thresholds.

Cannot Connect To Storage why would we want to configure this? Well essentially this is a per host setting.  If the host loses connection to the storage then the VM’s will be restarted using HA.  Big deal you say, I can see that in vCentre.  Well it also manages ‘lost storage path redundancy’ and ‘degraded storage path redundancy’ so if you have an if your ESXi host has multiple connections to it’s storage, you will be notified if one of these is lost.

Datastore Usage On Disk quite an important one.  From the presented LUN how much space has been provisioned as a Datastore.  I recommend always asking for slightly more than need e.g. if you need 1TB for a Datastore, ask for an extra 25%.  Then when the Datastore is provisioned only use 1TB so you have room for expansion quickly and easily if needed.  With this in mind, I set the Warning to 90% and Critical to 95% so I have some room to either more VM’s around either by Storage vMotion or Cold Migration.

Host CPU Usage with this alarm, I generally alert at Warning 75% for 15 mins and then Critical for 10 mins.  The rational behind this is that I would want to investigate the VM’s CPU utilisation to see if it is a one off event causing the high usage or if we need to look at introducing more processing power into the cluster.

Host Error perhaps the most important one, this is what vCentre relies on to monitor host alarms!

Host Memory Usage similar to CPU usage, I generally set Warning to 90% for 15 mins and Critical for 10 mins.  Again I would want to investigate the host memory usage to ensure that we have sufficient resources for a host failure.

Host Memory Status not be confused with ‘Host Memory Usage’ this monitors the physical DIMMS.

Host Process Status again not to be confused with ‘Hot CPU Usage’ this monitor the physical processor hardware.

License Capacity Monitor I like this alarm, it’s great for items such as Site Recovery Manager or Operations Manager.  It lets you know if you are trying to protect or manage more VM’s than you are licensed for.

Virtual Machine CPU Usage I use the same alarms settings for ‘Host CPU Usage’ so that if a VM is using more than 75% of it’s CPU capacity for over 15 minutes, I would want to identify if this is a one off or if extra resources are required.

vSphere HA Failover In Progress this resides on the nice to have.  If for some reason none of your other alarms work then at least you know that a VM has been restarted by HA.

vSphere HA Virtual Machine Monitoring Error this alarm works in conjunction with Virtual Machine Monitoring.  I tend to leave VM Monitoring Only and Medium and then change individual VM’s monitoring to High if required.  If you have this set to high for all servers then it can cause alarms when backup software rolls back snapshots depending on how big the VM is.

Hopefully these alarms shouldn’t need any explanation, as they should ALWAYS be enabled.

Host Battery Status
Host Connection And Power State
Host Connection Failure
Host Hardware Fan Status
Host Hardware Power Status
Host Hardware System Board Status
Host Hardware Temperature Status
Insufficient vSphere HA Failover Resources
Network Connectivity Lost
Network Uplink Redundancy Degraded
Network Uplink Redundancy Lost

Naturally, this isn’t a complete list of alarms, however it is the default alarms that I would configure in most, if not all environments.  Every environment is different and you may use more or less alarms than I have mentioned.

Don’t forget that depending on which vSphere licenses you have might see extra default alarms for items such as FT.  Also when you install additional components e.g. SRM you will get even more alarms to have a play around with.

Setting Up & Configuring Alarms in vCenter 5 Part 1

vCenter has some great inbuilt alarms which can trigger alerts via email or SNMP to the IT administrator   I have seen quite a few environments, where alarms haven’t been configured!  The obvious question is, is this due to lack of knowledge or do the administrators really check every item manually within vSphere? My guess is the earlier.

With this in mind, I thought I would go over the basic settings and then also what alarms/alerts I generally put in place along with some rational over the triggers.

The first thing we have to do is configure vCentre to send out email and SNMP alerts.  Go to Home > vCenter Servers Settings or to Top Menu Bar > Administration > vCenter Server Settings

Select Mail from the left hand side and enter your SMTP Server details.  Note that VMware does not support email authentication, so if you are using an Exchange 2003/2007/2010 I recommend you create a new receive connector called ‘vmware’.

Select SNMP from the left hand side and enter either the IP Address or DNS Name of your SNMP Server along with the community string needed to validate if any different from ‘public’

If you need the MIBS (Management Information Base) these can found at %ProgramFiles%\VMware\Infrastructure\VirtualCenter Server\MIBS if the default installation path has been used.

Alarms can be configured at a few different levels which are:

Root these alarms will encompass Datacentre, Cluster, ESXi Hosts, Resource Pools and VM’s

Datacentre these alarms will encompass Cluster, ESXi Hosts, Resource Pools and VM’s

Cluster these alarms will encompass ESXi Hosts, Resource Pools and VM’s

ESXi Hosts these alarms will encompass Resource Pools and VM’s

Resouce Pools these alarms will encompass the VM’s that reside within them.

VM these alarms are only specific to the virtual machine

Generally speaking, nearly all the alarms which I create are done at the root level which means that whatever actions are performed by the vCentre administrator, they should be covered.

vCentre allows you to configure actions for alarms based around set criteria.  When the alarm is triggered it can be configured to alert once or repeat

When the alarm triggers, it will do so when it enters a warning state e.g. Datastore Disk Usage Is Above 90% and then again when it hits a critical state e.g. Datastore Disk Usage Is Above 95%

So following this through, alarms can be triggered by the following events:

Normal Condition > Warning Condition
Warning Condition > Critical Condition
Critical Condition > Warning Condition
Warning Condition > Normal Condition

Alarms can be triggered if they meet ‘any’ of the conditions or ‘all’ the conditions you have set.

If you are a savy VMware Administrator you may ask the storage team for a 2TB LUN, but you only really need 1TB. So you provision a datastore at 50% capacity so you want to create a warning alarm when it reaches 75% provisioned and then critical at 90% provisioned, so you know when to ask for some extra space from the storage team.

With this in mind, imagine you had a single alarm which covered both Datastore Disk Usage (%) and Datastore Disk Provisioned (%).  However, I would always recommend using ‘trigger if any of the conditions are satisfied  unless you have a compelling reason not to do so.

So now we have configured vCenter to be able to send alerts, we need to configure some for it too send!  Hold fire until Part 2.

Virtual Machine Restart Priority

We are all guilty of doing this, we design and install a beautifully crafted vSphere 5 environment following best practises for HA, host isolation responses and we setup our admission control to meet the clients requirements.  When then pass the VMware environment back to the client to manage and maintain themselves.

The client has a hardware failure and the VM’s are restarted on an alternative host, excellent we say.  However the client is far from happy as we didn’t mention or configure ‘virtual machine restart priority’ and they encountered complications as the VM’s came up in the wrong order.

In essence virtual machine restart priority enables selected virtual machines to start before other virtual machines over riding the clusters default settings.  To configure virtual machine restart priority:

– Right Click Cluster
– Edit Settings
– Virtual Machine Options
– Virtual Machine Settings > VM Restart Priority

Lets look at the following scenario.

Scenario A

Client has VMware Standard licensing, which means they don’t have DRS.  They have two Exchange 2010 email servers, one running the CAS/Hub role and the other running Mailbox role.  They reside on the same host as someone thought this would a ‘good idea’.

The physical host fails and it’s a free for all for the VM’s to restart, as a result the CAS/Hub server comes up before the Mailbox server.  As a result Outlook Client connectivity, OWA and Active Sync take longer than anticipated to connect resulting in an extended downtime.

Scenario B

Same client has configured virtual machine restart priority with the following settings:

Mailbox server – High
CAS/Hub server – Medium

The VM’s restart in the right order and the client has less downtime.

Best Practices

Naturally every environment is different, but as a general rule of thumb, I recommend using the following guidelines.

Exchange

– CAS/Hub – High Priority
– Mailbox – Medium Priority

Domain Controllers

– If FSMO role holder – High Priority
– If Global Catalogue – High Priority

SQL

– SQL Server – High Priority
– Applications relying on SQL e.g. BES – Medium Priority

Citrix

– Data Collector – High Priority
– Web Server – Medium Priority
– License Server – Medium Priority
– Farm Members – Low Priority (as you want everything else to be up and running before users login).