3PAR StoreServ 7000 Software – Part 2

hp-3par-storeserv-7000-1

Online Firmware Upgrades (OLFU)

Online Firmware Upgrades means that the storage controllers can be updated with minimum disruption to data being transferred by using NPVI as discussed in 3PAR StoreServ 7000 Software – Part 1

Call me old school, but I’m still not 100% comfortable performing firmware upgrades on SAN’s live.  However, I do appreciate that some businesses such as cloud providers have no other alternative.

3PAR introduced a new way of upgrading from 3.1.1 to 3.1.2 which is as follows:

1. Firmware is loaded into the Service Processor

2. Service Processor copies the new code onto all nodes

3. Each node is updated one by one

4. Each node continues to run the old firmware until every node has been upgraded

5. A copy of the old firmware is kept in the ‘altroot’ directory

The upgrade process has a timeout value of 10 minutes, a few commands to have in your upgrade toolbox are:

  • upgradesys -status
  • checkhealth
  • checkupgrade

Common reasons for a failed upgrade are:

  • Disks may have degraded or failed.
  • LUNs may not have Native and Guest ports configured correctly and therefore would not be able to connect on the alternative node.

If everything has gone wrong you can revoke the upgrade by issuing the command upgradesys -revertnode

Node Shutdown Active Tasks

3PAR OS 3.1.1 would not allow you to shut down a node with active tasks, which I think is a thing of beauty.  However with 3PAR OS 3.1.2 when you issue the command

shutdownnode reboot 0

you will be prompted asking if you are really sure you want to do this? If you answer yes then any active tasks are stopped and the node is rebooted.  I haven’t been able to test this yet, however my understanding is that some tasks will automatically resume after the node has been rebooted.

Delayed Export After Reboot

This is actually pretty handy especially if you need to make sure that certain LUN’s are presented in a particular order after a power outage.

I haven’t been able to think of a particular application that would require this feature, never the less, it’s still handy to know it can be used.

Online Copy

Online Copy is 3PAR’s term for making a clone of a virtual volume.  It allows for backup products to directly make snapshots of  virtual volumes reducing the impact of backups on VM’s.  Perhaps more importantly it allows for entire VM’s to be recovered more quickly rather than relying on instant restore mechanisms which run the backup from the backup target which ultimately results in performance degradation.  Veeam have announced support for this integration for Q1 2013

An enhancement with 3PAR OS 3.1.2 is that you no longer have to wait for the copy to complete before you can gain access to the copied volume.

So how does this work?

Step 1 – A read only and read write snapshot of the source volume is created.  A copy volume is then created which could be on lower tier of disks.

Online Copy 1

Step 2 – A logical disk volume (LDV) is created to map the copy volume to.  Next a thin provisioned virtual volume is created along with another LDV.

Online Copy 2

Step 3 – Region moves then take place from one LDV to another.

Online Copy 3

Step 4 – The copy volume can now be exported whilst the region moves continue in the background (that is pretty awesome).

Step 5  – Once the region moves complete the read only and read write snapshots are removed as well as the LDV’s.  Then last of all the thin provisioned volumes are removed.

Online Copy 4

3PAR believe that online copies are faster as they use the ‘region mover’, however no performance figures have been released to substantiate this.  A few things to note:

  • Online copies need a starting and end point and therefore it has to have an interruption in I/O
  • Online copies cannot be paused only cancelled
  • Online copy does not support snapshot copying
  • Online copy can only be performed via CLI

3PAR StoreServ 7000 Software – Part 1

Upgrading To Inform OS 3.1.2

Before beginning an upgrade of the 3PAR Inform OS to version 3.1.2 it is recommended to use the following guides:

When performing the upgrade, 3PAR Support will either want to be onsite or on the phone remotely.  They will ask for the following details:

  • Host Platform e.g. StoreServe 7200
  • Architecture e.g. SPARC/x86
  • OS e.g. 3.1.1
  • DMP Software
  • HBA details
  • Switch details
  • 3PAR license details

A few useful commands here are:

  1. showsys
  2. showfirmwaredb
  3. showlicense
  4. shownode

A quick run threw of the items that are recommended to be checked.  The first item is your current Inform OS version as this will determine how the upgrade has to be performed

Log into your 3PAR via SSH and issue the command ‘showversion’  This will give you your release version and patches which have been applied.

OS Version

Here our 3PAR’s is on 3.1.1 however it doesn’t specify if we have a direct upgrade path to 3.1.2 or not, see table below.

Upgrade Path

If going from 3.1.1 to 3.1.2 then Remote Copy groups can be left replicating, if you are upgrading from 2.3.1 then the Remote Copy groups must be stopped.

Any scripts you may have running against the 3PAR should be stopped, the same goes for any environment changes (common sense really).

The 3PAR must be in a health state and each node should be multipatted and a check should be undertaken to confirm that all paths have active I/O.

If the 3PAR is attached to a vSphere Cluster, then the path policy must be set to Round Robin.

Once you have verified these, you are good to go.

3PAR Virtual Ports – NPIV

NPIV allows an N_Port which is that of the 3PAR HBA to assume the identity of another port without multipath dependency.  Why is this important  Well it means that if the storage controller is lost or rebooted it is transparent to the host paths meaning connectivity remains albeit with less interconnects.

When learning any SAN, you need to get used to the naming conventions.  For 3PAR they roll with it is Node:Slot:Port or N:S:P for short.

Each host facing port has a ‘Native’ identify (primary path) and a ‘Guest’ identity (backup path) on a different 3PAR node in case of node failure.

Node Backup

It is recommended to use Single Initiator Zones when working with 3PAR and to connect the Native and Backup ports to the same switch.

S1 – N0:S0:p1

S2 – N:S0:P2

S3 – N0:S0:P1

S4 – N0:S0:P2

zoning

For NPIV to work, you need to make sure that the Fabric switches support NPIV and the HBA’s in the 3PAR do.  Note the HBA’s in the Host do not need to support NPIV as the change to the WWN will be transparent to the Host facing HBA’s.

How does it actually work? Well if the Native port goes down, the Guest port takes over in two steps:

  1. Guest port logs into Fabric switch with the Guest identity
  2. Host path from Fabric switch to the 3PAR uses Guest path

NPIV Failure

The really cool thing as part of the online upgrade, the software will check:

  1. Validate Virtual Ports to ensure the same WWN’s appear on the Native and Guest ports
  2. Validate that the Native and Guest ports are plugged into the same Fabric switch

Online Upgrade

If everything is ‘tickety boo’ then Node 0 will be shutdown and transparently failed over to Node 1.  After reboot Node 0 will have the new 3PAR OS 3.1.2 and then Node 1 is failed over to Node 0 Guest ports and Node 1 is upgraded.  This continues until all Nodes are upgraded.

When performing an upgrade it shouldn’t require any user interaction, however as we all know things can go wrong. A few useful commands to have in your toolbox are:

  • showport
  • showportdev
  • statport/histport
  • showvlun/statvlun

vExpert 2013 Applications Open

John Troyer, Social Media Evangelist for VMware (twitter handle @jtroyer)   has announced that the vExpert 2013 applications are open.

To become a vExpert you have three paths to choose from.

Evangelist Path

The Evangelist Path includes book authors, bloggers, tool builders, public speakers, VMTN contributors, and other IT professionals who share their knowledge and passion with others with the leverage of a personal public platform to reach many people. Employees of VMware can also apply via the Evangelist path. A VMware employee reference is recommended if your activities weren’t all in public or were in a language other than English.

Customer Path

The Customer Path is for leaders from VMware customer organizations. They have been internal champions in their organizations, or worked with VMware to build success stories, act as customer references, given public interviews, spoken at conferences, or were VMUG leaders. A VMware employee reference is recommended if your activities weren’t all in public.

VMware Partner Network Parh

The VPN Path is for employees of our partner companies who lead with passion and by example, who are committed to continuous learning through accreditations and certifications and to making their technical knowledge and expertise available to many. This can take shape of event participation, video, IP generation, as well as public speaking engagements. A VMware employee reference is required for VPN Path candidates.

To apply to become a vExpert, click me

3PAR StoreServ 7000 Hardware – Part 4

Let’s say that we have had our StoreServ in and running for a few months and everything has been ‘tickety boo’ until we have an error or as I prefer to call it a ‘man down’ scenario.

What are the issues we are going to encounter? Well these can be broken down into three areas.

1. Configuration Errors

Err we the awesome StoreServ administrator has configured the 3PAR in an unsupported manner.

2. Component Failure

Not so bad, as it wasn’t caused by us! We have a component failure e.g. DIMM, Drive etc

3. Data Path

We have an interconnect failure or perhaps even faulty e.g. SAS cable

In the following section we are going to cover these in a little more detail.

Configuration Errors

These would mostly come from incorrect cabling, adding more cages than is supported and adding a cage to the wrong enclosure.  The good news is that configuration errors are detected by the StoreServ and you will receive an alert.

Let’s say that you have cabled incorrectly, most likely if you loose a cage, then you will loose connectivity to all the other cages downstream.  The correct cabling diagram is shown below.

3PAR Disk Shelf Cabling

Fixing an issue where you have to many Disk Enclosures above the supported maximum e.g. six enclosure on a StoreServ 7200 two node, this is pretty simple, unplug it!

It’s pretty obvious really, but make sure that all your devices are supported, two which aren’t are:

  1. SAS-1
  2. SAS connected SATA drives

Component Failure

I think the first thing to remember is that connectivity issues can be caused by component failures.

Components can be broken down into two areas Cage and Data Path.  The good news is that if everything is cabled correctly we have dual paths.  The only exception to this is the back plane.

Any failure of a Cage component e.g. Power Supply, Fan, Battery, Interface Card, will result in an alarm and an Amber LED being displayed until the component can be replaced.

Right so what happens then if we have a back plane failure? Well if it’s the original StoreServe 7000 enclosure you want to shut the system down and phone HP!

If you a Disk Enclosure back plan failure then your choices are as follows:

  1. If you have enough space on existing disks, then the disks can be vacated and the back plane replaced.
  2. If you don’t have enough space on existing disks, but another Disk Enclosure can be added.  Then add another Disk Enclosure, vacate the disks and then remove the failed Disk Enclosure.
  3. If you have no space and you cannot add another Disk Enclosure, then err work quickly!

Data Path Faults

The data path is essentially the SAS interconnects.  It is comprised of:

  • SAS Controller or HBA
  • SAS Port
  • SAS Expander (Drive Enclosures)
  • SAS Drives
  • SAS Cables

W e have two types of ‘phy’ ports, narrow and wide.  Narrow consists of a single physical interconnect and wide consists of two physical interconnects.  I prefer working in pictures as they make more sense to me.

Data Path

We can see the SAS Controller and Disk Enclosures are connected via 4 x Wide Physical Ports (Phys).  Whereas the individual Disk Drives are connected to SAS Expander (Drive Enclosure) the  by a 1 x Narrow Physical Port (phys).

In exactly the same way as we can have ethernet alignment mismatches when negotiating e.g. 2 x 1 Gb links, one negotiates at 100 Mb Half Duplex the same occurrence can happen with SAS. eg. 4 x Wide Ports into 4 x Wide Ports and one port doesn’t negotiate correctly.

If you do receive a mismatch then this will result in poorer performance, CRC errors or device resets.

Perhaps one of the hardest issues to resolve are intermittent errors which only become apparent when the StoreServ is under load.  In the above scenario where we have 4 x Wide Ports connected to another 4 x Wide Ports but one port hasn’t negotiated correctly then it’s won’t be until we need to utilize 75% or more of the link that we experience the problem.  The good news is that these issues can be detected  in the ‘phy error log’.

To view the link connection speeds issue the command showport -c

Naturally the link speeds should represent your fabric interconnects.

showport

3PAR StoreServ 7000 Hardware – Part 3

This is where things start to pick up a bit as we venture onto adding the StoreServ 7000 into the Virtual Service Processor.

Browse to your VSP using the IP Address you configured in 3PAR StoreServ 7000 Hardware – Part 2 and login with your credentials.  A quick side note you may here the term SPOCC banded around quite a bit it stands for ‘Sevice Processor Onstie Customer Care’.  Any how, click on SPMaint

SP 1

Select Inserv Configuration Management

SP 2

Guess what we need to Add A New InServ

SP 3

Enter the IP Address of your StoreServ 7000

SP 4

Verify the details and click ‘Add New InServ’

SP 5

Man Down – Replacing a Failed Hard Drive

A slightly over exaggerated title, but I’m sure it grabbed your attention.

The StoreServ has a feature called ‘Guided Maintenance’ this essentially shows you how to perform a number of tasks e.g. replace a DIMM, Fiber Channel Adapter.  This can be found under Support > Guided Maintenance

SP 6

Perhaps the most common failure you will encounter is replacing a fauly disk.  This can be done via the CLI by SSH onto your StoreServ or via the VSP by going to SPMaint > Execute a CLI Command and entering ‘servicemag status’.

servicemag

As I don’t have a failed disk it shows ‘No servicemag operations logged’

If you did have a failed disk, you will be told which Cage and Magazine has a failure and that the Magazine has been taken offline to allow you to replace the faulty HDD.  Once you have replaced the disk, give it 15 minutes and re issue the servicemag status command and when complete you will see ‘No Servicemag operations logged’.

You can also check via the GUI in the 3pAR Inform Management Console by going to ‘System’ > Physical Disks > and then looking down the cages.

Failed Disk

Double check the HDD if Failed and that Free Capacity and Allocated Capacity is displayed as all zeroes.  If this is the case, then pop the badboy out and pop a new one in.

Man Down – Servicing a Power & Cooling Module (PCM)

This is only available via SSH onto your StoreServ or via the VSP by going to SPMaint > Execute a CLI Command

To confirm if the PCM is down issue the command shownode -ps

As you can see mine are OK, however, if you had a failure then replace the SPM and run the command again until you see both PCM are OK.  Note this can be done live without any downtime.

ShowNode PS

Man Down – Replacing and Power & Cooling Module (PCM) Battery

The Power and Cooling Module Battery is again only available via SSH onto your StoreServ or via the VSP by going to SPMaint > Execute a CLI Command.

The battery is located at the top of the  PCM.

To verify your battery status issue the command showbattery

showbattery

Again if it was failed replace the part and re issue the showbattery command to verify it’s healthy.

Drive Enclosure Expansion

The StoreServ 7200 is limited to five extra drive enclosures.  Two can be connected via DP1 and three can be connected via DP2.

The StoreServ 7400 with two nodes is limited to nine extra drive enclosures.  Four can be connected via DP1 and five can be connected via DP2.  Note these figures double to a four node StoreServ 7400.

You might be thinking why does DP2 have more connections? Well the answer if that DP1 is also responsible for the internal connections, which evens things out.

The procedure to add an additional drive enclosure is:

  1. Rack the Drive Enclosure
  2. Install Power & Cooling Modules
  3. Power On
  4. Install Hard Drives
  5. Run command ‘servicecage startfc’ this will move all I/O to Node 1 (remember Node 0 is the first Node)
  6. Connect the SAS cable, the first connection should be out IFC 0 and in IFC0 on the new Drive Enclosure
  7. Run command ‘servicecage endfc’ and this will restore I/O to Node 0.
  8. Repeat for same procedure for Node 1.
  9. Connect the Drive Enclosure to the Controller Nodes

One of the slightly tricky parts is the disk shelf cabling.  Some rules to follow:

  • Event Nodes go to Even Controllers
  • Odd Nodes go to Odd Controllers
  • Odd Nodes connect to the highest Disk Shelf first
  • Even Nodes connect to the lowest Disk Shelf first

3PAR Disk Shelf Cabling

Run the showcage command to verify you new Disk Enclosure is recognised.

showcage

Disk Upgrade Rules

These are the golden rules which need to be followed.

  1. You need to add the same number of disk drives to the Drive Enclosure as are in the Node Enclosure e.g. if you are using 24 disks in your Node Enclosure you will need to add 24 disks to your rive Enclosure.
  2. When adding disks to a StoreServ 7200 without a Disk Enclosure they should be done in pairs and placed in the lowest slots.  On a 2.5″ Disk Enclosure this is left to right.  On a 3.5″ Disk Enclosure this is per column left to right and top to bottom within the column.
  3. For a StoreServ 74000 without a Disk Enclosure four node system the same rules apply except you have to add four disks at a time.
  4. If you have a StoreServ 7200 with a Disk Enclosure.  You would need to add a minimum of four disks.  Two to the Node Enclosure and two to the Drive Enclosure.