Application Gateway WAF, does it Load Balance?

images

I was recently working on a project in which we where using an Application Gateway with WAF to send traffic to certain destinations based on URL path.

During a conference call with the application developer and a Microsoft Cloud Solution Architect I was asked the question, what are you going to use to load balance the backend pools?

I initially responded the WAF as this is polling the backend pool to determine which VMs to send traffic to, so logically should include a Load Balancer, but hold on a minute I have never seen any settings for Load Balance rules.  In comes that moment of doubt when someone from Microsoft questions you.

After trawling over the documents, I was able to find reference to load balancing on the main product overview along with internal load balance configuration, but what about external connections?

I was able to find this golden nugget of information, written by David Sanchez entitled Azure Application Gateway uses the Load Balancer.  This confirms that it is a built in to the Application Gateway by default using an algorithm to provide load balance services.

So in short, yes the Application Gateway WAF, does include a Load Balancer, it is just inbuilt and therefore shielded from configuration choices.

 

App Service Environment or Web App

I have been asked a couple of times when should you consider using an App Service Environment over a standard App Service Web App?

App Service Environment

An App Service Environment (ASE) provides an isolated and dedicated container to run a number of services such as:

  • Web Apps
  • Mobile Apps
  • Functions

An ASE does not replace an App Service Web App, it just provides a secure space for this to run.

At a high level you should consider using an ASE, if you meet one of the following conditions:

  • Access to the management plane is only available within your VNET and not from the internet
  • The Web App cannot be internet facing and therefore should be behind a Web Application Firewall
  • Communication from the Web App to PaaS DB Service should be secured within your VNET
  • Communication from the Web App to VM should be secured within your VNET

This can be logically explained in the diagram below.

Azure ASE v0.1

App Service Web App

An App Service Web App is the PaaS service which without the ASE is accessible directly from the internet.

The instances you run sit on shared compute, which may or may not be on the same physical server or rack.

At a high level, an App Service Web App can be integrated into other Azure services such as:

Final Thought

Depending on the requirements of the application and the business will determine if your App Service Web App should run on a standard PaaS tier or within an App Service Environment.

It should be noted that even though an App Service Web App running App Service Environment is considerably more expensive than a standard App Service Web App, you can run multiple App Services within the App Service Environment.

HP FlexFabric 10Gb 2-port 534FLB Adapter – Outstanding iSCSI I/O

534FLB AdapterProblem Statement

Use of HP FlexFabric 10Gb 2-port 534FLB Adapter in a vSphere design using dependent hardware iSCSI mode.

Need to find out the number of outstanding iSCSI I/Os that the adapter can handle, to ensure that number of concurrent SCSI commands has been taken into account.

Methodology

HP FlexFabric 10Gb 2-port 534FLB  is based on Broadcom 57810S chipset which uses the bnx2i driver/firmware (see March 2014 VMware FW and Software Recipe)

Broadcom’s bnx2i iSCSI driver is dependent hardware iSCSI (see Cormac Hogan excellent blog post on vSphere 5.1 Storage Enhancements – Part 5: Storage Protocols)

A dependent hardware iSCSI adapter is a third-party adapter that presents itself as a normal NIC, but has an iSCSI offload engine.  It requires the use of VMKernel interface, which is then tied to the vmhba (HBA).

Solution

The following applies to the Broadcom 57810S chipset:

  • Total outstanding iSCSI Tasks (I/O) per port = 4096 (4K)
  • Total iSCSI Sessions per port = 128 – 2048 depending on the Operating System (Host limited)

Each iSCSI Session facilitates communication with a different Target:

  •  Total of 512 outstanding iSCSI Tasks (I/Os) per Session

Therefore using HP FlexFabric 10Gb 2-port 534FLB Adapter we can have 1024 outstanding iSCSI Tasks across two adapters of 512 each.

vSphere Sizing Formula – Storage & Datastores

This blog post continues on from my previous blog post on vSphere Sizing Formula – CPU & RAM, this is again for me to share with you my methodology when it comes to sizing the requirements for a storage design.

I’m not going to go into specific storage protocols as this can be influenced by a multitude of design decisions.

Step 1 – Datastore Size

The answer to this isn’t a one size fits all, but let me give you my considerations.

Restore Time Objective

Your RTO objective should help determine define your VM size.  Let me explain what I mean, by walking over an example.

Your RTO time is an hour.  You have VM’s which are 2TB in size because they have multiple disk partitions in the guest operating system.  You currently backup using LTO-4 tapes which produce a restore rate of 120/MB’s.

Restore Rate x 60 Seconds x 60 minutes = Restore Amount

120 x 60 x 60 = 432,000 MB

Restore Amount = Resore Time Objective? Yes/No

432,000 MB (422GB) = No Restore Time Objective has been violated

Knowing this we would then dictate the VM would be changed to at least 5 x VM’s to meet the required RTO.

So this then determines our maximum VM size to be 422GB to meet RTO.

VM Size

Now we know our RTO and maximum VM size of 422GB, we then move onto the required space for our VM.

Maximum VM Size – Swap File = Actual Maximum VM Size

422GB – 8GB = 414GB

Buffer Space

Buffer space is the amount of space we need on the datastore for items such as:

  • Log Files
  • Snapshots
  • Storage vMotion (temporary move space)

Rule of thumb on this is 25%.

Queue Depth

Jason Boche (@jasonboche) wrote an excellent article which can affect performance called  VAAI and the Unlimited VM’s per Datastore Urban Myth

Queue Depth for HBA’s is defaulted at 32, please check with your vendor if this should be altered for a vSphere environment.

Queue Depth for Software iSCSI Adapter is defaulted at 128, again check with you vendor is this should be altered in a vSphere.

Active IO Per VM x Number Of VM’s = Overall Active IO

9 x 500 = 4500

Overall Active IO / Number Of Hosts = Average Active IO Per Host

4500 / 12 = 375

Average Active IO Per Host / Queue Depth – Growth (Spike) % =  VM’s Per Datastore

375 / 32 – 50% = 6

Datastore Size

VM’s Per Datastore x Actual Maximum VM Size + Buffer Size = Datastore Size

6 x 414GB + 25% = 3TB

Step 2 – Performance Data Collection

I tend to look at performance first then capacity second, I have discounted capacity for this blog post as this should be straight forward to work out based around the RAID type you have chosen to use.

For the basis of the data collection we are going to make the following assumptions.

Disk IOPS
7.2K SATA/NearLine SAS 75
10K SAS 125
15K SAS 150
SSD 2,500

Table To Show RAID Penalty

RAID Type Write Penalty
0 1
1 2
5 4
6 6
10 2

Read/Write Collection

This is obtaining the key metrics from your physical systems, application owners, current storage array network, or perhaps its from perfmon or VMware Capacity Planner.

Read MB/Sec + Write MB/Sec = Overall MB/Sec

83 MB/Sec + 12 MB/Sec = 95 MB/sec

Read MB/Sec / Overall MB/Sec = Read Percentage

83 MB/Sec / 95 MB/Sec = 87%

Write MB/Sec / Overall MB/Sec = Write Percentage

12 MB/Sec / 95 MB/Sec = 13%

Front End IOPS Collection

Front End IOPS are the I/O transfers per second that your collection tool will see.  In this case we will use 1,231 IOPS.

Back End IOPS Collection

Back End IOPS is the performance that your SAN/NAS needs to accommodate taking into account RAID write penalty.

Read % + (RAID Penalty x Write %) = RAID Penalty Percentage

RAID 1: 87% + (2 x 13%) = 113%

RAID 5: 87% + (4 x 13%) = 139%

RAID 6: 87% + (6 x 13%) = 165%

RAID 10: 87% + (2 x 13%) = 113%

RAID Penalty Percentage x Front End IOPS = Back End IOPS

RAID 1: 113% x 1,231  = 1,391 IOPS

RAID 5: 139% x 1,231  = 1,711 IOPS

RAID 6: 165% x 1,231  = 2,031 IOPS

RAID 10: 113% x 1,231  = 1,391 IOPS

Step 3 – Target Disk Requirements

This is the process of determining which disk types meet your performance requirements.

Back End IOPS / Disk IOPS = Number Required Hard Drives

Table To Show Hard Drives Per RAID Type


RAID Type

Back End IOPS

Disk IOPS

Number Required Hard Drives
10 1,391 75 IOPS (7.2K SATA) 19
10 1,391 125 IOPS (10K SAS) 12
10 1,391 150 IOPS (15K SAS) 10
1 1,391 2,500 IOPS (SSD) 2 (to meet RAID 1 disk requirements)
5 1,711 75 IOPS (7.2K SATA) 23
5 1,711 125 IOPS (10K SAS) 14
5 1,711 150 IOPS (15K SAS) 12
5 1,711 2,500 IOPS (SSD) 3 (to meet RAID 5 disk requirements)
6 2,031 75 IOPS (7.2K SATA) 28
6 2,031 125 IOPS (10K SAS) 17
6 2,031 150 IOPS (15K SAS) 14
6 2,031 2,500 IOPS (SSD) 4 (to meet RAID 6 disk requirements)

Note: RAID 1 hasn’t been included apart from for SSD due to IOPS performance.

Step 4 – Growth Requirements

This is the amount of performance or capacity increase that is required to meet business or application objectives, in this case we will use 50%.


RAID Type

Back End IOPS
 Growth %  Required IOPS
Disk IOPS

Number Required Hard Drives
10 1,391 50% 2,087 75 IOPS (7.2K SATA) 28
10 1,391 50% 2,087 125 IOPS (10K SAS) 17
10 1,391 50% 2,087 150 IOPS (15K SAS) 14
1 1,391 50% 2,087 2,500 IOPS (SSD) 2 (to meet RAID 1 disk requirements)
5 1,711 50% 2,567 75 IOPS (7.2K SATA) 35
5 1,711 50% 2,567 125 IOPS (10K SAS) 21
5 1,711 50% 2,567 150 IOPS (15K SAS) 18
5 1,711 50% 2,567 2,500 IOPS (SSD) 3 (to meet RAID 5 disk requirements)
6 2,031 50% 3,047 75 IOPS (7.2K SATA) 41
6 2,031 50% 3,047 125 IOPS (10K SAS) 25
6 2,031 50% 3,047 150 IOPS (15K SAS) 21
6 2,031 50% 3,047 2,500 IOPS (SSD) 4 (to meet RAID 6  disk requirements)

Considerations

Most storage vendors will have some kind of caching, you can use this to decrease the number of disks required or you can use it as an added performance bonus.

Latency hasn’t been taken into account, the rule for this is the less hops/distance travelled should equal less latency e.g. DAS is faster than SAN.

vSphere Sizing Formula – CPU & RAM

This is a blog post I have been meaning to do for a while, essentially the purpose behind it is for me to share the methodology I use for sizing the requirements for a physical ESXi host design.
I’m sure you know the information in this post isn’t exactly new, it is based from my own experiences and reading of a number of materials which includes:

VMware vSphere Design
 – Forbes Guthrie & Scott Lowe
Managing & Optimizing VMware vSphere Deployments – Sean Crookston & Harley Stagner
Designing VMware Infrastructure – Scott Lowe

Step 1 – Data Collection
This is obtaining the key metrics from your physical systems.  You might get this data manually using perfmon or using tools such as VMware Capacity Planner.
 
One thing to be wary of, is if you have a system with 100% utilization then you don’t always know what extra resources might be required.  For RAM it isn’t so bad as you can take into account paging, however with CPU, you have to make a judgement call.
 
CPU Data Collection
 
Average CPU per physical (MHz) x Average CPU Count = Average CPU per physical system
 
2,000MHz x 4 = 8,000MHz
 
Average CPU per physical system x Average peak CPU utilization (percentage) = Average peak CPU utilization (MHz)
 
8,000MHz x 12% = 960Mhz
 
Average peak CPU utilization (MHz) x Number of concurrent VM’s = Total peak CPU utilization (MHz)
 
960MHz x 50 = 48,000MHz
 
RAM Data Collection
 
Average RAM per physical (MB) x Average Peak RAM utilization (percentage) = Average peak RAM utilization (MB)
 
4,000MB x 52% = 2080MB
 
Average peak RAM utilization (MB) x Number of concurrent VM’s = Total peak RAM utilization (MB)
 
2080MB x 50 = 104,000MB
 
Step 2 – Target Host Specification
These are your target systems, remember to think about items such as server build limitations e.g. 2 Sockets with 6 Cores with Blades, DIMM slots and other factors such as license requirements or physical space which is available.
I also try and factor in maximum capacity at this point, some people like to call this head room or growth.
 
Host CPU Specification
 
CPU sockets per host x Cores per socket = Cores per host
 
2 x 6 =12
 
Cores per host x MHz per core = MHz per host
 
12 x 2,000MHz = 24,000MHz
 
MHz per host x Maximum CPU utilization per host (percentage) = CPU available per host
 
24,000MHz x 80% = 19,200MHz
 
Host RAM Specification
 
RAM per host x Maximum RAM utilization per host (percentage) = RAM available per host
 
80,000MB x 70% = 56,000MB
 
Step 3 – Number of Hosts
Based around the above details, we can now work out the number of hosts required to meet our needs for CPU, RAM and redundancy.

Hosts Per CPU Specification

Total peak CPU utilization (MHz) / CPU (MHz) per host = Number of hosts required for CPU
 
48,000MHz / 19,200MHz = 2.5 (round up) 3 Hosts
 
Number of hosts required for CPU + redundancy = Number of hosts for N+?
 
3 + 1 = 4
 
Hosts Per RAM Specification
 
Total peak RAM utilization (MB) / RAM (MB) per host = Number of hosts required for RAM
 
104,000MB / 56,000MB = 1.8 (round up) 2 Hosts
 
Number of hosts required for RAM + redundancy = Number of hosts for N+?
 
2 + 1 = 3
 
Tables
Depending on how you prefer to present or calculate out your solutions, you might prefer to work with tables.
 
Table to Show CPU & RAM Requirements 

Performance Metric

Recorded Value
Average number of CPU per physical system 4
Average CPU MHz 2,000 MHz
Average CPU Utilization per physical system 12% (960MHz)
Number of concurrent virtual machines 50

Total CPU resources for all virtual machines at peak

48,000MHz
Average amount of RAM per physical system 4,000MB
Average peak memory utilization per physical system 52% (2,080MB)
Number of concurrent virtual machines 50

Total RAM resrouces for all virtual machines at peak

104,000MB
  
Table to Show Host CPU & RAM Specification 

Attribute

Specification
CPUs sockets per host 2
Cores per CPU 6
MHz per CPU core 2,000MHz

Total CPU MHz per host

24,000MHz
Maximum CPU utilization per host (growth) 80%

CPU available per host

19,2000MHz
RAM per host 80,000MB
Maximum RAM utilization per host (growth) 70%

RAM available per host

56,000MB
 
Table to Show Hosts Per CPU & RAM Specification 

Attribute

Specification
Total peak CPU utilization 48,000MHz
CPU available per host 19,200MHz
Hosts required for CPU (round up) 3
Redundancy 1

Number of hosts for CPU & redundancy

4
Total peak RAM utilization 104,000MB
RAM available per host 56,000MB
Hosts required for RAM (round up) 2
Redundancy 1

Number of hosts for RAM & redundancy

3

Note: I haven’t included TPS savings as these will negate memory overhead for CPU.