Microsoft Azure Concepts – Networks

The purpose of this post is to explain the different networking options with Azure, it is meant to be an overview and not a deep dive into each area.

Endpoints

Endpoints are the most basic configuration offering when it comes to Azure networking.  Each virtual machine is externally accessible over the internet using RDP and Remote PowerShell. Port forwarding is used to access the VM.  For example 12.3.4.1:6510 resolves to azure.vmfocus.com which is then port forwarded to an internal VM on 10.0.0.1:3389

Azure Input Endpoints

  • Public IP Address (VIP) is mapped to the Cloud Service Name e.g. azure.vmfocus.com
  • The port forward can be changed if required and additional services can be opened or the defaults of RDP and Remote PowerShell can be closed
  • It is important to note that the public IP is completely open and the only security offered is password authentication into the virtual machine
  • Each virtual machine has to have an exclusive port mapped see diagram below

Azure Input Endpoints Multiple VM

Endpoint Access Control Lists

To provide some mitigation to having virtual machines completely exposed to the internet, you can define an basic access control list (ACL).  The ACL is based on source public IP Address with a permit or deny to a virtual machine.

  • Maximum of 50 rules per virtual machine
  • Processing order is from top down
  • Inbound traffic only
  • Suggested configuration would be to white list on-premises external public IP address

Network Security Groups

Network Security Groups (NSG) are essentially traffic filters.  They can be applied to ingress path, before the traffic enters a VM or subnet or the egress path, when the traffic leaves a VM or subnet.

  • All traffic is denied by default
  • Source and destination port ranges
  • UDP or TCP protocol can be defined
  • Maximum of 1 NSG per VM or Subnet
  • Maximum of 100 NSG per Azure Subsription
  • Maximum of 200 rules per NSG

Note: You can only have an ACL or NSG applied to a VM, not both.

Load Balancing

Multiple virtual machines are given the same public port for example 80.  Azure load balancing then distributes traffic using round robin.

  • Health probes can be used every 15 seconds on a private internal port to ensure the service is running.
  • The health probe uses TCP ACK for TCP queries
  • The health probe can use HTTP 200 responses for UDP queries
  • If either probe fails twice the traffic to the virtual machine stops.  However the probe continues to ‘beacon’ the virtual machine and once a response is received it is re-entered into round robin load balancing

Azure Load Balancing

Virtual Networks

Virtual networks (VNET) enable you to create secure isolated networks within Azure to maintain persistent IP addresses.  Used for virtual machines which require static IP Addresses.

  • Enables you to extend your trust boundary to federate services whether this is Active Directory Replication using AD Connect or Hybrid Cloud connections
  • Can perform internal load balancing using internal virtual networks using the same principle as load balancing endpoints.
  • VLAN’s do not exist in Azure, only VNETs

Hybrid Options

This is probably the most interesting part for me, as this provides the connectivity from your on-premises infrastructure to Azure.

Point to Site

Point to site uses certificate based authentication to create a VPN tunnel from a client machine to Azure.

  • Maximum of 128 client machines per Azure Gateway
  • Maximum bandwidth of 80 Mbps
  • Data is sent over an encrypted tunnel via certificate authentication on each individual client machine
  • No performance commitment from Microsoft (makes sense as they don’t control the internet)
  • Once created certificates could be deployed to domain joined client devices using group policy
  • Machine authentication not user authentication

Azure Point to Site

Site to Site

Site to site sends data over an encrypted IPSec tunnel.

  • Requires public IP Address as the source tunnel endpoint and a physical or virtual device that supports IPSec with the following:
    • IKE v1 v2
    • AES 128 256
    • SHA1 SHA2
  • Microsoft keep a known compatible device list located here
  • Requires manual addition of new virtual networks and on-premises networks
  • Again no performance commitment from Microsoft
  • Maximum bandwidth of 80 Mpbs
  • The gateway roles in Azure have two instances active/passive for redundancy and an SLA of 99.9%
  • Can use RRAS if you feel that way inclined to create the IPSec tunnel
  • Certain devices have automatic configuration scripts generated in Azure based

Azure Site to Site

Express Route

A dedicated route is created either via an exchange provider or a network service provider using a private dedicated network.

  • Bandwidth options range from 10 Mbps to 10 Gbps
  • Committed bandwidth and SLA of 99.99%
  • Predictable network performance
  • BGP is the routing protocol used with ‘private peering’
  • Not limited to VM traffic also Azure Public Services can be sent across Express Route
  • Exchange Providers
    • Provide datacenters in which they connect your rack to Azure
    • Provide unlimited inbound data transfer as part of the exchange provider package
    • Outbound data transfer is included in the monthly exchange provider package but will be limited
  • Network Service Provider
    • Customers who use MPLS providers such as BT & AT&T can add Azure as another ‘site’ on their MPLS circuit
    • Unlimited data transfer in and out of Azure

Azure Express Route

Traffic Manager

Traffic Manager is a DNS based load balancer that offer three load balancing algorithms

  • Performance
    • Traffic Manager makes the decision on the best route for the client to the service it is trying to access based on hops and latency
  • Round Robin
    • Alternates between a number of different locations
  • Failover
    • Traffic always hits your chosen datacentre unless there is a failover scenario

Traffic Manager relies on mapping your DNS domain to x.trafficmanager.net with a CNAME e.g. vmfocus.com to vmfocustm.trafficmanager.net. Then Cloud Service URL’s are mapped to global datacentres to the Traffic Manager Profile e.g. east.vmfocus.com west.vmfocus.com north.vmfocus.com Azure Traffic Manager

Microsoft Azure Concepts – Failures

One of the key concern areas for clients who are considering migrating workloads into Microsoft Azure is failures.  Why would this be a concern, isn’t that the responsibility of Microsoft to ensure that they meet the 99.9% or greater SLA?  Well the answer is no.

It is up to you to ensure that your applications are ‘cloud ready’ and can be split between fault and update domains to achieve the stated 99.95% SLA.

This means the onus is on you to ensure that your application is split across geographic locations with multiple instances.  Ensuring that global site load balancing is in place along with data integrity and zero data loss if you loose an instance member.  Of course all of your on-premises applications have been designed to be cloud ready, erm yeah right!

So knowing that most of our on-premises applications aren’t designed to be ‘cloud ready’ what is the impact and expected behaviour outside of Microsoft’s mandated SLA with availability sets?

Fabric Controller

This is where we need to introduce the Azure Fabric Controller.  Each Microsoft Azure datacentre is split into clusters which are a grouping of racks.  These provide compute and storage resources. Each cluster is managed by a Fabric Controller which is a distributed stateful application running across servers spread across racks.  The purpose of the Fabric Controller is to perform the following operations:

  • Co-ordinates infrastructure updates across update domains
  • Manages the health of the compute services
  • Maintains services availability by monitoring the software and hardware health
  • Co-ordinates placement of VM’s in Availability Sets
  • Orchestrates deployment across nodes within a cluster

Fabric Controller

The Fabric Controller receives heartbeats from the physical host and also the guest virtual machines running on the host.

Fabric Controller Agents

Now that we understand the architecture, let’s cover a couple of failure scenarios.

Guest VM Unresponsive

If the Fabric Controller fails to receive a number of heartbeats from the Guest VM, then it is restarted on the same physical host.

Physical Host Failure

In the event of a physical host failure, the virtual machine is powered on a different physical host.  To do this your virtual machine must be protected by Locally Redundant Storage (LRS maintains three copies of synchronous data within the same datacentre).

The Fabric Controller determines which compute node has the same level of storage that your original VM was on and then powers on the read only VHD and changes it to read/write.

Final Thought

To achieve the 99.95% SLA you need applications which are ‘cloud ready’.  However you are still protected against Guest VM and Physical Host failures in the same way that you use on-premises vSphere or Hyper-V HA.  However as mentioned in this post, Microsoft does not provide an SLA against this.

Interestingly Microsoft does not provide an SLA against a datacentre failure.  It is only when Microsoft declares a datacentre lost that the geo-replicated copies of your storage become available.  Due to this it is important that you understand that you have zero control over the datacentre failover process.

Microsoft Azure Concepts – Backups

Backups are really important when it comes to returning service after an issue or for meeting compliance or regulatory requirements.  The inability to recover from loss of data can make a business bankrupt in a short space of time.

You might say that having your data in the cloud which is highly available makes backups someone else’s problem.  Well that isn’t correct, what happens if you have data corruption or failure of an service or application after an update or perhaps a virus?  Having two copies of the data just means you have two corrupted copies!

We need to be able to go back in time to recover from an unplanned event.  This is where Microsoft Azure Backup steps into the ring!

What Is Azure Backup?

In a nutshell Azure Backup enables you to backup on-premises or Azure virtual machines using your Azure subscription.  This may sound a bit bizarre but until recently there was no supported way to backup Azure virtual machines.

When you initially create your Azure Backup Vault you are able to decide if you want your backups to be locally or geo redundant.

Azure Backup v0,1

Backing Up Azure Virtual Machines

Backing up Azure virtual machines is fairly straight forward, it’s a three step process which is intuitive in the Azure Backup blade.

  1. Discover the virtual machines you want to backup
  2. Apply a backup policy to the virtual machines
  3. Backup the virtual machines

Each VM that you want to protect uses the Azure VM agent to co-ordinate backup tasks.  The best way to think of the Azure VM agent is an extension within the virtual machine.  When a backup is triggered the Azure VM agent leverages VSS to take a point in time snapshot of the VM.  This data is then transferred to the Azure Backup Vault.

Azure VM Backup

 

Few things I should call out which are limitations of the current version of Azure Backup, these are:

  • Backing up V2 Azure Virtual Machines is not supported (created by Resource Manager)
  • Unable to backup VM’s with Premium storage
  • On restore you have to delete the original VM and then restore it
  • Restoring VM’s with multiple NICs or that perform Domain Controller roles is only available via PowerShell

Backing Up On-Premises Virtual Machines

When Microsoft do things, they don’t like to mess about.  What do I mean? Well they are giving you an enterprise backup solution for the cost of an Azure Storage Account!

They don’t even stop at that, they enable you to back up your on-premises virtual machines to a disk first and then if you want backup to your Azure Vault after this.

Essentially, it’s Data Protection Manager, but with some of the functionality removed.  After creating an Azure Backup Vault you are entitled to download and install the Azure Backup Server software too an on-premises server.  To push out the DPM Agent to virtual machines you need to enter an account that has local administrators rights over these VM’s.

So what functionality is missing?

  • Azure Backup Server supports 50 on-premises virtual machines, so a new Azure Backup Vault is required with new vault credentials.  This intern means an on-premises Azure backup Server
  • Azure Backup Server does not support tape, instead it uses Azure Backup Vault for archiving
  • Azure Backup Server does not allow you to manage multiple Azure Backup Servers from a single console (think 200 VM’s being backed up you would have to login to at four consoles)

The supported operating systems are shown in the table below.  Note that you can only backup to disk with the client OS as these are unsupported in Azure.

Operating System Platform SKU
Windows 8 and latest SPs 64 bit Enterprise, Pro
Windows 7 and latest SPs 64 bit Ultimate, Enterprise, Professional, Home Premium, Home Basic, Starter
Windows 8.1 and latest SPs 64 bit Enterprise, Pro
Windows 10 64 bit Enterprise, Pro, Home
Windows Server 2012 R2 and latest SPs 64 bit Standard, Datacenter, Foundation
Windows Server 2012 and latest SPs 64 bit Datacenter, Foundation, Standard
Windows Storage Server 2012 R2 and latest SPs 64 bit Standard, Workgroup
Windows Storage Server 2012 and latest SPs 64 bit Standard, Workgroup
Windows Server 2012 R2 and latest SPs 64 bit Essential
Windows Server 2008 R2 SP1 64 bit Standard, Enterprise, Datacenter, Foundation
Windows Server 2008 SP2 64 bit Standard, Enterprise, Datacenter, Foundation

Final Thoughts

So why are Microsoft doing this?  My thoughts are they want customers to start using Azure storage to replace on-premises tapes.  For those who are used to DPM, this could be a natural extension to your existing backup policy.

Microsoft Azure Concepts – Virtual Machines

Virtualisation has been around for a number of years, with companies such as VMware being formed in 1998.  We are used to the abstraction of compute resources from the underlying hardware to provide logical isolation of virtual machines driving greater consolidation on providing a better return on investment.

Microsoft have taken the concept of a virtual machines and broken it down further to enable the consumption of Azure resources to be easier depending on the purpose of the virtual machine.  To explain this further lets delve into the three types of Azure Virtual Machines.

App Service

The purpose of the App Service is to allow the developer to create and manage websites and applications without needing to worry about managing and maintaining the underlying operating system.

Depending on the App Service Plan chosen, will entitle you to a number of features for your website which are:

  • Amount of disk space
  • SLA
  • Auto Scale
  • Geo Distributed Deployment
  • Custom Domain
  • Staging Environment

Most businesses would choose the Standard Tier as this provides an SLA of 99.95% along with Auto-Scale.  More information on the types of App Service plans can be found here.

Cloud Services

Azure Cloud Services are the middle ground between an App Service and a traditional virtual machines.

Cloud Service virtual machines provide two options which are:

  • Web Role which runs Windows Server and IIS
  • Work Role which runs Windows Server without IIS

When you create a Cloud Service you specify how many of each type you require and the underlying virtual machine, configuration and patching is performed by Microsoft.  You do however get the ability to RDP onto the virtual machine and install services.

It is important to note that Cloud Service virtual machines do not provide persistent storage, so this needs to be handled in the application architecture.  They do however support load balancing, and fault tolerance via in built availability sets.

Cloud Servies virtual machines use the same categories of pricing and specification as tradtional virtual machines which I will cover later in this blog post.

Virtual Machines

Azure Virtual Machines provide the most control.  You are responsbile for deployment, patching, load balancing and availability of the operating system and in-guest application.  Pretty much the same as when you run virtual machines on premises.

Each virtual machines comes with a persistent operating system disk and a non-peristent temporary drive which can be used to store data or perform storage related tasks whilst the VM is powered on.

The diagram below shows the complexity of management per Azure virtual machine type.

Azure VM Management v0.1

Great stuff, now we understand the types of virtual machines, what about virtual machines sizes, performance and capacity?

Virtual Machine Sizes

When a number of virtual machines are accessing the same underlying physical hardware, there has to be winners and loosers when resources become contended.  Depending on the type of virtual machine you purchase, will result in the performance characteristics available.

In Azure, Microsoft break down virtual machines into three categories which are A, D and G series.

A Series virtual machines are designed to run your standard everyday workload. They provide upto 500 IOPS per disk and can support up to the following specifications:

  • 56GB RAM
  • 8 CPU Cores
  • 4 NICs
  • 16 Data Disks at 500 IOPS each

Microsoft offer a subset of subset of the A Series called Compute Intensive which provide up to the following specifications:

  • 112GB RAM
  • 16 CPU Cores
  • 4 NICs
  • 16 Data Disks at 500 IOPS each

D Series virtual machines are designed to run more intensive workloads with the key difference being that the temporary disk is backed by SSD.  They provide upto 500 IOPS per disk and can support up to the following specifications:

  • 112GB RAM
  • 16 CPU Cores
  • 8 NICs
  • 32 Data Disks at 500 IOPS each
  • 800GB Temporary SSD

Microsoft offer the Dv2 Series virtual machine which provides a slightly better specification CPU.  The more interesting subset of the D Series is the DS.  This category provides shared SSD for the operating system disk.  Specifications are up to the follwing:

  • 112GB RAM
  • 16 CPU Cores
  • 8 NICs
  • 32 Data Disks at 224GB and 50,000 IOPS each with 512MB bandwidth

G Series virtual machines are massive and provide a huge amount of compute power, with specifications as follows:

  • 448GB RAM
  • 32 CPU Cores
  • 8 NICs
  • 64 Data Disks at 500 IOPS each
  • 6,144GB Temporary SSD

Again we have a subset of the D Series which is the DS Series.  This provides the same specification with shared storage the operating system disk.

The diagram below shows different virtual machines sizes.

Azure Virtual Machines Sizes v0.1

More information on Azure Virtual Machine sizes can be found here.

 

Final Thoughts

It is clear that Microsoft have given alot of thought to Azure Virtual Machines with different catergories catering for different levels of ownership and performance.  It is worthwhile reviewing the Azure Subscription Limits, Quotas and Constraints documentation to ensure that your virtual machines aren’t constrained by an unknown factor.

Microsoft Azure Concepts – Storage

When I think about storage, I normally visualize hard drives in client devices, some type of enterprise shared storage or a hyper-converged appliance that supports a single person or business towards meeting it’s requirements for storage capacity, performance, integrity and recoverability.

To talk about Azure Storage, I need to shift my perspective slightly.  Azure Storage is truly a web scale storage solution currently supporting over 50 Trillion or 50,000,000,000,000,000,000 objects which is alot!

To understand Azure Storage we need to understand how it all fits together.  When you have cloud multi-tenancy you need a way to ensure that only you have access to your own data (unless of course you choose to share it).  This is the fundamental underpin of Azure Storage which is your Azure Storage Account.

Azure Storage Account

An Azure Storage Account is the gateway to accessing storage in Azure.  When created you receive a unique namespace which is linked to the type of storage your are going to use for example http://storagevmfocus.blob.core.windows.net.  This then in turn links to your storage billing which is based around four factors:

  1. Storage usage
  2. Replication of data
  3. Read and write operations
  4. Data transferred out to other Azure regions

Azure Storage Types

Now that we know that the starting point is an Azure Storage Account, what types of storage can Azure offer?  These are broken down into four areas (I’m beginning to think Microsoft link the number four) which are Blob, Table, Queue and File.

Azure Storage Account

Blob Storage

Blobs is the name given to Microsoft’s cost effective cloud storage.  It is used to store large amounts of unstructured data for example:

  • Azure VM Hard Drives
  • Documents
  • Media Files
  • Backup Files

Blobs are further broken down into different categories this is to ensure that the storage is optimized for it’s intended workloads.

Azure Blob Storage

Table Storage

Table storage is provided by Microsoft’s NoSQL which is a distributed scale out store. Essentially its a repository for metadata that is captured and then needs to be accessed quickly. Example use cases for Table Storage is shown in the diagram below.

Azure Table Storage

Queue Storage

Queue storage is essentially a reliable messaging solution that passes information between different tiers of an application.

Azure Queue Storage

When held in the ‘queue’ data is kept until it is passed ‘a synchronously’ to the application.  Example use cases for Queue Storage are:

  • Communication between Websites and Applications
  • Hybrid communications between on-premises and Azure applications

File Storage

File storage is your traditional SMB 2.1 or 3.0 file share that we are used to accessing on a daily basis for example \vmfocuscustomer_files.  Access to the file share can be granted from on-premises using the net use command with the storage key for example

net use z: \vmfocusprodstorage.file.core.windows.netvmfocusfileshare /u:vmfocusprodstorage  m1G1Xatnb9NgzEjCrx1gBtQ/xpyFR4N71i6imkt38VvKCWB2bK9X==

This can then be added to a group policy login script for users.

An application would access the file share from an on-premises location using the REST API.

My understanding is that SMB 3.0 file shares provide encryption and persistency they do not provide Role Based Access Control (RBAC) via Active Directory Users & Groups.

Storage Redundancy

Great you say, I have moved some of my on-premises storage to Azure, but how do I make sure that data is available?  Well Azure offers the concept of Storage Redundancy which is broken down into, yes you guessed it four areas, these are:

  • Locally redundant storage (LRS)
  • Zone redundant storage (ZRS)
  • Geo redundant storage (GRS)
  • Read access geo redundant storage (RA_GRS)

Locally Redundant Storage (LRS)

Data is held within the same datacentre, however it is replicated three times.  Each replica sits inside a separate fault and update domains.  This uses the same concept as Availability Sets, but for storage.

Use cases, include:

  • Protection from hardware failures
  • Provide redundancy inside a local datacentre to meet compliance or regulatory requirements

Zone Redundant Storage (ZRS)

Data is held across two or three datacentres either in the same region or across regions.

  • Protection from hardware failures
  • Provides a higher level of fault tolerance above LRS

Geo Redundant Storage (GRS)

Data is replicated to a second region.  Data is replicated three times in the primary region like ZRS then replicated to a secondary region ‘a synchronously’.  The purpose behind this is to ensure continued storage performance.  Waiting for an acknowledgement from a replicated region would slow storage responses down.

Geo Replicated Storage v0.1

Read Access Geo Redundant Storage (RA_GRS)

Works in the same was as GRS.  However you have read access to the data at the secondary location.  This can be useful for data mining operations where you don’t want to run against the primary set of data.

 

Final Thoughts

Azure storage is huge and needs to be thought about very carefully to ensure that no single part of the chain becomes the bottleneck for example you might have sufficient disk I/O but the limiting factor is the network throughput from source to target.

This leads us onto the next topic of discussion which is Virtual Machines.