Using Azure Data Factory to Copy Data Between Azure File Shares – Part 1

I was set an interesting challenge by a customer to copy the data in their Production Subscription Azure File Shares into their Development Subscription Azure File Shares. The reason behind this was to ensure that any uploads to their Production environment are kept inline with the Development environment, enabling testing to be performed on ‘live’ data.

The customer wanted something which was easy to manage, which provided visibility of data movement tasks within the Azure Portal without needing to manage and maintain PowerShell scripts.

The answer to this was Azure Data Factory.

What Is Azure Data Factory?

Azure Data Factory is a managed data integration service that enables data driven workflows between either on-premises to public cloud or within public clouds.

Pipelines

A pipeline is a logical grouping of activities that together perform a task.  The activities within the pipeline define actions to perform on data. 

Data Factory supports three types of activities data movement activities, data transformation activities and control activities. In this use case, data movement activities will be used to copy data from the source data store to the destination data sink.

Linked Service

Linked Services are used to link data stores to the Azure Data Factory.   With the ‘data set’ representing the structure of the data and the linked service defining the connection to the external data source. The diagram below provides a logical overview of this.

Integration Runtime

For copy activities an integration runtime is required to determine the source and sink linked services to define the direction of data flow.  To ensure data locality a custom integration runtime will be used within West Europe.

Source Datastore

Each file share within the vmfwepsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Sink Datastore

Each destination file share within the vmfwedsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Copy behaviour to the sink datastore can be undertaken using three methods:

  • Preserve Hierarchy the relative path of source file to source folder is identical to the relative path of the target file and folder
  • Flatten Hierarchy all files from the source folder are in the first level of target folder.  The target files have auto generated names
  • Merge Files merges all files from the source folder to one file, using an auto generated name

To maintain the file and folder structure, preserve hierarchy copy behaviour will be used.

Tune in for the next blog post when we will cover the configuration settings.

Azure CDN: Custom Cache Rules

It was just over a couple of years ago when I wrote the Azure CDN Concept blog post.

I was recently asked by a customer to apply caching rules to only a specific set of file extensions using a custom domain name.  So with this in mind, I thought I would share the process with you.

Step 1 – Which CDN?

Microsoft Azure provides a number of CDN, so we need to find the correct CDN to meet requirements which are custom caching rules and custom domain HTTPS.

Looking at the Compare Azure CDN Product Features page it shows that only Standard Verizon and Premium Verizon will meet the requirements.

In this case, I will start by using Standard Verizon, we can migrate to Premium Verizon if needed.

Step 2 – Caching Rules

Azure CDN uses the HTTP caching specialisation RFC 7234.  It should be noted that not all resources can be cached in particular Standard Verizon only deals with:

  • HTTP Status Codes 200
  • HTTP Methods GET
  • File Size Limits 300GB

By default Standard Verizon caches any HTTP Status 200 Codes for 7 days.  To override this, we need to enable Global Caching Rules which affect the caching behaviour for all requests.

In this case we want to set the caching behaviour to ‘Bypass Cache’ meaning that no content which will be cached.

Next we then set our specific Custom Caching Rules which supersede the Global Caching Rules using File Extension types for example:

We are now utilising the Standard Verizon CDN to only cache jpg, jpeg, png and gif file extensions.

Final Thought

In a nutshell Custom Caching Rules override, Global Caching Rules which override Default Caching Rules.

Think of it like a game of top trumps, for those of you who don’t know what this is, I would suggest adding a pack to your Christmas list!

Standard SSD: Azure Backup Failure

imagesI have been undertaking a customer deployment and thought I would share this nugget of information which may save you some time.

Standard SSD

Even though Standard SSD are now GA as per this article.  We are unable to backup VMs with Standard SSD, receiving in total two error messages.

The first error message is the initial job to configure the backup fails with the message ‘Deployment to resource group ‘name’ failed.  Additional details from the underlying API that might be helpful: At least one resource deployment operation failed.  Please list deployment operations for details.  Please see https://aka.ms/arm-debug for usage details.

Azure Backup 01

Digging a bit deeper we receive the Error Code ‘UserErrorGuestAgentStatusUnavailble’ with a recommended action of ‘Ensure the VM has network connectivity and the VM agent is update and running.  For more information, please refer to https://aka.ms/guestagent-status-unavailable’.

A quick reboot of the VM and this resolves the initial ‘configure backup error’ with Standard SSDs.

We then go to protect the VM and undertake the initial backup and this is where the problem occurs.  After two plus hours, you will receive an error notification which states ‘The storage type is not supported by Azure Backup’.

Azure Backup 02

This is a known issue and is documented in the ‘Prepare Resource Manager Deployed VMs‘ article under the section ‘Limitations when backing up and restoring a VM’.

So for now, you can deploy VMs with Standard SSD but you can’t back up the entire VM using Azure IaaS VM Backup!

Update

Azure Backup now supports Standard SSD see blog post here.

 

Application Gateway WAF, does it Load Balance?

images

I was recently working on a project in which we where using an Application Gateway with WAF to send traffic to certain destinations based on URL path.

During a conference call with the application developer and a Microsoft Cloud Solution Architect I was asked the question, what are you going to use to load balance the backend pools?

I initially responded the WAF as this is polling the backend pool to determine which VMs to send traffic to, so logically should include a Load Balancer, but hold on a minute I have never seen any settings for Load Balance rules.  In comes that moment of doubt when someone from Microsoft questions you.

After trawling over the documents, I was able to find reference to load balancing on the main product overview along with internal load balance configuration, but what about external connections?

I was able to find this golden nugget of information, written by David Sanchez entitled Azure Application Gateway uses the Load Balancer.  This confirms that it is a built in to the Application Gateway by default using an algorithm to provide load balance services.

So in short, yes the Application Gateway WAF, does include a Load Balancer, it is just inbuilt and therefore shielded from configuration choices.

 

Altaro: First Impressions

In March 2018 Altaro announced v7.6 of their backup product, I thought it was time to give the product a whirl and provide feedback on my first impressions.

Lab

As those who follow my blog know, I switched to Server 2012 R2 running Hyper-V a while ago.  In this configuration I have a HPE DL360 G6 with some local SATA storage as the backup target for my Hyper-V virtual machines.

Installation

Altaro make the claim (see here) that you can be up and running, allowing you to back up your first virtual machine within 15 minutes.

Once I had completed the simple registration form to access the Unlimited Plus Edition for 30 Days, it was time to launch the installer.

A straight forward intuitive installer is completed within a couple of minutes we are ready to launch the management console.

Configuration

As soon as the management console is loaded, we just need to follow the 3 steps outlined below.

Altaro 01

Connecting to the Hvper-V Hosts is a straight forward process, entering the IP Address and credentials you would expect.

Altaro 02

After this I entered in the backup location and selected the VM which required backing up and clicked backup.

So far so good, Altaro have validated that backups can be started within 15 minute of installing the software.

CDP Settings

As we know all workloads are not equal and more critical application services require a lower restore point objective.  With Altaro,  I can set CDP settings as low as 5 minutes.  The part which is quite impressive is that they warn you of the impact on the hypervisor of taking such frequent snapshots.

Altaro 03

Offsite Backup

Another feature I wanted to validate was integration with Azure Storage Backups to undertake an ‘offsite backup copy’.

After entering the Connection String for one of my Azure Storage Accounts, it was simply a case of dragging and dropping the VM I wanted to protect into the ‘Offsite Location’ bucket and finally provide a Master Encryption Key.

I was interested to see the native format of the ‘Offsite Backup’ to see if this could be used to migrate VM’s to Azure.  Using Microsoft Azure Storage Explorer I browsed to the storage account and viewed the VM location.

The Offsite Backup VM isn’t easily identifiable, I’m assuming the VM name is encrypted by the Master Encryption Key and the backup files are held in Altaro format.

Altaro 06

Essentially this means that if you had a DR event on-premises and you needed to restore backups from Offsite, you would need to install and configure an Altaro Backup server which isn’t a big deal in itself but just adds to the overall time needed to restore business operations.

Schedules

When administering backups, an area which time and effort is spent is on backup scheduling.  I was pleased to see that the schedules are different for CDP and regular one off backups.

In this scenario, I wanted to perform an on-premises backup and then follow this up with an offsite copy.  A couple of click and this was ready to go!

Altaro 05

A bit of feedback for Altaro is it would be good to be able to name your backup schedules as I could see identifying the right schedule could become cumbersome.

Advanced Settings

Enables you to control features such as De-duplication, Encryption, Exclude ISO’s/Drives and use Change Block Tracking.

Linked back to the Offsite Backups, it would be great if you had the option to backup to Azure as a native VHD (without de-duplication) as you could then spin up your VMs in Azure and use this as a migration tool or for DR scenarios.

Restore

For me, on-premises restores are a given.  I’m more interested in restoring archive data from Azure (using Altaro Retention Policy to control this).

Selecting the Restore Icon, I can select Azure Storage Account, again with a decent prompt which states you will be charged for egress data.

Altaro 07.PNG It’s a case now of dragging the backup down your internet pipe to be re-hydrated by Altaro VM Backup on your selected Hyper-V Host.

One of the things I would like to see is a File Level Restore from an Azure Storage Account to avoid restoring an entire VM .

Final Thought

It’s clear that Altaro have invested heavily in a slick user experience to provide simplified backup operations with a clear and concise dashboard that is intuitive.

I’m sure that we will see further enhancements especially around the integration with public cloud.

If you’d like to try the software for your Hyper-V and/or VMware environments, you can download Altaro VM Backup to back up unlimited VMs for 30 days, then enjoy forever free backup for 2 VMs. Download Altaro VM Backup for free here.