Azure Updates Q4 2019

It’s been a while since I wrote a blog post on the Azure updates which I have paid attention to in 2019. So strap yourself in, lets see which ones make the list!

Azure Spot Virtual Machines

Need to run some VM’s for a short period of time to undertake some testing or development work? Well look no further than Azure Spot Virtual Machines, which Microsoft use to sell unused capacity. Then when they need it back they shut the VMs down.

More information can be found here.

Azure Migrate – Agentless Dependency Mapping

Working out a migration to Azure for IaaS can be a bit tricky when a customer doesn’t have or know what makes up an application service. Step in Azure Migrate – Agentless Dependency Mapping, which can discover interdepentent systems that need to be migrated together.

Coupled with this Azure Migrate can now perform captivation discovery using WMI and SSH calls to to determine the apps, roles and features installed.

More information can be found here.

Azure Dedicated Hosts

It’s not often in this day and age that a dedicated host is required in the public cloud, however Microsoft now have the answer for this with the general availability of ‘Dedicated Hosts’, which provide a hardware isolated environment.

More information can be found here.

Azure Cost Management – CSP

It was always a tough call for customers to choose between the commercial attractiveness of a CSP or the human readable information provided using an EA.

On 1st November, Microsoft enabled CSP for Azure Cost Management, meaning customer now have end to end visbility of costs and budgets.

More information can be found here.

Azure Monitor

This particular service in Azure, has had oodles of updates, which now enables it to provide a centralised repository for collecting, analysing and altering. To be fair, I have used it extensively troubleshooting issues in particular when tuning Application Gateway with WAF functionality.

More information can be found here.

Azure Bastion

Providing secure audited remote access to third parties or employees used to mean deploying Just in Time Access. However this raised a few eyebrows as it was based typically based on source Public IP Address locked down to a specific TCP port.

Azure Bastion removes this risk and provides a PaaS service to enable you to connect over SSL to RDP or SSH sessions.

More information can be found here.

New Microsoft Azure “MFA Free Option”

Worth noting. Microsoft have sneaked in a “Free Version of MFA”. On the understanding that you agree to enable the new “Security Defaults”  (below) and only use the MFA Authentication App

It basically forces MFA for everyone and blocks legacy authentication too. There is no “trusted sites option” and a load of other features not available. See URL here.

Also everyone is prompted for MFA and must register regardless of been internal / external. But which may be useful for small / tactical deployments.

Credit to Mark Brumby for highlighting this.

AWS Concepts – Identity & Access Management

When we talk about Identity and Access (IAM) what do we really mean? For me, it boils down to who you are and what you are entitled to access.

Simple statement, but it does start to get rather complicated when you think about Identity Management.

If you think about your own organisation what directory service does it use? Probably Active Directory Domain Services (AD DS). Think about how many years it has it been fine tuned with integration with third party solutions such as MFA, SSO and VPNs.

The list does truly go on and on. Most organisations will treat their on-premises Active Directory Domain Services (AD DS) as the one source of all truth for users, groups, permissions and passwords.

So the question is how does AWS deal with IAM?

What Is AWS IAM?

It is AWS hyperscale web service that allows users and services shared access to your AWS account. It uses an eventually consistent model, which in a nutshell means that changes are not immediately available.

Users are authenticated and then authorised to use AWS services. To ease the management of individual users, groups are used. Policies are applied to groups which then dictate what the user or service can do.

Policies are JSON documents, used to define actions, effect, resources and conditions on what can be evoked for example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": "*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "192.168.1.0/24"
                }
            }
        }
    ]
}

When you create an IAM user, they can’t access anything until you give them permission.

It should be noted that actions or resources which are not explicitly allowed are denied by default.

We also have IAM Roles, which are similar to users but are AWS identities with permissions (JSON policy) which determines what can or can’t do. It is important to note that IAM Roles don’t have any passwords or long terms access credentials. Access keys are created dynamically and provided on a temporary basis. Typically they are used to delegate access to applications or services.

IAM Roles can also be used to provide users with enhanced privileges on a temporary basis for example a user requires occasional admin access to an S3 bucket.

To enable policies to be tested before you apply them into production, AWS have a handy policy simulator which can be found here.

Identity Federation

AWS IAM supports identity federation for delegated access to either the AWS Management Console or APIs. Federated users are created within your corporate directory outside of the AWS account.

These can be web identity providers such as Amazon, FaceBook, Google or an OpenID Connect provider.

Within the enterprise world, we tend to see Active Directory Domain Services used in conjunction with Active Directory Federation Services. AWS have integration using Security Assertion Markup Language 2.0 (SAML 2.0) using STS AssumeRoleWith SAML.

A high level overview of this is shown below in the diagram below.

  1. User browsers to URL and is redirected to AD FS sign in page
  2. User enters Active Directory credentials
  3. User authenticated by Active Directory Domain Services
  4. Users browser receives a SAML 2.0 assertion
  5. User browser posts the SAML 2.0 assertion to AWS STS
  6. AssumeRoleWithSAML requests temporary security credentials and constructs a sign in URL for the AWS Management Console
  7. User browser receives the sign in URL and is redirected to the AWS Management Console

Single Sign On Cloud Applications

To provide easier integration with popular cloud applications
such as Dropbox, Office365 and SalesForce. AWS provide single sign on (SSO) using SAML 2.0via a configuration wizard.

Further information can be found here.

MultiFactor Authentication

AWS MFA provides an extra later of security to reduce the overall risk of compromised credentials. Providing a secondary authentication step for Management Console and API users.

For the MFA device, you have a choice of three items:

  1. Virtual MFA Device
  2. Hardware Key
  3. Hardware Device

This link shows which form factors can be used across devices.

Final Thoughts

AWS IAM is a web scale directory that can provides integration with on-premises directory services and cloud applications. Interestingly this is an added value service with no extra cost, which is a different approach from traditional licensing vendors.

Using Azure Data Factory to Copy Data Between Azure File Shares – Part 2

This blog post is a continuation of Part 1 Using Azure Data Factory to Copy Data Between Azure File Shares. So lets get cracking with the storage account configuration.

Storage Account Configuration

Lets start off with the basics, we will have two storage accounts which are:

  • vmfwepsts001 which is the source datastore
  • vmfwedsts001 which is the sink datastore

Within each storage account we have three file shares:

  • documents
  • images
  • videos

When configured each storage account should look something like this.

Right lets move onto the Data Factory configuration.

Data Factory Configuration

I have created a V2 Data Factory called vmfwepdf001.  Next let’s click on Author & Monitor as shown below.

data factory 02.PNG

This will now redirect us to the Azure Data Factory landing page.  We need to select ‘Copy Data’.

data factory 03.PNG

We need to give the pipeline a name, in this instance, I have chosen Document Share Copy.  To keep the file shares in ‘sync’ we are going to use a schedule with a trigger type of ‘schedule’.

Depending on how often you want the pipeline to run, you can run the task every minute if required with no end date.  I have chosen a daily basis as shown in the screenshot below.

data factory 04.PNG

When your ready, click next.  We are now ready to select our Source Data Storage which will be ‘Azure File Storage’.  To enable Azure Data Factory to access the Storage Account we need to Create a New Connection.

data factory 05.PNG

A new Linked Service, popup box will appear, ensure you select Azure File Storage.  Give the Linked Service a name, I have used ‘ProductionDocuments’. You can create a custom Integration Runtime to allow the data processing to occur in a specific Azure Region if required.  In this instance, I’m going to leave it as ‘AutoResolveIntegrationRuntime’.

Azure Data Factory requires the Host to be in a specific format which is //storageaccountname.file.core.windows.net/filesharename

The user name is your storage account name and the password is your storage account access key.

The below screenshot provides the configuration.

data factory 06

If you have entered everything correct;y, when you click on ‘Test Connection’ you should receive a Green Tick! Click Next and then Next again, it will test your connection again.

When you are greeted with the ”input file or folder’ screen, we need to define a few pieces of information as follows:

  • File or Folder – leave this blank unless you want to focus on a specific file or sub-folder
  • File Loading Behaviour – this is really a design decision between load all files and incremental load : LastModifiedDate
  • Copy File Recursively – Copy all files and subfolders, I would suggest selecting this
  • Compression Type – None

Once configured it should look something like this:data factory 08.PNG

Follow the same process for the Destination Data Store, when you get to the output file or folder screen, we need to define a few settings as follows:

  • File or Folder – leave this blank unless you want to focus on a specific file or sub-folder
  • Compression Type – None
  • Copy Behaviour – Preserve hierarchy which means we will preserve the folder structure

Once configured it should look like this:data factory 07.PNG

Click next then Next and you will see a Summary of your configuration.  Click Next and you should see your Data Factory completed.

data factory 09.PNG

Does It Work?

Lets check that this works.  I have loaded a few files into my Production Storage Account under Documents.

data factory 10.PNG

On the Azure Data Factory Landing page, click the Pencil (top left) > Select Pipelines > Document Share Copy > Trigger > Trigger Now as per the screenshot below.

data factory 11.PNG

Checking my Development Storage Account, I now have the three files available, success!

data factory 12.PNG

I hope you found this post useful, tune in for some more in the near future.

 

 

Using Azure Data Factory to Copy Data Between Azure File Shares – Part 1

I was set an interesting challenge by a customer to copy the data in their Production Subscription Azure File Shares into their Development Subscription Azure File Shares. The reason behind this was to ensure that any uploads to their Production environment are kept inline with the Development environment, enabling testing to be performed on ‘live’ data.

The customer wanted something which was easy to manage, which provided visibility of data movement tasks within the Azure Portal without needing to manage and maintain PowerShell scripts.

The answer to this was Azure Data Factory.

What Is Azure Data Factory?

Azure Data Factory is a managed data integration service that enables data driven workflows between either on-premises to public cloud or within public clouds.

Pipelines

A pipeline is a logical grouping of activities that together perform a task.  The activities within the pipeline define actions to perform on data. 

Data Factory supports three types of activities data movement activities, data transformation activities and control activities. In this use case, data movement activities will be used to copy data from the source data store to the destination data sink.

Linked Service

Linked Services are used to link data stores to the Azure Data Factory.   With the ‘data set’ representing the structure of the data and the linked service defining the connection to the external data source. The diagram below provides a logical overview of this.

Integration Runtime

For copy activities an integration runtime is required to determine the source and sink linked services to define the direction of data flow.  To ensure data locality a custom integration runtime will be used within West Europe.

Source Datastore

Each file share within the vmfwepsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Sink Datastore

Each destination file share within the vmfwedsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Copy behaviour to the sink datastore can be undertaken using three methods:

  • Preserve Hierarchy the relative path of source file to source folder is identical to the relative path of the target file and folder
  • Flatten Hierarchy all files from the source folder are in the first level of target folder.  The target files have auto generated names
  • Merge Files merges all files from the source folder to one file, using an auto generated name

To maintain the file and folder structure, preserve hierarchy copy behaviour will be used.

Tune in for the next blog post when we will cover the configuration settings.