AWS Concepts – Identity & Access Management

When we talk about Identity and Access (IAM) what do we really mean? For me, it boils down to who you are and what you are entitled to access.

Simple statement, but it does start to get rather complicated when you think about Identity Management.

If you think about your own organisation what directory service does it use? Probably Active Directory Domain Services (AD DS). Think about how many years it has it been fine tuned with integration with third party solutions such as MFA, SSO and VPNs.

The list does truly go on and on. Most organisations will treat their on-premises Active Directory Domain Services (AD DS) as the one source of all truth for users, groups, permissions and passwords.

So the question is how does AWS deal with IAM?

What Is AWS IAM?

It is AWS hyperscale web service that allows users and services shared access to your AWS account. It uses an eventually consistent model, which in a nutshell means that changes are not immediately available.

Users are authenticated and then authorised to use AWS services. To ease the management of individual users, groups are used. Policies are applied to groups which then dictate what the user or service can do.

Policies are JSON documents, used to define actions, effect, resources and conditions on what can be evoked for example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": "*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "192.168.1.0/24"
                }
            }
        }
    ]
}

When you create an IAM user, they can’t access anything until you give them permission.

It should be noted that actions or resources which are not explicitly allowed are denied by default.

We also have IAM Roles, which are similar to users but are AWS identities with permissions (JSON policy) which determines what can or can’t do. It is important to note that IAM Roles don’t have any passwords or long terms access credentials. Access keys are created dynamically and provided on a temporary basis. Typically they are used to delegate access to applications or services.

IAM Roles can also be used to provide users with enhanced privileges on a temporary basis for example a user requires occasional admin access to an S3 bucket.

To enable policies to be tested before you apply them into production, AWS have a handy policy simulator which can be found here.

Identity Federation

AWS IAM supports identity federation for delegated access to either the AWS Management Console or APIs. Federated users are created within your corporate directory outside of the AWS account.

These can be web identity providers such as Amazon, FaceBook, Google or an OpenID Connect provider.

Within the enterprise world, we tend to see Active Directory Domain Services used in conjunction with Active Directory Federation Services. AWS have integration using Security Assertion Markup Language 2.0 (SAML 2.0) using STS AssumeRoleWith SAML.

A high level overview of this is shown below in the diagram below.

  1. User browsers to URL and is redirected to AD FS sign in page
  2. User enters Active Directory credentials
  3. User authenticated by Active Directory Domain Services
  4. Users browser receives a SAML 2.0 assertion
  5. User browser posts the SAML 2.0 assertion to AWS STS
  6. AssumeRoleWithSAML requests temporary security credentials and constructs a sign in URL for the AWS Management Console
  7. User browser receives the sign in URL and is redirected to the AWS Management Console

Single Sign On Cloud Applications

To provide easier integration with popular cloud applications
such as Dropbox, Office365 and SalesForce. AWS provide single sign on (SSO) using SAML 2.0via a configuration wizard.

Further information can be found here.

MultiFactor Authentication

AWS MFA provides an extra later of security to reduce the overall risk of compromised credentials. Providing a secondary authentication step for Management Console and API users.

For the MFA device, you have a choice of three items:

  1. Virtual MFA Device
  2. Hardware Key
  3. Hardware Device

This link shows which form factors can be used across devices.

Final Thoughts

AWS IAM is a web scale directory that can provides integration with on-premises directory services and cloud applications. Interestingly this is an added value service with no extra cost, which is a different approach from traditional licensing vendors.

Azure Heavy Hitter Updates

Keeping up with Azure can be a full time task in itself with the plethora of updates. With this in mind, I thought I would share a couple of updates, which in my opinion are heavy hitters.

Account Failover for Azure Storage

Many of us use GRS storage for an added safety net, to ensure that data is available in a secondary paired region if the primary region has an outage. The kicker has always been that no SLA exists for this, it’s down to Microsoft to decide when they declare the primary region out and provide access to the replicated data.

Well that is all about to change with the announcement of ‘Account Failover for Azure Storage‘. This means that you are now in control of failing data over to your secondary region.

A couple of points which are worth noting:

  1. Having data available is only a single layer, think about security, identity and access, networks, virtual machines, PaaS etc in your secondary region
  2. Upon failover the secondary storage account is LRS, you will need to manually change this to GRS-RA and replicate back to your original primary region

Adaptive Network Hardening in Azure Security Center

I really enjoy updating an Access Control List, said no one ever!

Defining Network Security Groups (NSG) takes time and effort, with engagement across multiple stakeholders to determine traffic flow or you spend your time buried deep inside Log Analytics.

Microsoft have announced the public preview of Adaptive Network Hardening in Azure Security Center, which learns traffic flows (using machine learning) and provides recommendations for internet facing virtual machines.

A couple of points which are worth noting:

  1. This should be enabled when virtual machines are deployed to reduce the risk of rogue traffic
  2. As it mentions on the tin, this is for internet facing VMs only. However I’m sure this may be updated in due course.

Thanks for reading, tune in for the next post.

Using Azure Data Factory to Copy Data Between Azure File Shares – Part 3

This blog post is a continuation of Part 1 Using Azure Data Factory to Copy Data Between Azure File Shares and
Part 2 Using Azure Data Factory to Copy Data Between Azure File Shares. In this final part we are going to configure alerts to send an email on a failed pipeline run.

First of all select your Data Factory and then Select > Alerts > New Alerts Rule

In the previous configuration, the Azure Data Factory is running once a day. So with this in mind, we are going to Select ‘Add Condition’ then Failed Pipeline Runs.

Scroll down and Select Alert Logic. Ensure the conditions are set to Greater Than, Total 1. This essentially defines that if an issue occurs, perform an action.

Under the Evaluation based on, Select 12 Hours and Frequency Every Hour. This is how often the query is evaluated. It should look something like this:

Next we need to create an Action Group so when the above condition is met, an action is taken. I have called my Action Group VMF-WE-DFAG01, which stands for VMFocus, West Europe, DataFactory, ActionGroup 01.

For the short name, I have used Copy Failure, note this needs to be under 12 characters long.

Finally, I have chosen the ‘Action Type’ as Email/SMS/Push and entered in the appropriate contact details. Once done it should look something like this.

After a short while, you will receive an email from Microsoft Azure to confirm that you have been added to an Action Group.

Finally we want to give the Alert Rule a Name and a Description, such as the below.

That’s it your Azure Data Factory is all configured and ready for production use!

Using Azure Data Factory to Copy Data Between Azure File Shares – Part 2

This blog post is a continuation of Part 1 Using Azure Data Factory to Copy Data Between Azure File Shares. So lets get cracking with the storage account configuration.

Storage Account Configuration

Lets start off with the basics, we will have two storage accounts which are:

  • vmfwepsts001 which is the source datastore
  • vmfwedsts001 which is the sink datastore

Within each storage account we have three file shares:

  • documents
  • images
  • videos

When configured each storage account should look something like this.

Right lets move onto the Data Factory configuration.

Data Factory Configuration

I have created a V2 Data Factory called vmfwepdf001.  Next let’s click on Author & Monitor as shown below.

data factory 02.PNG

This will now redirect us to the Azure Data Factory landing page.  We need to select ‘Copy Data’.

data factory 03.PNG

We need to give the pipeline a name, in this instance, I have chosen Document Share Copy.  To keep the file shares in ‘sync’ we are going to use a schedule with a trigger type of ‘schedule’.

Depending on how often you want the pipeline to run, you can run the task every minute if required with no end date.  I have chosen a daily basis as shown in the screenshot below.

data factory 04.PNG

When your ready, click next.  We are now ready to select our Source Data Storage which will be ‘Azure File Storage’.  To enable Azure Data Factory to access the Storage Account we need to Create a New Connection.

data factory 05.PNG

A new Linked Service, popup box will appear, ensure you select Azure File Storage.  Give the Linked Service a name, I have used ‘ProductionDocuments’. You can create a custom Integration Runtime to allow the data processing to occur in a specific Azure Region if required.  In this instance, I’m going to leave it as ‘AutoResolveIntegrationRuntime’.

Azure Data Factory requires the Host to be in a specific format which is //storageaccountname.file.core.windows.net/filesharename

The user name is your storage account name and the password is your storage account access key.

The below screenshot provides the configuration.

data factory 06

If you have entered everything correct;y, when you click on ‘Test Connection’ you should receive a Green Tick! Click Next and then Next again, it will test your connection again.

When you are greeted with the ”input file or folder’ screen, we need to define a few pieces of information as follows:

  • File or Folder – leave this blank unless you want to focus on a specific file or sub-folder
  • File Loading Behaviour – this is really a design decision between load all files and incremental load : LastModifiedDate
  • Copy File Recursively – Copy all files and subfolders, I would suggest selecting this
  • Compression Type – None

Once configured it should look something like this:data factory 08.PNG

Follow the same process for the Destination Data Store, when you get to the output file or folder screen, we need to define a few settings as follows:

  • File or Folder – leave this blank unless you want to focus on a specific file or sub-folder
  • Compression Type – None
  • Copy Behaviour – Preserve hierarchy which means we will preserve the folder structure

Once configured it should look like this:data factory 07.PNG

Click next then Next and you will see a Summary of your configuration.  Click Next and you should see your Data Factory completed.

data factory 09.PNG

Does It Work?

Lets check that this works.  I have loaded a few files into my Production Storage Account under Documents.

data factory 10.PNG

On the Azure Data Factory Landing page, click the Pencil (top left) > Select Pipelines > Document Share Copy > Trigger > Trigger Now as per the screenshot below.

data factory 11.PNG

Checking my Development Storage Account, I now have the three files available, success!

data factory 12.PNG

I hope you found this post useful, tune in for some more in the near future.

 

 

Using Azure Data Factory to Copy Data Between Azure File Shares – Part 1

I was set an interesting challenge by a customer to copy the data in their Production Subscription Azure File Shares into their Development Subscription Azure File Shares. The reason behind this was to ensure that any uploads to their Production environment are kept inline with the Development environment, enabling testing to be performed on ‘live’ data.

The customer wanted something which was easy to manage, which provided visibility of data movement tasks within the Azure Portal without needing to manage and maintain PowerShell scripts.

The answer to this was Azure Data Factory.

What Is Azure Data Factory?

Azure Data Factory is a managed data integration service that enables data driven workflows between either on-premises to public cloud or within public clouds.

Pipelines

A pipeline is a logical grouping of activities that together perform a task.  The activities within the pipeline define actions to perform on data. 

Data Factory supports three types of activities data movement activities, data transformation activities and control activities. In this use case, data movement activities will be used to copy data from the source data store to the destination data sink.

Linked Service

Linked Services are used to link data stores to the Azure Data Factory.   With the ‘data set’ representing the structure of the data and the linked service defining the connection to the external data source. The diagram below provides a logical overview of this.

Integration Runtime

For copy activities an integration runtime is required to determine the source and sink linked services to define the direction of data flow.  To ensure data locality a custom integration runtime will be used within West Europe.

Source Datastore

Each file share within the vmfwepsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Sink Datastore

Each destination file share within the vmfwedsts001 Storage Account is an individual linked service.  Therefore, four source linked services will be defined for data, documents, images and videos.

Copy behaviour to the sink datastore can be undertaken using three methods:

  • Preserve Hierarchy the relative path of source file to source folder is identical to the relative path of the target file and folder
  • Flatten Hierarchy all files from the source folder are in the first level of target folder.  The target files have auto generated names
  • Merge Files merges all files from the source folder to one file, using an auto generated name

To maintain the file and folder structure, preserve hierarchy copy behaviour will be used.

Tune in for the next blog post when we will cover the configuration settings.