Failover with Traffic Manager from Azure to AWS

This post follows the same approach that we saw in Failover with Route 53 from Azure to AWS, but now the failover will be taken care of by Azure Traffic Manager.

The post will also show how Netreo can monitor the web app hosting the website on Azure.

Netreo offers advanced monitoring and automation features for Microsoft Azure resources(at time of writing this post). It uses metrics and alerts to monitor the status of the resources and to trigger notifications when custom thresholds are crossed. The automation features allow auto-healing, auto-scale and scheduled tasks. Using auto-healing, Netreo can trigger the replacement of resources that failed, while auto-scale allow to increase the number of resources(computing power) to accomodate usage spikes(expected or unexpected).

This diagram explains how Azure DNS will route the traffic to Azure Traffic Manager which in turn will direct the traffic to Azure website or the AWS VM based on their availability:

Failover with Traffic Manager

So let’s see the resources used in both Azure and AWS clouds.

The EC2 VM in AWS has a public IP address assigned:

Failover with Traffic Manager  2

The EC2 has Apache installed and in /var/www/html directory and when the browser is accessing http:// or http://, this is the webpage returned:

Because both Azure and AWS should have the same information, it will not be straightforward for the user from where the information is served when awswork.com will be accessed, hence the indication showing the source of information.

Going further with Azure setup, these are the resources required to host the DNS zone, the website and to make sure the failover is smooth:

There is a restriction about the app service plan when it comes to traffic manager routing the traffic. The web app has to be deployed in a standard app service plan.

This is the web app:

The screenshot was taken after the DNS zone was set up on Azure. Normally with a web app, the URL is something following this notation .azurewebsites.net. Above, the URL field is pointing to www.awswork.com.

This is because a custom domain has been added to the web app pointing to the CNAME record from the DNS zone:

And this is the DNS zone. The CNAME www record points to the Traffic Manager DNS link(as seen later):

Failover with Traffic Manager - DNS Zone

However, awswork.azurewebsites.net(awswork is the name chosen for the web app during web app creation) is still working.

Next, the Traffic Manager profile configuration. The Traffic Manager distribution is accessible using the DNS name section link. There are two endpoints. The web app is an Azure endpoint and the AWS VM is an external endpoint. Based on their priority, Traffic Manager will use the web app as a source for the data requested by client.

Because both endpoints are enabled and online, Traffic Manager will redirect to the endpoint with the lowest priority and that is the web app from Azure:

Consequently, accessing the custom DNS domain, will direct the user to the Azure web app:

Each and every endpoint from the Traffic Manager profile can be manually put in disable state which means that Traffic Manager will not redirect the users to that endpoint. In this specific case, to confirm that failover is working properly, the web app endpoint was manually disabled:

It might take some time until the user will be redirected to the AWS VM, but eventually it will:

And this is pretty much about failover between Azure web app and AWS VM.

Netreo has the ability to monitor the web app and trigger an alert when needed(an email is also sent).

Being able to monitor the web app will allow the operator to take restoring actions while the users continue to access the website almost no service impact during failover.

This is the web app created as endpoint for Traffic Manager:

Netreo has several default metrics like how much data was sent to the web app, the response time, the status of the web app and few other metrics:

Some of these metrics can be monitored and when specific conditions are met(thresholds crossed, status change) an alert gets triggered to notify the operator. There are few default alerts configured for web app:

To see Netreo in action monitoring a metric and triggering an alert, the Status metric is used. This metric can take three values: Ready, Down and Unknown and sometimes Stopped as seen below:

The alert notifying the operator that something has gone wrong with the web app gets triggered when the Status metric gets the value Down:

Because there is no easy way to simulate a failure of a web app(except deleting it completely from Azure), the alert was modified so that it gets triggered when the status is ready. This means that the alerts is triggered when the web app is up and running, but again, this is just to show how Netreo detects the status of a web app:

how Netreo detects the status of a web app

Shortly after the change, the alert was fired off:

Alert fired Off

After the alert configuration was changed back, the alert went off:

Azure Alert Configuration

And this is pretty much all about how one can use various Azure and AWS resources to achive failover of an Azure web app.

Traffic Manager made possible to direct the user to specific endpoints based on their priority and Azure DNS allowed the users to access a custom DNS domain instead of Traffic Manager DNS link.

Reference:

Ready to get started? Get in touch or schedule a demo.