Generic checks like pinging website hosted by the Web App or monitoring response times are basic methods to ensure Azure Web Apps are operating correctly.
Apart from verifying that the website is responsive, Netreo allows its users to detect outages and to automatically restart crashed WebApps.
In this article, we’ll show how to configure Netreo to automatically restart Web Apps in a case of outages:
1. Run Netreo Setup Wizard to connect to your Azure environment
If you aren’t using Netreo yet, request a demo to have a sales representative, show you the view of an Azure subscription with Web Apps. Learn more about the setup process here.
2. Track response codes
When configuring monitoring for a new Web App, you can use the default template that defines many useful metrics and alerts. You can later modify the template or create your own.
To ensure that the website is functioning correctly, we’ll define a new metric that will be tracking responses returned for a given query:
- In the Web App configuration dialog, navigate to the “Metrics” tab.
- Define a new metric of type “AzureWebsiteResponseCode”.
- Pick a host from the drop-down. Typically, there are two or three addresses, ensure you’re using the production one.
- Optionally, you can provide a relative path and query string that will be used for pinging the website.
How to automatically restart a crashed Web App with CloudMonix
3. Restart a crashed Web App when errors are detected
In this example, we want to restart a Web App every time Netreo alerts that the website is not available.
To create an action that satisfies those requirements:
- In the “Actions” tab define a new action.
- Trigger the action based on Expression, which checks if the value of the previously defined metric is other than “OK”.
- Select “AzureWebsiteRestart” from the “Command” drop-down.
- Specify a meaningful “Sustained period” value, which will ensure the self-healing action is not triggered prematurely, e.g. 10 min.
- Specify a meaningful “Suspended period” value, which will ensure that resource status stabilizes before another action is executed, e.g. value of 60 min will ensure that the website is not restarted more often than every hour.
Actions can be also used to proactively reboot Web Apps on a regular basis, which helps to address issues like memory leaks, disks fragmentation, poorly closed connections, and more. The same approach can be used to keep Cloud Services instances stable. Learn more here.