Tech Monger

Programming, Web Development and Computer Science.

Skip to main content| Skip to information by topic

Auto Restart Web Server After Crash

So you have web application running in production but it goes down unexpectedly and you get midnight calls to take up application. In this post we will build simple script that will recover web application after server crash due to over resource utilization.


Over Utilization of CPU or RAM

Web server could crash due to variety of reasons but in this guide we will deal with crashes that occur due to High CPU or Memory Consumption by web server process. However you can use below script to recover web server which crashes due to any other reason.

If request arrives at web server and web server is under heavy resource load, it will not respond to the request and request will time out. Also it might happen that web server process is itself a resource hungry and over utilisation of resources stops web server completely.

This is very common phenomenon when you have very light weight virtual machine hosting your web server with limited CPU and Memory such as google cloud's f1 micro instance. We encountered this issue running web application written in Flask which was hosted on Gunicorn web server over Google Cloud.


Auto Restart Script

We will build auto restart script which will have following two components.

  1. Crash Detection
  2. Crash Recovery

Crash Detection

Whenever web server responds to http request correctly, it sends HTTP Status code 200 inside HTTP header. So if there is crash then server will never send status code 200. We will leverage on this technique to detect web server crash.

We will use cURL to send http request to web site on page which always responds with status code 200. If status code is not 200 OK then we will trigger recovery component.

Crash Recovery

Once we are ascertained that web server is crashed and hence not responding, we can take recovery action. Recovery depend upon what action is more appropriate to take web application up.

Restarting web server is appropriate recovery in most of the cases. But restarting web server does not work when web server process itself is responsible for over utilisation because web server does not respond to the restart or stop command. Hence we will need to restart complete VM to recover from crash.

Building Crash Detection and Recovery Script

Below we will make web request with curl and check status code of response. If it is not 200 we will take recovery action i.e. reboot VM. You can take recovery action according your requirement. We will schedule this script on same machine as of web server.

auto_recover.sh
#!/bin/bash

status=`curl -s -o /dev/null -w "%{http_code}" https://example.com`
echo `date` $status >> crash_checks.log

if [ "$status" -ne "200" ]
then
       # Take any appropriate recovery action here.
	echo "webserver seems down, initiating reboot." >> check.log
	sudo reboot
fi
Additional Setting

In above example we are rebooting complete VM and need make sure that after reboot web application get started automatically. We will achieve it with cronjob. Cronjob will also help us schedule above script run after specified interval.

Open crontab in edit mode to create cronjob.

$ crontab -e

Schedule web application auto restart and crash detection script.

# To start webapp automatically after reboot
@reboot sh /path/to/webapp/startup/script/start.sh

# To trigger crash detection script after every 5 minutes
*/5 * * * * sh /path/to/script/auto_recover.sh


Conclusion

We learned how to write custom script to detect and recover web application when web server goes down unexpectedly. This technique help us maintain high availability despite of web server crash. Though this script is useful, we should always try to find actual cause of the crash and fix it accordingly.

Tagged Under : Flask Google Cloud Linux Ubuntu Web