
Challenge to solve
I am building a website for analyses of basketball games based on the play-by-play data publicly available after endgame. My logic (parsing, fetching from the internet, algorithms, etc) is written in Python and I wanted to continue using Python all the way, also when building front end. To do that, I have chosen Dash which builds on top of Flask.
My plan was to publish the web-based analytic app called Hubie behind a domain with port 80 and run it on a Linux server in the cloud. Gunicorn is the server of choice for the web application and Nginx is a web server which in this case serves as a reverse proxy.
The web application deployed is using a DNS name created in Azure.
Virtual environment is used to test the web application from port 8080, and to execute python3 and gunicorn commands suitable for Python3.
One of the challenges CentOS 7 has is that it still used Python 2 as its default Python. Installing Python3 and changing paths in /usr/bin might seem a good solution, but it will come back and haunt you. That is why it is best to create a virtual environment with the desired Python version.
Environment
Service used to host the server is Virtual Machine on Azure. The Linux server is using image CentOS-based 7.7 and the instance size is Standard D2s v3.
Install packages
Preparing CentOS environment by installing the necessary packages:
sudo yum update -y
sudo yum install -y epel-release
sudo yum -y install python3-pip nginx git
sudo yum install --enablerepo="epel" ufw -y
sudo yum install -y policycoreutils-{python,devel}
sudo pip3 install virtualenv
Create www-data group
sudo groupadd www-data
sudo usermod -a -G www-data centos
Create virtual environment
This command creates a new folder inside /home/$USER/ with the same name as the virtual environment. In this case, the path to the virtual environment home is /home/centos/hubievenv.
virtualenv hubievenv
Activating the virtual environment will enforce Python installed in the virtual environment.
source hubievenv/bin/activate
Executing the above command makes a change to the command line:
(hubievenv) [centos@hubie4 ~]$
The virtual environment can be exited by typing deactivate command. Before that, the virtual environment needs to be prepared.
Install packages with pip in virtual environment
pip install gunicorn flask dash plotly pandas boto3
If not using dash, only flask, remove the dash package from the install list. No need to install flask if you are only using dash. Package boto3 is installed because my data source is AWS S3.
If you get the following error:
ERROR: botocore 1.13.33 has requirement python-dateutil<2.8.1,>=2.1; python_version >= "2.7", but you'll have python-dateutil 2.8.1 which is incompatible.
Downgrade python-dateutil:
pip install python-dateutil==2.8.0
Any other Python package needed should be installed in the virtual environment. Deactivate the virtual environment when done installing.
Create home directory for git repository
This step is not needed to make nginx and gunicorn work.
My source code for the web app is in a GitHub repository.
mkdir git
git clone https://github.com/markokole/hubie.git
Executing above commands means home to my web app project is /home/centos/git/hubie. This will come in handy later on.
Test web application
I am still in the virtual environment for the testing purpose.
Since the application I am using as an example connects to AWS S3, credentials are needed.
export AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>
Stepping into the git repository and executing the following command:
gunicorn --chdir logic -b 0.0.0.0:8080 hubie:server
Should load the website in a browser once you enter IP_ADDRESS:8080 or DNS_NAME:8080
Make sure you open the port 8080!
The page loads successfully, and now we work towards loading the page with port 80.
Exit virtual environment:
deactivate
Create gunicorn service
To create a Gunicorn service, two services will be created, one depending on the other.
Create gunicorn.socket file
First service creates a socket file which listens for connections.
sudo vi /etc/systemd/system/gunicorn.socket
[Unit]
Description=gunicorn socket
[Socket]
ListenStream=/run/gunicorn.sock
[Install]
WantedBy=sockets.target
No need to start this service since it is a dependence of the service described below.
Create gunicorn.service file
This file creates the Gunicorn service and prior to that starts the above mentioned socket service. Make sure both files have the same name.
sudo vi /etc/systemd/system/gunicorn.service
[Unit]
Description=gunicorn daemon
Requires=gunicorn.socket
After=network.target
[Service]
User=centos
Group=www-data
WorkingDirectory=/home/centos/git/hubie
Environment="AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID>"
Environment="AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>"
Environment="PATH=/home/centos/hubievenv/bin/gunicorn"
ExecStart=/home/centos/hubievenv/bin/gunicorn --workers 3 --chdir /home/centos/git/hubie/logic --bind unix:/run/gunicorn.sock hubie:server
[Install]
WantedBy=multi-user.target
The group www-data has to exist before this service is started. Alter the parameters accordingly.
Start the gunicorn service
When the service file is created, start the service.
sudo systemctl start gunicorn
Enable the service so that it starts automatically after server restart.
sudo systemctl enable gunicorn
Check for status of the service with below command.
sudo systemctl status gunicorn
If the gunicorn service does not start add execute right to the world to the gunicorn.sock file.
sudo chmod 667 /run/gunicorn.sock
Configure Nginx and Gunicorn
Gunicorn configuration file
First, create two folders in the nginx home (/etc/nginx), folder sites-available will store the gunicorn configuration file, sites-enabled will store the symbolic link of the file.
sudo mkdir /etc/nginx/{sites-available,sites-enabled}
Create the configuration file. Keep in mind the file has to be of type *.conf.
sudo vi /etc/nginx/sites-available/gunicorn.conf
server {
listen 80;
server_name mydomain.com www.mydomain.com;
location = /favicon.ico { access_log off; log_not_found off; }
location /hubie/ {
root /home/centos/git;
}
location / {
proxy_pass http://unix:/run/gunicorn.sock;
}
}
Server name is the DNS name or IP address of the server.
First location ignores the error of missing favicon. ico file.
Second location defines the project name with root as the home directory of the repository.
Create symbolic link
Create a symbolic link of the file in the sites-available folder.
sudo ln -s /etc/nginx/sites-available/gunicorn.conf /etc/nginx/sites-enabled
Nginx configuration file
The nginx configuration file should be changed as well.
sudo mv /etc/nginx/nginx.conf /etc/nginx/nginx.conf.default
sudo vi /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
include /etc/nginx/sites-enabled/*.conf;
server_names_hash_bucket_size 64;
}
The file is pretty much similar to the default file, except the last two lines.
Check validity of nginx.conf
sudo nginx -t
Restart the nginx service.
sudo systemctl restart nginx
Create nginx.ini for ufw
Last file we create is a nginx.ini file to fix the firewall issues. Linux package ufw is required for the job.
sudo vi /etc/ufw/applications.d/nginx.ini
[Nginx HTTP]
title=Web Server
description=Enable NGINX HTTP traffic
ports=80/tcp
[Nginx HTTPS] \
title=Web Server (HTTPS) \
description=Enable NGINX HTTPS traffic
ports=443/tcp
[Nginx Full]
title=Web Server (HTTP,HTTPS)
description=Enable NGINX HTTP and HTTPS traffic
ports=80,443/tcp
sudo ufw enable
Answer “y” to the question.
sudo ufw allow 'Nginx Full'
Execute the following two commands and the Dash web app will be ready to use.
sudo grep nginx /var/log/audit/audit.log | audit2allow -M nginx
sudo semodule -i nginx.pp
If you check the browser, the page with the server’s DNS or IP loads on port 80.
Some error messages
502 bad gateway
connect() to unix:/run/gunicorn.sock failed (13: Permission denied) while connecting to upstream
When you run into this error, and believe me you will, make sure user nginx has access to the *.sock file in the above mentioned error message. Even though service nginx is not owned by nginx, nginx is still accessing the socket file.
With below command, it is possible to monitor the nginx error messages:
sudo tail -f var/log/nginx/error.log
504 Gateway Time-out
upstream timed out (110: Connection timed out) while reading response header from upstream
In the file that defines the service – in this example gunicorn.service – add the following option:
--timeout 120
remember to restart the service. And for more details regarding this solution, check out this stackoverflow post.
Links
Links used to put together a working example and this blog post:
- https://medium.com/faun/deploy-flask-app-with-nginx-using-gunicorn-7fda4f50066a (first one I used for my attempt)
- https://www.digitalocean.com/community/tutorials/how-to-set-up-nginx-server-blocks-on-centos-7
- https://www.digitalocean.com/community/tutorials/how-to-install-nginx-on-ubuntu-18-04
- https://www.digitalocean.com/community/tutorials/how-to-set-up-django-with-postgres-nginx-and-gunicorn-on-ubuntu-18-04 (maybe most detailed and helpful)
- https://axilleas.me/en/blog/2013/selinux-policy-for-nginx-and-gitlab-unix-socket-in-fedora-19/ (the one that got me over the finish line – last two commands in my blog post)
- https://www.digitalocean.com/community/questions/nginx-403-forbidden-2