Running Flask in production: a comprehensive guide

Introduction

Flask's built-in development server is perfect for local testing but notably unsuitable for production environments. This development server—while convenient—lacks critical production features, resulting in performance bottlenecks, request timeouts, and security vulnerabilities when faced with real-world traffic. A proper production deployment requires a purposefully designed application server, a robust process manager, and often a front-facing web server to handle client connections efficiently.

This guide explores production-ready deployment options for Flask applications, with a primary focus on Docker-based solutions. As an advanced user, you'll learn how to configure, deploy, and maintain Flask applications that can reliably serve production traffic.

Flask production server fundamentals

Why the development server falls short

Flask's development server was designed to simplify the development process, but it has several limitations that make it unsuitable for production:

Single-threaded by default: Unable to handle concurrent requests efficiently
No built-in process management: No automatic restarts after crashes
Limited security features: Not hardened against various attack vectors
Poor performance under load: Significant degradation with increased traffic
No built-in TLS/SSL support: Requires additional configuration for HTTPS

Essential components of a production setup

A robust Flask production environment typically includes:

WSGI server: Translates web requests to Python calls (uWSGI, Gunicorn)
Web server: Handles client connections, static files, SSL termination (Nginx, Apache)
Process manager: Ensures application availability and handles crashes
Container or virtualisation: Isolates the application and its dependencies
Monitoring and logging: Provides visibility into application behaviour

Docker-based Flask deployment

Docker has become the preferred method for deploying Flask applications due to its consistency, isolation, and orchestration capabilities. Let's explore the main approaches.

Option 1: Flask with Gunicorn in a single container

This approach is straightforward and works well for smaller applications:

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Run gunicorn with 4 worker processes
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "wsgi:app"]

The corresponding wsgi.py file:

from my_app import create_app

app = create_app()

if __name__ == "__main__":
    app.run()

This setup works but lacks the performance benefits of a dedicated web server for static files and connection handling.

Option 2: Flask with Gunicorn and Nginx (multi-container)

This more robust approach uses Docker Compose to manage multiple containers:

# docker-compose.yml
version: '3.8'

services:
  flask:
    build: ./app
    container_name: flask_app
    restart: always
    environment:
      - FLASK_ENV=production
    networks:
      - app_network

  nginx:
    build: ./nginx
    container_name: nginx
    restart: always
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/ssl:/etc/nginx/ssl
      - ./app/static:/static
    depends_on:
      - flask
    networks:
      - app_network

networks:
  app_network:

Flask application Dockerfile:

# app/Dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "wsgi:app"]

Nginx Dockerfile:

# nginx/Dockerfile
FROM nginx:1.25-alpine

COPY nginx.conf /etc/nginx/conf.d/default.conf

Nginx configuration:

# nginx/nginx.conf
server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://flask:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /static {
        alias /static;
        expires 30d;
    }
}

Option 3: Flask with uWSGI and Nginx

While Gunicorn is popular for its simplicity, uWSGI offers more advanced features and potentially better performance:

# docker-compose.yml
version: '3.8'

services:
  app:
    build: ./app
    restart: always
    networks:
      - app_network
    volumes:
      - ./app:/app

  nginx:
    build: ./nginx
    ports:
      - "80:80"
    networks:
      - app_network
    depends_on:
      - app

networks:
  app_network:

Flask with uWSGI Dockerfile:

# app/Dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt uwsgi

COPY . .

CMD ["uwsgi", "--ini", "uwsgi.ini"]

uWSGI configuration:

# app/uwsgi.ini
[uwsgi]
module = wsgi:app
uid = www-data
gid = www-data
master = true
processes = 5

socket = /tmp/uwsgi.sock
chmod-socket = 664
vacuum = true

die-on-term = true

Nginx configuration for uWSGI:

# nginx/nginx.conf
server {
    listen 80;
    server_name example.com;

    location / {
        include uwsgi_params;
        uwsgi_pass unix:///tmp/uwsgi.sock;
    }

    location /static {
        alias /app/static;
    }
}

Performance optimisation

Worker configuration

The number of workers is a critical configuration parameter that affects performance. A common formula is:

workers = (2 × CPU cores) + 1

For Gunicorn, you can set this with:

gunicorn --workers=5 --threads=2 wsgi:app

For uWSGI:

[uwsgi]
processes = 5
threads = 2

Connection handling

Optimize how your server handles connections:

# nginx/nginx.conf (additional optimizations)
http {
    # Connection timeouts
    keepalive_timeout 65;
    client_body_timeout 10;
    client_header_timeout 10;
    send_timeout 10;

    # Buffer sizes
    client_body_buffer_size 128k;
    client_header_buffer_size 1k;
    client_max_body_size 10m;
    large_client_header_buffers 4 4k;

    # Compression
    gzip on;
    gzip_min_length 1000;
    gzip_types text/plain text/css application/json application/javascript;
}

Scaling with Docker Swarm or Kubernetes

For larger applications, consider orchestration:

# docker-stack.yml for Docker Swarm
version: '3.8'

services:
  flask:
    image: yourusername/flask-app:latest
    deploy:
      replicas: 3
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure
    networks:
      - app_network

  nginx:
    image: yourusername/nginx:latest
    ports:
      - "80:80"
    networks:
      - app_network
    deploy:
      replicas: 2
    depends_on:
      - flask

networks:
  app_network:
    driver: overlay

Monitoring and logging

Prometheus and Grafana setup

# Add to docker-compose.yml
services:
  # ... other services

  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus:/etc/prometheus
    ports:
      - "9090:9090"
    networks:
      - app_network

  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    networks:
      - app_network
    depends_on:
      - prometheus

Flask application instrumentation:

# app.py
from flask import Flask
from prometheus_flask_exporter import PrometheusMetrics

app = Flask(__name__)
metrics = PrometheusMetrics(app)

# Static information as metric
metrics.info('app_info', 'Application info', version='1.0.3')

@app.route('/')
def main():
    return "Hello World!"

if __name__ == '__main__':
    app.run(host='0.0.0.0')

Centralised logging with ELK stack

# Add to docker-compose.yml
services:
  # ... other services

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.16.2
    environment:
      - discovery.type=single-node
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data
    networks:
      - app_network

  logstash:
    image: docker.elastic.co/logstash/logstash:7.16.2
    volumes:
      - ./logstash/pipeline:/usr/share/logstash/pipeline
    networks:
      - app_network
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:7.16.2
    ports:
      - "5601:5601"
    networks:
      - app_network
    depends_on:
      - elasticsearch

volumes:
  elasticsearch-data:

Configure Flask to use the ELK stack:

# logging_config.py
import logging
from logging.handlers import SocketHandler

def configure_logging(app):
    logstash_handler = SocketHandler('logstash', 5000)
    app.logger.addHandler(logstash_handler)
    app.logger.setLevel(logging.INFO)

Security considerations

Environment variables for sensitive data

# config.py
import os

class Config:
    SECRET_KEY = os.environ.get('SECRET_KEY') or 'hard-to-guess-string'
    SQLALCHEMY_DATABASE_URI = os.environ.get('DATABASE_URL')
    # other configuration variables

# docker-compose.yml (excerpt)
services:
  flask:
    # ... other configuration
    environment:
      - SECRET_KEY=${SECRET_KEY}
      - DATABASE_URL=${DATABASE_URL}

HTTPS configuration with Let's Encrypt

# nginx/nginx.conf
server {
    listen 80;
    server_name example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_prefer_server_ciphers on;
    ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
    ssl_session_cache shared:SSL:10m;

    location / {
        proxy_pass http://flask:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Best practices and common pitfalls

Best practices

Use application factories: Makes testing and configuration easier
Environment-specific configuration: Different settings for development, testing, and production
Health checks: Monitor application health and restart when necessary
Graceful shutdown: Handle termination signals properly
Connection pooling: Optimize database connections

Common pitfalls

Overloading worker processes: Leads to degraded performance
Memory leaks: Gradually consume all available memory
Inadequate error handling: Causes application crashes
Misconfigured proxy headers: Security vulnerabilities and broken functionality
Improper static file handling: Performance issues

Conclusion

Properly deploying Flask in a production environment requires careful consideration of various components. Docker provides an excellent platform for creating consistent, isolated, and scalable deployments. By following the best practices outlined in this guide, you can ensure your Flask application is robust, performant, and maintainable in production.

For your next steps, consider exploring more advanced topics such as blue-green deployments, A/B testing infrastructure, and automated scaling based on traffic patterns. These approaches can further enhance your Flask application's reliability and performance in production environments.