Introduction
The following quote from FastAPI creator Sebastian Ramírez challenges a common Docker practice. His warning about Alpine images for Python projects initially surprised me, so I decided to test these claims in production environments. Here's what I discovered about optimizing Python container deployments.
THe quote in full:
In short: You probably shouldn't use Alpine for Python projects, instead use the slim Docker image versions.
Do you want more details? Continue reading point_down
Alpine is more useful for other languages where you build a static binary in one Docker image stage (using multi-stage Docker building) and then copy it to a simple Alpine image, and then just execute that binary. For example, using Go.
But for Python, as Alpine doesn't use the standard tooling used for building Python extensions, when installing packages, in many cases Python (pip) won't find a precompiled installable package (a "wheel") for Alpine. And after debugging lots of strange errors you will realize that you have to install a lot of extra tooling and build a lot of dependencies just to use some of these common Python packages. weary
This means that, although the original Alpine image might have been small, you end up with a an image with a size comparable to the size you would have gotten if you had just used a standard Python image (based on Debian), or in some cases even larger. exploding_head
And in all those cases, it will take much longer to build, consuming much more resources, building dependencies for longer, and also increasing its carbon footprint, as you are using more CPU time and energy for each build. deciduous_tree
If you want slim Python images, you should instead try and use the slim versions that are still based on Debian, but are smaller. nerd_face
Selecting the optimal Python Docker image
When containerising Python applications, selecting the appropriate base image is a critical decision that affects build times, deployment efficiency, security posture, and runtime performance. While Alpine Linux has become popular for containerisation across many languages, its usage with Python deserves careful consideration.
In short: You probably shouldn't use Alpine for Python projects, instead use the slim Docker image versions.
This seemingly controversial statement from Sebastián Ramírez (creator of FastAPI) challenges conventional wisdom around Docker image optimisation. This article explores why this recommendation holds true for most Python applications and offers advanced guidance on selecting and configuring the optimal Python container environment.
The Alpine misconception
Alpine Linux's minimalist design and small footprint make it an attractive option for containerisation. Many developers instinctively reach for Alpine-based images assuming they'll achieve optimal efficiency. However, this approach often proves counterproductive for Python applications.
Alpine is more useful for other languages where you build a static binary in one Docker image stage (using multi-stage Docker building) and then copy it to a simple Alpine image, and then just execute that binary. For example, using Go.
The core issue stems from Alpine's use of musl libc instead of the more common glibc. This fundamental difference affects how Python packages with C extensions are compiled and installed.
Understanding the Python packaging ecosystem
To comprehend why Alpine presents challenges for Python applications, we must first understand how Python packages are distributed and installed.
The wheel mechanism
Python's packaging ecosystem relies heavily on wheels (.whl files) – pre-built binary distributions that allow for rapid installation without compilation. When a compatible wheel exists for your platform, pip can install it directly, avoiding the compilation process entirely.
The Python Package Index (PyPI) hosts wheels for popular platforms, primarily:
- Windows (various versions)
- macOS (various versions)
- Linux using glibc (as used by Debian, Ubuntu, CentOS, etc.)
Notably absent from this list is Linux using musl libc (Alpine). When pip runs on Alpine, it frequently fails to find compatible wheels and must fall back to building from source.
But for Python, as Alpine doesn't use the standard tooling used for building Python extensions, when installing packages, in many cases Python (pip) won't find a precompiled installable package (a "wheel") for Alpine. And after debugging lots of strange errors you will realize that you have to install a lot of extra tooling and build a lot of dependencies just to use some of these common Python packages.
This compilation process presents several challenges:
- Dependency hell – Building packages from source requires development tools and libraries that aren't included in the base Alpine image
- Build failures – Packages may have assumptions about the build environment that don't hold true on Alpine
- Extended build times – Compilation significantly increases image build duration
- Larger final images – The tools required for compilation often remain in the final image unless carefully removed
Comparative analysis of Python Docker images
Let's examine the primary options for Python Docker images:
1. Standard Python images (python:3.x)
The default Python images use Debian as their base. These images include:
- A complete Python installation
- Common development tools
- Libraries required for building extensions
Size: ~900MB-1GB Build speed: Fast (most packages have compatible wheels) Compatibility: Excellent Security: Good, with regular updates
2. Slim variants (python:3.x-slim)
These images also use Debian but strip out documentation, localisations, and non-essential packages.
Size: ~150-200MB Build speed: Generally fast (compatible wheels available) Compatibility: Excellent Security: Good, with regular updates
3. Alpine variants (python:3.x-alpine)
Based on Alpine Linux with a minimal footprint.
Size: ~45-60MB (base image) Build speed: Often slow (requires building from source) Compatibility: Problematic with many packages Security: Good, with regular updates Final size after dependencies: Often comparable to slim variants
This means that, although the original Alpine image might have been small, you end up with a an image with a size comparable to the size you would have gotten if you had just used a standard Python image (based on Debian), or in some cases even larger.
Performance and environmental impact
The build performance differences between these images translate to practical implications beyond mere convenience:
And in all those cases, it will take much longer to build, consuming much more resources, building dependencies for longer, and also increasing its carbon footprint, as you are using more CPU time and energy for each build.
These considerations are especially relevant for:
- CI/CD pipelines where builds occur frequently
- Development environments with iterative container rebuilds
- Organisations with sustainability commitments
Optimising Python Docker images for production
Having established that slim variants typically offer the best balance for Python applications, let's explore advanced techniques for optimising these images for production use.
Multi-stage builds
Multi-stage builds allow you to use one image for building and another for running your application. This approach enables:
- Installing build-time dependencies only in the build stage
- Copying only the necessary files to the runtime image
- Reducing the attack surface of the final image
# Build stage
FROM python:3.11-slim AS builder
WORKDIR /app
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt
# Runtime stage
FROM python:3.11-slim
WORKDIR /app
# Create a non-root user
RUN useradd -m appuser && \
chown -R appuser:appuser /app
# Copy only the built wheels and install
COPY --from=builder /app/wheels /app/wheels
COPY --from=builder /app/requirements.txt .
RUN pip install --no-cache /app/wheels/*
# Copy application code
COPY --chown=appuser:appuser . .
USER appuser
CMD ["python", "main.py"]
Managing package versions with pip-tools
The pip-tools package provides reliable dependency pinning and management. This approach ensures reproducible builds and prevents unexpected changes.
- Create a
requirements.infile with your direct dependencies:
flask==3.0.1
sqlalchemy
psycopg2-binary
- Generate a fully pinned
requirements.txt:
pip-compile requirements.in
- Use the pinned requirements in your Dockerfile:
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
Setting up useful plugins and extensions
Advanced Python Docker setups often benefit from additional tools that enhance debugging, monitoring, and performance. Here are some valuable additions for production-ready containers:
1. Configuring Python's GC monitoring
Python's garbage collection can be monitored and tuned using the gc module. Creating a simple plugin to expose GC statistics can provide valuable insights:
# gc_monitor.py
import gc
import time
import threading
import json
from pathlib import Path
class GCMonitor:
def __init__(self, interval=60, output_dir="/app/metrics"):
self.interval = interval
self.output_dir = Path(output_dir)
self.output_dir.mkdir(exist_ok=True)
self.running = False
def start(self):
self.running = True
threading.Thread(target=self._monitor_loop, daemon=True).start()
def _monitor_loop(self):
while self.running:
stats = {
"collected": gc.get_count(),
"thresholds": gc.get_threshold(),
"objects": len(gc.get_objects()),
"timestamp": time.time()
}
with open(self.output_dir / "gc_stats.json", "w") as f:
json.dump(stats, f)
time.sleep(self.interval)
To use this monitor, add it to your application's startup:
from gc_monitor import GCMonitor
# Start GC monitoring
monitor = GCMonitor()
monitor.start()
2. Configuring APM with Python agent
For production monitoring, Application Performance Monitoring (APM) tools provide invaluable insights. The Elastic APM Python agent offers a lightweight solution:
# Add to your Dockerfile
RUN pip install elastic-apm[flask]
Then configure in your Flask application:
from elasticapm.contrib.flask import ElasticAPM
def create_app():
app = Flask(__name__)
app.config['ELASTIC_APM'] = {
'SERVICE_NAME': 'your-service-name',
'SERVER_URL': os.environ.get('APM_SERVER_URL', 'http://apm-server:8200'),
'ENVIRONMENT': os.environ.get('FLASK_ENV', 'production'),
}
apm = ElasticAPM(app)
# Rest of your app configuration
return app
3. Setting up Python profiling with py-spy
For on-demand profiling without modifying your application code, py-spy provides a powerful solution that can be included in your container:
# Install py-spy in your Dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends \
procps \
&& rm -rf /var/lib/apt/lists/* \
&& pip install py-spy
With this setup, you can run profiling commands when needed:
# From inside the container or via docker exec
py-spy record -o profile.svg --pid 1
Security considerations
When deploying Python containers to production, security must be a priority:
- Run as non-root user: Always configure your container to run as a non-privileged user
- Pin package versions: Use exact versions for all dependencies to prevent supply chain attacks
- Regular updates: Establish a process for updating base images and dependencies
- Image scanning: Implement automated vulnerability scanning in your CI/CD pipeline
- Minimal images: Include only what's necessary for your application to run
Conclusion
If you want slim Python images, you should instead try and use the slim versions that are still based on Debian, but are smaller.
This recommendation from the source material aligns with our comprehensive analysis. While Alpine images appear attractive initially, the practical challenges they present for Python applications typically outweigh their benefits.
For production Python applications:
- Start with
python:3.x-slimas your base image - Use multi-stage builds to separate build and runtime concerns
- Implement proper dependency management with pip-tools or similar
- Configure monitoring and profiling tools appropriate for your environment
- Follow security best practices for container deployment
By following these guidelines, you'll achieve a balance of performance, security, and maintainability that serves your Python applications well in production environments.