You push an image, wait 3 minutes for the build, another 2 for the push, and when it finally reaches production, the container lands with Ubuntu, Python, pip cache, build tools, and every dev dependency you installed months ago. That image is eating disk space, slowing rollouts, and carrying libraries with known CVEs you are not even using.
I have worked with clusters where the average image was well over a gigabyte. That is not unusual. But it is also not necessary. With a handful of straightforward techniques, most application images land under 200MB. Sometimes under 50MB.
Prerequisites
- Docker Engine 24+ (check with
docker version) - A project to containerize
- Basic familiarity with Dockerfiles
Technique 1: Multi-Stage Builds
This is the biggest single win. Multi-stage builds let you use one Dockerfile with multiple FROM statements. The first stage has all the build tools. The final stage copies only what you need to run.
Go
# Stage 1: Build
FROM golang:1.23 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o server ./cmd/api
# Stage 2: Runtime
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]
golang:1.23 is about 800MB. Distroless is about 2MB. The final image lands around 15-20MB.
Node.js
# Stage 1: Dependencies only
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
# Stage 2: Runtime
FROM node:22-alpine AS runner
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
ENV NODE_ENV=production
CMD ["node", "dist/index.js"]
Dev dependencies - TypeScript, testing libraries, linters - never make it into production. The Alpine base is ~120MB. The final image stays close to that because you are only adding your source code on top.
Python
# Stage 1: Build wheels
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt ./
RUN pip install --user --no-cache-dir -r requirements.txt
# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
ENV PYTHONUNBUFFERED=1
RUN addgroup --system app && adduser --system --ingroup app app
USER app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Watch out for this mistake: piping pip install results straight into the runtime stage or copying the entire builder stage. Copy only wheels.
Why Multi-Stage Works
Build tools are heavy. The Go compiler, the C toolchain, npm with its dependency resolution - none of these are needed at runtime. Separating build from runtime means you leave 99% of the weight behind.
Technique 2: Pick the Right Base Image
Not all images are the same.
| Image | Size | Good For |
|---|---|---|
debian:bookworm-slim |
~80MB | General purpose, needs apt packages |
alpine:3.20 |
~7MB | Minimal, musl libc (test first) |
gcr.io/distroless/base |
~10MB | Static binaries, no shell |
gcr.io/distroless/static |
~2MB | Go, Rust, or any static binary |
scratch |
0 bytes | Fully static binary, no OS layer |
A simple rule: start with -slim or -alpine variants. Reach for -full only when you actually need runtime build tools (compiling native extensions on the fly, running a full OS).
A warning about Alpine. Some Python packages (psycopg2, numpy, pandas) compile C extensions. Alpine uses musl libc instead of glibc, and that difference can cause subtle linking failures. Python apps often do better with the -slim Debian variants.
Technique 3: .dockerignore
Docker sends your entire build context to the daemon before executing the Dockerfile. A repo with node_modules, Python virtual environments, or a .git directory can be hundreds of megabytes.
node_modules/
.git/
*.md
.gitignore
.env
.env.*
dist/
.next/
coverage/
__pycache__/
*.pyc
.vscode/
.idea/
Without this file, Docker does not know what to exclude. It sends everything.
Technique 4: Order Layers for Caching
Docker caches each build layer. When a layer changes, all subsequent layers are rebuilt. The trick is putting things that change frequently at the bottom.
# Bad - cache invalidates on every source change
COPY . .
RUN pip install -r requirements.txt
# Good - pip only reruns when dependencies change
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
This matters most in CI. If your pipeline rebuilds on every commit, a well-ordered Dockerfile saves 30-60 seconds per build. In a monorepo with dozens of services, that compounds fast.
Technique 5: Run as Non-Root
Containers default to root. If someone exploits your application, they have root access inside the container. Fixing this takes one line:
RUN addgroup --system app && adduser --system --ingroup app app
USER app
Distroless images from Google ship with a nonroot variant. Use those.
Technique 6: Scan Before You Ship
Scan images for known vulnerabilities before pushing to production.
Docker Scout (bundled with Docker Desktop):
docker scout quickview myapp:latest
Trivy (open source, CI-friendly):
trivy image --severity HIGH,CRITICAL myapp:latest
I run Trivy in every CI pipeline. If CRITICAL severity CVEs are found, the pipeline stops. This catches problems like an outdated OpenSSL in the base image before the image reaches production.
# GitHub Actions
- name: Scan image
uses: aquasecurity/trivy-action@master
with:
image-ref: 'myapp:${{ github.sha }}'
severity: 'CRITICAL,HIGH'
exit-code: '1'
Putting It Together
A complete Dockerfile for a Go service:
FROM golang:1.23 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o server .
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]
docker build -t myapp:latest .
docker scout quickview myapp:latest
docker push myapp:latest
Final size for a Go binary? About 15MB. Compare with the 800MB golang image you would have shipped otherwise.
Common Pitfalls
Build cache in the final image. apt-get install leaves .deb files in /var/cache/apt/archives. Clean up in the same RUN layer: apt-get clean && rm -rf /var/lib/apt/lists/*.
COPY . . too early in the file. This busts the cache on every source change. Move stable COPY commands (package.json, requirements.txt) to the top.
Using latest for base images. node:22-alpine can point to different images a week apart. Pin a digest or at least a minor version: node:22.13-alpine@sha256:abc123.
Where to Go Next
Once your images are lean, these topics become useful:
- Cosign for signing images and verifying integrity in production
- Kaniko or BuildKit for building images inside Kubernetes without mounting the Docker socket
- Docker Slim as an automated optimizer for images you cannot refactor right now