Docker is a platform for developing, shipping, and running applications in lightweight, portable containers. It solves the "it works on my machine" problem by packaging an application with all its dependencies—code, runtime, system tools, libraries, and settings—into a standardized unit. Teams pick Docker over traditional virtualization because containers share the host OS kernel, making them faster to start, more resource-efficient, and smaller in size than virtual machines. For a Senior Full Stack AI Engineer, it's the foundational tool for ensuring consistent environments from a local LlamaIndex prototype to a scalable microservice deployment on AWS, Azure, or GCP, and it's the primary building block for Kubernetes-based orchestration.
The core mental model is the separation of concerns between the Docker Engine (the runtime), Images (the immutable blueprint), and Containers (the running instance).
Key Internals & Architecture:
docker CLI talks to the Docker Daemon (dockerd), a long-running background service that manages images, containers, networks, and volumes. They communicate via a REST API, often over a Unix socket (/var/run/docker.sock) or a TCP port.Dockerfile (e.g., FROM, RUN, COPY) creates a new layer. Layers are cached and shared between images, making builds efficient. The final image is tagged (e.g., myapp:v1.0) and stored in a Registry (Docker Hub, Amazon ECR, Azure Container Registry).docker run, the daemon adds a thin, writable "container layer" on top of the image's read-only layers using a Union File System (like overlay2). This is where any file changes during runtime live. The container gets its own isolated process space, network stack, and mount namespace.--memory=2g.Dockerfile. The Docker CLI sends build context to the daemon, which executes the instructions layer-by-layer to create an image. The image is pushed to a registry. In production, the daemon pulls the image and creates a container from it, applying runtime constraints (resource limits, network config, volume mounts).[Developer Machine]
|
| (docker build/push)
v
[Docker Daemon] <-----> [Container Registry (ECR/ACR/GCR)]
| (Manages)
|
[Container] (Namespace/cgroup isolation)
| |
| +---> Writable Layer
| +---> Image Layers (ubuntu, python, app code...)
|
[Host OS Kernel (Linux/Windows)]
|
[Infrastructure (AWS EC2, Azure VM, etc.)]
Critical for production, especially with compiled languages or Node.js applications with heavy build dependencies. It keeps the final image small and secure by discarding build tools.
# Stage 1: The Builder
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Copy source and build if needed (e.g., TypeScript)
COPY src ./src
# RUN npm run build # Uncomment for TS/React etc.
# Stage 2: The Runner
FROM node:18-alpine
WORKDIR /app
# Copy only the necessary artifacts from the builder stage
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
COPY --from=builder /app/src ./src # Or /app/dist if built
USER node
EXPOSE 3000
CMD ["node", "src/server.js"]
Orchestrates multi-container applications (app + database + cache + AI model server). Essential for replicating a microservice or full-stack AI environment locally.
# docker-compose.yml
version: '3.8'
services:
ai-backend:
build: ./backend
ports:
- "8000:8000"
environment:
- REDIS_URL=redis://redis:6379
- MODEL_PATH=/models/llama2
volumes:
- ./backend/src:/app/src # Live code reload
- ./models:/models # Mount pre-downloaded AI models
depends_on:
- redis
- model-server
model-server:
image: ghcr.io/llama-index/llama-index-server:latest
ports:
- "8080:8080"
volumes:
- ./models:/models
redis:
image: redis:7-alpine
ports:
- "6379:6379"
command: redis-server --appendonly yes
volumes:
- redis_data:/data
volumes:
redis_data:
Production containers must be robust. A healthcheck lets the orchestrator (Kubernetes, ECS) know if the app is alive. Handling SIGTERM ensures graceful shutdown.
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Define a healthcheck (critical for K8s liveness/readiness probes)
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
# Use a wrapper script for graceful shutdown
COPY entrypoint.sh .
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
#!/bin/bash
# entrypoint.sh
# Trap SIGTERM and gracefully stop the app
trap 'echo "SIGTERM received, shutting down..."; kill -TERM $PID; wait $PID' TERM
python app.py &
PID=$!
wait $PID
Running as root inside a container is a security risk. Always switch to a non-root user.
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y python3 python3-pip \
&& rm -rf /var/lib/apt/lists/* \
&& groupadd -r appuser && useradd -r -g appuser appuser
WORKDIR /home/appuser
COPY --chown=appuser:appuser . .
RUN pip3 install --no-cache-dir -r requirements.txt
USER appuser # Switch to non-root user for execution
EXPOSE 8000
CMD ["python3", "app.py"]
Optimizing build speed and image size is a hallmark of seniority. Order Dockerfile instructions from least to most frequently changing and use a .dockerignore file.
# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
.env
*.pyc
__pycache__
.DS_Store
README.md
tests/
# Ignore large model files unless specifically copied
models/*.bin
# Dockerfile - Optimized for cache
# 1. Install OS dependencies (changes infrequently)
FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc libpq-dev && rm -rf /var/lib/apt/lists/*
# 2. Install Python dependencies (cache breaks when requirements.txt changes)
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# 3. Copy application code (changes most frequently)
COPY . .
Q: Explain what happens when you run docker run -it ubuntu bash.
A: The Docker daemon checks locally for the ubuntu:latest image. If not found, it pulls it from Docker Hub. It creates a new container with a writable layer atop the image layers, allocates a pseudo-TTY (-t), keeps STDIN open (-i), and sets the command to bash. It uses namespaces to isolate the container's process tree and network, and cgroups to apply default resource limits. The container runs until bash exits.
Q: What's the difference between COPY and ADD in a Dockerfile? When would you use each?
A: COPY is straightforward: it copies local files/directories from the build context into the image. ADD has extra features: it can copy from remote URLs and automatically extract local tar archives. The best practice is to always use COPY unless you specifically need the tar extraction or remote URL functionality of ADD, as its behavior is less predictable and transparent.
Q: How do Docker containers communicate with each other on the same host?
A: By default, Docker creates a bridge network (docker0). Each container gets a virtual Ethernet interface attached to this bridge and an IP address. Containers can communicate using these internal IPs. For better service discovery, you should create a user-defined bridge network (docker network create mynet). Containers attached to the same user-defined network can communicate using their container names as hostnames, and Docker provides an embedded DNS server to resolve these names.
Q: When would you choose Docker over a full VM, and when might a VM be better? A: Choose Docker for application portability, fast startup, high density (running many instances on one host), and microservice architectures. It's ideal for stateless services, CI/CD pipelines, and packaging dependencies. Choose a VM when you need to run a full OS with a different kernel (e.g., Windows on Linux), for strict security isolation (though containers can be secure, VMs offer a stronger boundary), or for legacy applications that require specific kernel modules or modifications.
Q: You have a containerized Node.js application that's crashing intermittently in production. What's your debugging strategy using Docker?
A: First, I'd check the container logs: docker logs <container_id> --tail 50 -f. Then, I'd inspect the container's resource usage at the time of crash using docker stats history or the host's monitoring. If logs aren't enough, I'd examine the container's metadata and state with docker inspect <container_id>. For a deeper dive, I might commit the state of a failing container to a new image (docker commit) for offline analysis, or if possible, exec into a running container (docker exec -it <container_id> sh) to check process lists, memory, and internal logs. The goal is to correlate the crash with resource limits (OOM Killer), application errors, or external dependencies.
Q: How do you manage persistent data (like a database) with Docker containers?
A: Containers are ephemeral. For persistent data, you must use Docker Volumes or bind mounts. Volumes are managed by Docker and stored in a host directory (/var/lib/docker/volumes/). They are the preferred mechanism because they can be named, easily backed up, and managed by Docker CLI. Bind mounts mount a specific host path into the container. They offer higher performance but tie the container to a specific host's filesystem layout. For a database, you'd define a volume in your docker run command or Compose file: -v db_data:/var/lib/postgresql/data. Never store data in the container's writable layer.
Q: How does Docker fit into a CI/CD pipeline for a microservice?
A: Docker is the consistent packaging format throughout the pipeline. In CI, the pipeline builds a Docker image from the application code, tags it with the commit SHA (e.g., myapp:git-abc123), and runs unit/integration tests inside a container. If tests pass, it might scan the image for vulnerabilities. In CD, the pipeline pushes the validated image to a production registry (ECR). The deployment system (Kubernetes, ECS) is then instructed to update the deployment to use the new image tag. This ensures the artifact tested is identical to the one deployed.
Q: Explain the relationship between Docker and Kubernetes. A: Docker is a containerization platform used to build, package, and run individual containers. Kubernetes is a container orchestration platform. It uses Docker (or other runtimes like containerd) as its underlying container runtime to pull images and run containers. Kubernetes then manages the lifecycle of these containers at scale: scheduling them across a cluster of machines, ensuring desired replica counts, handling load balancing, service discovery, and rolling updates. Think of Docker as the "what" (the application package) and Kubernetes as the "where and how many" (the cluster scheduler and manager).
As a Senior Full Stack AI Engineer at WWT, Docker is the linchpin that connects all the pieces of your stack into a reproducible, deployable unit.
USER directive.latest tag in production." This is a recipe for instability. You should advocate for immutable, version-specific tags (e.g., myapp:1.2.3-gitabc123)..dockerignore. This leads to bloated build contexts and images, and potentially leaking secrets.docker commit as part of a build process. This creates opaque, unreproducible images. The correct answer is always to define everything in a Dockerfile.dev vs prod dependencies) shows depth.