Containers

Docker

A practical introduction to how Docker packages software, runs containers, and connects application parts.

Docker packages an application and its runtime dependencies into a portable, isolated unit called a container. Unlike virtual machines, containers share the host operating system kernel but are isolated using Linux namespaces and control groups, making them lightweight, fast to start, and highly portable. The result is a delivery model where the same image is built once, tested consistently, and promoted across environments without configuration drift or dependency mismatch.

Learning objectives

What you should be able to do after reading.
  • Explain the difference between an image, a container, and the host they run on.
  • Describe the main files and commands used to build and run a Docker-based application.
  • Recognize the typical lifecycle from build to run to update.

At a glance

Fast mental model before you dive in.
Building blocks
  • Images
  • Containers
  • Dockerfile
Runtime view
  • Processes
  • Ports
  • Volumes
Operational habits
  • Repeatable builds
  • Clear tagging
  • Simple service wiring

Core idea

At the operating system level, a container is a process bounded by two Linux primitives. Namespaces and cgroups. Namespaces isolate what the process can see. Its filesystem view, network interfaces, hostname, and process tree. Control groups (cgroups) limit how much CPU, memory, and I/O the process can consume. Docker bundles these kernel features behind a developer-friendly toolchain, but it is these primitives that actually create the isolation.

That distinction matters when reasoning about security. A container is not a virtual machine, it shares the host kernel. If the kernel has a vulnerability a container process can reach, isolation can be broken. This is why kernel version, container runtime configuration, and workload privilege all affect the security posture of a containerized system.

Reproducibility is the other central benefit. An image captures not just application code but the exact versions of system libraries, configuration, and tooling that the code depends on. Moving that image from a developer's laptop through staging to production changes only the runtime context, not the software. This makes environments easier to reason about and problems easier to reproduce.

The Docker architecture separates the image (a read-only, layered blueprint) from the container (a running instance with a writable top layer). Multiple containers can run from the same image simultaneously without interfering with each other's filesystem state, and stopping a container leaves the image intact for the next run.

Working model

  • Write a Dockerfile that describes exactly how the image is built, including base image, dependencies, and configuration.
  • Build the image to produce a versioned, tagged artifact that can be pushed to a registry and pulled in any environment.
  • Run the image as one or more containers, exposing only the ports and volumes the workload actually needs.
  • Promote the same image artifact through environments rather than rebuilding at each stage.

Operational baseline

  • Tag images with a specific version or commit reference rather than relying on floating tags like 'latest' for automated deployments.
  • Keep each container to a single responsibility so its Dockerfile, resource needs, and failure mode are easy to understand.
  • Treat the Dockerfile and Compose file as first-class application code. Version, review, and test them with the same discipline as the application itself.
  • Never bake environment-specific configuration or secrets into an image, inject them at runtime through environment variables or a secrets manager.

Signals to watch for

Patterns worth investigating further.
  • A container image that changes without a matching code change.
  • Multiple services sharing a single container for convenience.
  • Ports exposed without a clear purpose or owner.

DEEP DIVE

Container basics

A container is not a package. It is a running process with a bounded view of the operating system. When Docker starts a container, the Linux kernel creates a new namespace set for it, a private filesystem (mount namespace), a private network stack (network namespace), a private hostname (UTS namespace), and an isolated process tree (PID namespace). The container process believes it is the only workload on the machine, even though it shares the kernel with many others.

Resource isolation is handled by control groups (cgroups), which set hard limits on CPU time, memory usage, disk I/O, and network bandwidth. Without cgroups, a single misbehaving container could starve all others of resources. Together, namespaces and cgroups define what a container can see and how much it can consume. The two fundamental axes of container isolation.

Container images use a union filesystem built from layers. Each instruction in a Dockerfile that modifies the filesystem adds a new read-only layer. These layers are content-addressed and shared across images that share a common history, so storage is efficient. When a container starts, Docker adds a thin writable layer on top, writes go there and disappear when the container stops, which is why persistent data needs to live in volumes.

A common misunderstanding is treating containers as lightweight virtual machines. They are not. VMs have their own kernel, containers share the host kernel. Running as root inside a container means the process has root-level capability within its namespace, and if namespace isolation is breached, that becomes root on the host. This is why avoiding root in containers and using minimal capabilities matters for production workloads.

Images vs containers

An image is a read-only, content-addressed artifact. It contains a filesystem snapshot built from Dockerfile instructions, plus metadata such as the default entry point, exposed ports, and environment variables. Images are composable, each Dockerfile instruction creates a new filesystem layer, and those layers are reused across images that share a common base, saving storage and transfer time.

A container is what happens when an image is started. Docker adds a thin writable layer on top of the image's read-only layers. All file changes in the container go into this writable layer. When the container is removed, the writable layer is discarded. The original image layers are untouched and available for the next container instance from the same image.

This copy-on-write model has practical implications. Reading files is fast because data is served directly from the shared immutable layers. Writing or modifying files triggers a copy to the writable layer, which keeps different container instances from interfering with each other's filesystem state. This is why volumes exist, to persist data that needs to outlive the container's writable layer.

Image size affects startup time, storage, pull latency, and attack surface. Every layer in an image is present in every container started from it. Files added in an early layer cannot be removed by a later RUN instruction. They are still present in the image, just shadowed. This is the key motivation for multi-stage builds. Ensuring the final image contains only what the application needs to run.

Dockerfile

A Dockerfile is a sequence of instructions executed by the Docker build engine. Each instruction that modifies the filesystem produces a new immutable layer. The order of instructions matters both for correctness and for build performance, because Docker caches layers, if an instruction and its inputs have not changed since the last build, Docker reuses the cached result rather than re-executing.

Cache-efficient Dockerfiles place the least-frequently-changing instructions first. Adding and installing dependencies (COPY package.json && RUN npm install) before copying source code means the dependency layer is only rebuilt when the package file changes, not on every source edit. In a busy development workflow this can reduce build times from several minutes to a few seconds.

The base image choice in the FROM instruction is one of the most consequential decisions in a Dockerfile. Official slim or alpine variants contain fewer packages than full-fat images, which reduces both image size and the number of packages that might contain known vulnerabilities. Distroless or scratch-based images go further, containing only the runtime and the application binary. Nothing else to exploit during a breach.

Common mistakes include running apt-get without --no-install-recommends (bringing in unnecessary packages), COPY-ing the entire repository into the image, leaving build tools in a runtime image, and storing secrets in ARG or ENV instructions (which appear in the image layer history and in docker inspect output). Each of these either enlarges the image, slows the build, or leaves sensitive data accessible to anyone who can pull the image.

Compose

Docker Compose defines a multi-container application as a single, declarative YAML file. A typical web application might need an app service, a database, and a cache. Without Compose, starting that stack requires three separate docker run commands with carefully coordinated port mappings and environment variables. Compose reduces this to docker compose up, all services start in dependency order with a consistent configuration.

Compose provides automatic service discovery by name. Services on the same Compose network reach each other using the service name as a hostname. The app container connects to 'db' without knowing its IP address, and that name resolves to whatever container is currently running the database service. This makes configuration portable and avoids hard-coded addresses in environment variables.

Compose is excellent for local development and integration testing but is not a production orchestration tool. It runs on a single Docker host, does not handle host failure, and cannot scale services across machines. Teams that continue using Compose in production for simple workloads should understand this clearly. Kubernetes or a similar orchestrator exists precisely to handle what Compose does not.

A common mistake is treating docker-compose.yml as a one-off convenience rather than a versioned part of the application. When Compose files drift from the application's real runtime requirements, they stop being a reliable way to reproduce the environment. The file should be in source control, reviewed like any other configuration, and designed to work without manual steps after checkout.

Volumes

Docker provides three types of persistent storage. Named volumes are managed by Docker. It picks the location on the host, creates it automatically, and tracks its lifecycle. Bind mounts map a specific host directory or file directly into the container. Tmpfs mounts store data in host memory and disappear when the container stops. Each type has a different operational trade-off between portability, performance, and durability.

Named volumes are the recommended choice for application data that must survive container restarts and updates. They are not tied to a specific host path, they are easier to back up using Docker tooling, and they avoid host filesystem permission mismatches that often trip up bind mounts. When a container is replaced with a new version, the named volume persists and the new container picks up the existing data.

Bind mounts are common in development because they allow the host's source code directory to be reflected live inside the container. Edit a file on the host and the running container sees the change immediately, no rebuild required. But bind mounts expose more of the host filesystem to the container and tightly couple the container to the host's directory structure, which makes them less suitable for production.

The design principle underlying volumes is that containers should be stateless. A container that writes important data to its own filesystem creates a hidden dependency between the container's lifecycle and the data's survival. Keeping the container filesystem ephemeral and routing durable state to volumes or external storage makes containers easier to replace, scale, and recover, which is the whole point of container orchestration.

Networking

Docker creates virtual networks and connects containers to them rather than directly to the host's physical interfaces. The default bridge network allows containers to communicate by IP address but not by name. User-defined bridge networks, created explicitly with docker network create, add automatic DNS resolution by container name, which makes service discovery in multi-container environments much simpler and more reliable.

Port publishing maps a container port to a host port, making the service reachable from outside the container. The syntax -p 8080:80 means requests on host port 8080 are forwarded to port 80 inside the container. Without explicit port publishing, a container's listening ports exist only within its network namespace and are unreachable from outside. This default-closed behavior is a useful security property.

For multi-host deployments, Docker supports overlay networks and external CNI plugins that enable containers on different machines to communicate as if on the same network segment. This is the foundation of Swarm and the conceptual basis for how Kubernetes networking works, though Kubernetes implements it through its own pluggable CNI layer with different tools (Calico, Cilium, Flannel) for different environments.

The security habit to build early is never exposing ports by default. Internal services, databases, caches, internal APIs, should only be accessible on internal Compose or Docker networks, not published to the host. Treating network exposure as a deliberate design decision rather than an afterthought reduces the attack surface of each service and prevents accidentally reachable ports from becoming an entry point.

Registries

A registry stores and serves Docker images. Docker Hub is the default public registry, teams pull official base images from it and can push public images of their own. Private registries, AWS ECR, Google Artifact Registry, GitHub Container Registry, or a self-hosted Harbor, require authentication and are used to store proprietary images that should not be publicly accessible. Most production workflows use a private registry for everything that goes to production.

Image tags are mutable by default. A tag like myapp:latest or nginx:1.25 can be updated to point to a different image without any warning to consumers. Pulling by tag alone does not guarantee reproducibility or integrity. For deployment automation and security audits, the more reliable approach is to pin by digest, a content-addressable SHA256 hash that uniquely identifies a specific image. A digest cannot be changed after the fact.

Good registry hygiene includes keeping base images up to date (old images accumulate unpatched CVEs), removing images that are no longer deployed (to reduce the footprint for scanning and audit), and defining a clear promotion path where images move from build to staging to production registries as they pass validation gates. Mixing unverified build images and production-ready images in the same namespace reduces visibility.

Registry access control matters because anyone who can push to a registry can substitute images that automated systems will pull and run. Pipeline credentials that allow pushing should be scoped to specific repositories and used only in dedicated publish steps, not in every job that touches the repository. Combining registry access control with image signing gives deployment systems a way to verify that what they are about to run was actually built by the trusted pipeline.

Typical workflow

The standard Docker workflow is write or update the Dockerfile, build the image with a specific tag, test the running container locally, push the image to a registry, and run that same image in the next environment. This one-way promotion path is the foundation of 'build once, deploy everywhere'. The same artifact that was tested is the one that goes to production.

A well-functioning workflow produces deterministic outputs. The same Dockerfile and the same input files should build to the same image on every run. In practice, reproducibility is threatened by floating tags in FROM instructions (the base image silently changes), unbound package installations (apt-get or npm install without lockfiles fetches the latest available versions), and build arguments that inject environment-specific values into layers. Each of these can cause two builds from the same commit to produce different images.

Tagging strategy signals what an image is and where it belongs in the release process. Using the commit SHA (myapp:a3f2c9b) gives traceability. You can always find the source code that produced a running container. Semantic versions (myapp:2.4.1) communicate stability and intent for public or partner-facing releases. The 'latest' tag is convenient for local testing but dangerous in automated pipelines because it changes silently and cannot be relied on for reproducibility.

Deploying updated containers in a real environment is not just a Docker operation. It depends on the orchestrator's rolling update logic, health checks, and readiness signals. A container that starts but fails its health check should not receive traffic. Building health checks into the Dockerfile and into the deployment configuration from the beginning, rather than adding them after problems appear, is what makes the delivery workflow reliable under real conditions like slow startup, transient failures, and misconfigured dependencies.