CI/CD

CI/CD Foundations

A practical introduction to how delivery pipelines move code from commit to release in a controlled, repeatable way.

CI/CD , continuous integration and continuous delivery , is the practice of automating the path from a developer's code change to a running deployment. Continuous integration means every change is built, tested, and integrated automatically; continuous delivery extends that to make each passing change releasable on demand; continuous deployment goes further and releases automatically without a manual gate. For DevSecOps, the pipeline is not just a delivery mechanism , it is also a control system where security checks, approval gates, and provenance tracking are built into every release.

Learning objectives

What you should be able to do after reading.
  • Explain the main stages in a delivery pipeline and how they fit together.
  • Recognize how environments, approvals, and rollback shape safe releases.
  • Describe why reproducibility and clear artifact handling matter.

At a glance

Fast mental model before you dive in.
Pipeline flow
  • Build
  • Test
  • Artifact
Release control
  • Deploy
  • Environments
  • Approvals
Operating habits
  • Repeatable inputs
  • Visible outputs
  • Clear ownership

Core idea

The shift-left principle holds that problems found earlier in the development cycle are cheaper and faster to fix. A bug caught by a unit test before commit is faster to fix than the same bug found in a staging environment, which is faster than finding it in production. CI/CD pipelines operationalize shift-left by running tests, security scans, and policy checks automatically on every change. Giving developers immediate feedback at the moment a problem is introduced.

Reproducibility is the foundational quality of a trustworthy pipeline. A reproducible build produces the same output given the same input. The same source code, the same dependency versions, the same build tools. When a build is not reproducible, it becomes difficult to reason about why one run passed and another failed, and it undermines the guarantee that the artifact deployed to production is the one that was tested. Pinned dependencies, lockfiles, and hermetic build environments are the technical means to achieve this.

The difference between CI and CD is often confused. CI (continuous integration) focuses on the integration step. Merging changes frequently so that the combined codebase is always in a known, tested state. CD (continuous delivery) focuses on the deployment side. Ensuring that every passing build produces an artifact that can be deployed to any environment, on demand, without additional manual preparation. Continuous deployment is a specific CD implementation where deployment to production is fully automated, every passing build goes to production without a human approval gate.

The strongest pipelines treat the pipeline definition as application code. The YAML files, scripts, and configuration that define how software is built and deployed should live in version control, be reviewed before they change, and be tested in non-production environments. Pipelines that are modified ad hoc, that contain undocumented manual steps, or that behave differently on different machines are fundamentally less reliable and less secure.

Release path

  • Each pipeline stage has a clear purpose, defined inputs, and an explicit success condition, not just 'it ran without errors' but 'it validated what it was supposed to validate.'
  • Promote the same build artifact through environments rather than rebuilding at each stage, rebuilding introduces risk that the tested artifact differs from the deployed one.
  • Release decisions are based on documented criteria, test results, scan findings, approval records, not on subjective confidence.

Baseline

  • Pin build tool versions, dependency versions, and base images so that the same commit produces the same artifact a year from now.
  • Treat environments as distinct risk levels. Development is where experimentation happens, staging validates production-likeness, production receives only what has passed all gates.
  • Define rollback before you need it. Who authorizes it, what it means (redeploy previous artifact? revert configuration?), and how long it is expected to take.

Signals to watch for

Patterns worth investigating further.
  • The same commit produces different outputs on different runs.
  • Deployments depend on manual steps that are not documented.
  • Rollback requires guesswork because release state is unclear.

DEEP DIVE

Build

A build is the process of transforming source code and its dependencies into a runnable or distributable artifact, a compiled binary, a container image, a JAR file, an npm package. A reliable build should be deterministic. The same inputs always produce the same output. Determinism requires controlling every input. Source code (pinned to a commit), dependencies (locked to specific versions), build tools (versioned), and the build environment itself (containerized build environments or hermetic build systems like Bazel achieve this).

Non-determinism in builds is a common and underappreciated problem. Sources include fetching the 'latest' version of a dependency (the package you got today may not be available tomorrow), timestamps embedded in artifacts (the file changes even if the source did not), parallel execution order (some builds produce different output depending on which steps finish first), and implicit environment variables leaking into the build. Each of these creates a gap between what was tested and what might be built on a future run.

Build caching is a performance optimization that stores the outputs of expensive steps and reuses them when inputs have not changed. However, caching requires careful design. A cache keyed on the wrong inputs can serve a stale or incorrect result, for example, a cache that ignores dependency version changes will serve outdated compiled libraries. The rule is cache keys must capture everything that can change the output of the cached step.

The distinction between debug and release builds matters for security. Debug builds often contain additional logging, assertions, debug symbols, and sometimes hardcoded test credentials or feature flags. A production deployment should always come from a release build, and the pipeline should enforce this. Organizations that accidentally deploy debug builds to production are exposing internal information and potentially disabling security controls.

Test

The test pyramid describes the right distribution of test types in a delivery pipeline. At the base are fast unit tests that validate individual functions and classes in isolation, these run in milliseconds, provide precise feedback, and should be the most numerous. Integration tests in the middle validate that components work together correctly, they are slower but catch problems that unit tests miss. At the top are end-to-end or acceptance tests that validate the full system from the user's perspective. These are the slowest, most fragile, and most expensive to maintain.

Test speed matters in CI/CD because slow tests delay feedback. A pipeline that takes 45 minutes to run discourages developers from committing frequently and reduces the value of continuous integration. The goal is to give developers meaningful feedback within a time window that does not interrupt their flow. Parallelization, running independent test suites simultaneously, is one of the most effective ways to keep pipelines fast as test suites grow.

Flaky tests are tests that pass sometimes and fail other times without any change to the code being tested. They are a major reliability problem in pipelines. They erode trust in the test suite (developers learn to retry failures rather than investigate them), they slow down pipelines (failed flaky tests require reruns), and they occasionally mask real failures (a real failure looks like just another flake). Tracking flakiness metrics and dedicating effort to eliminating flaky tests is necessary maintenance for a healthy pipeline.

Smoke tests in production are a specific test type, a minimal set of automated checks run immediately after deployment to confirm that the core application functions are working. If a deployment breaks something fundamental, the service doesn't start, the login page returns 500, the API can't reach the database, smoke tests detect it within seconds. This is the fast path to triggering an automated rollback or alerting the on-call engineer before users are widely affected.

Artifact

An artifact is the versioned, immutable output of a build. It might be a container image, a compiled binary, a Java JAR, a Helm chart, or a zip file of static assets. The key properties of a good artifact are. It is immutable (once built, it never changes), it is versioned (every version is distinct and identifiable), and it is traceable (you can determine from the artifact which commit produced it, which pipeline built it, and which tests it passed).

Versioning strategy communicates intent. Semantic versioning (MAJOR.MINOR.PATCH) is appropriate for libraries and APIs where consumers need to understand the impact of upgrading. Git commit SHAs are good for internal service deployments where traceability matters more than human-readable versions. Build numbers (sequential integers) are simple but lose the connection to source code without additional metadata. The wrong choice for the context leads to either confusion about what changed or loss of traceability.

Artifact immutability is a strong guarantee that prevents a class of deployment errors. If an artifact is rebuilt for production using slightly different inputs (a newer base image was pulled, a dependency version resolved differently), the deployed artifact is not the same as the tested one. This is the 'works on my machine' problem at the deployment layer. Using content-addressable storage (Docker image digests, artifact checksums) to reference artifacts guarantees that the exact bytes that were tested are the bytes that get deployed.

Artifact provenance, the documented record of how an artifact was produced, is increasingly important for supply chain security. SLSA (Supply-chain Levels for Software Artifacts) is a framework that defines levels of provenance guarantees, from basic source code review up to a hermetic, reproducible build with signed provenance attestations. Attestations are machine-verifiable claims attached to the artifact that record the builder, the build inputs, and the build environment.

Deploy

Deployment is the act of moving a release artifact into a target environment and making it serve traffic. The most important distinction in deployment is between 'deploy' (the technical act of putting the new version on servers) and 'release' (the business act of routing traffic to the new version). Separating deploy from release enables techniques like dark launches, where new code is deployed but not yet serving real traffic, allowing pre-production validation in the real environment.

Deployment strategies define how the transition from old to new is managed. Rolling updates replace instances incrementally, maintaining availability throughout. Blue/green deployments maintain two identical environments and switch traffic from one to the other atomically, allowing instant rollback by reversing the traffic switch. Canary releases route a small percentage of traffic to the new version first, gradually increasing as confidence grows. This limits the blast radius of a bad release to a small fraction of users.

Feature flags are an alternative to risky deployments. Instead of deploying a big feature all at once, the code is deployed with the feature behind a flag that is initially off. The flag can be enabled incrementally, for internal users first, then a percentage of production users, then everyone. This decouples the deployment event from the feature release event, making each individually lower-risk. Feature flags require their own lifecycle management to avoid accumulating flags that are never cleaned up.

Health checks and readiness signals are the feedback loop that makes deployments safe. Before any orchestrator routes traffic to a new instance, that instance must pass its readiness check. Before declaring a deployment successful, the pipeline waits for a defined period during which error rates, latency, and health check results confirm the new version is behaving correctly. A deployment that reaches production without this feedback loop is flying blind.

Environments

Environments represent increasing levels of exposure and confidence in the delivery pipeline. Development is where individual changes are tested in isolation, often with mocked dependencies, local databases, and loose permissions. Staging is as close to production as the team can make it. Same infrastructure patterns, same configuration management approach, same data scale (or realistic synthetic data), same network topology. Production is the real thing.

The phrase 'staging is production-like' hides a lot of difficulty in practice. Staging environments often have different data (scrubbed or synthetic rather than real), different traffic volume, different third-party integrations (test accounts rather than production APIs), and different access controls. Each difference is a gap where a problem can hide in staging and only appear in production. Teams that treat staging as a genuine risk reduction tool invest in closing these gaps.

Ephemeral environments, temporary environments spun up for a specific pull request and torn down when it merges, are an increasingly common pattern. They allow integration testing against real infrastructure (a real database, real dependencies) for every PR, rather than only in a shared staging environment that may be congested or contain state from other teams' work. Tools like ArgoCD, Flux, and Terraform Cloud make ephemeral environment management practical.

Environment promotion should be a deliberate step, not an automatic cascade. Moving from development to staging might be automatic when all tests pass, moving from staging to production should require explicit approval or at minimum a specific human action. The promotion step is also the right place to enforce consistency. The exact same artifact that ran in staging is what goes to production, with only the environment-specific configuration changing.

Approvals

Approval gates add a human decision point to the pipeline at stages where the risk or uncertainty warrants it. For most changes in a mature pipeline, automated gates, test results, security scan thresholds, performance benchmarks, provide sufficient confidence without requiring human review. Approvals are most valuable where automated checks have limits. Novel changes, high-impact infrastructure modifications, releases that coincide with business-critical events.

A good approval gate specifies clearly what the approver is expected to verify. 'Please approve this deployment' is not useful if the approver has no information to base the decision on. A useful approval includes what is being deployed, what has changed since the last deployment, what tests passed, what the rollback plan is, and who is on call if something goes wrong. The approver should be signing off on something specific, not rubber-stamping a pipeline stage.

Compliance-driven approvals (required for SOC 2, PCI DSS, and similar frameworks) often require that the person approving a change is not the same person who made it (separation of duties) and that the approval is recorded with a timestamp and the approver's identity. Pipeline systems that support this, GitHub's required reviewers, Jira change tickets linked to deployments, ServiceNow integrations, turn human approvals into auditable records.

The failure mode of approval gates is rubber-stamping. Approvals happen quickly and without real review because the approver trusts the automated checks, doesn't have enough context to evaluate the change, or is under time pressure to release. Rubber-stamp approvals provide false assurance. If approvals are not adding value, the fix is better information for the approver or better automated checks, not removing the gate without addressing why it failed.

Rollback

Rollback is the ability to return to a known-good state after a failed deployment. It is a core release engineering capability, not an emergency improvisation. The two main strategies are rollback (returning to the previous version) and roll-forward (fixing the problem by deploying a new version as quickly as possible). Most teams should have both options available, with rollback as the default for problems discovered immediately after deployment and roll-forward for problems that require a code fix.

Kubernetes Deployments support rollback natively. The command 'kubectl rollout undo deployment/myapp' replaces the current Pod template with the previous one, performing a rolling update in reverse. This is fast and reliable for stateless applications. For applications managed with Helm or ArgoCD, the equivalent operations are 'helm rollback' and reverting a Git commit in the GitOps repository, which triggers ArgoCD to reapply the previous state.

Database schema migrations are the hardest part of rollback. If a deployment includes both a code change and a database migration (adding a column, renaming a table), rolling back the code may not be enough if the database is now in a state that the old code does not understand. The best-practice pattern is to make migrations backwards-compatible. Add new columns before deploying code that uses them, deprecate columns before dropping them, use expand-contract migrations instead of destructive changes. This allows the code and the schema to change independently and makes rollback feasible.

Blue/green deployments make rollback trivially fast, if the new (green) environment has a problem, switching traffic back to the old (blue) environment is a single routing change. This works because the old environment is still running and has its own database connection. For stateful applications where both environments share a database, blue/green requires careful coordination to avoid schema compatibility issues between the two application versions.

Release flow

GitOps is a release flow pattern where the desired state of the deployment is stored in a Git repository, and an automated system continuously reconciles the actual deployment state to match it. Infrastructure and application configuration are declared in Git, a tool like ArgoCD or Flux watches the repository and applies any changes to the cluster. This model makes every change to production visible as a Git commit, complete with author, timestamp, and diff. The same review and audit trail used for application code.

Trunk-based development and feature branching represent different CI/CD philosophies. In trunk-based development, all developers commit directly to the main branch (or use very short-lived branches that merge within a day), and the main branch is always deployable. This maximizes continuous integration but requires comprehensive automated testing and feature flags for unfinished work. Feature branching delays integration until a feature is 'ready,' which can lead to large merges, integration pain, and slow feedback loops.

Release trains are a pattern where deployments happen on a fixed schedule (weekly, bi-weekly) regardless of which features are ready. Features that miss the train wait for the next one. This pattern reduces coordination overhead in large organizations and makes deployment timing predictable for stakeholders. Its drawback is that a critical fix must wait for the next train unless there is a hotfix process, and teams may rush to include unfinished work to make the train.

The most important property of a healthy release flow is that the main branch is always in a deployable state. Broken main branches slow everyone down, erode confidence in the pipeline, and make it harder to release fixes quickly when needed. Achieving this requires passing tests before merge (enforced by branch protection), short-lived feature branches, and a culture where fixing a broken build is the highest priority task.