Application & Cloud Security

Cloud Security

A practical guide to securing cloud environments by controlling identity, exposure, configuration, and visibility.

Cloud security addresses the unique challenges of protecting workloads, data, and identities in public cloud environments. Because cloud infrastructure is programmable and ephemeral, security controls must be codified, automated, and continuously verified.

Learning objectives

What you should be able to do after reading.
  • Explain the shared responsibility model and why cloud controls are split across provider and customer.
  • Recognize how identity, networking, storage, and logging shape cloud risk.
  • Describe the configuration habits that reduce public exposure and prevent insecure defaults.

At a glance

Fast mental model before you dive in.
Control areas
  • IAM
  • Networking
  • Storage
Visibility
  • Logging
  • Monitoring
  • Review
Safety posture
  • Private by default
  • Least privilege
  • Explicit exposure

Shared Responsibility Model

Cloud providers and customers share responsibility for security, but the division depends on the service model. For IaaS the provider secures the physical hardware, data centers, and hypervisor, the customer is responsible for the operating system, runtime, middleware, data, and applications. For PaaS the provider also manages the OS and runtime, narrowing the customer's scope. For SaaS the provider manages almost everything and the customer is responsible primarily for identity, access configuration, and data governance.

The practical implication is that no cloud deployment is automatically secure just because it runs in a major cloud. The customer must actively configure, monitor, and maintain their portion of the security model. The most common breach scenarios in cloud environments involve customer-side failures. Misconfigured storage buckets, overly permissive IAM roles, and unencrypted data.

A common misconception is that compliance in one cloud service automatically extends to all services built on top of it. A provider may hold SOC 2 or PCI DSS certification at the infrastructure layer, but if the customer's application code or configuration violates those standards, the customer's workload is not compliant. Customers must independently verify their configuration against applicable compliance requirements.

Identity and Access Management (IAM)

Cloud IAM governs who or what can perform actions on which resources. Every principal, whether a human operator, a CI/CD pipeline, or a running workload, should have only the permissions it needs for its specific function. The principle of least privilege is the design goal. Start with no permissions and grant only what is explicitly required.

IAM policies are evaluated for every API call to the cloud control plane. The cloud provider checks whether any policy attached to the requesting identity (or inherited through groups, roles, or organizational hierarchy) grants the requested action on the requested resource. Understanding policy evaluation order matters for debugging permission issues and for identifying privilege escalation paths.

Regular permission audits are essential because IAM configurations drift over time. Developers grant broad permissions to unblock themselves during an incident and never remove them. Users change roles but retain permissions from previous assignments. Tools like AWS IAM Access Analyzer and GCP Policy Analyzer identify unused permissions and suggest minimum viable policies based on actual usage patterns from audit logs.

Cloud Security Posture Management (CSPM)

CSPM tools continuously scan cloud configurations for deviations from security best practices and compliance frameworks such as CIS Cloud Foundations Benchmarks, NIST, and PCI DSS. They detect issues such as publicly accessible storage buckets, overly permissive security groups, unencrypted databases, and disabled logging. Popular CSPM platforms include Prisma Cloud, Wiz, Orca, and cloud-native options like AWS Security Hub.

CSPM provides the visibility needed to detect misconfigurations before they are exploited. Without continuous scanning, a single misconfiguration introduced by one developer can persist for months, exposed to the internet the entire time. Automated detection reduces mean time to detection (MTTD) from weeks to minutes.

Infrastructure as Code shifts CSPM earlier in the development lifecycle. Security linting tools (checkov, tfsec, tflint) analyze Terraform, CloudFormation, and similar templates before deployment, catching misconfigurations at the code review stage. When misconfigurations are caught before deployment, they are far cheaper to fix and never create actual exposure.

Network Security in the Cloud

Cloud network security relies on Virtual Private Clouds (VPCs), subnets, security groups, and network ACLs to control traffic flow. A well-segmented architecture places public-facing resources (load balancers, API gateways) in public subnets, application servers in private subnets, and databases in isolated private subnets with no direct internet access. Security groups act as stateful per-resource firewalls that allow only specifically required ports and protocols.

Zero trust networking extends this further by removing implicit trust from any network location, including internal subnets. Every connection is authenticated and authorized regardless of where it originates. Services communicate through explicit authenticated channels rather than relying on network position as a security boundary. This means a compromised workload in one subnet cannot freely access services in another.

A common misconfiguration is setting security groups to allow inbound traffic from 0.0.0.0/0 (the entire internet) on ports that should only be accessible internally. This exposes databases, cache servers, and internal APIs directly to the internet. Automation tools like AWS Config rules, Azure Policy, and Cloud Custodian detect and alert on these configurations, and can optionally remediate them automatically.

Data Protection

Cloud data protection encompasses encryption at rest and in transit, key management, data classification, and access control for stored data. All major cloud providers encrypt data at rest by default using provider-managed keys. Customer-managed keys (CMK via KMS) give the customer control over key lifecycle and auditing, which is often required for regulated data.

Object storage permissions are a frequent source of data breaches. Public buckets, buckets with overly permissive access policies, and buckets without access logging have each contributed to high-profile incidents. The defense is layered. Enable account-level public access blocks, include bucket policy validation in IaC reviews, and run CSPM tools that alert on any public storage resource.

Data classification helps prioritize protection efforts and apply controls proportional to sensitivity. Not all cloud data needs the same level of protection. Production database backups containing customer PII require stronger controls than build artifacts or public documentation. Classification policies define which storage, access, and encryption requirements apply to each category.

Signals to watch for

Patterns worth investigating further.
  • Cloud resources are created with broad permissions or public access by default.
  • No clear owner exists for identity, network, or storage configuration.
  • Logs are incomplete, hard to query, or reviewed only after an incident.

DEEP DIVE

Shared Responsibility

The shared responsibility model defines how security obligations are divided between a cloud provider and its customers. The provider is responsible for the security of the cloud. Physical hardware, data centers, global network infrastructure, and the managed services it operates. The customer is responsible for security in the cloud. Everything they configure, deploy, and build on top of that infrastructure.

The model shifts based on service type. In IaaS (Infrastructure as a Service), the customer manages the operating system, runtime, middleware, data, and applications. The provider manages hardware and virtualization. In PaaS (Platform as a Service), the provider also manages the runtime and OS, the customer is responsible for application code and data. In SaaS, the provider manages nearly everything and the customer is responsible primarily for identity, access configuration, and data governance.

Understanding where the responsibility boundary falls for each service used is essential for building a complete security programme. Many cloud environments use multiple service types simultaneously. EC2 instances (IaaS), RDS databases (PaaS), and SaaS collaboration tools all in the same account. Each has different customer responsibilities that must be explicitly addressed.

The most dangerous misconception is assuming that a provider's compliance certification covers the customer's workload. A provider certified under PCI DSS is responsible for its portion, the customer must independently meet PCI DSS requirements for their application code, data handling, and configuration choices. Regulators have made clear that outsourcing infrastructure does not outsource compliance.

Cloud IAM

Cloud Identity and Access Management (IAM) is the set of policies, roles, and mechanisms that control who or what can perform actions on cloud resources. Every major provider (AWS, Azure, GCP) has an IAM system at its core. Mastering IAM is fundamental to cloud security because almost every significant cloud breach involves an IAM misconfiguration or compromise at some point.

The principle of least privilege is the design goal, every identity should have only the permissions needed to perform its specific function, nothing more. In practice this means using roles rather than users with direct permissions, scoping policies to specific resources rather than entire accounts, and regularly auditing and removing unused permissions. Wildcard actions ('Action: *') and wildcard resources ('Resource: *') in policies are red flags that indicate excessive privilege.

Privilege escalation in cloud environments frequently happens through IAM misconfigurations rather than by exploiting application vulnerabilities. An attacker who can modify their own IAM policies, create new users, or assume other roles can escalate from minimal access to full administrative control. Paths to privilege escalation through IAM are non-obvious and numerous, automated tools like Cloudsplaining and PMapper enumerate these paths and help prioritize remediation.

IAM Access Analyzer and equivalent tools in Azure and GCP analyze actual API call history to identify which permissions a role uses in practice and suggest a minimum viable policy. Starting with a broad policy and tightening it based on observed usage is more practical than trying to define minimum permissions upfront. The goal is to eliminate the gap between permissions granted and permissions actually needed.

Network Segmentation

Network segmentation in the cloud means dividing infrastructure into isolated network segments (VPCs, subnets, security groups) so that a compromise of one segment cannot directly reach another. The architecture goal is to ensure that workloads can communicate only with the services they legitimately need. A compromised web server should not be able to reach the database server directly, it should only be able to reach the application tier.

A well-segmented cloud architecture places public-facing resources (load balancers, CDNs, API gateways) in public subnets, application servers in private subnets with no public IP addresses, and databases in a separate private subnet accessible only from the application tier. Each layer's security group allows only the specific ports and source IP ranges required for its function.

Zero Trust networking extends segmentation by removing implicit trust from network position. Even within a private subnet, services must authenticate to each other rather than relying on the assumption that anything inside the VPC is safe. This is implemented through service mesh mutual TLS, API gateway authentication, and strict security group rules that allow traffic only from specific source security groups rather than IP ranges.

A common misconfiguration is security groups that allow inbound traffic from 0.0.0.0/0 on ports that should only be internal. This directly exposes databases, cache clusters, and internal management interfaces to the public internet. Internet-wide scanners (Shodan, Censys) index these exposures within minutes of their creation. Automated policy enforcement through cloud provider guardrails (AWS Config, Azure Policy, GCP Organization Policy) detects and optionally remediates these misconfigurations.

Storage Security

Cloud storage services (AWS S3, Azure Blob Storage, GCP Cloud Storage) are among the most frequently misconfigured and breached cloud resources. They store large volumes of sensitive data and are accessible from anywhere on the internet by default unless explicitly restricted. Securing cloud storage requires layered controls. Access policies, encryption, versioning, and access logging working together.

Access control for cloud storage works through IAM policies (who can access the service) and bucket or container policies (what operations are allowed on the specific storage resource). The most important control is blocking public access at the account level, which overrides any object-level or bucket-level configuration that might inadvertently make data public. This single setting prevents the most common category of storage-related data breach.

Encryption at rest is provided by default in all major cloud storage services using provider-managed keys. Customer-managed keys (CMK) give the customer control over key lifecycle. Who can use the key, when it can be rotated, and full audit logs of every key usage. For data subject to regulatory requirements, customer-managed keys are often mandated because they allow the organization to revoke access to data by revoking the key.

Publicly accessible storage buckets have caused numerous high-profile data breaches. The fix is multi-layered. Use account-level public access blocks, run CSPM tools that alert on any public storage resource, validate storage permissions in IaC reviews before deployment, and enable access logging so that any access to sensitive data is auditable. Never rely on obscurity (an unlisted or hard-to-guess bucket name) as a substitute for proper access control.

Logging and Monitoring

Effective cloud security requires comprehensive logging of API calls, resource changes, authentication events, and network traffic. Without logs, there is no way to detect an ongoing attack, investigate a breach, or demonstrate compliance. In cloud environments this means enabling services like AWS CloudTrail, Azure Monitor Activity Logs, and GCP Cloud Audit Logs across every account and region. These logs must be enabled from day one, retroactively enabling them after a breach provides no historical data.

Log coverage should span both the control plane (API calls that change configuration) and the data plane (access to actual data). CloudTrail captures control-plane activity. VPC flow logs capture network-level traffic. S3 access logs capture reads and writes to object storage, database audit logs capture SQL queries against sensitive tables. Centralizing all these log sources in a SIEM or log aggregation platform enables correlation and alerting across the full attack chain.

Alerting on security-relevant events is what transforms logging from a compliance checkbox into an active defense. Key events to alert on include root account usage, IAM policy changes, security group modifications, unusual geographic access patterns, failed authentication attempts above a threshold, public bucket creation, and disabling of logging or security services. Cloud-native services like AWS Security Hub, Azure Defender, and GCP Security Command Center aggregate these signals and provide pre-built detection rules.

A critical gap in many organizations is logging without monitoring. CloudTrail is enabled but no alerts are configured. VPC flow logs are collected but no one reviews them. Log data is only useful when someone or some automated system actively watches it and responds to anomalies. Establishing a mean time to detect (MTTD) target and measuring against it reveals whether the monitoring programme is effective.

Misconfigurations

Cloud misconfigurations are the leading cause of cloud security incidents. Unlike attacks that require sophisticated exploitation, misconfigurations are simple oversights that expose resources directly. Common examples include publicly accessible storage buckets, overly permissive IAM roles, unencrypted databases, disabled MFA on privileged accounts, and security groups that allow all inbound traffic. Each is easy to prevent but easy to introduce accidentally.

CSPM tools continuously scan cloud accounts against security benchmarks such as the CIS Cloud Foundations Benchmark and the provider's own security best practices. Tools like Prisma Cloud, Wiz, Orca, and Lacework evaluate thousands of configuration checks across IAM, networking, storage, logging, and compute. Findings are prioritized by risk severity, helping security teams focus remediation effort on the most dangerous issues.

Infrastructure as Code (IaC) makes misconfiguration detection earlier and more systematic. Security linting tools (checkov, tfsec, tflint) analyze Terraform, CloudFormation, and similar templates before deployment and catch misconfigurations at the code review stage. When a security rule is violated, the developer sees the finding in their PR review, not after the resource is deployed and exposed.

The root cause of most misconfigurations is not malice but convenience. A developer opens a port 'just to test' and never closes it. A bucket is made public to share a file and never locked down again. Shared administrative credentials are never rotated. The systemic defense is to automate detection and enforce secure defaults through policy rather than relying on human discipline. When the default configuration is secure and deviations require explicit justification, misconfigurations become exceptions rather than the norm.

Public Exposure Risk

Public exposure risk is the attack surface created when cloud resources that should be private are reachable from the public internet. Every publicly accessible endpoint is a potential entry point for attackers. In cloud environments, resources can accidentally become public through a single misconfigured security group rule, a storage bucket ACL change, or an accidentally assigned public IP address.

Attack surface management tools continuously discover and catalog all internet-facing resources in a cloud environment. This includes load balancers, API gateways, EC2 instances with public IPs, exposed databases, and public storage buckets. Many organizations are surprised to discover resources they did not know were public, often because they were created by developers who did not fully understand the networking model.

Reducing public exposure requires a combination of controls. Use private subnets for internal services with no direct internet access, place all public traffic through a load balancer or API gateway that provides TLS termination and WAF protection, block direct instance-level public IP assignment for application servers, and enforce public access blocks on storage services at the account level. Network diagrams and automated asset inventory help maintain visibility as the environment grows.

Internet-wide scanners like Shodan and Censys index all publicly accessible services continuously. A new public endpoint can be discovered and probed within minutes of being created. The window between accidental exposure and attacker discovery is very short. Automated detection and remediation (or alerting) for public exposure is essential because the human review cycle is too slow to catch exposures before they are found.

Secure Defaults

Secure defaults means configuring cloud accounts, services, and resources to be secure from the moment they are created, requiring explicit opt-in to less secure configurations rather than the reverse. Many cloud services have historically been permissive by default, placing the burden on the customer to harden them. Modern cloud security practice inverts this start locked down and open only what is explicitly needed.

Implementing secure defaults requires an account-level security baseline applied to every new account at creation time. A typical baseline includes mandatory MFA for all IAM users, CloudTrail enabled in all regions and all accounts, account-level S3 public access block enabled, default encryption for EBS volumes and RDS instances, and Service Control Policies that prevent disabling of security tools or creating unencrypted resources.

Organizational policy enforcement through tools like AWS Organizations Service Control Policies (SCPs), Azure Policy, and GCP Organization Policy allows security teams to set guardrails that cannot be overridden by individual account administrators. This ensures that even if a developer misconfigures their account, organization-level controls prevent the most dangerous outcomes. SCPs are particularly powerful because they apply to all accounts in the organization, including new accounts created in the future.

Secure defaults do not replace active security testing but they dramatically reduce the blast radius of mistakes. When the default posture is secure and the default answer is 'no, you must explicitly allow that,' misconfigurations that create exposure become exceptions requiring explanation rather than the norm. Combining secure defaults with automated posture monitoring creates a defence-in-depth approach where both the initial state and drift from that state are controlled.