Cloud Incident Response

Paul Ashwood - June 20, 2023

What is cloud incident response?

Cloud Incident Response (Cloud IR) is the process you follow when a cybersecurity incident occurs in your cloud environment. While the cloud aspects of IR will essentially follow typical IR phases (Preparation, Detection and Analysis, Containment, Eradication, and Recovery, and Post Incident / Post Mortem), there are critical differences between cloud platforms (running on AWS, Azure, GCP, Oracle Cloud, etc.). A team of specialist responders and tools can make the difference in getting clear and definitive answers and getting the decision support you need to recover.

How is cloud IR different from traditional endpoint IR?

1. No physical access to servers

In exchange for having the cloud service provider managing data center and network operations and patching (for PaaS), as well as writing and maintaining the software (for SaaS), you sacrifice things incident responders quite often want: physical access to servers to connect to a console, capturing forensic images of hard drives, and the ability to install special diagnostic utilities.

In most cases with cloud environments, you can’t take out-of-band memory dumps, log onto the physical console, or receive actual hard drives from a cloud provider. And that’s OK. You have to work within the access parameters of the cloud provider — such as taking logical storage snapshots, collecting and analyzing the logs the cloud provider makes available, and so on.

2. Accelerated cloud life cycles

The dynamic combination of technologies in modern cloud estates — a mix of containers, functions-as-a-service, hypervisors, and virtual machines running microservices with workloads spinning up and down to meet the changing demands of an organization — make cloud IR a very different environment to the classic capture-an-image, collect-the-logs incident on a typical endpoint.

Without adequate planning and internal communication, forensic artifacts are likely to be lost as workloads that are found “defective” — whether by automated processes meant to keep cloud workloads healthy, or by operations teams who mean well when they rapidly release new versions of a platform to address a vulnerability or exploit — are typically discarded to save costs. If you don’t know where to look, and you haven’t planned to collect logs and retain relevant snapshots, you may find that you don’t have enough information to answer the questions your organization has to ask to contain and recover from a potential breach.

3. “Rogue” operations are almost the norm

New cloud environments can be established by any business unit with access to a credit card, and are often unknown to the central information security organization. While it’s very common that the security organization knows about some core critical operations in the cloud, it’s also very common that adversaries will land their attacks at components of your cloud estate that are peripheral to your awareness, in subsidiaries acquired recently or long ago, in systems and applications that were slated for retirement or were supposed to receive security controls upgrades as part of a later phase of a security plan that didn’t come to fruition in time. These peripheral operations may not be considered core to the organization’s mission, but they can lead to breaches whose tremendous cost greatly exceeds the time and budget allocated to them prior to those breaches.

4. Identity is the new perimeter

Unlike a traditional data center where the perimeter is defined by the network using firewalls, identity has become the true perimeter in a cloud environment. This drastically changes our containment and remediation strategies when facing a cloud data breach. It’s no longer practical to block bad IP addresses, just as it’s no longer possible to assume all authorized users are coming from a particular network.

When user accounts lacking MFA requirements or API keys, commonly stolen from workstations, source control systems, and CICD hosts among others, are discovered by adversaries, they can wreak damage from anywhere on the internet. Adversaries also target accounts synchronized between Active Directory (AD) and Azure Active Directory, meaning an on-prem compromise can easily become a cloud breach as well.

5. Skilled staff are in even shorter supply

Just as technical staff with senior experience in the cloud are in short supply, those with cloud security engineering experience are even harder to find and retain, and those with cloud security incident response experience even more so. For organizations with a presence in more than one cloud platform, this challenge is multiplied.

2023 Cloud Risk Report

Download this new report to learn about the most prevalent cloud security threats from 2023 to better protect from them in 2024.

Download Now

Understanding cloud log sources

One of the biggest differences between traditional IR on an endpoint and cloud IR is the log sources. Logs are always important in a breach investigation, but the “ground truth” of what is normal and authorized in a cloud environment is harder to construct without logs.

Event logs enable us to piece together the malicious actions executed by a threat actor during a forensic investigation and help us contain and eject that threat actor. But cloud log sources do not align to endpoint log sources, and require different collection tools and techniques. In many cases, cloud logs are not enabled by default or have been purposely disabled and can be complex to retrieve without the right tools.

Without an understanding of these cloud services and tools built using these cloud services, IR firms that do not have dedicated cloud IR specialists will struggle to collect the data needed for an efficient and effective investigation.

Learn More

Read our post on log management to understand the importance and best practices to follow when selecting a log management tool. Read: Log Management

In SaaS cases, we always collect authentication logs (sign-in logs) and a targeted set of operations for all users, as well as another set of operations for all users of interest. In infrastructure (PaaS) cases, we collect cloud service plane logs as well as any relevant data plane logs, VPC/VNet flow logs, and any relevant specialty logs (function-as-a-service invocation logs, load balancer or CDN logs, etc.

In both types of investigations, we will collect performance metric data if it’s relevant (especially when more direct evidence sources are lacking) and cloud threat detection alerts for valuable context. Configuration snapshots are not log sources per se but can assist with interpreting the events recorded in the logs.

Furthermore, once you have those logs, interpretation of log entries (or their absence) is not always straightforward. Seasoned cloud investigators know what log trails exist in the different cloud environments (AWS, Azure, GCP) and how to quickly and efficiently collect and interpret the data from them. Log analysis is a key part of cloud compromise assessment.

As organizations try to balance the speed of delivering new digital user experiences to modernize their business processes with the security needs and costs of event logging in their cloud platform, the temptation to minimize or completely disable logging in the cloud environment exists. This becomes problematic in the face of a cloud data breach.

Expert Tip

Stay on top of the most common cloud vulnerabilities that might affect your security strategy to protect your cloud infrastructure.Read: Cloud Vulnerabilities

Containing an active cloud threat

Unlike a breach on an endpoint device, you cannot simply unplug the compromised machine and disconnect it from the network in a cloud data breach. And remembering that cloud data breaches can move quickly across your cloud environment, understand that containment is considerably more difficult in a cloud environment than it is with an endpoint breach.

Containing an incident in cloud infrastructure includes identifying all security principals compromised and/or added by the adversary, including users, compromised roles (such as via federated sessions or compromised identity stores), and service accounts. In many cases, the cloud provider supports more than one credential source for a security principal, allowing an adversary to impersonate a user or service account without interfering with the original, authorized purpose for that account — thereby hindering detection.

Some examples of this include multiple API keys in addition to a password for a user, multiple credential sources for a service account, and multiple MFA devices for a single user. All these must be carefully tracked and eliminated while ensuring adequate monitoring to detect any attempts by the adversary to reestablish persistence.

Containment scope doesn’t end there. Automation and deployment frameworks in the cloud platforms, and any user-defined code which can act as a security principal, including functions-as-a-service, containers, and hosts, can be used by the adversary to re-establish persistence in the environment.

Network backdoors including rogue VPC peering points and simple modification to network security group rules can expose resources to continued adversary control. And cloud-based infrastructure with excessive outbound permissions, with policies such as allowing access to all IP ranges belonging to a trusted cloud provider, can be abused by adversaries to blend in with authorized traffic to that cloud provider. A comprehensive review of all possible pivot points, persistence mechanisms, and outliers in configured access is required to ensure thorough containment and subsequent ejection.

You probably know that a Cloud Security Posture Management (CSPM) solution is an important defense prior to the occurrence of a breach, because a CSPM helps identify Indicators of Misconfiguration (IOMs) that contribute to the risk of a breach, and of course remediating those IOMs can actually prevent a breach. But very often, a CSPM is an important part of breach containment as well, both because it can help identify (in combination with endpoint security solutions) an active breach at the interface between endpoint and cloud, and because adversaries can continue to attempt to breach the cloud environment using these misconfigurations.

Falcon Cloud Security CSPM

Falcon Cloud Security delivers threat detection, prevention and remediation, while enforcing security posture and compliance across AWS, Azure, and Google Cloud. It empowers organization and security teams with unified visibility and security consistency to stop breaches faster and more efficiently.

Explore Now

In other words, while the visibility and containment efforts are ongoing, your organization can leverage the urgency of the breach to remediate existing misconfigurations and any new ones that were introduced by the adversary. Sharing access to the CSPM console with your peers in cloud platform operations and empowering them to see the progress they make can considerably shorten the path to securing the cloud environment.

CrowdStrike’s Incident Response for Cloud service

Incident Response for a cloud data breach requires a different set of skills, experience and tools than traditional IR for an on-premises attack. Without dedicated cloud IR specialists with the right tooling for the different cloud service provider platforms (AWS, Azure/O365, GCP), you will experience long delays in getting critical answers which will slow down your response and recovery time. Organizations without the necessary skills to respond effectively, have trouble containing active threats and ejecting them from the network, often losing critical evidence and being unable to determine exactly what happened during the attack, and what data may have been compromised.

CrowdStrike’s dedicated team of cloud IR specialists get called in to perform investigations on cloud data breaches, many of which are the result of misconfigured cloud security settings. We recommend that organizations fortify their cloud security posture before a breach occurs rather than deal with the aftermath of a cloud data breach.

Expert Tip

CrowdStrike Services offers a Cloud Security Assessment that will highlight areas of misconfiguration and ineffective cloud security settings to prevent cloud data breaches before they occur.Cloud Security Assessment Services

In the event that you are experiencing a cloud data breach, you need to engage an IR firm that has a specialized team of cloud incident responders and investigators. Firms that deliver traditional IR services on an endpoint breach will not be effective when it comes to responding to a cloud breach.

CrowdStrike has a dedicated team of cloud IR specialists with knowledge of the AWS, Azure / O365 and GCP cloud environments. We have built our own suite of Cloud Data Collector tools which we use to accelerate the forensic investigation and gain visibility to the malicious actions that have been executed by a threat actor.

CrowdStrike prides itself on being a leader in cloud incident response and brings control, stability, and organization to what can become a chaotic event. Learn how CrowdStrike can help you respond to a cloud data breach faster and more effectively:

Incident Response for Cloud

The CrowdStrike Services Cloud Incident Response team is composed of specialists drawn from industry, government and military dedicated to providing expert cloud incident investigation, containment and remediation services. These experts are capable of supporting comprehensive multi-cloud and on-premise investigations.

Request Info

GET TO KNOW THE AUTHOR

Paul Ashwood is a Senior Product Marketing Manager for CrowdStrike Services with a focus on Incident Response and Advisory Services. He has over 37 years experience in applications development, testing, mobile apps, digital transformation consulting and cybersecurity. Paul started his career as a developer before moving into sales and business development and now marketing with organizations like EDS, HP, DXC Technology and CrowdStrike. He holds a Bachelor’s in Applied Science (Computing) from his native country of Australia and currently resides in the Atlanta, GA area.