Even with extremely mature preventive and detective controls, a process still needs to be put in place to respond to and mitigate the potential impact of security incidents. The architecture of workloads strongly affects the ability of teams to operate effectively during an incident, to isolate or contain systems, and to restore operations to a known good state. Putting in place the tools and access ahead of a security incident, then routinely practicing incident response through game days, will help ensure that the architecture accommodates timely investigation and recovery.
In every incident, maintaining situational awareness is one of the most important principles. By using tags to properly describe cloud resources, incident responders quickly determine the potential impact of an incident. For example, tagging instances and other resources with an owner or work queue in a ticketing system allows the team to engage the right people more quickly. By tagging systems with a data classification or a criticality attribute, the impact of an incident is estimated more accurately.
During an incident, the right people require access to isolate and contain the incident, and then perform forensic investigation to identify the root cause quickly. In some cases, the incident response team is actively involved in remediation and recovery as well. Determining how to get access to the right people during an incident delays the time it takes to respond, and introduces other security weaknesses if access is shared or not properly provisioned while under pressure. Determine the access any team member needs ahead of time, and then regularly verify that the access is functional - or easily triggered - when needed.
Use the power of the API’s to automate many of the routine tasks that need to be performed during an incident and subsequent investigations. For example, isolate an instance by changing the security groups associated with an instance or removing it from a load balancer. Architecting a workload using Auto Scaling potentially allows the instance under investigation to be removed from production without affecting the availability of other applications.
Forensics often requires capturing the disk image or “as-is” configuration of an operating system; use block storage snapshots and instance API’s to capture the data and state of systems under investigation. Storing snapshots and related incident artifacts in object store ensures that the data will be available and retained appropriately.
During an incident, before the root cause has been identified and the incident has been contained, it is difficult to conduct investigations in an untrusted environment. Security practitioners use Devek to quickly create a new, trusted environment in which to conduct deeper investigation. The Devek design preconfigures instances in an isolated environment that contain all the necessary tools forensic teams need to determine the cause of the incident. This cuts down on the time it takes to gather necessary tools, and isolate systems under examination, and it ensures that the team is operating in a clean room.
Join Devek in reducing Cloud complexity
Looking to reduce complexity of cloud infrastructure? Look no further, we are here to make it happen!
Please leave some details and we will get back to you when Devek is available for trying out.