SOC Analyst

What is a SOC?

A Security Operations Center (SOC) is a centralized unit that continuously monitors and defends an organization’s information systems. Its main goal is to detect, investigate, and respond to cybersecurity threats using tools like SIEMs, IDS/IPS, and EDR solutions.

How Does a SOC Work?

A SOC focuses on the operational aspects of cybersecurity—monitoring, alert triage, incident investigation, and response—rather than designing strategy or implementing security infrastructure. Most teams follow a tiered analyst structure:

Tier 1 Analysts monitor and triage alerts.
Tier 2 Analysts investigate escalated events and improve detections.
Tier 3 Analysts handle advanced threats and conduct proactive threat hunting.

Additional roles include detection engineers, incident responders, threat intel analysts, and security engineers. Some SOCs also conduct forensics and malware analysis to determine root causes and enhance long-term defense.

Cyber kill Chain Lifecycle

Recon (Reconnaissance)

This is the initial stage of an attack. During reconnaissance, the attacker selects a target and gathers as much useful information as possible. This can include technical details, public data, and other intelligence that may aid in later stages of the attack.

Weaponization

In this phase, the attacker uses the information collected during reconnaissance to develop an exploit or payload. The goal is to craft a method of gaining access to the target system while avoiding detection by the organization's security solutions.

Delivery

At this stage, the attacker delivers the payload to the victim. Common delivery methods include:

Phishing emails with malicious attachments or links
Compromised websites hosting the payload
Physical methods (e.g., infected USB drives)

Exploitation

This is the moment the payload is triggered, exploiting a vulnerability to gain access to the system.

Installation

Once the system is exploited, the attacker installs malware to maintain access. This may involve:

Deploying backdoors, droppers, or rootkits
Implementing persistence mechanisms for continued control

Command and Control (C2)

The attacker establishes remote communication with the compromised system, allowing for:

Continuous control
Data exfiltration

Actions on Objectives

In this final stage, the attacker performs their intended actions, which may include:

Installing ransomware
Exfiltrating sensitive data
Moving laterally within the network
Compromising the domain
Taking full control of the infrastructure

Incident handling process

Preparation Stage

The Preparation Stage focuses on building an effective incident response capability and implementing strong preventive measures. This includes assembling a trained team, having clear documentation, and ensuring access to essential tools like forensic equipment and secure communication systems. Protection efforts—though not directly handled by the incident response team—are crucial and must be well understood by them.

Key measures include:

DMARC for email security
Endpoint hardening
Network segmentation
MFA
Privilege management
Continuous vulnerability scanning
User awareness training
Active Directory assessments
Purple team exercises

All of these efforts aim to reduce the attack surface and enhance readiness for detecting and responding to incidents effectively.

Detection & Analysis Stage

When a security incident is detected, you should conduct some initial investigation and establish context before assembling the team and calling an organization-wide incident response.

The Detection and Analysis phase is a critical part of incident response, as it enables teams to identify, investigate, and understand security incidents before they escalate. It involves detection from multiple sources such as security tools, user reports, threat hunting, and third-party notifications. This phase requires strong technical expertise, deep visibility into the network, and collaboration between teams. One of the key activities is conducting an initial investigation to gather context, build an accurate incident timeline, and assess the severity and scope of the event. To guide decision-making during this process, it’s essential to answer a set of core questions that help determine how to respond and prioritize effectively:

Key questions to evaluate the severity and scope of a security incident:

What is the impact of the exploitation?
What are the requirements for the attacker to exploit it?
Can any business-critical systems be affected?
Are there any suggested remediation steps?
How many systems have been impacted?
Is the exploit currently being used in the wild?
Does the exploit have worm-like propagation capabilities?

How is data collected?

There are two main approaches:

Live Response: This is performed while the system is still powered on. It involves collecting volatile data such as running processes, active network connections, recent file activity, etc. This method is common because many critical traces (artifacts) only exist in RAM, and turning off the system would erase them.
Post-Mortem Analysis (Powered Off): Sometimes the system is shut down to prevent the attacker from interfering. However, this means losing RAM contents, which can include key evidence.

Legal Aspect: Chain of Custody

Throughout the data collection process, it’s crucial to document who accessed what data, when, and how, in order to maintain the chain of custody. This ensures that if legal action is required, the digital evidence will be admissible in court.

Containment, Eradication, & Recovery Stage

Containment

This phase aims to stop the spread of the incident and limit the damage. It is divided into:

Short-term containment: Quick actions that isolate the threat without significantly altering the affected systems.
Long-term containment: More permanent controls that improve security and limit future risks.

All containment actions should be coordinated across systems to avoid alerting the attacker and giving them a chance to adapt or hide.

Eradication

Once containment is in place, the next step is to completely remove the threat from the environment.

Deleting malware, backdoors, and any malicious files
Rebuilding or restoring affected systems from clean backups
Applying additional patches and hardening measures

The goal is to eliminate both the symptoms and the root cause of the attack

Recovery

The organization begins restoring normal operations.

Validating that restored systems are functioning properly and data is intact
Reintroducing systems into production
Monitoring them closely for unusual activity, since recently compromised systems are prime targets for reinfection

Even after systems are back online, vigilance is essential to ensure the attacker does not return.

Post-Incident Activity Stage

This final stage focuses on learning and improving from the incident. Once the systems are restored and the threat is removed, it's time to analyze what happened, how it was handled, and how to prevent similar events in the future.

The incident report is a vital deliverable and should answer questions like:

What happened and when?
Performance of the team dealing with the incident in regard to plans, playbooks, policies, and procedures
Did the business provide the necessary information and respond promptly to aid in handling the incident in an efficient manner? What can be improved?
What actions have been implemented to contain and eradicate the incident?
What preventive measures should be put in place to prevent similar incidents in the future?
What tools and resources are needed to detect and analyze similar incidents in the future?

Triaging Process Overview

Alert triaging is the process SOC analysts use to review, prioritize, and assess security alerts to determine their threat level and impact. The goal is to identify whether an alert is a real threat, a false positive, or requires further investigation.

Initial Alert Review
- Review alert metadata (IP addresses, time, systems involved, etc.).
- Check logs (network, system, application) to understand context.
Alert Classification
- Categorize alert severity and urgency using internal standards.
Alert Correlation
- Cross-check with other alerts/events for patterns or related threats.
- Use SIEM data and threat intel for validation.
Data Enrichment
- Add context using packet captures, malware samples, memory dumps, etc.
- Use sandboxes or open-source tools to analyze suspicious artifacts.
Risk Assessment
- Evaluate threat impact on critical assets and likelihood of spread.
- Consider compliance or regulatory risks.
Contextual Analysis
- Analyze affected asset value, existing security controls, and relevance to regulations (e.g., GDPR, HIPAA).
Incident Response Planning
- If alert is serious, start documenting and assigning response roles.
- Coordinate with relevant internal teams.
Consultation with IT
- Check for maintenance, misconfigurations, or known changes that could explain alert.
- Reduce false positives.
Response Execution
- Act on alerts based on risk. Resolve, dismiss, or escalate as needed.
Escalation

Escalate alerts involving critical systems, advanced threats, or legal implications.
Share detailed findings and risks with higher-level teams or external bodies if needed.

Continuous Monitoring

Keep tracking the situation, sharing updates with stakeholders.

De-escalation

Once the incident is contained and resolved, inform relevant teams and document lessons learned.

PreviousEscape Restricted Shell NextWindows Logging Hub

Last updated 3 months ago