SIEM functionality / wazuh overview

Now that your SIEM is installed and even has some agents running that feed log data into it - we can start to dive into the different functionalities of the software.

The SIEM we use throughout the course is called wazuh and it consists of separate services in two areas - the individual client- the agent-side, and the centralized server-side.

The agents are available for many operating systems (Windows, MacOS, Linux/Unix) and oftentimes it is even possible to run agents in container environments (e.g. docker / kubernetes)

Ok cool, but what does an agent do actually?

wazuh agents

The agent is contrary to their name, not a hidden spy. Agents are small programs that are installed on clients (a client is a computer/service/container) and each one is responsible for collecting all the required system information as logs and forwarding those to the server-side components.

Great, but umm - Can you explain that again?

Sure!

In a network many different computers exchange information and while some are databases, others are domain controller or even your work laptop.

Each system has most likely different software installed and the software generates information about its usage and for debugging. This data is sent to the server.

Back in the last module we installed wazuh agents for Windows and Linux clients - this software is responsible for forwarding the data to the SIEM.

Now the remaining question that might nag you in the back of your head is - How does the agent know which logs to send to the server?

Short answer - it does not 😅 - longer answer, the sources need to be configured by the blue team/IT team

→ YOU! 🤘

Great, but umm how?

When the agent is installed you can add sources to the agent so that it knows where to get the log data from and in which format.

There is a component of the agent that is called ossec - it collects logs from various sources and feeds them into the wazuh server. The agent has a configuration file that is located here: /var/ossec/etc/ossec.conf for Linux and C:\\Program Files (x86)\\ossec-agent\\ossec.conf for Windows agents.

Inside the ossec.conf you will find hierarchical tags (<somethingies>) like the one below:

What are we looking at?

This is an XML file structure.

A What now?

XML - extended markup language - might look familiar if you have seen .xml files before - e.g. Microsoft Excel uses xml files - xlsx is essentially a zip archive containing many xml files/folders.

Don’t believe me? Let’s check it out.

Create a new Excel file, add some data to it, and save it

add/change the file extension to .zip → e.g. if your file was called Book1.xlsx add .zip to the end → Book1.xlsx.zip

open the file with your favorite unzip tool or the terminal and check out the content 😲

inside the extracted files is a sheet1.xml (xl/worksheets/sheet1.xml) open that one and see if you can find the data you entered into the excel sheet.

sheet1.xml:

WAOW - all the data in the sheet is here - albeit it might look a little different, e.g. my date was converted to a different format but it is all there 😲

What you can see here is the hierarchical order of the different tags as well a worksheet holds sheet data and each sheet data consists of rows. A row has columns (c) and values (v).

Each tag is opened and closed (by a /) very similar to what HTML (websites). The spaces/indentation are not mandatory.

But they make the code much easier to read because each level has a clear set of spaces in front of it and your eyes can easily identify which level data belongs to.

What happens to XML files is that they often are read into programs and then converted into commands/information inside of a binary computer program.

As an example, Microsoft Excel loads the XML files and then shows you the data + it even stores which cell you were currently looking at when you saved the file!

Pretty cool, huh?

💡 This process is called deserialization (you take data from a file and plug it into a program) and the opposite is called serialization (you take data from the program and write it into a file).

Back to the ossec.conf file - we can also see a hierarchy though not as deep as in the excel sheet - each localfile has at least two corresponding parameters - log_format and either location or command + frequency - these are indented to make it easier on your eyes but it is not required for XML to have any indentation.

The only important part is that each opened tag e.g. <log_format> has a corresponding closing tag </log_format> which are separated by one leading slash / in front of the tag name.

Ok cool but can you use any word and just make up your own tags? <maikroservice> for example?

Very good question!

Technically yes, as long as you have a decoder/parser that takes the data and turns them into valid things for your code/program you can do it!

each one of the <localfile>...</localfile> blocks are information that your agent should forward to the server.

In this case, we have 2 different sources - the first one reads the output from the df -P command every 360 seconds and latter forwards the content of the snort.alert.fast file.

💡for more information what's possible with ossec: https://documentation.wazuh.com/current/user-manual/reference/ossec-conf/index.html

server components

The wazuh server-side components consist of three entities - the Server, Dashboard and Indexer.

The Server

The server is a pretty mighty component of the wazuh ecosystem. It handles all the interactions with agents:

enrollment
log ingestion
system inventory
compliance checks
file hashes / IP checks / MITRE ATT&CK
vulnerability scans

If you imagine a spy movie you can think of the clients as agents and the server component as their handler.

Spy movie?!?

The server is who the agents connect and send logs to, who they ping to say they are still alive (heartbeat)

It also collects and sorts the logs, checks if any of the entries in the logs match a rule, handles the classification of events into MITRE ATT&CK matrix categories so that:

threat intelligence (looking for threats in the internet/darknet)
can be combined with threat hunting (looking for indicators of compromise in your environments/infrastructure)
and the possible corresponding threat actors

Additionally, the server is responsible for authentication management - that means the server checks if users are who they say they are (authentication) - kinda like a bouncer in front of a club. This is handled via the API - essentially, the dashboard asks the server if the credentials the user provided are correct and then lets them in based on their roles and rights.

The indexer

The indexer allows us to use almost instantaneous searches through the data which is essential when you analyze endpoint log data for threats after an alert has been triggered.

It queries the API of the wazuh server to get the data and then organizes it neatly with indices. Each index is essentially a JSON document database (noSQL) similar to firebase/mongodb.

There are 4 different indices that wazuh stores data in:

wazuh-alerts → all alerts generated by the wazuh server
wazuh-archives → all collected/generated events
wazuh-monitoring → agent status (used for the overview in the dashboard)
wazuh-statistics → wazuh server performance metrics

Additionally, the indexer handles long-term storage of the data - this enables you to have backups of all data collected in a decentralized manner (important when something breaks).

What you could also do is connect wazuh to other systems later on - e.g. you want to use wazuh in a subsection of your network/clients and forward the data to splunk and into a data lake (unstructured data storage location) for backup/later analysis with machine learning models.

Under the hood, this is the wazuh indexer forwarding data just like it receives them from the wazuh server.

However, you can use any data forwarding service you prefer - e.g. it is totally cool to use splunk universal forwarder, filebeats, logstash or even fluentd wazuh is agnostic when it comes to data ingestors.

And what's even cooler is that not only can you integrate the wazuh indexer but also directly the wazuh server which would forward all the data from the agents directly into a software of your choice.

The Dashboard

Last but definitely not least - the dashboard is the main tool you will use in your job as a SOC/Security analyst.

All your agents will be shown in the dashboard and if there is a connection issue with one of them you will see it in the overview immediately - like in the image below.

But the dashboard has WAY more capabilities, you can

analyze data via queries
build dashboards/reports - wazuh uses aws opensearch under the hood
check on rules
start threat hunting
test interactions with the API
watch the agents
- and you are also able to adapt the server configuration / custom rules and decoders right from the web-gui (graphical user interface)

rules and decoders? What's that?

Glad you asked - that's part of the next module.

Complete and Continue

Discussion

Practical SOC Analyst (intermediate)