National Cyber Warfare Foundation (NCWF)

In this article, we will explore building an autonomous AI agent that uses a reasoning loop to automate OSquery threat hunting, turning raw system data into forensic conclusions at machine speed

Welcome back, aspiring cyberwarriors!

Most security tools give you data. What they rarely give you is a coherent answer. You run a query, get back a hundred rows, paste the output into an AI chat window, read the summary, and think of a follow-up question. Twenty minutes later, you have finally assembled something resembling a conclusion. The tool did the querying. The AI did some interpreting. But the intelligence loop connecting those two things was entirely you, manually carrying data from one place to another. Today, we are going to eliminate that manual loop in an example with system monitoring with osquery. Let’s get rolling!

The Problem with Manual Threat Hunting

In the heat of an incident, alert fatigue is your greatest enemy. Even with a powerful tool like OSquery, which allows you to query your entire operating system like a SQL database, the bottleneck remains the human investigator. Knowing which table to pivot to, which process ID to correlate with a network socket, and which file hash to check is a skill that takes years of “boots-on-the-ground” experience to master. While you are manually crafting the perfect SQL JOIN to see if a process has a valid binary on disk, the adversary is already moving laterally or exfiltrating data. We need a way to close the gap between detection and understanding by automating the tactical reasoning of the hunt.

The “Agentic” Forensics Approach

Instead of a human typing SQL, we are building an autonomous agent using LLM as the central nervous system. This isn’t just a script that runs a list of commands; it’s a reasoning engine. It analyzes high-level goals, such as identifying unauthorized persistence, and then decides the best path forward based on real-time data. If the AI detects a suspicious process, it doesn’t wait for you, but autonomously decides to query the file table for that process’s path. If it sees an error in its own SQL syntax, like forgetting that the file table requires a specific path in the WHERE clause, it reads the error, corrects its logic, and tries again. This mimics the OODA loop (Observe, Orient, Decide, Act) at machine speed. This mimics the OODA loop (Observe, Orient, Decide, Act) at machine speed. Sounds great, doesn’t it? I certainly think so, especially with control of a human.

Technical Walkthrough

The construction of this agent requires a Python environment and the OSqueryi binary installed and accessible in your system path.

Check your OSquery version with the following command:

kali> osqueryi –version

You start by exporting your OpenRouter API key as an environment variable to allow the script to communicate with the LLM models. The script manages the state of the investigation by passing the history of every query and result back to the model during each iteration.

kali> export OPENROUTER_API_KEY=”your-key-here”

Step 1: The OSquery Executor

The first step is creating a bridge between Python and the local system. We use a function that wraps the osqueryi command. This is important because it forces the output into JSON format, which allows the AI to “read” the system data as structured objects rather than a wall of text.

Step 2: The Decision Engine (The Brain)

Next, we define the decide_next_query function. This is where the “Agentic” logic happens. Instead of a standard chatbot, we use a system prompt that defines strict forensic constraints. We include specific rules for tables like file and systemd_units to prevent the AI from making common SQL syntax errors.

Step 3: Managing the State

The script must remember every step it has taken. We do this by appending every query and its result to a history list. Each time we call the LLM, we pass this entire history back to the model. This allows the AI to say, “In Step 1, I found a suspicious IP; therefore, in Step 2, I will check the process associated with that connection.”

Step 4: The Iteration Loop

Finally, we wrap everything in a loop that runs until the agent reaches a conclusion or hits the MAX_ITERATIONS limit. This loop is the “Intelligence Loop” we discussed, it continuously cycles between reasoning, executing, and observing until the threat is unmasked.

Testing

Let’s conduct a persistence mechanism audit:

kali> python3 agent.py “Find all persistence mechanisms on this system including systemd units, cron jobs, and startup items that were not installed by the standard package manager”

Agent checks a few tables: systemd_units, crontab, startup_items, deb_packages, and cross-reference findings to distinguish legitimate services from anomalies.

Let’s test fileless process detection:

kali> python3 agent.py “Identify any processes currently running in memory that do not have a corresponding binary on disk and check if they have active network connections”

The three-table correlation didn’t find any malware.

Summary

This tool is just an example of what you can achieve with a bit of knowledge and desire. It generally follows an “agentic” forensics approach. Hopefully you will find it useful, but it is intended as a force multiplier rather than a replacement for professional knowledge. This technology highlights a foundational gap: the AI is only as effective as the investigator guiding it. To understand why the AI selects specific tables or when it is heading down a rabbit hole, you need the deep system knowledge covered in our Digital Forensics or SOC Analyst training. Automated tools can find data, but only the human expert can provide the context that turns data into evidence.

Source: HackersArise
Source Link: https://hackers-arise.com/artificial-intelligence-in-cybersecurity-part-14-turning-osquery-into-an-ai-powered-forensics-engine/

Artificial Intelligence in Cybersecurity, Part 14: Turning OSquery into an AI-Powered Forensics Engine