Updated: Jul 21, 2020
Seemingly one of the most overlooked security vulnerabilities in the web applications that we test is the deserialization of untrusted data. I say overlooked because awareness of this issue seems to be comparatively low among web developers. Contrast that with the severity of this class of vulnerability (an attacker running arbitrary code on your web server), and the fact that it is present in the more common modern web application frameworks (.NET and Java), and you have a dangerous situation. OWASP recently recognised this, moving it into their Top 10 Most Critical Web Application Security Risks.
If you are deserialising any data; whether that be through JSON, XML, or Binary deserialisation, there is a chance that your code is vulnerable to this type of attack.
The dangers of deserialisation in the .NET framework in particular have only recently been demonstrated, with practical attacks having been publicised just last year. Examples in this article will largely refer to .NET code, but the principles apply to many languages; especially Java. This is not intended to be a comprehensive review of all possible attacks, as the subject is quite deep; but it will help to catch the vast majority of the issues we see.
So what is this vulnerability?
Deserialisation is the process of turning a stream of bytes into a fully-formed object in a computer’s memory. It makes the job of a programmer much simpler: less effort spent parsing complicated data structures: just let the framework handle it for us!
One aspect of deserialisation that is important to understand; both from the point of view of functionality, as well as security; is that simply filling in an object’s internal properties may not be sufficient to reconstruct it. Some code may have to run to complete the process: think of hooking up internal references to sub-objects, or global objects, or ensuring that data is normalised, etc. In .NET, for example, this is done using the OnDeserialized and OnDeserializing attributes, as well as OnDeserialization callbacks.
The big problem with deserialisation arises when an attacker can coerce another system to deserialise an object of an unexpected type. If the code doing the deserialising trusts whatever type it receives, that’s where things start to go south. Take the following code for example:
Vulnerable deserialisation code:
byte[] value = `user-controlled data`; MemoryStream memoryStream = new MemoryStream(value); var formatter = new BinaryFormatter(); MyType deserialized = (MyType)formatter.Deserialize(memoryStream); // Cast the object to the expected type deserialized.SomeMethod(); …
At first glance, it appears as though we’re checking that the serialised object that we receive is of the correct type. We certainly verify it before using the object.
From an attacker’s point of view, however, the code does the following:
-
Receive data from an attacker
-
Determine the type of object that the attacker wants to use
-
Run all the deserialisation callbacks for that type of object, using the data sent by the attacker as parameters
-
Then, a type check is performed, which will fail… but the damage has already been done: we have run the OnDeserialized method of an unexpected type!
-
Then, eventually, run the object’s finaliser (upon being cleaned up by the garbage collector)
But what’s the big problem with that? An attacker can’t just create their own type, and send it over the wire to be deserialized. However, what if there were a built-in type which could be leveraged to perform some malicious action?
Well, security researchers have done just that: various types, which are built-in to their respective frameworks (.NET, Java, etc.) have been discovered, which, upon being deserialised, can be leveraged to perform malicious actions, including executing arbitrary code on the server. Even if the code immediately checks “Was this the expected type?” (as in the above example), the damage has already been done by the very act of deserialising.
Take, for example, the .NET type SortedSet.
Upon deserialisation, a SortedSet needs to make sure that it is indeed sorted; otherwise its internal state would essentially be corrupted. So, in its deserialisation callback, the SortedSet class calls its sorting function to make sure everything’s as it should be.
To allow programmers the flexibility to sort items according to their own business rules, the SortedSet class allows the programmer to set an alternative sorting function, as long as it is a method that receives two parameters of the expected type. So an attacker can send a malicious payload that does the following:
-
Create a SortedSet<string> object that contains two values, say for example “cmd” and “/c ping 127.0.0.1”
-
Configure the SortedSet to use the method Process.Start(string process, string arguments) to “sort” its entries. This method, while not being used for its intended purpose, technically meets the criteria above: it’s a method that takes two strings as parameters.
-
Serialise the SortedSet, and send it to the vulnerable system.
Upon being deserialised, in its callback, the vulnerable system will attempt to “sort” the set by calling Process.Start(“cmd”,”/c ping 127.0.0.1″) which will in turn run an arbitrary command (in this case, a ping command). The thread will immediately throw an exception because Process.Start returns an unexpected type… but again, the damage has already been done, as the command is already running in a separate process.
This clever hack was discovered by James Forshaw of Google Project Zero. Check out his original blog post for a more technical explanation.
The GitHub project ysoserial.net contains a list of other “gadgets” that can be used for code execution in .NET, and will create payloads to exploit vulnerable code. Some of these gadgets have been patched by Microsoft, but the majority are difficult to fix, as they would require breaking changes to the Framework. They are thus still currently available to attackers, despite having been public for over a year.
The Jackson deserialisation library for Java has taken a more aggressive approach in accepting breaking changes; in that, as researchers have found gadgets that can be used for code execution, they are added to a “blacklist” of types which will be prevented from being deserialised. This is somewhat of a “whack-a-mole” approach: it will complicate exploitation for code that is already vulnerable; but should by no means be relied upon to prevent newly-discovered attacks. Known Java deserialisation attacks can be found in the ysoserial GitHub project.
It’s alright, I’m encrypting the serialised data
Cryptography is hard. Even modern ciphers such as AES can be used in an insecure way. We often find incorrectly-implemented cryptography in the websites we test, allowing us to read, and sometimes inject our own data. Or perhaps the encryption key is sitting in a file that an attacker might be able to read through a file disclosure attack, or a temporary server misconfiguration. If you are relying on cryptography as your only protection mechanism against this attack, you’re living dangerously close to the edge.
What can I do about it?
Quite often, deserialisation is just the wrong design pattern, especially for web applications. This is especially true when serialised objects are passed between the client and server to maintain state. Removing the attack surface completely by never accepting serialised objects from the user is often the safest and most correct solution.
If you insist upon using deserialisation, though, how can you avoid this vulnerability? The core problem for deserialisation attacks is that, by controlling the type of an object, an attacker is able to run code that was not intended to be run as part of the vulnerable program. As a result, a developer must configure the deserialiser to check the type of the serialised object before deserialising.
Different deserialisers behave differently in this respect, so it bears going through the main ones that are used:
.NET
A review of .NET deserialization mechanisms was performed by Alvaro Muñoz and Olexsandr Mirosh for the 2017 Black Hat conference, and their whitepaper contains detailed information about all major deserialisation libraries in .NET. By way of summary:
If you are using the BinaryFormatter type to deserialise, you are almost certainly vulnerable to this attack, as your code will by default just deserialise whatever type an attacker wants it to. Our advice is to never use this class with untrusted data at all; however you can configure BinaryFormatter to be more secure by restricting the valid deserialised types using the SerializationBinder class.
If you are using Newtonsoft’s Json.Net deserialiser, this is secure-by-default; but it can be accidentally configured to be vulnerable. Specifically, before deserialising an object, the serialised type will be checked to ensure it is the same as the expected type, thus removing an attacker’s ability to control the type of an object. However, a developer may wish to allow subclasses of an expected type, and may configure deserialisation to be polymorphism-friendly by setting the TypeNameHandling property to have a value other than None. As soon as this is done, a door is opened for an attacker: if any deserialised field, either in the parent object, or in some sub-object, is of type System.Object, an attacker will now be allowed to place whatever type they wish into that field, including one of the unsafe code-execution “gadgets”.
XmlSerializer is even harder to make vulnerable, however it has been known to happen. By using .NET generics with a static type, XmlSerializer will prevent arbitrary types from being used, at the expense of some flexibility for the developer. However, if a program’s code sets the expected deserialisation type dynamically (say for example by inspecting the Xml and setting the expected type using reflection), an attacker once again has control over the type that will be deserialised. Using XmlSerializer with a static expected type is secure.
Most other built-in .NET deserialisation libraries use BinaryFormatter or XmlSerializer internally, and would thus inherit their security properties. Other 3rd-party deserialisation libraries also exist, and Muñoz and Mirosh’s paper examines some of them.
Java
The ObjectInputStream.readObject method will deserialise whatever serialised Java object it is given, making it comparable to .NET’s BinaryFormatter class in both its flexibility and its attack surface.
The Jackson JSON deserializer has a similar attack surface to Json.Net, in that it is secure-by-default; but by enabling polymorphic behaviour through the ObjectMapper.enableDefaultTyping() method, arbitrary subclasses may be created.
The bottom line
Deserialisation is a highly flexible and convenient tool for developers… which unfortunately means it’s also highly flexible and convenient for hackers. If you’re deserialising data anywhere in your code, make sure you consider the security implications… or better yet, don’t use deserialisation at all, if you can avoid it. And if you’re a developer, there’s nothing quite like performing this attack on your own code, to give yourself a better understanding of the risks.