In case you are a security professional, you likely already know about the Solarwinds hacking case. In case you aren’t, you can check it here. In a brief, their software was backdoor-ed and distributed to many customers. Here are our thoughts about the case and the associated payloads.
An important aspect to consider in this attack is that the distributed payloads were signed by the attackers on behalf of Solarwinds (see figure below). Most security solutions skip many checks for signed binaries, which might help them to remain undetected for a greater time.
After the breach was identified, detection rules were released to help identifying the payloads related to this attack (as shown in figure below). One can get rules from here.
Despite this tool, identifying all samples might still require an extra effort. We identified 5 samples related to Solarwind (they are available in our collection, check here), but not all of them were detected by the yara rules despite being similar. In other cases, they were not only detected by the yara rules but also reported as similar by similarity hashing functions, as shown in the figure below.
A nice thing about having such detection rules available is that many AVs integrated them into their engines so as to detect these payloads, as shown below for some Virustotal’s AVs.
The bad thing about AVs is that they have a significant response time. We talked about that before in a paper (check it here). At the moment I’m writing, there are companies still deploying their defenses (example here). The impact of time over AVs can be clearly seen in this case by using our analysis platform. As it saves the Virustotal report from the submission time, we can compare the initial detection rate (shown below) with the current one (shown above).
Talking about detection, this attack incident also showcases our claim for more personalized threat detection, more specifically regarding Machine Learning systems. We have been demonstrating that distinct models have distinct bias and this can be clearly seen in this case. In our analysis platform, we have a classifier trained with distinct bias, mainly due to the nature of the malware samples in the training set: Brazilian and World samples. Each one of them is more suitable for a distinct task. In a past analysis of a ransomware sample (check it here).
But now, for Solarwinds’ samples, the situation is the opposite. Only those same classifiers would be able to detect them. Mainly because the characteristics of these samples match the ones those classifiers are used to see in the BR setting. So, please, more diversity in the models!
At this point, I was expecting to talk about the sample’s execution, but it happens that they did not produce anything meaningful in our sandbox. When it happens, I often check other solutions to confirm that it was not a sandbox bug. By checking Joebox (report here) We notice that the “Program does not show much activity”. Maybe due to stalling code? It seems the case, according to this. At SECRET, we LOVE automated dynamic analysis, but this case shows that it is not perfect (even though our research goal is to make it each time better).
Well, so let’s focus on static analysis instead. All samples are DLLs compiled from .Net. This might be bad or good. The bad side. If you open it in IDA without any extension, you can’t get much information, since the DLL entry point only jumps to the external runtime handler.
The good side is that .Net binaries can be easily decompiled. So let’s dive into it! Since repeating tasks is boring, I randomly chose one of the payloads to take a better look. As figure below shows, reading code helps a lot, we can identify interesting functions by their name. But that’s not that easy: As you can notice, many strings are obfuscated.
If we dive a little bit more into the obfuscation routine, we discover that what happens in the end is that strings are encoded as b ase64 and thus compressed (deflated). To decode them, the application inflate the data and decode the base64 string. The decoder is present in the application.
I’d like to have a plugin/extension to automatically decode this for me. I really don’t know if this is available on the tool that I used (dnspy). Anyway, I wrote my own decoder. In the above case, the actually retrieved parameters were the following:
Now, we can just repeat this procedure to all other encoded strings and start building the malware puzzle. From here to the end, it is hard to say something new about these samples, they were already extensively analyzed in other blogs/sites/reports, so I will stop here. If you are interested in something else, just let us know!