Analysis of a Caddy Wiper Sample Targeting Ukraine
Analysis of a Caddy Wiper Sample
CaddyWiper was first reported by ESET as below:
Dubbed CaddyWiper by ESET analysts, the malware was first detected at 11.38 a.m. local time (9.38 a.m. UTC) on Monday. The wiper, which destroys user data and partition information from attached drives, was spotted on several dozen systems in a limited number of organizations. It is detected by ESET products as Win32/KillDisk.NCX.
One of my friends pinged me a few days later with a link to a CaddyWiper sample. Since this sample was a particularly small one, I decided to write a blog post going through each function from scratch and introducing the tools I used to make my life easier. Hopefully, this can serve as a reference to junior malware analysts who want to get started with this craft.
First off, I’m a Linux user myself and I use mainly Linux tools to analyse malware.
pev is a set command-line utilities providing a high level analysis of a
PE binary. It consists of the following tools
pehash on the sample offers the following:
If you’re new to analyzing a PE, I highly recommend looking at the official Microsoft documents for PE Format. Some notes from the link:
At location 0x3c, the stub has the file offset to the PE signature. This information enables Windows to properly execute the image file, even though it has an MS-DOS stub. This file offset is placed at location 0x3c during linking. After the MS-DOS stub, at the file offset specified at offset 0x3c, is a 4-byte signature that identifies the file as a PE format image file. This signature is “PE\0\0” (the letters “P” and “E” followed by two null bytes).
Main function Analysis
the main function starts at
00401000 and it looks like it doesn’t return a status code. in
c terms, it means the
main function is written like so:
In the main function, we can see a call to the external function
according to Microsoft documentation, The
DsRoleGetPrimaryDomainInformation function retrieves state data for the computer. This data includes the state of the directory service installation and domain data.
If we take a closer look at the function call, we can see that the function has been called with 3 parameters:
DsRoleGetPrimaryDomainInformation(0,1,&empty_int_pointer);. the 0 refers to the
lpServer parameter, meaning the function will be called on the local computer (refer to the link above for more info on that). The
1 is the
InfoLevel parameter, which specifies the level of output needed, as well as the type of output being pushed to our
empty_int_pointer. referring to Microsoft Documentation, we can see
1 refers to the first item in the C++ enum, which is
If we follow the docs, it’ll mention our output type as
DSROLE_PRIMARY_DOMAIN_INFO_BASIC, and refers to this page. Looks like our return value will be in this struct:
clearly the attack is interested in
MachineRole, and compares it with value
5. Let’s dig deeper to see what
5 means. If we go to this doc, we’ll see the following
5 is the primary Domain Controller. Looking at the code, you can see the attacker does not intend to attack the primary DC, and will skip them.
After getting all the info, I started to rename the functions and add a bit of comment, as well as converting types in Ghidra to make sure it’s readable:
Now we can see there’s a
wiper function, which runs on
C:\\Users as well as
D:\\ for 24 chars (
E:\\, F:\\, ...), which means basically all drive letters.
let’s go take a look at the
wiper function. That’s where the attacker’s malicious code is located.
The wiper function
The function itself is a
void one. Meaning the attacker didn’t really care if the wiping is successful or not. Reading a bit of the function itself, the first bit of interesting information is seen at line ~180. There seems to be another function, that gets called with both
After digging around the
wipe function, you can see
kernel32.dll as a stack string with these functions being called from it (in order):
All above functions are thoroughly documented in Microsoft’s official Win32 API Docs
Essentially, the wiper is looking for all the files under
Z: and tries to enumerate the first file within those directories (with
FindFirstFileA), then enumerates through the folders with
FindNextFileA, opens the file, scrambles the header of each file, and does it across all folder recursively. Here’s the main
wiper function with function names and syscalls somewhat renamed to a more readable format
Before we rename this function to something human-readable, we should know what it does. Here’s the pseudo-code of the function itself:
The function appears to concat two strings together with a couple of
while loops and put them in the first parameter’s pointer. in
python terms, it basically means
param_1 = param_2 + param_3. From now on, I’ll refer to
After concatenating the paths with
FUN_00401530 gets called with two parameters:
kernel32.dll, as specified in lines directly after calling the two concat functions (line 190 to 200 inside the
wipe function in Ghidra).
Even though the logic of the function seems complicated, from what it gets and produces as an output, it’s safe to assume the function is a Win32 API client. The DLL filename as well as the specific functionality is pushed to the function and the result is an integer that corresponds to the API response code. From now on, I’ll refer to
Other Interesting Functions
Using the same trick we did before, it’s easy to see this function using the same
syscall_wrapper to invoke multiple functions from
This function looks to be looking into each particular file’s ownership and tries to get around some ACLs and “access denied” errors that it comes across. I would describe it as a basic way to try to make a file writable enough so it can destroy it. Although I didn’t read each individual syscall to back that claim.
FUN_00401750 is the main carrier of this operation. In
FUN_00401750, we can see the following functions:
FUN_00401750 simply tries to see if the malware has enough permission to change permissions on a file. I’ll rename it to
As a result, based on my guess,
FUN_00401a10 is renamed to
Putting it all together
This is a small Malware sample, and it’s effective and fast. In a nutshell, this is what the attack vector had in mind
- Checks if the Computer is a primary domain controller or not. If not, it leaves it behind and doesn’t wipe it.
- It identifies C:\Users and D: through Z: as primary attack targets
- Finds the first file in the folder
- Tries to see the permission it has to write to the file
- Tries elevating privileges to gain permission to write to the file
- Opens the file in write mode
- rewrites the file header with gibberish
- Close the file
Interestingly, If you run the binary through something like the
strings command, you’ll only see a few strings, like so
This is because the attacker is making use of
stack strings. This link has a good explanation of what are stack strings and how are they used to avoid detection.
The easiest detection for this particular sample could be a hash value. But since this malware is small, hashes, even
ssdeep are not a very good idea. Let’s try to build a YARA rule that defines what we learned from the malware.
As we saw, since the attacker was clever enough to use Stack String, our YARA rule is going to be slow and regex-y but it still works. Interestingly, for
FindClose I had to adjust my regex to factor in the slightly smaller
MOV assembly code. I’ve also put a file size cap on the sample to ignore potentially different variants of this malware.
As an exercise, you can create similar detection for the
dll files, which are a bit trickier considering they’re both
wide strings and Stack Strings.
Hope you enjoyed this brief analysis. I’ll put the Ghidra zipped file alongside the scripts, comments etc in a Github Repo if anyone is interested. Let me know what Malware should I dissect next :)