CruLoader Analysis
For the Zero2Auto course, @0verflow and @VKIntel developed a sample to test our skills. This write-up will be my analysis of this brand new sample !
Now let’s set the context :
Hi there,
During an ongoing investigation, one of our IR team members managed to locate an unknown sample on an infected machine belonging to one of our clients. We cannot pass that sample onto you currently as we are still analyzing it to determine what data was exfiltrated. However, one of our backend analysts developed a YARA rule based on the malware packer, and we were able to locate a similar binary that seemed to be an earlier version of the sample we’re dealing with. Would you be able to take a look at it? We’re all hands on deck here, dealing with this situation, and so we are unable to take a look at it ourselves.
We’re not too sure how much the binary has changed, though developing some automation tools might be a good idea, in case the threat actors behind it start utilizing something like Cutwail to push their samples.
I have uploaded the sample alongside this email.
Thanks, and Good Luck!
1st stage
OK so first we got a zip, containing a PE File. Let’s do some statically analysis to see what we are dealing with :
From what I can see, this is a 32bits PE File, containing a unknown resource in RCDATA.
Let’s load IDA to see what’s going on :
Don’t want the malware analyst to see what library you use ? Introducing : String Obfuscation. Luckily for us, the routine is fairly basic. It’s a ROT13 algorithm with a custom alphabet :
Doing the same in python in order to have the good names :
import string
dict = string.ascii_letters + '01234567890./='
l_encr = [".5ea5/QPY4//", "pe51g5Ceb35ffn", "I9egh1/n//b3", "t5gG8e514pbag5kg", "E514Ceb35ffz5=bel", "Je9g5Ceb35ffz5=bel", "I9egh1/n//b3rk", "F5gG8e514pbag5kg", "E5fh=5G8e514", "s9a4E5fbhe35n", "yb14E5fbhe35", "F9m5b6E5fbhe35", "yb3.E5fbhe35"]
for encr in l_encr:
decr = ""
for char in encr:
pos = dict.find(char)
decr += dict[(pos+13)%len(dict)]
print(f"Encr : {encr} --> {decr}")
Remember the unknown resource in RCDATA we talk earlier ? It’s time for it to rise and shine. Once the resource is loaded can you see what’s waiting for us next ? I let you 1min :
You got it right, it’s RC4 ! It’s pretty easy to spot with the The key begins at the 12th bytes of the data and is 16bytes long. Once the resource is decrypted, a new process of itself is created in a suspended state :
The decrypted executable is written to memory and execution of the process created is resume :
In case you didn’t spotted it, it’s a classical case of Process Hollowing
There is now a brand new executable to analyze !
2nd Stage
This part is a little more complicated then the one before. It’s relying heavily on CRC32 hashing for all sort of things like :
- Check if it’s running in svchost :
- Check any blacklisted processes
- Looping through all running processes, hashing their names and comparing it to a harcoded array. Blacklisted processes are : “wireshark.exe”, “x32dbg.exe”, “x64dbg.exe” and “ProcessHacker.exe”
- Load API calls
This one is a little bit more tricky. There is a function that take a CRC32 hash as a parameter. The hash is matching the wanted API call. 0x8436F795 is corresponding to IsDebuggerPresent()
for example.
But there is a lot of call to this functions… And a lot of APIs in kernel32.dll, ntdll.dll and wininet.dll… So if it’s not fun to do, let’s have a script doing it for us ! I made a IDA script (available here) that resolve all API calls, the job is way easier now !
Important strings are encrypted with rol 4 + a 1byte XOR Key. The following CyberChief recipe can be used to decrypt them
With all theses API Calls, our beloved sample will now create a new svchost process :
And a new thread inside of it :
The trouble with execution passed with CreateRemoteThread
is that the thread doesn’t exist yet, and you won’t be fast enough to intercept it. My tip is to set a breakpoint on the entrypoint of the thread (the ebx
value here). When the thread run, the debugger will stop exactly here.
There is now a brand new executable to analyze ! (I’m lying, it’s the 2nd stage but with another entrypoint)
3rd Stage
This stage is all about the internet. It decrypt the config URL (more on that latter on), fetch it (it contains another URL), fetch the second URL but this one is a .jpg
so it saves it under C:\Users\USER\AppData\Local\Temp\cruloader\output.jpg
.
The custom UserAgent ‘cruloader’ could be used for detection
When everything is done, a new svchost process is created (yes, again) the output.jpg
is decoded and written to the new process memory. Injection is done with ResumeThread
4th stage
Here we are. I promess this is the final stage. The final function is the hardest :
I made a flowchart of everything we saw. I feel like it helps to understand what is going on :
I tried to keep it simple
And that’s it ! Oh wait… The IR guy wanted some kind of automation isn’t it ? Let’s give him what he wants !
Let’s extract that config
Can all of this hardwork be automated and take like 3secondes ? Sadly for me… It can, so I did it. First the objective : recover the first URL. Not the 2nd because you should not reach out to unknown server without proper protection (TOR, VPN, proxy, public WIFI… WHATEVER). Even if this is 100% safe (a reddit URL), I prefer to always keep this routine. A couple of problems :
- The 2nd stage is RC4 encrypted but we know the location and where the key is.
- There is no way (to my understanding) to predict the offset of the data we want
- Every string is encrypted with a different XOR key (but is always 1byte)
- Rotate Left is always 4, but can be 2 or 5 in another sample
Sooooooo how I did it ?
Even if this is just fiction, I wanted to have something that would work for any similar sample, so the bruteforce is kinda big.
First the RC4 key and data is recovered from the 1st stage :
pe = pefile.PE(file)
for entry in pe.DIRECTORY_ENTRY_RESOURCE.entries:
if str(entry.name) == "RC_DATA" or "RCData":
new_dirs = entry.directory
for res in newdirs.entries:
data_rva = res.directory.entries[0].data.struct.OffsetToData
size = res.directory.entries[0].data.struct.Size
data = pe.get_memory_mapped_image()[data_rva:data_rva+size]
key = data[12:27]
return rc4_decrypt(key, data[28:])
And I dumped of ALL of the .rdata
section of the 2nd stage and bruteforced it with RotateLeft and XOR key until I find an URL.
for rotAmount in range(1,10): #Bruteforce the ROT amount
rotated = rot(data, rotAmount)
for xorKey in range(300): # Bruteforce the XOR key
result = ""
for b in rotated:
result += chr(b ^ xorKey)
if "http" in result:
pattern = "https?://(www.)?[-a-zA-Z0-9@:%._+~#=]{1,256}.[a-zA-Z0-9()]{1,6}b([-a-zA-Z0-9()@:%_+.~#?&//=]*)?" #hope you like my tiny regex
config = re.search(pattern, result)
That’s might not be the most efficient way to do it, but still faster than opening IDA/x64dbg to check for the correct offset. The full code is available here
Now the IR guy got everything he wanted !
Case solved
And that’s it, we solved all of the mysteries behind CruLoader. I hope you liked this post and had fun reading it. I tried not to put too many screenshots as otherwise the post would look like a gallery and I don’t think this is enjoyable. Also most of the time I put IDA pseudocode because they are smaller than the graph view in Assembly but I prefer working with assembly (yeah I’m doing this just for you).
Let me know if you find that something can be enhanced (I’m sure it can).
Thanks again @0verflow and @VKIntel for this cool sample
See you soon for another case !