Reviewing Qakbot loader sequences: Part 2
In this second installment of our Qakbot series, we’ll be looking at log-based indicators around phishing payloads that are useful for detection, threat hunting, and incident investigation.
Detecting phishing attempts as early as possible is critical, but at this stage of the attack chain, attackers often get the benefit of blending in with normal activity. Security controls have been developed around each step of the email process to defend against phishing, yet it remains one of the main pillars of initial access.
Writing detections for activities around user execution can be challenging because it relies on one of the most unpredictable facets of a system: the user. Attackers and defenders both rely on systems providing reproducible outcomes to achieve their goals. Attackers construct chains of execution that depend on specific system binaries, API calls, and resources being available and running in particular ways. They must ensure their attacks land and execute properly on a wide variety of OS versions in an environment that is actively trying to keep them from succeeding.
Defenders depend on these to detect them where it is hardest to change their process. Users have their own unique behaviors, preferred apps, and unpredictable timeframes. And in our case, when attempting to detect malicious behaviors across thousands of partners of varying sizes, industries, and locations, the complexity of possible user interactions with phishing emails increases considerably.
Throughout this article, we will take these factors into consideration while exploring detection opportunities or hunting and investigation aids. We’ll also create a better understanding around the limitations in coverage depending on user behavior and the visibility granted by particular applications. Due to our wide range of partners, we take an approach where we survey for the breadth of possibilities based on user activities reflected in logs, and make decisions based on historical user trends.
From email to first stage download
Before diving into the paths followed after downloading a malicious payload from email, we must understand the scope of situations where that file may originate from in our records. To that end, we’ll use data about trends in user interactions with email as the starting point to consider our visibility when it comes to downloading attachments or following links.
All user interactions with their email will come from one source: an email client. But that's not stated to simplify things. Many standalone email clients exist for users to choose from, but the vast majority, for one reason or another, use Outlook. This is followed by a considerably smaller slice using Thunderbird, and only a handful of users running more obscure email clients such as eM Client, Mailbird, or IceWarp. Unfortunately, this does not make things simpler, because we also must consider that many users use web-based email clients. So, not only do we need to look at events surrounding email clients, but we also must classify browser-based events as potentially relating to email.
Figure 1: Number of machines observed creating processes for specific browsers.
Counts of browser market share show that Chrome is the most popular browser, followed far behind by Safari, then considerably further behind by Edge, Firefox, and Opera. We also did our own count and found user adoption in our partner-base to be slightly different—Safari and any other mobile browsers are noticeably absent, as we’re only looking at Windows endpoints in this analysis. Firefox has a considerably larger share than is typically depicted. But most surprising was the larger share of Edge usage than expected. Since this rudimentary data was gathered from process creation events rather than frequency of usage, we chalked this up to situations such as Microsoft forcing links opened in Outlook to first be opened in Edge, a consideration we will have to return to later.
The next step in the phishing chain depends on the user opening the payload, which can be reflected in the logs in various ways. We've gone through the processes that could be used while creating these files, so now we turn to the actions of the user.
A file downloaded from an email client will end up in a directory—either a default directory or something different. For detection purposes, a default directory is preferable, as it makes things more predictable, and we risk losing visibility when things become less predictable.
Thankfully, our data from file creation records from Outlook alone show that only about 1% of files downloaded over a 7-day period were saved to non-default directories, but this still adds up to a considerable number. With this in mind, Figure 2, which can be seen below, shows default paths for our relevant potential email client applications.
Figure 2: Table of default paths email clients will download files to.
Before diving into events relating directly to specific payload activity, we’re first going to look at how often each of the different Qakbot phishing payloads are typically downloaded via email, as much as we can observe this through Outlook downloads. When collecting this data, detections were not the expected outcome—the intended goal was a better understanding of the context of typical user actions, which may end up guiding detections further down the line.
PDFs are overwhelmingly the most common file type downloaded via email. This makes sense, with Qakbot's move toward predominantly using PDF files. ZIP and HTML files are less frequently delivered, but most notable is the rarity of OneNote documents. Sending OneNote documents over email as attachments may be rare enough to create detections around their creation from an Outlook process, but it’s important to also consider this in the context of how OneNote maldocs were used.
They were initially used by multiple threat groups, and the particular attack vector they allowed was swiftly remediated by Microsoft, so they’re no longer heavily used. Until more opportunities are uncovered to use OneNote for maldocs, that sort of detection may only uncover the occasional false positive. However, it may benefit some organizations' threat models to detect such activity.
Figure 3: Counts of file extensions observed being created by email clients over a 7-day period.
URL and zip payloads
Figure 4: Execution paths from URL phishing payloads
Figure 5: Execution paths from ZIP phishing payloads
The most frequent phishing payload leading to Qakbot was a URL. The least frequent were ZIP files, but most URL payloads lead to a ZIP file download, so we’re considering them together. When it comes to the analysis of links users open from emails, we run into a few issues. On one hand, clicking on a link in an email will expose the URL in the browser child process's command line.
But as we mentioned earlier, Microsoft has started to force links clicked in Outlook to be opened in Edge. This may push a lot of users looking to avoid Edge to copy and paste URLs into their preferred browser, which blinds us to the activity. Similarly, users who use web-based email clients will not generate any endpoint logs indicating that they opened a link from an email.
With ZIP files, we get a useful log artifact—though it once again depends on how the user interacts with the archive. All software that decompresses ZIP files has it’s own temporary directory pattern where decompressed files will be stored if they’re opened from within the software itself, rather than being extracted to a user-specified directory.
Below is a table displaying the temporary directory patterns of common software used to open ZIP files. If users do decide to go ahead and extract files, then once again, we lose valuable information that can be used for context-specific detection of executions of those files. However, you will still likely see the file creation records for these temporary files, unless the user goes straight to extracting the archive without opening a preview of them in any software. Even then, you’ll still see file creation records for the extracted files made by decompression software—unless the user used Explorer for ZIP extraction, then there likely won't be any direct indicator that the file came from an archive, but this is a comparatively rare use case.
Figure 6: Default temporary file paths for files opened from archives directly from archive software.
This information gives us enough to build effective detections for several of the follow-on activities from a ZIP file. Most of the Qakbot delivery activity relating to ZIP files involved disc image files and Office maldocs, but many also directly lead to scripts, HTA files, MSI installers, and HTML files. Most of these can contribute to direct detections for processes running files from temporary archive directories, such as WScript, MShta, Msiexec, and web browsers, by looking for creation records of those processes with the possible temporary paths included in the command line.
It is also worth it to note that while Qakbot heavily used password-protected ZIP files to evade simple detections, there are unfortunately no log-based indicators that a ZIP file was decompressed using a password, regardless of the software used to extract it.
Figure 7: Execution paths from PDF phishing payloads.
PDFs are the most frequently downloaded email attachment type and were the phishing payload of choice for Qakbot delivery while they were active in 2023. When considering potential detections around malicious PDFs, it’s again important to consider how users may approach opening PDFs. Several standalone PDF readers exist, such as Adobe Acrobat and Foxit, but many users now rely on PDF readers built into their browsers when opening them. Some web-based email clients like Gmail even have their own PDF reader integrations built in. This inhibits visibility in many of the situations where we might want to know that a PDF attachment was opened.
Data gathered through @pr0xylife and @Cryptolaemus1 posts show PDFs lead to either opening a URL or a ZIP file. Further review showed that they likely just changed how they depicted the chain over time since all the PDF samples they indicated leading to ZIP files included a URL to download the ZIP rather than it being embedded.
Considering this, we mainly look at possible user interactions with PDFs that lead to them clicking a link. With the previously stated limitations in mind, we can detect potential PDF attachments opened with standalone readers that lead to opening a URL. In event logs, most PDF readers will expose the path for the file being opened. We can tune this down to process creation events where the parent process is a standalone PDF reader with one of the paths from Figure 2 in its command line and a browser child process. Further, we can look for a sequence where the browser's process identifier (PID) creates files shortly after the initial event.
Figure 8: Execution paths from OneNote phishing payloads.
We don't explore OneNote payloads as in-depth as many of the other phishing payloads since it was more of a flash-in-the-pan payload at the time. Absent any newly emerging techniques that revitalize OneNote usage as a maldoc, it suffices for most situations to maintain a general suspicious OneNote child process detection. This could be developed from Figure 8, where we can see the types of files delivered and executed after opening a OneNote file. This gives us both suspicious child processes and file creations to consider when developing detections.
Figure 9: Execution paths from HTML phishing payloads.
To develop detections for HTML smuggling, we had to investigate what could be observed in logs from each browser opening an HTML attachment from either a standalone or web-based email client. What we found was that when directly opened from a standalone client, all browsers would expose the path of a local HTML file in their process creation command line records. The situation is different with web-based clients—Firefox will show another Firefox child process with the local path on the command line, but Chrome and Edge do not give any indication that a local file was opened outside of the file creation event.
Knowing this, we can make limited sequence-based detections looking for different browsers with process creation events indicating an HTML file opened from a Content.Outlook or downloads directory, followed by file creation records soon afterward with that PID. Depending on your environment, you may want to tune this for the creation of certain types of files or the exclusion of prominent secure message attachments that frequently come in HTML attachments.
While we do face a lot of difficulties and limitations in detecting activity around phishing payloads due to unpredictable user behavior, software preferences, high false positives, and limited log visibility in certain circumstances, by looking further into user stories around email usage and digging deeper into log behaviors of different applications, there exist several detections we can leverage to try and alert on suspicious phishing payload activity earlier in the chain.
In the next installment of our Qakbot loader sequence series, we will dive into the DLL execution methods that were used to execute Qakbot and see if we can generalize their activities into weak signals that capture a wider array of suspicious DLL execution techniques. We’ll also look at alternative methods for launching DLLs that could potentially be adopted by this or other groups in the future.