Apt41 Toughprogress Malware Analysis

I arrive at the office, make a cup of coffee and sit down to browse the latest Cybersecurity news. I have a daily brief to cover for the company where I talk about whatever is the most important, alarming or interesting news for the day. While browsing I see the mention of APT41 once again, one of the groups I like to keep an eye on for their innovation. Opening the Google Threat Intel page I choke a little on my coffee as I scroll to the final Data Exfiltration part of the malware; this isn’t the first time that APT41 has taken the world by surprise and used an existing cloud service to perform C2 communication, but to use a Google service and get detected by Google; again?

If I had a nickel for everytime APT41 used a Google Service to do C2 communication, I’d have 2 nickels; which is a lot when it comes to the cybersecurity world.

In 2021 a new tool was posted to GitHub named GC2-Sheet, it is a Command and Control server communication methodology that hides its traffic using valid Google API calls to a Google Sheet and uses them to issue instructions, push malware and exfiltrate data. This tool marked the midpoint between this technique being discovered and becoming widely used as it made it easy for many different attacker groups to utilize this method. One of the groups to pick up on this tool was APT41 or HOODOO in 2022 and in 2023 while targeting job search websites. This indicated a two-fold shift in the approach of the group at the time: 1. To use open source tools instead of novel malware & 2. The usage of the cloud services as a valid C2 component. While the group would later rollback on the former change, returning to new novel malware and techniques, they would venture further into the abuse of Google Services to help perform C2 operations along with Data Exfiltration.

Thus TOUGHPROGRESS was discovered. First identified in attacks against government entities in October 2024, a Mandiant (now Google Threat Intel) post would talk about this in late May 2025. Part of a 3 stage attack whose initial attack vector is a phishing mail in Mandarin, the malware acts as a late stage payload which is used to perform custom actions on the affected Windows machine as well as perform Data Exfiltration, while using Google Calendar to communicate with its C2 server. This intrigued me, not only because this is a group that I like to follow, but also because this is a novel technique; something we hadn’t seen before and I could tell exactly why we would see more of in the future.

Initial Analysis

Thus began my journey into decoding, reverse engineering, analyzing, detonation and detecting this malware. I had to ensure that I could find out everything that there is to find about this malware while ensuring the process was fun but the output was important enough to inform the whole world. First I had to grab the malware from a reliable source; given that it had been out for 6 months already meant that the surface internet should suffice in my hunt, so I went to malware bazaar and grabbed the first malware I found under the tag TOUGHPROGRESS. I hardly made any progress with it however, finding out 2 days later that it was the wrong malware as it had marks of the earlier payload but none of the parts which made it interesting to me. Eventually going back and hopping around on the bazaar I eventually found the correct copy and so the process of setting up my lab and analysing this malware began.

Static analysis of the malware becomes the first step. To begin, we first install a fresh instance of Remnux on a Virtual Machine as it comes bundled with the necessary tools required for static as well as dynamic analysis of EXE files.

First we use malanyze, a tool to analyze malware for static properties which can be used with a combination of yara rules to check for different flags.

Summary of malware static analysis showing architecture as IMAGE_FILE_MACHINE_AMD64 and compilation date as September 23, 2024.

Firstly we obtain the compilation date of the malware, 23rd September 2024, which is consistent with the first detection in October 2024. Thus we can rest assured this is from the right time, but we mustn’t stop here in verifying the malware.

Next we analyze the Image Headers:

Image showing the optional header details of a PE32+ executable file, including properties like linker version, size of code, entry point address, and operating system version.

Here we can confirm that the executable is a PE32+ file. I.e. the executable is meant to work on x64 Windows systems. While not immediately informative, this helps us set up the right lab later when we detonate the malware. Furthermore we also find out that the address of entrypoint for the malware is “.text” so we can begin our code analysis from this address. Both important things to be noted.

Running another analysis tool we obtain the following list:

List of loaded DLLs for malware analysis showing KERNEL32.dll, WINHTTP.dll, and IPHLPAPI.dll among others.

The loaded dlls already tell a story; the KERNEL32.dll is a favourite amongst attackers for the variety of API calls possible via it. The presence of WinHTTP.dll is quite usual for a software but it would also communicate data via HTTPS thus hiding the requests of the malware where possible. IPHLPAPI.dll is a new and unusual find which requires further investigation.

We use another tool to grab strings from the malware and find exactly why this is necessary. IPHLPAPI.dll is being used for GetAdaptersInfo syscall. This can be used for reconnaissance to understand the network configuration, host identifiers etc.

Analysis of the IPHLPAPI.dll showing the GetAdaptersInfo syscall, used for network reconnaissance.

We use a new tool called “PE Tree” which displays the structure of the executable file. While we are not necessarily getting a lot of new information from the same, we are able to see the syscalls within the DLLs that are being imported.

A screenshot of a PE Tree tool displaying the structure of a Portable Executable (PE) file, including headers and imported Dynamic-Link Libraries (DLLs) such as KERNEL32.dll, WinHTTP.dll, and IPHLPAPI.dll.

What we have learnt from our analysis so far, is that the exe file we are looking at over here is importing many “DLL” files which have overwriting and serious API calls which are quite commonly used by malware executables for their capabilities.

We pause static analysis here, while there could be more knowledge gained from digging deeper into this path, we must realise that there is a time/effort payoff that needs to be balanced. So we switch approaches from here, next step is to jump into disassembling with Ghidra. Ghidra is an open source reverse engineering tool developed by the NSA and it’s quite widely used in malware analysis.

Next Steps – Disassembly and Reverse Engineering

I download and run Ghidra, import the executable and run the analysis and voila I have a mix of address, assembly code and a small window that attempts to convert the assembly in C language code, along with other smaller windows and tools that help guide someone in unpacking and understanding an executable.

I am immediately greeted with the disassembled code as well as all the points mentioned above, first I try to look for the entry point within the code and see if I understand where the execution begins. I am able to find the same using the “Symbol Tree” window which leads me to a function where the entry is tagged. Quickly I realise this will make more sense once I understand where things are coming from and where they go, so I launch the Function Graph on this function.

Ghidra disassembly view showing the entry point and function graph for analyzing a malware executable.

Now this portion takes me back to my college classes of assembly and playing around with x64dbg in my free time. I begin to look at the code and trying to understand what is happening here, but the executable is quite long and vast, maybe I need a little more direction.

The next thing I remember is that often such code will have strings in unicode, basic English that we can immediately understand. I scroll a bit here and forth and I am able to confirm.

A screenshot showing a hexadecimal view of exe file strings, displaying Unicode values with recognizable text related to a Mozilla user agent.

To make this easier, I used the “Defined Strings” functionality of Ghidra and I am now greeted with a new window, which is the gold mine. Following screenshots highlighting the same:

A table displaying string values and representations from a malware analysis tool, highlighting various Unicode strings found in the executable.

A screenshot showing strings extracted from malware code, detailing various API calls and configuration strings.

A table showing string values and representations from malware analysis, including directory status messages and date formats.

Gotcha! We see it now, a mix of API calls to both google auth and google calendar APIs, a website called “https://api.ipify.org” (which I later find just returns the IP of the victim machine) and details being pulled about the local date and time of the machine as well. Hence we confirm that the malware is a calendar-aware application that integrates with the Google Calendar API. String values reveal support for locale-sensitive date and time formatting using Windows APIs like “GetDateFormatEx” and “GetLocaleInfoEx”, along with standard error handling for filesystem operations such as missing directories, busy files, and memory issues. It is also displaying typical calendar UI elements, including days of the week and months, and formats timestamps, suggesting event logging or scheduling functionality.

Further, the presence of URLs like googleapis.com, ipify.org, and headers such as “Authorization: Bearer” and “Content-Type: application/json” indicate that the malware performs authenticated network communication, likely to sync calendar events or retrieve data.

Eventually, playing around, we even find the date within the calendar, where the event is located, which the malware is trying to reach; 2023-07-30 a.k.a 30th July 2023.

Screenshot displaying defined strings from a malware analysis tool, highlighting a Google Calendar API event string used for malware communication.

This date is important because on further analysis we find out that the malware is trying to receive commands or pull information from an event in the google calendar that is placed on this date. I was eventually able to find the date of the data exfiltration as well, 30th May 2023 (2023-05-30); the date to which encrypted information was being sent by the malware. This formed the perfect tool for the malware, talking to the C2 server, hiding its communication behind legitimate looking http calls.

I tried to study the disassembled code further in-depth taking these strings as my guide, however I kept running into some issues with understanding the control flow; the function calls alone weren’t making sense. That’s when I decided to visit the Mandiant (now Google) post on APT41 and realized where I had missed the mark; the malware was using multiple obfuscation techniques to hide the Control Flow. In the words of the Mandiant, “Adding the two values together overflows the 64-bit address space and the result is the address of the function to be called.” “Clever”, I thought, a little learning lesson for me to not overlook overflows even in the modern age at any level while debugging. At this point I decided to read the rest of the Mandiant article which talked in detail about the encryption technique used by the malware. Given that I found this a little challenging, I decided to stop my analysis for the time being and focus on the next steps; I could try to dynamically emulate the malware in remnux, but decided against it as I would require the remaining files as well, which I did not currently have available.

Takeaways

After spending considerable time dissecting TOUGHPROGRESS, a few things were confirmed for me. This wasn’t just a kiddie script, this was APT41’s display of capability. Choosing to opt out of using public exploits and creating a new additional vector with the use of Cloud services as C2 channels was a threat yet to be explored, by both researchers and malicious actors. The malware itself is very stealthy and maintains its level of sophistication even in evasion. Lastly, control flow obfuscation is not going anywhere soon; it is just one of the many techniques being used by attackers to make it harder to analyze malware by just using the tools readily available to them.

Functioning of the Malware

TOUGHPROGRESS operates as a late-stage payload in a multi-phase attack, beginning with a phishing email in Mandarin. Once the malware lands on a machine, it starts by profiling the environment; pulling adapter information, getting locale data, and generally checking where it has landed. It does this using standard Windows DLLs like IPHLPAPI.dll and KERNEL32.dll.

It then initiates communication with its C2 by connecting to a predefined Google Calendar event. The malware parses event data from a specific date (in this case, July 30, 2023) and extracts commands embedded within the event’s metadata. It also exfiltrates stolen data to a different event scheduled for May 30, 2023. This use of Calendar events for bidirectional communication makes network traffic look harmless and bypasses most conventional security tools.

Detections

From a detection standpoint, TOUGHPROGRESS reinforces the importance of behavioral analytics over signature-based detection. The malware uses nothing particularly groundbreaking in terms of file structure or obvious malicious signatures. Static scans won’t find much unless they are tuned for very specific API usage patterns or YARA rules that detect uncommon string sequences.

Security teams must begin treating anomalous usage of legitimate cloud services with more skepticism. For example, how often is a corporate endpoint making authenticated calls to Google Calendar APIs using bearer tokens? Organizations will need to start monitoring cloud API usage in the same way they track endpoint behavior. This means setting up alerting for spikes or deviations in API usage, even if the services being contacted are not inherently malicious.

Conclusion

Looking ahead, we’re likely to see more attackers mimic this approach. The use of legitimate cloud services as a layer of C2 infrastructure is just too effective to ignore. It offers global availability, encryption by default, high availability, and a built-in layer of trust. What APT41 has done with TOUGHPROGRESS will likely serve as a blueprint for future malware.

We may also see a rise in malware designed with specific cloud provider APIs in mind: not just Google but Microsoft, AWS, even smaller SaaS providers. As organizations increasingly migrate to cloud-native workflows, attackers are moving right alongside them.

In conclusion, TOUGHPROGRESS is not just a piece of malware; it’s a case study in what the next generation of threat actors will look like. It’s agile, aware of the cloud, modular in design, and built to blend in. Analyzing it was not just a technical exercise, but a peek into what cybersecurity defense needs to adapt to: threat actors who know our tooling, our architecture, and maybe even our coffee schedules.

Indicators of Compromise:

Calendar API: 104075625139-l53k83pb6jbbc2qbreo4i5a0vepen41j.apps.googleusercontent.com
Calendar URL: https[:]//www[.]googleapis[.]com/calendar/v3/calendars/ff57964096cadc1a8733cf566b41c9528c89d30edec86326c723932c1e79ebf0@group.calendar.google.com/events
IP Fetch URL: https[:]//api[.]ipify[.]org

APT41: TOUGHPROGRESS Malware Analysis