In this article we talk about using Call Stacks to detect malware at a deeper level and further our understanding of malware behaviour. The reason we take this approach is to work beyond just detecting behaviour on the basis of which programs are triggered or actions are performed; but also working to determine which functions are being involved in performing said malicious activity. This works to help us increase coverage, reduce false positives and increase confidence in our alerts.
What is a Call Stack?
A Call Stack is defined as “a stack data structure which stores information about the active subroutines and inline blocks of a computer program.” by Wikipedia. This definition itself is quite simple and self-explanatory, but what is not covered is its purpose or use. The main reason for Call Stacks to exist is that it is used to keep track of the point/address to which each active subroutine must return control once it finishes execution. This includes nesting of functions as well, as with each call away from the original subroutine the address is pushed on top of the Call Stack. Note that when control flow returns from subroutine post execution, the latest return address in the stack is ‘pop’ed in order to both contain the flow within the stack as well as not to exceed the allocated space.
Each Call Stack is associated with a running program (each task or thread of a process); while the Call Stack is usually hidden from the developer, it is collected in the Sysmon logs under EventID 10 (ProcessAccess). What this also means is that each process execution stores its order to execution in memory, which can be accessed via logs and parsed later.
We can now take a look at the Call Stacks to understand the usual behaviour of a program including the dlls and functions it calls, and accordingly map suspicious deviations or known malicious behaviour on the basis of the same.
Understanding Sysmon Event ID 10 logs
Let’s take a look at a sample Sysmon Event ID log, a sanitized example log is shown below:
{
"EventTime": "1970-01-01 00:00:00",
"Hostname": "DESKTOP-TESTING",
"Keywords": -9223372036854776000,
"EventType": "INFO",
"SeverityValue": 2,
"Severity": "INFO",
"EventID": 10,
"SourceName": "Microsoft-Windows-Sysmon",
"ProviderGuid": "{1234567A-A12B-12A3-AB1C-12A3456CDEF7}",
"Version": 3,
"Task": 10,
"OpcodeValue": 0,
"RecordNumber": 165040,
"ProcessID": 1111,
"ThreadID": 2222,
"Channel": "Microsoft-Windows-Sysmon/Operational",
"Domain": "NT AUTHORITY",
"AccountName": "SYSTEM",
"UserID": "S-1-1-1",
"AccountType": "User",
"Message": "Process accessed:\r\nRuleName: technique_id=T1003,technique_name=Credential Dumping\r\nUtcTime: 1970-01-01 00:00:00.000\r\nSourceProcessGUID: {98a7bcd7-987a-987b-9a87-987654321098}\r\nSourceProcessId: 1111\r\nSourceThreadId: 2222\r\nSourceImage: C:\\Windows\\Sysmon64.exe\r\nTargetProcessGUID: {98a7bcd7-987a-987b-9a87-987654321098}\r\nTargetProcessId: 333\r\nTargetImage: C:\\Windows\\system32\\lsass.exe\r\nGrantedAccess: 0x1FFFFF\r\nCallTrace: C:\\Windows\\SYSTEM32\\ntdll.dll+9d9b4|C:\\Windows\\SYSTEM32\\ntdll.dll+d787a|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+1e20c|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+2921e|C:\\Windows\\Sysmon64.exe+ecb25|C:\\Windows\\Sysmon64.exe+ee340|C:\\Windows\\Sysmon64.exe+12ee13|C:\\Windows\\Sysmon64.exe+ef2d1|C:\\Windows\\Sysmon64.exe+f2c0d|C:\\Windows\\Sysmon64.exe+108e5a|C:\\Windows\\Sysmon64.exe+1adb62|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+17374|C:\\Windows\\SYSTEM32\\ntdll.dll+4cc91\r\nSourceUser: NT AUTHORITY\\SYSTEM\r\nTargetUser: NT AUTHORITY\\SYSTEM",
"Category": "Process accessed (rule: ProcessAccess)",
"Opcode": "Info",
"RuleName": "technique_id=T1003,technique_name=Credential Dumping",
"UtcTime": "1970-01-01 00:00:00.000",
"SourceProcessGUID": "{98a7bcd7-987a-987b-9a87-987654321098}",
"SourceProcessId": "1111",
"SourceThreadId": "2222",
"SourceImage": "C:\\Windows\\Sysmon64.exe",
"TargetProcessGUID": "{98a7bcd7-987a-987b-9a87-987654321098}",
"TargetProcessId": "333",
"TargetImage": "C:\\Windows\\system32\\lsass.exe",
"GrantedAccess": "0x1fffff",
"CallTrace": "C:\\Windows\\SYSTEM32\\ntdll.dll+9d9b4|C:\\Windows\\SYSTEM32\\ntdll.dll+d787a|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+1e20c|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+2921e|C:\\Windows\\Sysmon64.exe+ecb25|C:\\Windows\\Sysmon64.exe+ee340|C:\\Windows\\Sysmon64.exe+12ee13|C:\\Windows\\Sysmon64.exe+ef2d1|C:\\Windows\\Sysmon64.exe+f2c0d|C:\\Windows\\Sysmon64.exe+108e5a|C:\\Windows\\Sysmon64.exe+1adb62|C:\\Windows\\SYSTEM32\\KERNEL32.DLL+17374|C:\\Windows\\SYSTEM32\\ntdll.dll+4cc91",
"SourceUser": "NT AUTHORITY\\SYSTEM",
"TargetUser": "NT AUTHORITY\\SYSTEM",
"EventReceivedTime": "1970-01-01 00:00:00",
"SourceModuleName": "in",
"SourceModuleType": "im_msvistalog"
}
Here, we can see the log field “CallTrace” which acts as a snapshot of the Call Stack currently for the reported process. If we are to focus our attention on CallTrace alone, as is our intention, and expand the same for readability, we get the following output:
C:\\Windows\\SYSTEM32\\ntdll.dll+9d9b4
C:\\Windows\\SYSTEM32\\ntdll.dll+d787a
C:\\Windows\\SYSTEM32\\KERNEL32.DLL+1e20c
C:\\Windows\\SYSTEM32\\KERNEL32.DLL+2921e
C:\\Windows\\Sysmon64.exe+ecb25
C:\\Windows\\Sysmon64.exe+ee340
C:\\Windows\\Sysmon64.exe+12ee13
C:\\Windows\\Sysmon64.exe+ef2d1
C:\\Windows\\Sysmon64.exe+f2c0d
C:\\Windows\\Sysmon64.exe+108e5a
C:\\Windows\\Sysmon64.exe+1adb62
C:\\Windows\\SYSTEM32\\KERNEL32.DLL+17374
C:\\Windows\\SYSTEM32\\ntdll.dll+4cc91
Here we can see the semblance of a stack where we can see how the control flow is operating in this process including the method used to store addresses. “.exe” or “.dll” paths are stored along with the offset to point to the address where the control flow is supposed to return to after execution completes.
The execution begins and ends in ntdll.dll and KERNEL32.DLL, which are core Windows libraries responsible for interfacing between user-mode applications and kernel-mode operations. These entries (e.g., ntdll.dll+9d9b4, KERNEL32.DLL+1e20c) suggest standard thread or process handling behavior.
The middle of the stack is dominated by Sysmon64.exe offsets (e.g., +ecb25, +f2c0d, +1adb62), which represent Sysmon’s internal logic for monitoring and logging system events.
This is a benign and expected stack for normal Sysmon operation, reflecting how it receives a system callback (via ntdll and KERNEL32), processes an event internally, and exits.
Examples of Scenarios to use Call Stacks
In the past we have talked about how Call Stacks can be used to reduce false positives, to understand the same, we take an example scenario and explore the suggested improvement offered by the use of Call Stacks.
Here we take an example of “Microsoft Office spawning child processes”; standalone this detection is quite crucial and aims to catch common attack techniques; however, legitimate processes like ‘splwow64.exe’ used for printing can also get caught in its net. We cannot simply exclude ‘splwow64.exe’ on the basis of path as this would quickly make it a target for an attacker to hijack using Process Hollowing and cause a blind-spot in our detection.
Thus, we check to see whether the call for ‘splwow64.exe’ originates from ‘winspool.drv’ which makes it a legitimate request, spawned from the process of printing. Thus, we have safely excluded splwow64.exe only in the exact legitimate scenario, reducing false positives without opening up evasion gaps.
An example Sigma rule for the same is shown below:
detection:
selection:
ParentImage|endswith:
- '\WINWORD.EXE'
- '\EXCEL.EXE'
- '\POWERPNT.EXE'
- '\MSACCESS.EXE'
- '\mspub.exe'
- '\fltldr.exe'
- '\visio.exe'
EventAction: 'start'
exclusion:
Image|endswith: '\splwow64.exe'
StackTrace:
- '*\System32\winspool.drv*'
- '*\SysWOW64\winspool.drv*'
condition: selection and not exclusion
A similar example of the same can be to check when an Office process spawns a script child process like cmd.exe, powershell.exe. etc., and the Call Stack includes ‘VBE7.DLL’, it’s a strong indicator of macro execution, a common attack vector. Hence, instead of looking at and flagging each and every Office process, we look at those which contain “VBE7.DLL” in the StackTrace to make our detections sharper.
Let’s take another example where we may need to reduce false positives and enhance our detection: “Execution from ZIP File via Explorer”
Here we are faced with the issue that a script file (wscript.exe / cmd.exe / powershell.exe / cscript.exe etc.) is running while originating from the ‘explorer.exe’ process. If we are to create a detection solely for this issue, we might gather many events where the user is clicking to start some legitimate scripts, which will however get flagged as suspicious due to our rule. In order to make this more refined, we can look at the Call Stack for the process to highlight a certain kind of suspicious scenario:
If we find a ‘zipfld.dll’ (Zipped Folders Shell Extension) in the Call Stack, we can say with confidence that this is an instance of a script running from inside a ZIP archive which was opened using Explorer. This is an attack vector used by threat actors to evade “Mark of the Web” or MoTW and it is necessary to highlight such a case scenario. Thus, using Call Stack allows us to enhance a detection in order to make it accurate and drive false positives close to zero.
Future Possibility
There is a limitation in our current approach with Call Stack, tail calls are omitted by compilers, thus hiding crucial tracing information from us. Tail Calls are used both by legitimate software for faster processing as well as by malicious software for evasion.
To work around this problem, the first step is to further enrich modules by using public symbol tables; to include the function as well in the following format:
Module!function+offset.
Eg: ntdll.dll!NtProtectVirtualMemory+0x14.
Why this helps us is that it forces the stack to include the missing return addresses and calls that were being used as Tail Calls as well as sets up the next steps required for detection. Next we must also filter the “final user module” from any stack for further investigation.
Now we can use a filtered list of NTDLL kernel mode to user mode callbacks listed below to detect possibly malicious behaviour in the future:
- KiUserExceptionDispatcher
- KiUserApcDispatcher
- KiRaiseUserExceptionDispatcher
- KiUserCallbackDispatcher
- LdrInitializeThunk
- RtlUserThreadStart
- EtwpNotificationThread
- LdrHotPatchRoutine
An example of the above can be detecting pre-entrypoint execution attacks. If we see a Call Stack includes ‘LdrInitializeThunk’, it means that we are at the very beginning of a thread’s execution, this is where the application compatibility Shim Engine operates, where hook-based security products prefer to install themselves, and where malware tries to gain execution before those other security products.
Thus, we have covered the basics of Call Stack, their relevance to detection engineering, as well as the future possibilities that would help us expand our detection methodologies further.