This article discusses a compiler optimization technique, which causes parameter values on the stack to show up differently from their actual values. Since it is common practice, during debugging, to read parameter values directly from X86 call stacks, this behavior can be misleading. Although the assembly language code snippets used in this article are from Windows XP running on X86, the content discussed here applies to all versions of Windows.
On the X86 CPU, barring a few exceptions like functions using the fastcall calling convention, parameter values read from the call stack are assumed to be accurate. However, the example shown below seems to imply otherwise. The following kernel mode stack depicts a waiting thread on Windows XP :
ChildEBP RetAddr Args to Child
fc0b6c08 804dc6a6 ff8e3e18 ff8e3da8 804dc6f2 nt!KiSwapContext+0x2e
fc0b6c14 804dc6f2 00000103 ff95c16c 00000000 nt!KiSwapThread+0x46
fc0b6c3c 80616c2b 00000000 00000000 00000000 nt!KeWaitForSingleObject+0x1c2
fc0b6c68 805e6f85 0095c16c ff8e39a8 ff8e3a3c nt!IopCancelAlertedRequest+0x68
fc0b6c84 8057a510 ff953680 00000103 ff95c110 nt!IopSynchronousServiceTail+0xe1
fc0b6d38 804df06b 000007b0 00000770 00000000 nt!NtWriteFile+0x602
fc0b6d38 7c90eb94 000007b0 00000770 00000000 nt!KiFastCallEntry+0xf8
007cff30 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet
In the above stack, the first (highlighted) parameter to KeWaitForSingleObject() is NULL. KeWaitForSingleObject() happens to be a kernel function that is documented on MSDN and its prototype is as follows:
__in PVOID Object,
__in KWAIT_REASON WaitReason,
__in KPROCESSOR_MODE WaitMode,
__in BOOLEAN Alertable,
__in_opt PLARGE_INTEGER Timeout );
As seen from the prototype, the first parameter i.e. the pointer to the object to wait on, is probably the most important parameter to KeWaitForSingleObject(). And yet, that parameter being 0x00000000 does not cause this function to crash. The rest of the article investigates why this mandatory parameter to KeWaitForSingleObject() appears as NULL on the call stack.
The call stack shown above indicates that the caller to KeWaitForSingleObject() is the function IopCancelAlertedRequest().
Examining the assembler code of IopCancelAlertedRequest(), shows that the value of the first parameter being passed in to KeWaitForSingleObject, is read from the ESI register, as shown in the highlighted instruction.
80616c21 53 push ebx
80616c22 53 push ebx
80616c23 53 push ebx
80616c24 53 push ebx
80616c25 56 push esi
80616c26 e8255becff call nt!KeWaitForSingleObject (804dc750)
The assembler code for the initial part of the function KeWaitForSingleObject(), i.e. the function prolog, shows the non-volatile registers (ebp, esi, edi, ebx) being saved on the stack. The compiler saves and later restores the values of these non-volatile registers since their values to be preserved across function calls. Out of these four non-volatile registers, the one that is of interest is the ESI register, as highlighted below:
804dc750 8bff mov edi,edi
804dc752 55 push ebp
804dc753 8bec mov ebp,esp
804dc755 83ec14 sub esp,0x14
804dc758 53 push ebx
804dc759 56 push esi
804dc75a 57 push edi
804dc75b 64a124010000 mov eax,fs:
804dc761 8b5518 mov edx,[ebp+0x18]
804dc764 8b5d08 mov ebx,[ebp+0x8]
Examining of the contents of the stack frame for the function KeWaitForSingleObject(), the value of the parameters, return address, frame pointer, local variables and saved non-volatile registers can be deduced.
fc0b6c1c 00000103 ; Saved EDI
fc0b6c20 ff95c16c ; Saved ESI
fc0b6c24 00000000 ; Saved EBX
fc0b6c28 00000000 ; Start of Local Variable Area
fc0b6c38 00000000 ; End of Local Variable Area
fc0b6c3c fc0b6c68 ; Saved EBP (frame pointer)
fc0b6c40 80616c2b nt!IopCancelAlertedRequest+0x68 ; return address of the caller to KeWaitForSingleObject()
fc0b6c44 00000000 ; Parameter #1 = Object
fc0b6c48 00000000 ; Parameter #2 = WaitReason
fc0b6c4c 00000000 ; Parameter #3 = WaitMode
fc0b6c50 00000000 ; Parameter #4 = Alertable
fc0b6c54 00000000 ; Parameter #5 = Timeout
The stack location containing the saved value of the ESI register is at address 0xfc0b6c20 and the contents of the location, shown highlighted above, is 0xff95c16c. This value of the ESI register is the same value what would have been passed in as the first parameter to KeWaitForSingleObject(), by the caller IopCancelAlertedRequest(), as explained before.
Furthermore, details of the waiting thread reveal that it is indeed waiting on a valid notification event whose address, highlighted below, is the same value as the contents of the ESI register observed above. This eludes to the fact that when KeWaitForSingleObject() was called, the value of the first parameter was a valid pointer i.e. 0xff95c16c.
THREAD ff8e3da8 Cid 0514.06cc Teb: 7ffde000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
Examining the assembler code of the function body for KeWaitForSingleObject() shows an instruction at address 0x804fbe3c (highlighted below) which writes the value 0x0 to a location on the stack pointed to by ebp+0x08. Based on the layout of the X86 call stack, this also happens to be location containing the first parameter passed to KeWaitForSingleObject().
804fbe3c 83650800 and dword ptr [ebp+0x8],0x0
804fbe40 e9b609feff jmp nt!KeWaitForSingleObject+0x9b (804dc7fb)
The above assembly sequence is generated by the compiler to perform a particular type of optimization which reduces the amount of stack space used by a function. When the compiler notices a code pattern, similar to the one shown in the following 'C' code, it concludes that the usage of a particular parameter and local variable are mutually exclusive within the function.
Function ( PVOID Parameter1 )
if ( . . . )
. . .
// Access Parameter1;
// No access to Local1;
. . .
. . .
// Access to Local1;
// No access to Parameter1;
. . .
Since the compiler observes that Parameter1 and Local1 are not used at the same time, it repurposes the memory locations containing the parameter "Parameter1" to store local variable "Local1".
The utilization of parameter memory space, on the stack, to store local variables is common practice on the X86 CPU since there aren't that many register the compiler can use during program execution to store values of temporary/local variables.
On CPU architectures like X64, where there are a lot more registers available (i.e. R8 through R15), the need for doing such optimization diminishes as these extra registers can be used to cache temporary/local variables, which is much more efficient than storing them on the stack.
To sum up the observations above, at the time of the call to KeWaitForSingleObject(), the first parameter was passed in correctly but subsequently KeWaitForSingleObject() overwrote the value by zeroing out the contents of the location where the parameter was stored on the stack. So, once the function body had finished using the parameter value and did not need it anymore, it reused the stack space occupied by the parameter to store a local variable, leveraging the fact that according to the "C" calling convention the caller to KeWaitForSingleObject() would not depend on the parameter value upon return.