Part 3 – User Mode Communication
We’ve covered how to grab a partial context for a kernel thread, and the construction of a driver, now it’s time to finally witness the fruits of our labour.
Once the driver has been installed[1], the first thing we need to do is make sure it is running and loaded. In the eyes of the system, drivers are nothing more than kernel services so the normal StartService command will launch the driver if it isn’t already running.
SC_HANDLE manager = OpenSCManager(NULL, NULL, SERVICE_MANAGER_CREATE_SERVICE); SC_HANDLE service = OpenService(manager, L"KStack", SERVICE_START); StartService(service, NULL, NULL);
After the driver has been started successfully, CreateFile can be used, as discussed in the previous article, to retrieve a handle to the device created in DriverEntry. The path to specify to CreateFile is the path to the user visible symlink we created. However instead of using “\\DosDevices\\” to reference the namespace like in the driver, we use the familiar “\\.\” notation. If we required it, the handle could also be opened for overlapped operations but in our sample we don’t need this flexibility.
With a handle in our possession, we can use DeviceIoControl to make indirect calls to the driver routines we defined. We pass in the IOCTL id’s we defined and the in and out buffers as required. This makes our code that retrieves the partial code look like this:
// Thread has to be suspended before trying to read its stack // or get it's context SuspendThread(hThread); HANDLE hDriver = CreateFile(L"\\\\.\\Global\\KStack", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if(hDriver != INVALID_HANDLE_VALUE) { // the output buffer ThreadCtx thCtx; DWORD bytesCopied = 0; // call the driver in a synchronous manner // passing in a pointer to the threadid were interested in // and the threadcontext struct if(DeviceIoControl(hDriver, IOCTL_THREAD_CONTEXT, &threadId, sizeof(threadId), &thCtx, sizeof(thCtx), &bytesCopied, NULL) { // convert the threadctx to a normal context struct ThreadCtxToCONTEXT(thCtx, context); } }
To let StackWalk64 read memory from the thread stack we need to provide it with a custom routine that calls into our driver for kernel addresses. When we’re dealing with normal user addresses we pass NULL so it can use it’s own internal function (most likely ReadProcessMemory) to read it. As we did in the previous articles, before writing our own let’s see what process explorer’s custom routine looks like.
; ; procexp.exe - 0x00427AB0 ; entire read memory function for kernel addresses ; C Signature is ; BOOL CALLBACK SymbolHelper::ReadMemory(HANDLE hProcess, DWORD64 lpBaseAddress, PVOID lpBuffer, ;DWORD nSize, LPDWORD lpNumberOfBytesRead) ; mov ecx,dword ptr [esp+8] ; lpBaseAddress to read to ecx ; compare it with the maximum app address (contained in a global) cmp ecx,dword ptr ds:[49FAE0h] mov eax,1 ; return true for user addresses jae 00427ACE ; if it is a kernel address jump to kernel section, otherwise continue mov edx,dword ptr [esp+18h] ; lpNumberOfBytesRead to eax mov dword ptr [edx],0 ; dereference and set to 0 ret 18h ; return true ; ; kernel section ; offsets to the function arguments change as the pushes take place ; as each one decreases the value of esp by 4 ; mov eax,dword ptr [esp+10h] ; move lpBuffer to eax push esi ; save esi mov esi,dword ptr [esp+18h] ; mov nSize to esi push esi ; arg to function push eax ; arg to function push 8 ; misc arg to function lea ecx,[esp+18h] ; move address to ecx push ecx ; arg to function push 83350030h ; deviceiocontrol code call 004221D0 ; call deviceiocontrol wrapper mov ecx,dword ptr [esp+30h] ; mov lpNumberOfBytesRead into ecx add esp,14h ; re-adjust stack pointer to pop arguments pushed mov edx,eax ; mov return val to edi neg edx ; negate sbb edx,edx ; nop and edx,esi ; and negated return val with nSize to get number of bytes read (all or none) mov dword ptr [ecx],edx ; save in dereferenced lpNumberOfBytesRead pop esi ; restore esi from before function call ret 18h ; exit function
Because we can’t be sure what is on the threads’ stack, a check is made to see if the address to read from is a user mode one. If it is, the function exits signalling that 0 bytes have been read. If it isn’t, the driver is passed the address and size and the return value of the function call is used to set how many bytes were read. All or none are the only outcomes because, as we saw last time, the Process Explorer driver validates the range before copying anything. Our version is more of less a straight asm to C conversion, but obtaining the driver handle and the DeviceIoControl call are within the same function.
BOOL CALLBACK SymbolHelper::ReadMemory(HANDLE hProcess, DWORD64 lpBaseAddress, PVOID lpBuffer, DWORD nSize, LPDWORD lpNumberOfBytesRead) { PVOID maxUserAddress = GetMaximumAppAddress(); PVOID base = reinterpret_cast<PVOID>(lpBaseAddress); const bool isUserAddress = (base <= maxUserAddress); BOOL bSuccess = TRUE; // if the address is a user one, set bytesRead to 0 and return true if(isUserAddress) { *lpNumberOfBytesRead = 0; } // otherwise use the driver to read the kernel menory else { HANDLE hDriver = GetDriverHandle(); if(hDriver != INVALID_HANDLE_VALUE) { ReadRequest req = {base, nSize}; bSuccess = DeviceIoControl(hDriver, IOCTL_READ_MEMORY, &req, sizeof(req), lpBuffer, nSize, lpNumberOfBytesRead, NULL); } else { DisplayError(L"Couldn't open an handle to the driver", GetLastError()); bSuccess = FALSE; } } return bSuccess; }
And as far as communicating with the driver that’s it. Turning this into a stack trace is a simple matter of calling StackWalk64 in a loop passing the returned addresses to SymFromAddr and printing it’s output [2]. The rest of the code in the sample application deals with support for the symbol handler as well as various wrappers and utility functions dealing with the driver service.
That concludes the series on getting a kernel mode stack trace with helpful hints from Process Explorer. Here’s a zip full of sourcecode for the driver and sample application, tested and built in Visual Studio 2008.
[1] Installation is achieved by copying the driver binary to the system32\drivers directory and registering it with the service control manager via CreateService (See the InstallDriver function in Utility.cpp in the source package). StartService will then start the driver.
[2] Sample Output:
Windows 2000 SP4: