Just Let It Flow

February 22, 2009

Grabbing Kernel Thread Call Stacks the Process Explorer Way – Part 3

Filed under: Code,Windows — adeyblue @ 5:27 am

Part 3 – User Mode Communication

We’ve covered how to grab a partial context for a kernel thread, and the construction of a driver, now it’s time to finally witness the fruits of our labour.

Once the driver has been installed[1], the first thing we need to do is make sure it is running and loaded. In the eyes of the system, drivers are nothing more than kernel services so the normal StartService command will launch the driver if it isn’t already running.

SC_HANDLE manager = OpenSCManager(NULL, NULL, SERVICE_MANAGER_CREATE_SERVICE);
SC_HANDLE service = OpenService(manager, L"KStack", SERVICE_START);
StartService(service, NULL, NULL);

After the driver has been started successfully, CreateFile can be used, as discussed in the previous article, to retrieve a handle to the device created in DriverEntry. The path to specify to CreateFile is the path to the user visible symlink we created. However instead of using “\\DosDevices\\” to reference the namespace like in the driver, we use the familiar “\\.\” notation. If we required it, the handle could also be opened for overlapped operations but in our sample we don’t need this flexibility.

With a handle in our possession, we can use DeviceIoControl to make indirect calls to the driver routines we defined. We pass in the IOCTL id’s we defined and the in and out buffers as required. This makes our code that retrieves the partial code look like this:

// Thread has to be suspended before trying to read its stack
// or get it's context
SuspendThread(hThread);
HANDLE hDriver = CreateFile(L"\\\\.\\Global\\KStack", GENERIC_READ, FILE_SHARE_READ, 
                                      NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if(hDriver != INVALID_HANDLE_VALUE)
{
    // the output buffer
    ThreadCtx thCtx;
    DWORD bytesCopied = 0;
    // call the driver in a synchronous manner
    // passing in a pointer to the threadid were interested in
    // and the threadcontext struct 
    if(DeviceIoControl(hDriver, IOCTL_THREAD_CONTEXT, &threadId,
                             sizeof(threadId), &thCtx, sizeof(thCtx),
                             &bytesCopied, NULL)
    {
        // convert the threadctx to a normal context struct
        ThreadCtxToCONTEXT(thCtx, context);
    }
}

To let StackWalk64 read memory from the thread stack we need to provide it with a custom routine that calls into our driver for kernel addresses. When we’re dealing with normal user addresses we pass NULL so it can use it’s own internal function (most likely ReadProcessMemory) to read it. As we did in the previous articles, before writing our own let’s see what process explorer’s custom routine looks like.

;
; procexp.exe - 0x00427AB0
; entire read memory function for kernel addresses
; C Signature is
; BOOL CALLBACK SymbolHelper::ReadMemory(HANDLE hProcess, DWORD64 lpBaseAddress, PVOID lpBuffer, 
;DWORD nSize, LPDWORD lpNumberOfBytesRead)
;
mov         ecx,dword ptr [esp+8] ; lpBaseAddress to read to ecx
; compare it with the maximum app address (contained in a global)
cmp         ecx,dword ptr ds:[49FAE0h]
mov         eax,1 ; return true for user addresses
jae         00427ACE ; if it is a kernel address jump to kernel section, otherwise continue
mov         edx,dword ptr [esp+18h] ; lpNumberOfBytesRead to eax
mov         dword ptr [edx],0 ; dereference and set to 0
ret         18h  ; return true
;
; kernel section
; offsets to the function arguments change as the pushes take place
; as each one decreases the value of esp by 4
;
mov         eax,dword ptr [esp+10h] ; move lpBuffer to eax
push        esi  ; save esi
mov         esi,dword ptr [esp+18h] ; mov nSize to esi
push        esi  ; arg to function
push        eax  ; arg to function
push        8 ; misc arg to function
lea         ecx,[esp+18h] ; move address to ecx
push        ecx  ; arg to function
push        83350030h ; deviceiocontrol code
call        004221D0 ; call deviceiocontrol wrapper
mov         ecx,dword ptr [esp+30h] ; mov lpNumberOfBytesRead into ecx
add         esp,14h ; re-adjust stack pointer to pop arguments pushed
mov         edx,eax ; mov return val to edi
neg         edx  ; negate
sbb         edx,edx ; nop
and         edx,esi ; and negated return val with nSize to get number of bytes read (all or none)
mov         dword ptr [ecx],edx ; save in dereferenced lpNumberOfBytesRead
pop         esi  ; restore esi from before function call
ret         18h  ; exit function

Because we can’t be sure what is on the threads’ stack, a check is made to see if the address to read from is a user mode one. If it is, the function exits signalling that 0 bytes have been read. If it isn’t, the driver is passed the address and size and the return value of the function call is used to set how many bytes were read. All or none are the only outcomes because, as we saw last time, the Process Explorer driver validates the range before copying anything. Our version is more of less a straight asm to C conversion, but obtaining the driver handle and the DeviceIoControl call are within the same function.

BOOL CALLBACK SymbolHelper::ReadMemory(HANDLE hProcess, DWORD64 lpBaseAddress, 
PVOID lpBuffer, DWORD nSize, LPDWORD lpNumberOfBytesRead)
{
    PVOID maxUserAddress = GetMaximumAppAddress();
    PVOID base = reinterpret_cast<PVOID>(lpBaseAddress);
    const bool isUserAddress = (base <= maxUserAddress);
    BOOL bSuccess = TRUE;
    // if the address is a user one, set bytesRead to 0 and return true
    if(isUserAddress)
    {
        *lpNumberOfBytesRead = 0;
    }
    // otherwise use the driver to read the kernel menory
    else
    {
        HANDLE hDriver = GetDriverHandle();
        if(hDriver != INVALID_HANDLE_VALUE)
        {
            ReadRequest req = {base, nSize};
            bSuccess = DeviceIoControl(hDriver, IOCTL_READ_MEMORY, &req, sizeof(req),
            lpBuffer, nSize, lpNumberOfBytesRead, NULL);
        }
        else
        {
            DisplayError(L"Couldn't open an handle to the driver", GetLastError());
            bSuccess = FALSE;
        }
    }
    return bSuccess;
}

And as far as communicating with the driver that’s it. Turning this into a stack trace is a simple matter of calling StackWalk64 in a loop passing the returned addresses to SymFromAddr and printing it’s output [2]. The rest of the code in the sample application deals with support for the symbol handler as well as various wrappers and utility functions dealing with the driver service.

That concludes the series on getting a kernel mode stack trace with helpful hints from Process Explorer. Here’s a zip full of sourcecode for the driver and sample application, tested and built in Visual Studio 2008.

KStack Driver and Sample

[1] Installation is achieved by copying the driver binary to the system32\drivers directory and registering it with the service control manager via CreateService (See the InstallDriver function in Utility.cpp in the source package). StartService will then start the driver.

[2] Sample Output:
Windows 2000 SP4:
Image Hosting by Picoodle.com

Windows XP SP3:
Image Hosting by Picoodle.com

Windows XP SP3:
Image Hosting by Picoodle.com

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress