5 days lecture + hands-on lab
Driver Developers, Support Engineers and Software QA Engineers
This course teaches architecture, internals and of the Windows operating system with emphasis on production debugging of kernel mode drivers. It helps attendees understand the behind the scenes working of the Windows operating system and debug common crashes and hangs that occur during kernel mode code execution.
The hands-on lab familiarizes attendees with the debugging and instrumentation tools, relevant debugger extension commands, interpretation of the command's output to investigate the state of device drivers and the system, debugging techniques to isolate faulting modules and root cause crashes and hangs caused by drivers.
Attendees must be familiar with the basic usage of the debugger (Debugging Tools for Windows). Basic usage includes symbol server, debugger commands for displaying call stacks, data structures, memory contents and system information. To get the most value from the course attendees must be familiar with Windows kernel API and C programming language.
Upon completion of this course attendees would be able to:
Configure the host and target systems for live kernel debugging. Apply live debugging techniques to debug kernel mode issues.
Understand the effect enabling driver verifier has on debugging and learn techniques to debug such failures. Identify the Driver Verifier settings required to debug different categories of problems.
Understand the memory dump generation mechanism and configure the system to generate memory dumps for hangs and crashes.
Interpret the information displayed by the debuggers’ automated analyzer and to identify subsequent analysis steps on memory dumps.
Understand how hardware failures cause system bug-checks and debug WHEA_UNCORRECTABLE_ERROR bugchecks.
Understand IRQLs, restrictions imposed by IRQLs and debug IRQL related problems like IRQL_NOT_LESS_OR_EQUAL bug-checks.
Understand the different execution contexts and conditions under which kernel mode drivers execute. Deploy tools and apply techniques for debugging performance issues, intermittent CPU spikes and consistent high CPU usage in the system.
Understand APCs, critical and guarded regions, process attachment and use this knowledge to debug bug-checks like APC_INDEX_MISMATCH, KERNEL_APC_PENDING_DURING_EXIT etc.
Understand system worker threads, work items and identify worker thread depletion issues causing system hangs.
Understand the principles of synchronization in kernel mode code and the different synchronization options that are provided by the kernel. Identify normal vs. stuck threads, modules and resources involved in kernel mode deadlocks and root cause the deadlocks.
Understand the distribution and utilization of system virtual address space. Identify causes of system PTE depletion and debug NO_MORE_SYSTEM_PTES bug-checks. Debug invalid memory accesses leading to bug-checks like KMODE_EXCEPTION_NOT_HANDLED and PAGE_FAULT_IN_NONPAGED_AREA.
Understand the stack usage in the kernel, stack jumping, causes of stack overflows and debug issues like double faults.
Understand the layout, types and utilization of pool memory. Debug pool depletion indicated by Event IDs 2010 and 2020. Detect corrupted data structures, identify the scope of corruption, isolate modules that are responsible for the corruption and debug BAD_POOL_CALLER bugchecks.
Understand how virtual memory is mapped to physical memory on X86 and X64 systems, memory locking, mapping, memory descriptor lists and debug issues related to DMA.
Understand key I/O manager data structures and navigate between them. Understand the interactions between device drivers the I/O manager. Find and identify drivers blocking I/O requests leading to system hangs. Debug bug-checks like PROCESS_HAS_LOCKED PAGES, NO_MORE_IRP_STACK_LOCATIONS, MULTIPLE_IRP_COMPLERE_REQUESTS etc. and identity drivers that cause them.
Understand behind the scenes working of PnP and Power transitions and how device drivers respond to PnP and Power state changes. Identify driver and stacks responsible for blocking power IRPs resulting in DRIVER_POWER_STATE_FAULIRE bug-checks.
Kernel Mode Debugging ToolsDebugging Tools for WindowsCollecting System Information Gflags for Kernel Debugging Performance Analysis Tools Driver Verifier Driver Verifier Logging Kernel ArchitectureKernel Mode ComponentsSystem Service Dispatching Process Context Thread Context Exception Handling Trap Frames Task State Segment (TSS) Context Structures System Bug-Checks Dump GenerationLive DebuggingMemory Dump Generation Memory Dump Types & Contents Hang vs. Crash Dumps Types of Crashes Memory Dump Navigation Dump AnalysisCommon Analysis StepsRegister Contexts Analyzing System State Identifying Faulting Modules Hardware Failures Access violations Assembly language Call Stacks Kernel MechanismsProcessor Control Region (PCR)IRQLs Interrupt Service Routines (ISRs) Deferred Procedure Calls (DPCs) Asynchronous Procedure Calls (APCs) Intermittent CPU Spikes High CPU Usage System Threads Work Items |
Memory ManagerKernel Virtual Address SpaceDynamic Kernel Space Management SysPTE Depletion Kernel Stacks Stack Overflows and Double Faults Page Table Entries (PTEs) Page Frame Number (PFN) Database Memory Descriptor Lists (MDLs) Direct Memory Access (DMA) Issues Pools and Look-aside Lists Pool Corruption Pool Depletion Kernel SynchronizationDispatcher ObjectsFast Mutexes and Guarded Mutexes ERESOURCEs Deadlocks Spin Locks Queued Spin Locks Livelocks I/O ManagerDriver Architecture & Entry PointsI/O Manager Data Structures I/O Request Packet (IRP) Flow IRP Processing Synchronous and Asynchronous I/O Processing Completion Routines Stuck I/O Requests Cancel Routines I/O Cancelation Hangs PnP & PowerDriver, Device Types and Device NodesDevice Object Layering PnP IRPs and PnP State Transitions Device Enumeration Device Startup Failures Device Manager Error Codes System and Device and CPU Power States Power IRPs and Power State Transitions Idle Power Management Remote Wakeup Power Watchdog Timeouts |