This Channel 9 “going deep” interview with Mark Russinovich dives very deep in to kernel changes in Windows 7 and Windows Server 2008 R2. I’ll do my best to summarise. It’s excellent stuff, illuminating:
-
MinWin
- Lots of server core’s first incarnation was only there to satisfy DLL dependencies even though many of those DLLs were not used
- To get to MinWin they had to:
- Monitor DLL dependencies and reverse-engineer the dependency map
- Disentangle all of the unnecessary dependencies
- What was left was the smallest Windows instance that could be booted. It would not be functional as an operating system for building products on top of it, but the core functionality of the system could be booted and tested
- The refactoring of DLLs to disentangle unnecessary dependencies has yielded performance benefits
- Additionally, boot is parallelised more in Windows 7, also improving boot time
-
NUMA node changes and Multiple Processor Group mappings
- He demonstrates on a 256 logical core, 1 terrabyte Windows server
- The new grouping functionality is what allows Windows to scale beyond 64 CPUs, across four processor groupings of up to 64 processors each
- Because this system has 256 cores it needs to use all four groups
- Typically groups are limited to as few as possible
- On this system there are 16 NUMA nodes and there are 16 CPUs per-NUMA node,
- The nodes straddle the four groups
- Each node of 16 CPUs will have “local” memory of 64GB
- These CPUs can access memory from other nodes but there is an associated performance hit
- Because this system has 256 cores it needs to use all four groups
- Process reflection
- Allows capture of a process’s memory dump without having to shut it down, stall the process or write memory to disk
- Reflection temporarily freezes a process and captures a mini-dump
- Creates a cloned child process that gets captured
- The child process is not executable
- It’s possible to capture a dump in response to system events like CPU spikes, if this is specified in advance with procdump /r (a process-reflected dump)
- Distributed Fair Share Scheduling (DFSS)
- Allows throttling of users in Remote Desktop Services
- “Gives users a budget”
- Allows throttling of users in Remote Desktop Services
He also covers some of the potential things to come in “Windows 8” such as MiniCore (GPU-as-CPU) support.
The Channel 9 interviewer really wants to get past the standard Microsoft line and there’s very little that’s not on the table for discussion. Similarities to Unix are discussed. He finishes talking about his first experiences with Apple II ROMs. One for OS geeks.