VO Betriebssysteme Windows
Transcription
VO Betriebssysteme Windows
VO Betriebssysteme Windows - A Case Study operating system: software that controls the operation of a computer and directs the processing of programs. (Merriam-Webster OnLine) Andreas Schabus Academic Relations Microsoft Österreich GmbH [email protected] http://blogs.msdn.com/msdnat VO Einführung Betriebssystemstruktur Ein Betriebssystem besteht häufig aus folgenden Komponenten: Prozessmanagement Hauptspeichermanagement Sekundärspeichermanagement Netzwerkmanagement Schutzmechanismus Kommando-Interpreter-System Computer scientists should be able to map those concepts to real systems Different Implementations yield to different Behavior Understanding the implications prevent problems Agenda ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation Why Engage in OS? v1 How to design an OS? 1. 2. 3. Make hardware assumptions Identify problems to be solved Determine the architecture ² Abstractions and layers The higher an abstraction in the software stack the more it is circumvented, ignored, or upcalled higher abstraction abstraction abstraction abstraction abstraction OS ² Traditional Picture USER Apps OS Hardware Hardware/OS is the platform Goal is to sell (proprietary) hardware 3XUSRVHLVWRUXQVSHFLILF¶EXVLQHVV·DSSOLFDWLRQV OS and other system software is incidental/subsidiary Economics allows apps to target particular HW/OS HW/OS often single-vendor (IBM mainframes) ² 0RGHODOVRXVHGE\%XUURXJKV'LJLWDO681+3$SSOH¬ Revenues sweetened through service & education OS ² Windows-like Ecosystem USER OS Apps PC Hardware OS is the platform, hardware is commoditized Goal is to grow the ecosystem ² MS sells OS & Apps, OEMs/IHVs sell HW, ISVs sell Apps Purpose is consumer experience, running business apps OS is core Apps target the OS OEMs are the principle integrators Service & education just more elements in ecosystem Windows And Linux Evolution Windows and Linux kernels are based on foundations developed in the mid-1970s 1970 1980 1990 2000 1970 1980 1990 2000 (see http://www.levenez.com for diagrams showing history of Windows & Unix) OS Design Environments UNIX (1970s) NT (1980/1990s) ?? (2000/2010s) Address space 16b, swapping 32-bit, linear VM 64-bit, ?? CPU perf KIPS MIPS GIPS IRQL Test&Set, Comp&Swap Transactional memory? Memory size KBytes MBytes GBytes Hard concurrency none SMP High-Multicore Mass storage Kbytes, slow seek Mbytes, slow seeks GBytes, no seeks TBytes Client/server Peer-to-peer Synchr Distrib. computing Tape Old OS designs can (of course) be ported, but How you would design an OS on blank paper? How should the CPU system architecture evolve? v1 What problems do we need to solve in future operating systems? Well-understood problems (sort-of) Processes Threads Virtual Memory Input/output Local file systems Client/server computing Virtual machines Network protocols (quasi-secure) Security technologies (ACLs, authentication) v1 What problems do we need to solve in future operating systems? Difficult problems Security models Application model, System Extensibility Configuration/state management System Extensibility Compatibility/Fragility, Versioning Data Management Federated Computing Industrial Design Ecosystem Agenda ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation Requirements and Design Goals Provide a true 32-bit, preemptive, reentrant, virtual memory operating system Run and scale well on symmetric multiprocessing systems Be a great distributed computing platform (Client & Server) Run most existing 16-bit MS-DOS and Microsoft Windows 3.1 applications Meet government requirements for POSIX 1003.1 compliance Meet government and industry requirements for operating system security Be easily adaptable to the global market by supporting Unicode Requirements and Design Goals Extensibility ² Code must be able to grow and change as market requirements change. Portability ² The system must be able to run on multiple hardware architectures and must be able to move with relative ease to new ones as market demands dictate. Reliability and Robustness ² Protection against internal malfunction and external tampering. ² Applications should not be able to harm the OS or other running applications. Compatibility ² User interface and APIs should be compatible with older versions of Windows as well as older operating systems such as MS-DOS. ² It should also interoperate well with UNIX, OS/2, and NetWare. Performance ² Within the constraints of the other design goals, the system should be as fast and responsive as possible on each hardware platform. History of NT Team forms November 1988 Developers from DEC and Microsoft Build from the ground up ² Advanced Commercial Operating System ² Designed for desktops and servers ² Secure, scalable SMP design ² All new code Initial effort targeted at Intel i860 code-named N10, hence the name NT which doubled as N-Ten and New Technology Overview of Windows Architecture Heritage is RSX-11, VMS ² not UNIX Kernel-based, microkernel-like ² OS personalities in subsystems(i.e. for Posix, OS/2, Win32) ² Kernel focused on memory, processes, threads, IPC, I/O ² Kernel implementation organized around the object manager ² Win32 and other subsystems built on native NT APIs ² System functionality heavily based on client/server computing Primary supported programming interfaces: Win32 (and .NET) NT APIs ² Generally not documented (except for DDK) ² NT APIs are rich (many parameters) NTOS (kernel) ² Implements the NTAPI ² Drivers, file systems, protocol stacks not in NTOS ² Dynamic loading of drivers (.sys DLLs) is extension model Windows Kernel Evolution Basic kernel architecture has remained stable while system has evolved ² Windows 2000: major changes in I/O subsystem (plug & play, power management, WDM), but rest similar to NT4 ² Windows XP & Server 2003: modest upgrades as compared to the changes from NT4 to Windows 2000 Internal version numbers confirm this: ² Windows 2000 was 5.0 ² Windows XP 32-bit and IA64 editions is 5.1 So is XP Embedded ² Windows Server 2003 is 5.2 ² Windows XP 64-bit Edition for x64 is also 5.2 Based on the Windows Server 2003 SP1 kernel ² Windows Vista is 6.0 NT Timeline first 17 years 2/1989 Coding Begins 7/1993 NT 3.1 9/1994 NT 3.5 5/1995 NT 3.51 7/1996 NT 4.0 12/1999 NT 5.0 Windows 2000 8/2001 NT 5.1 Windows XP 3/2003 NT 5.2 Server 2003 8/2004 NT 5.2 Windows XP SP2 4/2005 NT 5.2 Windows XP 64 Bit Edition (& WS03SP1) 2006 NT 6.0 Windows Vista (client) Agenda ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation Slides are based on materials of the Windows Operating System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze. http://www.microsoft.com/resources/sharedsource/Licensing/WindowsAcademic.mspx Simplified OS Architecture System support processes User Mode Service processes User Environment applications subsystems Subsystem DLLs Kernel Mode Executive Kernel Device drivers Hardware Abstraction Layer (HAL) Windowing and graphics Windows Kernel Attributes Reentrant ² Kernel functions can be invoked by multiple threads simultaneously ² No serialization of user threads when performing system calls Asynchronous ² I/O system works fully asynchronously ² $V\QFKURQRXV,2LPSURYHVDSSOLFDWLRQ·VWKURXJKSXW ² Synchronous wrapper functions provide ease-ofprogramming Multiprocessor support Microkernel OS? Is Windows a microkernel-based OS? ² No ² not using the academic definition (OS components and drivers run in their own private address spaces, layered on a primitive microkernel) ² All kernel components live in a common shared address space Why not pure microkernel? ² Performance ² separate address spaces would mean context switching to call basic OS services ² Most other commercial OSs (Unix, VMS etc.) have the same design Microkernel OS? But it does have some attributes of a microkernel OS ² OS personalities running in user space as separate processes ² Kernel-mode components don't reach into one DQRWKHU·Vdata structures Use formal interfaces to pass parameters and access and/or modify data structures ² 7KHUHIRUHWKHWHUP´PRGLILHGPLFURNHUQHOµ Demo User vs. Kernel Mode Windows Architecture Applications Subsystem servers DLLs System Services Kernel32 Critical services Login/GINA User32 / GDI ntdll / run-time library User-mode Kernel-mode Trap interface / LPC Security refmon I/O Manager Net devices File filters Net protocols File systems Net Interfaces Volume mgrs Device stacks Virtual memory Procs & threads Win32 GUI Filesys run-time Scheduler Cache mgr Synchronization Object Manager / Configuration Management (registry) Kernel run-time / Hardware Adaptation Layer HAL - Hardware Abstraction Layer 5HVSRQVLEOHIRUDVPDOOSDUWRI´KDUGZDUHDEVWUDFWLRQµ ² Components on the motherboard not handled by drivers System timers, Cache coherency, and flushing SMP support, Hardware interrupt priorities Subroutine library for the kernel & device drivers ² Isolates Kernel and Executive from platform-specific details ² Presents uniform model of I/O hardware interface to drivers Reduced role as of Windows 2000 ² Bus support moved to bus drivers ² Majority of HALs are vendor-independent HAL also implements some functions that appear to be in the Executive and Kernel Selected at installation time ² See \windows\repair\setup.log to find out which one ² Can select manually at boot time with /HAL= in boot.ini HAL kit ² Special kit only for vendors that must write custom HALs (requires approval from Microsoft) Kernel Lower layers of the operating system ² Implements processor-dependent functions (x86 vs. Itanium etc.) ² Also implements many processor-independent functions that are closely associated with processor-dependent functions Main services ² ² ² ² Thread waiting, scheduling & context switching Exception and interrupt dispatching Operating system synchronization primitives (different for MP vs. UP) A few of these are exposed to user mode 1RWDFODVVLF´PLFURNHUQHOµ ² shares address space with rest of kernel-mode components Executive Upper layer of the operating system 3URYLGHV´JHQHULFRSHUDWLQJV\VWHPµIXQFWLRQV´VHUYLFHVµ ² ² ² ² ² ² ² ² ² ² Process Manager Object Manager Cache Manager LPC (local procedure call) Facility Configuration Manager Memory Manager Security Reference Monitor I/O Manager Power Manager Plug-and-Play Manager Almost completely portable C code 5XQVLQNHUQHO´SULYLOHJHGµULQJPRGH Most interfaces to executive services not documented NTDLL.DLL Support library for use of subsystem DLLs: System service dispatch stubs to NT executive system services ² NtCreateFile, NtSetEvent ² More than 200 ² Most of them are accessible through Win32 Stubs call service-dispatcher/kernel-mode service in NTOSKRNL.EXE Support functions used by subsystems ² ² ² ² ² Image loader (Ldr...) Heap manager Win32 subsyst. Comm. func. (Csr...) Runtime library func. (Rtl...) User-mode asynch. procedure call (APC) dispatcher, exception disp. Device Drivers Loadable kernel modules 'RQ¶WPDQLSXODWHKDUGZDUHEXWFDOOSDUWVRI+$/ ² Written in C/C++ typically ² Source code portable across CPU architectures Types: Hardware device drivers: implement device/network I/O File system drivers: file I/O <-> device I/O Filter drivers: disk mirroring, encryption Network redirectors and servers: send/receive remote I/O requests I/O Objects Driver Object: represents loaded driver ² Creates device objects for the devices it manages Device Object: represents an instance of a device ² Can have names for direct access from applications and other drivers File Object: represents open instance of a device ² Created by I/O Manager ² Process handle table entries for open files/devices point at file objects I/O Request Flow Process DeviceIoControl User Mode Kernel Mode Dispatch Table NtDeviceIoControlFile File Device Driver Object Object Object Handle Table IRP DispatchDeviceControl( DeviceObject, Irp ) Driver Code Driver Layering and Filtering To divide functionality across drivers, provide added value, etc. Process User Mode ² Only the lowest layer talks to the I/O hardware ´)LOWHUGULYHUVµDWWDFKWKHLUGHYLFHV to other devices ² They see all requests first and can manipulate them ² Example filter drivers: File system filter driver Bus filter driver Kernel Mode System Services File System Driver I/O Manager Volume Manager Driver Disk Driver IRP Vista I/O Enhancements I/O priorities: device drivers that use the I/O Manager for device queues will prioritize IRPs ² Based on the priority of the issuing thread or the explicitly set I/O priority ² Stored in IRP flags ² 6 priority levels (0-5) Cancellable synchronous I/O: synchronous I/O RSHUDWLRQVLQFOXGLQJ´RSHQµFDQEHFDQFHOOHG ² Explorer hangs on network resources can be aborted I/O completion no longer requires Asynchronous Procedure Calls ² Significant performance improvement on > 4-way systems Security Reference Monitor Implements common object access model shared by all kernel subsystems ² Exposes model for use by applications Performs object access checks (authorization), manipulates privileges, and generates audit messages ² Core function: SeAccessCheck (user-mode version: AccessCheck) Agenda ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation Slides are based on materials of the Windows Operating System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze. http://www.microsoft.com/resources/sharedsource/Licensing/WindowsAcademic.mspx Windows API Memory Management Architecture Windows Program C library: malloc, free Heap API: HeapCreate,HeapDestroy, HeapAlloc, HeapFree Memory-Mapped Files API: CreateFileMapping, CreateViewOfFile Virtual Memory API Windows Kernel with Virtual Memory Manager Physical Memory Disc & File System 51 Windows Memory Management Fundamentals Classical virtual memory management ² ² ² ² Flat virtual address space per process Private process address space Global system address space Per session address space Object based ² Section object and object-based security (ACLs...) Demand paged virtual memory ² Pages are read in on demand & written out when necessary (to make room for other memory needs) Provides flat virtual address space ² 32-bit: 4 GB, 64-bit: 16 Exabyte's (theoretical) Windows Memory Management Fundamentals Lazy evaluation ² Sharing ² usage of prototype PTEs (page table entries) ² Extensive usage of copy-on-write ² ...whenever possible Shared memory with copy on write Mapped files (fundamental primitive) ² Provides basic support for file system cache manager Memory Manager Components System services for allocating, deallocating, and managing virtual memory A access fault trap handler for resolving hardware-detected memory management exceptions and making virtual pages resident on behalf of a process Six system threads ² Working set manager (priority 16) ² drives overall memory management policies, such as working set trimming, aging, and modified page writing ² Process/stack swapper (priority 23) -- performs both process and kernel thread stack inswapping and outswapping ² Modified page writer (priority 17) ² writes dirty pages on the modified list back to the appropriate paging files ² Mapped page writer (priority 17) ² writes dirty pages from mapped files to disk ² Dereference segment thread (priority 18) is responsible for cache and page file growth and shrinkage ² Zero page thread (priority 0) ² zeros out pages on the free list Protecting Memory Attribute Description PAGE_NOACCESS Read/write/execute causes access violation PAGE_READONLY Write/execute causes access violation; read permitted PAGE_READWRITE Read/write accesses permitted PAGE_EXECUTE Any read/write causes access violation; execution of code is permitted (relies on special processor support) PAGE_EXECUTE_ READ Read/execute access permitted (relies on special processor support) PAGE_EXECUTE_ READWRITE All accesses permitted (relies on special processor support) PAGE_WRITECOPY Write access causes the system to give process a private copy of this page; attempts to execute code cause access violation PAGE_EXECUTE_ WRITECOPY Write access causes creation of private copy of pg. PAGE_GUARD Any read/write attempt raises EXCEPTION_GUARD_PAGE and turns off guard page status 55 Physical Memory Limits (in GB) x86 x64 32-bit x64 64-bit I64 64-bit XP Home 4 4 n/a n/a XP Professional 4 4 16 n/a Server 2003 Web Edition 2 2 n/a n/a Server 2003 Standard 4 4 16 n/a Server 2003 Enterprise 32 32 64 64 Server 2003 Datacenter 64 128 1024 1024 Agenda ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation Slides are based on materials of the Windows Operating System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze. http://www.microsoft.com/resources/sharedsource/Licensing/WindowsAcademic.mspx Virtual Memory Concepts $SSOLFDWLRQDOZD\VUHIHUHQFHV´YLUWXDODGGUHVVHVµ Hardware and software translates, or maps, virtual addresses to physical 1RWDOORIDQDSSOLFDWLRQ·VYLUWXDODGGUHVVVSDFHLVLQ physical memory at one time... ² ...But hardware and software fool the application into thinking that it is ² The rest is kept on disk, and is brought into physical memory automatically as needed Virtual address descriptors (VADs) Memory manager uses demand paging algorithm Lazy evaluation is also used to construct page tables ² Reserved vs. committed memory ² Even for committed memory, page table are constructed on demand Memory manager maintains VAD structures to keep track of reserved virtual addresses ² Self-balancing binary tree VAD store: ² ² ² ² range of addresses being reserved; whether range will be shared or private; Whether child process can inherit contents of the range Page protection applied to pages within the address range Mapping Virtual to Physical Pages 00000000 virtual pages Physical Memory 7FFFFFFF 80000000 C0000000 C1000000 FFFFFFFF page table entries Successive page table entries describe successive virtual pages, SRLQWLQJWR³VFDWWHUHG´LHQRW physically contiguous) physical pages Sample ² PDE Definition typedef struct _HARDWARE_PTE_X86 { ULONG Valid : 1; ULONG Write : 1; ULONG Owner : 1; ULONG WriteThrough : 1; ULONG CacheDisable : 1; ULONG Accessed : 1; ULONG Dirty : 1; ULONG LargePage : 1; ULONG Global : 1; ULONG CopyOnWrite : 1; // software field ULONG Prototype : 1; ULONG reserved : 1; // software field // software field ULONG PageFrameNumber : 20; } HARDWARE_PTE_X86, *PHARDWARE_PTE_X86; Introduction 85 Address Translation Mapping virtual addresses to physical memory user Mapping via page table entries Virtual pages Indirect relationship between virtual pages and physical memory Physical memory 31 system 22 21 10 user system x86: Page table entries 12 11 10 0 12 Page directory index Page table index Byte index Shared and Private Pages Process A Process B 00000000 Physical Memory 7FFFFFFF 80000000 C0000000 C1000000 FFFFFFFF For shared pages, multiple SURFHVVHV·37(VSRLQWWR same physical pages 32-bit x86 Address Space Default 2 GB User process space 2 GB System Space 3 GB user space 3 GB User process space 1 GB System Space Increased Limits in 64-bit Windows Itanium x64 x86 User Address Space 7152 GB 8192 GB 2-3 GB Page file limit 16 TB 16 TB 4095 MB PAE: 16 TB Max page file space 256 TB 256 TB ~64 GB System PTE Space 128 GB 128 GB 1.2 GB System Cache 1 TB 1 TB 960 MB Paged pool MB 128 GB 128 GB 470-650 Non-paged pool 128 GB 128 GB 256 MB 32-bit x86 Virtual Address Space 00000000 Unique per process, accessible in user or kernel mode 7FFFFFFF 80000000 Per process, accessible only in kernel mode C0000000 System wide, accessible only in kernel mode FFFFFFFF 2 GB per-process Code: EXE/DLLs Data: EXE/DLL static storage, perthread user mode stacks, process heaps, etc. Code: NTOSKRNL, HAL, drivers Data: kernel Process stacks, page tables, hyperspace File system cache Non-paged pool, Paged pool ² Address space of one process is not directly reachable from other processes 2 GB system-wide ² The operating system is loaded here, and appears in every SURFHVV·VDGGUHVVVSDFH ² The operating system is not a process (though there are processes that do things for the 26PRUHRUOHVVLQ´EDFNJURXQGµ 3 GB user space option & Address Windowing Extensions (AWE) described later Address Translation 32-bit Windows Hardware Support Intel x86 Intel x86 provides two levels of address translation ² Segmentation (mandatory, since 8086) ² Paging (optional, since 80386) Segmentation: first level of address translation ² Intel: logical address (selector:offset) to linear address (32 bits) ² Windows virtual address is Intel linear address (32 bits) Paging: second level of address translation ² Intel: linear address (32 bits) to physical address ² Windows: virtual address (32 bits) to physical address ² Physical address: 32 bits (4 GB) all Windows versions, 36 bits (64 GB) PAE ² Page size: 4 kb since 80386 (all Windows versions) 4 MB since Pentium Pro (supported in NT 4, Windows 2000/XP/2003) Intel x86 Segmentation Offset Segment Selector Intel Logical address 15 3 Index 2 TI=0 1 31 0 RPL 0 : Intel Linear Addresses Global Descriptor Table (GDT) Access 0xffffffff Limit=0xfffff Base Address = 0 Access Limit=0xfffff Base Address = 0 + Windows Virtual Addresses 0 Intel x86 Paging Address Translation Intel Linear 31 Address 22 21 10 12 11 10 Physical Address 0 Windows-PFN Database n 12 Windows Virtual Address operand 4Kb PDE 4Mb PDE PTE Page table 1024 entries 4kb page frame 22 bit offset 4MB page frame 4 Kb page operand 4 Mb page Page directory 1024x4byte entries (one per process) cr 3 Physical address 3 2 1 0 Physical Memory Page tables are created on demand Page Frame Number Database Interpreting a Virtual Address x86 32-bit 31 22 0 Page table selector 10 bits x64 64-bit 21 12 Page table entry selector 11 Byte within page 12 bits 10 bits (48-ELWLQWRGD\·VSURFHVVRUV 47 39 38 30 29 21 20 12 11 0Page map level Page directory Page table Page table Byte within page pointer selector 4 selector selector entry selector 9 bits 9 bits 9 bits 9 bits 12 bits Windows Virtual Memory Use Performance Counters Performance Counter System Variable Description Memory: Committed Bytes MmTotalCommitedPages Amount of committed private address space that has a backing store Memory: Commit Limit MmTotalCommit-Limit Amount of memory (in bytes) that can be committed without increasing size of paging file Memory: %Commited Bytes in Use MmTotalCommittedPages / MmTotalCommitLimit Ratio of committed bytes to commit limit x86 Virtual Address Translation PFN 0 31 0 Page table selector 1 Page table entry selector 2 Byte within page 3 4 CR3 5 physical address 6 inde x index physical page number ´SDJH frame QXPEHUµRU ´3)1µ 7 8 9 10 11 12 Page Directory (one per process, 1024 entries) Page Tables (up to 512 per process, plus up to 512 system-wide) Physical Pages (up to 2^20) Virtual Address Translation The hardware converts each valid virtual address to a physical address virtual address Page Directory Virtual page number Byte within page Address translation (hardware) Page Tables if page not valid... Translation Lookaside Buffer Physical page number a cache of recentlyused page table entries Byte within page physical address page fault (exception, handled by software) System and process-private page tables PTE 0 PDE 0 private PDE 511 PDE 512 Process 1 page tables PDE n Process 1 page directory PDE 0 PTE 0 PDE 511 System page tables Sys PTE 0 Sys PTE n PDE 512 PDE n Process 2 page tables Process 2 page directory On process creation, system space page directory entries point to existing system page tables Not all processes have same view of system space (after allocation of new page tables) Page Table Entries Page tables are array of Page Table Entries (PTEs) Valid PTEs have two fields: ² Page Frame Number (PFN) ² Flags describing state and protection of the page Reserved bits are used only when PTE is not valid 31 12 Page frame number U P Cw Gi L D A Cd Wt O Res (writable on MP Systems) Res Res Global Res (large page if PDE) Dirty Accessed Cache disabled Write through Owner Write (writable on MP Systems) 0 valid W V PTE Status and Protection Bits (Intel x86 only) Name of Bit Meaning on x86 Accessed Page has been read Cache disabled Disables caching for that page Dirty Page has been written to Global Translation applies to all processes DWUDQVODWLRQEXIIHUIOXVKZRQµWDIIHFWWKLV37( Large page Indicates that PDE maps a 4MB page (used to map kernel) Owner Indicates whether user-mode code can access the page of whether the page is limited to kernel mode access Valid Indicates whether translation maps to page in phys. Mem. Write through Disables caching of writes; immediate flush to disk Write Uniproc: Indicates whether page is read/write or read-only; Multiproc: ind. whether page is writeable/write bit in res. bit Translation Look-Aside Buffer (TLB) Address translation requires two lookups: ² Find right table in page directory ² Find right entry in page table Most CPU cache address translations ² Array of associative memory: translation look-aside buffer (TLB) ² TLB: virtual-to-physical page mappings of most recently used pages Virtual page #: 17 Simultaneous read and compare Virtual page #: 5 Page frame 290 Virtual page #: 64 Invalid Virtual page #: 17 Page frame 1004 Virtual page #: 7 Invalid Virtual page #: 65 Page frame 801 Page Fault Handling Reference to invalid page is called a page fault Kernel trap handler dispatches: ² Memory manager fault handler (MmAccessFault) called ² Runs in context of thread that incurred the fault ² Attempts to resolve the fault or raises exception Page faults can be caused by variety of conditions Four basic kinds of invalid Page Table Entries (PTEs) In-Paging I/O due to Access Faults Accessing a page that is not resident in memory but on disk in page file/mapped file ² Allocate memory and read page from disk into working set Occurs when read operation must be issued to a file to satisfy page fault ² Page tables are pageable -> additional page faults possible In-page I/O is synchronous ² Thread waits until I/O completes ² Not interruptible by asynchronous procedure calls In-Paging I/O due to Access Faults During in-page I/O: faulting thread does not own critical memory management synchronization objects Other threads in process may issue VM functions, but: ² Another thread could have faulted same page: collided page fault ² Page could have been deleted (remapped) from virtual address space ² Protection on page may have changed ² Fault could have been for prototype PTE and page that maps prototype PTE could have been out of working set Other reasons for access faults Accessing page that is on standby or modified list ² Transition the page to process or system working set Accessing page that has no committed storage ² Access violation Accessing kernel page from user-mode ² Access violation Writing to a read-only page ² Access violation Reasons for access faults (contd.) Writing to a guard page ² Guard page violation (if a reference to a user-mode stack, perform automatic stack expansion) Writing to a copy-on-write page ² Make process-private copy of page and replace original in process or system working set Referencing a page in system space that is valid but not in the process page directory (if paged pool expanded after process directory was created) ² Copy page directory entry from master system page directory structure and dismiss exception On a multiprocessor system: writing to valid page that has not yet been written to ² Set dirty bit in PTE Invalid PTEs and their structure Page file: desired page resides in paging file in-page operation is initiated 31 12 11 10 9 Page file offset 54 1 0 Page Protection File No 0 Transition Prototype Valid Demand Zero: pager looks at zero page list; if list is empty, pager takes list from standby list and zeros it; PTE format as shown above, but page file number and offset are zeros Invalid PTEs and their structure (contd.) Transition: the desired page is in memory on either the standby, modified, or modified-no-write list ² Page is removed from the list and added to working set 31 12 11 10 9 Page Frame Number 1 1 5 4 Protection 3 2 1 0 0 Transition Prototype Protection Cache disable Write through Owner Write Valid Unknown: the PTE is zero, or the page table does not yet exist - examine virtual address space descriptors (VADs) to see whether this virtual address has been reserved - Build page tables to represent newly committed space Understanding the implications prevent problems Windows ² A Case Study ¾ Why Engage in Operating Systems ¾ Windows Evolution ¾ Windows Architecture ¾ Memory Management Fundamentals ¾ Virtual Address Translation ´0RGHUQµ2SHUDWLQJ6\VWHPV Unix Multics 1960 1970 Windows (NT) Linux VMS 1980 Design parameters ² scarce resources ² benign environment ² knowledgeable and trained users 1990 Design parameters? malicious environment Safe Micro-Kernel (e.g. Singularity) untrained users Virtualization Convergence DB & OS Works Citied (iStockphoto) http://www.istockphoto.com (Merriam-:HEVWHU³0HUULDP-Webster OnLine 6HDUFK´ http://www.merriam-webster.com/dictionary/operating%20system (Probe) Probe, Dave. ³0LFURVRIW$FDGHPLF'D\V7RURQWR´. (Solomon) Solomon David A., Russinovich Mark E., Polze Andreas. ³:LQGRZV2SHUDWLQJ6\VWHP,QWHUQDOV&XUULFXOXP´ http://www.microsoft.com/resources/sharedsource/Licensing/WindowsAcademic.mspx Additional Readings (Russinovich 2005) Russinovich, Mark E., Solomon, David A.. Microsoft Windows Internals. Redmond, WA: Microsoft Press, 2005. (Zachary) Zachary, Pascal G.. Show Stopper!: The Breakneck Tace to Create Windows NT. © 2007 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.