A Real-Time Execution Performance Agent Interface for Confidence
Transcription
A Real-Time Execution Performance Agent Interface for Confidence
A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling by Sam Siewert B.S., University of Notre Dame, 1989 M.S., University of Colorado, 1993 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirement for the degree of Doctor of Philosophy Department of Computer Science 2000 Copyright 2000 Sam Siewert, All Rights Reserved This thesis entitled: A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling written by Sam Siewert has been approved for the Department of Computer Science ____________________________ Professor Gary J. Nutt, Advisor ____________________________ Professor Ren Su Date__________________ The final copy of this thesis has been examined by the signatories, and we find that both content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. Copyright 2000 Sam Siewert, All Rights Reserved ii Siewert, Sam (Ph.D. Computer Science) A Real-Time Execution Performance Agent Interface for Confidence-Based Scheduling Thesis directed by Professor Gary Nutt Abstract The use of microprocessors and software to build real-time applications is expanding from traditional domains such as digital control, data acquisition, robotics, and digital switching, to include emerging domains like multimedia, virtual reality, optical navigation, and audio processing. These emerging real-time application domains require much more bandwidth and processing capability than the traditional real-time systems applications. Furthermore, at the same time, the potential performance and complexity of microprocessor and I/O architectures is also rapidly evolving to meet these new application demands (e.g. a super-scalar, pipelined architecture with multilevel cache with burst transmission I/O bus). Finally, the complexity of typical realtime system algorithms is increasing extant to include functions such as image processing, rulebased fault protection, and intelligent sensor processing. The foundation of real-time systems theory is the recognition that bandwidth and processing resources will always be constrained (a more demanding application always exists that can make use of increased resources as they become available). Given this reality, the question is how does an engineer formally ensure, given resource constraints, that the system will not only function correctly, but also meet timing deadlines. Since the introduction of Liu and Layland’s rate-monotonic analysis and the development of the formal theory of hard real-time systems, significant progress has been made on extending this theory and developing an engineering process for it. The problem is that the current hard real-time theory and process assumes full reliability and constrains systems more than necessary by requiring either deterministic use of resources or worst-case models of such usage. Real-time systems engineering requires translation of requirements into a system meeting cost, performance, and reliability objectives. If deadline performance was the only consideration in the engineering process, and there were no cost or reliability requirements, then current hard real-time theory is generally sufficient. In reality though, it is clear that cost and reliability must be considered, especially since emerging application domains may be more sensitive to cost and reliability than traditional hard real-time domains. Typically a direct trade can be made between cost and reliability for a given performance level. There are three main problems that exist with application of current hard real-time theory to systems requiring a balance of cost, reliability and performance. First, there is no formal approach for the design of systems for less than full reliability. Second, the assumptions and constraints of applying hard real-time theory severely limit performance. Finally, safe mixing of hard and soft real-time execution is not supported. Without a better framework for mixed hard and soft real-time requirements implementation, the engineer must either adapt hard real-time theory on a case by case basis, or risk implementing a best effort system which provides no formal assurance of performance. Soft real-time quality-ofservice frameworks are also an option. However, not only are these approaches not fully mature, more fundamentally, they do not address the concept of mixed hard and soft real-time processing, nor is it clear that any of these approaches provide concretely measurable reliability. In this thesis we present an alternative framework for the implementation of real-time systems which Copyright 2000 Sam Siewert, All Rights Reserved iii accommodates mixed hard and soft real-time processing with measurable reliability by providing a confidence-based scheduling and execution fault handling framework. This framework, called the RT EPA (real-time execution performance agent), provides a more natural and less constraining approach to translating both timing and functional requirements into a working system. The RT EPA framework is based on an extension to deadline monotonic theory. The RT EPA has been evaluated with simulated loading, an optical navigation test-bed, and the RT EPA monitoring module will be flown on an upcoming NASA space telescope in late 2001. The significance of this work is that it directly addresses the shortcomings in the current process for handling reliability and provides measurable reliability and performance feedback during the implementation, systems integration, and maintenance phases of the real-time systems engineering process. Copyright 2000 Sam Siewert, All Rights Reserved iv Acknowledgements I would especially like to thank the following people who believed in the merits of this research, helped remove road blocks, provided expert insight, and gave me moral support and encouragement along the way. Prof. Gary Nutt, Computer Science Department, Dissertation Advisor – Prof. Nutt was the perfect advisor as far as letting me drive the research, but also challenging me to get to the heart of the research and gently reminding me to keep on track and helping me to get past roadblocks preventing progress. Much of the theory in this dissertation was derived almost three years prior to completing the proof-of-concept experiments and Prof. Nutt was extremely patient and supportive while I struggled with technological aspects of the experiments such as high-bandwidth transfer for high frame rates. I think that the complexity and realistic nature of the experiments like RACE has made this a better thesis and I appreciate the support and patience Prof. Nutt provided. Dr. George Rieke, University of Arizona Steward Observatory, MIPS Principal Investigator on Space-based Infrared Telescope Facility (SIRTF) -- Dr. Rieke was very trusting and supportive of my efforts to use methods from this research in order to improve data processing performance on the Multi-band Infrared Photometer for SIRTF (MIPS) instrument. At the point in the project when there was SIRTF program concern that the MIPS instrument software would not work due to apparently random timeouts in exposure processing, Dr. Rieke supported my efforts completely without unnecessary questioning. I believe this is due to his insight and uncommon ability to appreciate technology in many different disciplines. Ball Aerospace – Ball supported my completion of this research by providing me time off to finish writing and allowing me to use information from the SIRTF/MIPS real-time scheduling work in this thesis completed under contract 960785 to the NASA Jet Propulsion Laboratory. Elaine Hansen, Director of the Colorado Space Grant Consortium – Elaine Hansen provided me with research opportunities on NASA Jet Propulsion Laboratory (JPL) projects, funding for basic research and presentations at conferences on the subject of real-time automation, and excellent review and insight of many of the early ideas which lead to completion of this research. Along the way I also learned the practical aspects of engineering real-time systems by working for director Hansen to build ground and flight software for a Shuttle payload operations system which flew on STS-85 in 1997. Prof. Ren Su, Chairman of the Electrical Engineering Department – Prof. Su provided me the opportunity to teach fundamentals of real-time systems while completing this research which was motivational and helped me to focus on the current state-of-practice in real-time embedded systems as well as directions for my research and infrastructure to complete experiments. Copyright 2000 Sam Siewert, All Rights Reserved v Dr. Richard Doyle, Manager of Autonomy Technology Programs and Information and Computing Technologies Research at the NASA Jet Propulsion Laboratory – Dr. Doyle provided support for basic research on real-time automation to me through the Space Grant College. Working with members of his group including Dr. Steve Chien and Dr. Dennis Decoste at JPL, I was able to formulate some of the basic concepts which inspired the RT EPA framework for realtime data processing and control. Dr. Doyle also encouraged me to present early research work at the International Space Artificial Intelligence and Robotics Applications Symposium where I was able to share concepts with many researchers from NASA and the robotics industry which helped me to understand these application domains much better. Copyright 2000 Sam Siewert, All Rights Reserved vi CONTENTS ABSTRACT............................................................................................................................ III ACKNOWLEDGEMENTS......................................................................................................V 1 INTRODUCTION............................................................................................................. 1 1.1 RESEARCH SCOPE......................................................................................................... 3 1.2 COMPARISON TO E XISTING APPROACHES ...................................................................... 6 1.3 PROBLEM STATEMENT.................................................................................................. 8 1.4 PROPOSED SOLUTION ................................................................................................. 10 1.5 EVALUATION ............................................................................................................. 12 1.6 SUMMARY OF RESEARCH RESULTS ............................................................................. 12 1.6.1 Theoretical Results ............................................................................................ 13 1.6.2 Framework Prototype Implementation................................................................ 13 1.6.3 Proof-of-Concept Test Results ............................................................................ 13 1.7 SIGNIFICANCE ............................................................................................................ 13 2 PROBLEM STATEMENT ............................................................................................. 15 2.1 SYSTEM TIMING ISSUES .............................................................................................. 16 2.1.1 Release Variance Due to Contention and Interference ........................................ 18 2.1.2 Dispatch and Preemption Variance Due to System Overhead.............................. 18 2.1.3 Algorithm Execution Variance Due to Non-uniform Loading In a Single Release 19 2.1.4 Architectural Execution Variance Due to Micro-parallelism and Memory Hierarchy ......................................................................................................................... 19 2.1.5 Input/Output Variance Due to Shared Resource Contention and Transfer Modes 21 2.1.6 System End-to-End Latency and Jitter ................................................................ 21 2.2 ENVIRONMENTAL E VENT RATE VARIANCE DUE TO NATURE OF EVENTS AND MODELING DIFFICULTY........................................................................................................................... 22 2.3 CHARACTERISTICS OF EMERGING REAL-TIME APPLICATIONS ........................................ 23 2.3.1 Loading characteristics of purely continuous media ........................................... 23 2.3.2 Loading characteristics of purely event-driven processing .................................. 24 2.3.3 Loading characteristics of purely digital control applications............................. 25 2.3.4 Loading characteristics of mixed processing ...................................................... 26 2.3.5 Loading characteristics of mixed event-driven and digital control....................... 27 2.3.6 Loading characteristics of mixed real-time applications in general..................... 28 3 RELATED RESEARCH................................................................................................. 30 3.1 HARD REAL-TIME RESEARCH RELATED TO RT EPA ................................................... 30 3.2 SOFT REAL-TIME RESEARCH RELATED TO RT EPA AND CONFIDENCE-BASED SCHEDULING ......................................................................................................................... 31 3.3 EXECUTION FRAMEWORKS SIMILAR TO RT EPA......................................................... 31 4 SCHEDULING EPOCHS ............................................................................................... 32 4.1 MULTIPLE ON-LINE SCHEDULING EPOCH CONCEPT DEFINITION ................................... 32 4.1.1 Admission and Scheduling Within an Epoch ....................................................... 33 4.1.2 Active epoch policy ............................................................................................ 33 4.2 EQUIVALENCE OF EDF AND MULTI-EPOCH SCHEDULING IN THE LIMIT ........................ 34 4.3 APPLICATION OF MULTI-EPOCH SCHEDULING ............................................................. 34 4.3.1 SIRTF/MIPS Multi-Epoch Scheduling Example .................................................. 35 4.3.2 Multi-epoch Scheduling Compared to Multi-level Scheduling ............................. 37 5 REAL-TIME EXECUTION PERFORMANCE AGENT FRAMEWORK................... 38 5.1 DESIGN OVERVIEW .................................................................................................... 38 Copyright 2000 Sam Siewert, All Rights Reserved vii 5.1.1 Pipeline Time Consistency and Data Consistency ............................................... 39 5.1.2 Pipeline Admission and Control ......................................................................... 39 5.2 RT EPA TRADITIONAL HARD REAL-TIME FEATURES .................................................. 40 5.3 RT EPA SOFT REAL-TIME FEATURES ......................................................................... 40 5.4 RT EPA BEST EFFORT FEATURES ............................................................................... 41 5.5 RT EPA DATA PROCESSING PIPELINE FEATURES ........................................................ 41 5.6 RT EPA IMPLEMENTATION ........................................................................................ 41 5.6.1 RT EPA Service and Configuration API.............................................................. 42 5.6.1.1 RT EPA System Initialization and Shutdown................................................................42 5.6.1.2 RT EPA Service (Thread) Admission and Dismissal .....................................................42 5.6.1.3 RT EPA Task Control..................................................................................................43 5.6.1.4 RT EPA Release and Pipeline Control..........................................................................44 5.6.1.5 RT EPA Performance Monitoring ................................................................................44 5.6.1.6 RT EPA Execution Model Utilities...............................................................................45 5.6.1.7 RT EPA Information Utilities.......................................................................................45 5.6.1.8 RT EPA Control Block ................................................................................................45 5.6.1.8.1 RT EPA CB Negotiated Service ............................................................................47 5.6.1.8.2 RT EPA CB Release and Deadline Specification....................................................47 5.6.1.8.3 RT EPA CB On-Line Statistics and Event Tags......................................................48 5.6.1.8.4 RT EPA CB On Demand or Periodic Server Computed Performance Statistics .......48 5.6.1.9 RT EPA Service Negotiation and Configuration Example .............................................49 5.6.1.10 RT EPA Admission Request and Service Specification.............................................50 5.6.1.10.1 Service Type.......................................................................................................51 5.6.1.10.2 Interference Assumption .....................................................................................51 5.6.1.10.3 Execution Model.................................................................................................52 5.6.1.10.4 Termination Deadline Miss Policy.......................................................................52 5.6.1.10.5 Release Period and Deadline Specification...........................................................52 5.6.1.11 Expected Performance Feedback..............................................................................53 5.6.1.11.1 Global Performance Parameters Update API Functions ........................................53 5.6.1.11.2 Deadline Peformance API Functions ...................................................................53 5.6.1.11.3 Execution Peformance API Functions..................................................................53 5.6.1.11.4 Release Performance API Functions ....................................................................54 5.6.1.12 RT EPA Task Activation and Execution Specification ..............................................54 5.6.1.12.1 Service Execution Entry Point and Soft Deadline Miss Callback...........................54 5.6.1.12.2 Service Release Complete Isochronal Callback ....................................................54 5.6.1.12.3 Release Type and Event Specification..................................................................54 5.6.1.12.4 Service On-Line Model Size................................................................................55 5.6.1.13 Service Performance Monitoring Specification .........................................................55 5.6.2 RT EPA Kernel-Level Monitoring and Control ................................................... 55 5.6.2.1 Event Release Wrapper Code .......................................................................................55 5.6.2.1.1 ISR Release Wrapper Code ...................................................................................56 5.6.2.1.2 RT EPA Event Release Wrapper Code...................................................................56 5.6.2.2 Dispatch and Preempt Event Code................................................................................57 5.6.2.3 Release Frequency .......................................................................................................58 5.6.2.4 Execution Time ...........................................................................................................58 5.6.2.5 Response Time ............................................................................................................58 5.6.2.6 Deadline Miss Management .........................................................................................58 5.6.2.6.1 Terminate Execution that would Exceed Hard Deadline .........................................59 5.6.2.6.2 Hard Deadline Miss Restart Policy.........................................................................59 5.6.2.6.3 Termination Deadline Miss Dismissal Policy .........................................................60 5.6.3 Performance Monitoring and Re-negotiation...................................................... 60 5.6.3.1 5.6.3.2 6 THE CONFIDENCE-BASED SCHEDULING FORMULATION ................................ 62 6.1 6.2 6.3 7 Soft Deadline Confidence ............................................................................................61 Hard Deadline Confidence ...........................................................................................61 RT EPA CBDM CONCEPT ......................................................................................... 62 CBDM DEADLINE CONFIDENCE FROM EXECUTION CONFIDENCE INTERVALS ............... 63 CBDM ADMISSION TEST EXAMPLE ............................................................................ 64 EVALUATION METHOD ............................................................................................. 67 Copyright 2000 Sam Siewert, All Rights Reserved viii 7.1 RT EPA PSEUDO LOADING EVALUATION ................................................................... 67 7.2 SIRTF/MIPS VIDEO PROCESSING RT EPA MONITORING EVALUATION ....................... 69 7.3 SIRTF/MIPS VIDEO PROCESSING RT EPA EPOCH EVALUATION ................................. 69 7.4 DIGITAL VIDEO PIPELINE TEST-BED ............................................................................ 73 7.4.1 NTSC Digital Video Decoder DMA Micro-coding .............................................. 73 7.4.2 RT EPA Digital Video Processing Pipeline......................................................... 74 7.5 RACE OPTICAL NAVIGATION AND CONTROL EXPERIMENT ......................................... 75 7.5.1 RACE Mechanical System Overview................................................................... 76 7.5.2 RACE Electronics System Description................................................................ 77 7.5.3 RACE RT EPA Command, Control, and Telemetry Services................................ 78 7.5.3.1 7.5.3.2 7.5.3.3 7.5.3.4 7.5.3.5 7.5.3.6 7.5.3.7 Frame-based Processing and Control Sequencing..........................................................79 Frame Display Compression/Formatting Algorithm ......................................................79 Optical Navigation Algorithm ......................................................................................79 RACE Control Algorithm ............................................................................................80 State Telemetry Link Algorithm...................................................................................80 Grayscale Frame Link Algorithm .................................................................................81 NTSC Camera Tilt and Pan Control Algorithm.............................................................81 7.5.4 RACE RT EPA Software System ......................................................................... 82 7.6 ROBOTIC TEST-BED .................................................................................................... 83 7.7 ROBOTICS TEST-BED INCONCLUSIVE RESULTS ............................................................ 84 8 EXPERIMENTAL RESULTS........................................................................................ 85 8.1 RT EPA EXPERIMENTATION GOALS ........................................................................... 85 8.2 RT EPA PSEUDO LOADING TESTS .............................................................................. 85 8.2.1 Pseudo Load Marginal Task Set Negotiation and Re-negotiation Testing (Goal 2) 86 8.3 SIRTF/MIPS VIDEO PROCESSING RT EPA MONITORING EVALUATION ....................... 88 8.3.1 SIRTF/MIPS RT EPA DM Priority Assignment................................................... 89 8.3.2 MIPS Exposure-Start Reference Timing Model................................................... 91 8.3.3 SIRTF/MIPS Exposure Steady-State Reference Timing Model............................. 95 8.3.4 SIRTF/MIPS SUR Mode Steady-State ME Results............................................... 97 8.3.5 SIRTF/MIPS Raw Mode Steady-State Results ..................................................... 99 8.3.6 SIRTF/MIPS Video Processing RT EPA Epoch Evaluation ................................. 99 8.4 DIGITAL VIDEO PIPELINE TEST-BED RESULTS ............................................................100 8.5 RACE RESULTS ........................................................................................................102 8.5.1 RACE Marginal Task Set Experiment (Goal 1) ..................................................102 8.5.2 RACE Nominal Configuration Results ...............................................................103 8.5.2.1 8.5.2.2 8.5.2.3 8.5.2.4 8.5.2.5 8.5.2.6 8.5.2.7 8.5.3 8.5.4 8.5.5 8.5.5.1 8.5.5.2 9 Bt878 Video Frame Buffer Service.............................................................................104 Frame Display Formatting and Compression Service ..................................................105 Optical Navigation Ranging and Centroid Location Service ........................................106 RACE Vehicle Ramp Distance Control ......................................................................107 RACE Vehicle Telemetry Processing .........................................................................108 RACE Video Frame Link Processing..........................................................................109 RACE Camera Control ..............................................................................................110 RACE RT EPA Initial Service Negotiation and Re-negotiation (Goal 2) .............110 RACE Release Phasing Control Demonstration (Goal 3a and 3b)......................111 RACE Protection of System from Unbounded Overruns (Goal 5) .......................112 Example of Unanticipated Contention for I/O and CPU Resources ..............................112 RT EPA Protection from Period/Execution Jitter Due to Misconfiguration (Goal 5).....113 SIGNIFICANCE ............................................................................................................115 10 PLANS FOR FUTURE RESEARCH ........................................................................116 11 CONCLUSION...........................................................................................................117 REFERENCES ......................................................................................................................118 Copyright 2000 Sam Siewert, All Rights Reserved ix APPENDIX A ........................................................................................................................121 RT EPA SOURCE CODE API SPECIFICATION .........................................................................121 APPENDIX B.........................................................................................................................129 LOADING ANALYSIS FOR IMAGE CENTROID CALCULATION WITH VARIANCE DUE TO CACHE MISSES.................................................................................................................................129 11.1 ARHITECTURE PERFORMANCE ASSUMPTIONS .............................................................129 11.2 33 MHZ RAD6000 ANALYSIS ...................................................................................129 11.3 CENTROID COMPUTATION TIME MODEL.....................................................................130 11.3.1 Alogirthm Description.......................................................................................130 11.3.2 Load-Store RISC Pseudo-code Instructions to Implement for X-bar and Y-bar ...130 11.4 OVERALL EXPECTED CACHE HIT RATE ......................................................................131 11.5 CENTROID CPI ESTIMATIONS ....................................................................................131 11.6 ALGORITHM COMPLEXITY .........................................................................................131 11.7 TIME TO COMPUTE ARRAY CENTROID........................................................................132 11.8 EXAMPLE FOR 1024X1024 ARRAY .............................................................................132 11.9 GENERAL RESULT .....................................................................................................132 APPENDIX C ........................................................................................................................133 UNMODELED INTERFERENCE CAUSES SEVERAL TERMINATION DEADLINE MISSES ..................133 APPENDIX D ........................................................................................................................144 RACE INITIAL SCHEDULING AND CONFIGURATION ADMISSION RESULTS ...............................144 APPENDIX E.........................................................................................................................148 VIDEO PIPELINE TEST RESULTS (WITHOUT ISOCHRONOUS OUTPUT).......................................148 APPENDIX F.........................................................................................................................157 VIDEO PIPELINE TEST RESULTS (WITH ISOCHRONOUS OUTPUT) .............................................157 Copyright 2000 Sam Siewert, All Rights Reserved x TABLES TABLE 1: MIXED BEST EFFORT, SOFT, AND HARD REAL-TIME APPLICATION EXAMPLE ............... 10 TABLE 2: ENVIRONMENTAL EVENT-RATE TYPES WITH APPLICATION EXAMPLES OF EACH ........... 28 TABLE 3: SINGLE EPOCH DESIGN OF THE SIRTF/MIPS VIDEO PROCESSING WITH 16 KWORD FIFOS ............................................................................................................................... 36 TABLE 4: SINGLE EPOCH DESIGN OF THE SIRTF/MIPS VIDEO PROCESSING WITH 4 KWORD FIFOS ......................................................................................................................................... 36 TABLE 5: MULTIPLE EPOCH DESIGN OF THE SIRTF/MIPS STEADY-STATE VIDEO PROCESSING.... 37 TABLE 6: RT EPA DEADLINE MANAGEMENT SUMMARY ............................................................ 59 TABLE 7: PSEUDO SOURCE/SINK EXPERIMENT T ASK SET DESCRIPTION ....................................... 68 TABLE 8A: EPOCH 1 OF THE SIRTF/MIPS VIDEO PROCESSING.................................................... 70 TABLE 8B: EPOCH 2 OF THE SIRTF/MIPS VIDEO PROCESSING .................................................... 71 TABLE 8C: EPOCH 3 OF THE SIRTF/MIPS VIDEO PROCESSING .................................................... 72 TABLE 9: DIGITAL VIDEO PIPELINE SERVICES ............................................................................ 75 TABLE 10: RACE TASK SET DESCRIPTION ................................................................................. 79 TABLE 11: 5 DOF ROBOTIC EXPERIMENT T ASK SET DESCRIPTION.............................................. 84 TABLE 12: PSEUDO LOADING MARGINAL TASK SET DESCRIPTION (TIMER RELEASED) ................ 86 TABLE 13: PSEUDO LOADING ACTUAL MARGINAL T ASK SET PERFORMANCE (TIMER RELEASED) 87 TABLE 14: PSEUDO LOADING MARGINAL TASK SET DESCRIPTION (TIMER RELEASED) ................ 87 TABLE 15: PSEUDO LOADING ACTUAL MARGINAL T ASK SET PERFORMANCE (TIMER RELEASED) 88 TABLE 16: RT EPA EXECUTION JITTER IN SIRTF/MIPS SI FRAME PROCESSING RELEASES ......... 89 TABLE 17: RT EPA EXECUTION JITTER IN SIRTF/MIPS SI OPTIMIZED FRAME PROCESSING RELEASES .......................................................................................................................... 89 TABLE 18: SIRTF/MIPS DM PRIORITY ASSIGNMENTS ............................................................... 90 TABLE 19: MIPS SUR C0F2N2 EXPOSURE START VMETRO TIME TAGS ..................................... 94 TABLE 20: MIPS RAW C0F1N2 EXPOSURE START VMETRO TIME TAGS ..................................... 95 TABLE 21: SUR C0F2N2 STEADY-STATE EXPOSURE TIME T AGS ................................................ 98 TABLE 22: RAW C0F1N2 STEADY-STATE E XPOSURE TIME TAGS ................................................ 99 TABLE 23: DIGITAL VIDEO PIPELINE MARGINAL TASK SET DESCRIPTION...................................100 TABLE 24: ACTUAL DIGITAL VIDEO T ASK SET PERFORMANCE...................................................101 TABLE 25: RACE SOURCE/SINK PIPELINE TASK SET DESCRIPTION ............................................103 TABLE 26: RACE STANDARD PIPELINE PHASING AND RELEASE FREQUENCIES ...........................103 TABLE 27: RACE SOFT AND TERMINATION DEADLINE ASSIGNMENT .........................................104 TABLE 28: INITIAL RACE SOURCE/SINK PIPELINE T ASK SERVICE DESCRIPTION.........................111 TABLE 29: RACE SOURCE/SINK ACTUAL PERFORMANCE ..........................................................111 Copyright 2000 Sam Siewert, All Rights Reserved xi FIGURES FIGURE 1: RT EPA CONFIDENCE-BASED SCHEDULING UPPER-BOUND .......................................... 4 FIGURE 2: THE RT EPA UTILITY ASSUMPTION............................................................................. 6 FIGURE 3: END-TO-END JITTER FROM E VENT RELEASE TO RESPONSE .......................................... 22 FIGURE 4: CONTINUOUS MEDIA DIGITAL VIDEO PIPELINE........................................................... 24 FIGURE 5: PURELY EVENT-DRIVEN REAL-TIME PROCESSING ...................................................... 25 FIGURE 6: REAL-TIME DIGITAL CONTROL PROCESSING .............................................................. 26 FIGURE 7: MIXED CONTINUOUS MEDIA AND E VENT-DRIVEN REAL-TIME PROCESSING ................ 27 FIGURE 8: MIXED DIGITAL CONTROL AND E VENT-DRIVEN REAL-TIME PROCESSING ................... 28 FIGURE 9: FEATURE SPACE OF SYSTEM I/O AND CPU REQUIREMENTS BY APPLICATION TYPE..... 29 FIGURE 10: MULTIPLE EPOCHS OF SCHEDULING ACTIVE SIMULTANEOUSLY ................................ 32 FIGURE 11: IN-KERNEL PIPE WITH FILTER STAGE AND DEVICE INTERFACE MODULES .................. 38 FIGURE 12: EXECUTION E VENTS AND DESIRED RESPONSE SHOWING UTILITY.............................. 62 FIGURE 13: EPA PSEUDO LOADING PIPELINE ............................................................................. 67 FIGURE 14: SIRTF/MIPS DUAL STREAM PIPELINE ..................................................................... 69 FIGURE 15: BASIC DIGITAL VIDEO RT EPA PIPELINE ................................................................. 75 FIGURE 16 A AND B : RACE SYSTEM SIDE-VIEW(A) AND FRONTAL-VIEW (B).......................... 76 FIGURE 17: RACE VEHICLE AND GROUND CONTROL SYSTEM ELECTRONICS .............................. 77 FIGURE 18: RACE ELECTRONICS ............................................................................................... 78 FIGURE 19 A AND B: TARGET WIDTH DISTRIBUTION FOR ALL SCAN-LINES – CLOSE (A) AND FAR (B) .................................................................................................................................... 80 FIGURE 20: RACE EPA PIPELINE .............................................................................................. 82 FIGURE 21 A AND B: 5 DOF DEAD-RECKONING ROBOT (A, LEFT), POSITION FEEDBACK ROBOT (B, RIGHT)............................................................................................................................... 84 FIGURE 22: MIPS MODE HTG READY E XPOSURE-START HW/SW SYNCHRONIZATION WINDOW 91 FIGURE 23: MIPS EXPOSURE START WORST CASE DELAY (CASE A) .......................................... 92 FIGURE 24: MIPS EXPOSURE START BEST CASE DELAY (CASE B) .............................................. 93 FIGURE 25: SUR C0F2NN FIRST DCE DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 96 FIGURE 26: SUR C0F2NN DCE 2 TO N DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 96 FIGURE 27: FIRST DCE RAW C0F1NN DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 97 FIGURE 28: DCE 2 TO N RAW C0F1NN DATA COLLECTION AND PRODUCTION EVENT TIMING MODEL .............................................................................................................................. 97 FIGURE 29 A AND B: RACE FRAME COMPRESSION (A) AND RESPONSE JITTER (B) ....................100 FIGURE 30 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE JITTER (B)................101 FIGURE 31 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE JITTER (B) WITH ISOCHRONAL OUTPUT CONTROL........................................................................................102 FIGURE 32: BT878 VIDEO RELEASE JITTER ...............................................................................104 FIGURE 33 A AND B: BT878 VIDEO EXECUTION (A) AND RESPONSE JITTER (B) .........................105 FIGURE 34: RACE FRAME DISPLAY SERVICE RELEASE PERIOD JITTER ......................................105 FIGURE 35 A AND B: FRAME DISPLAY SERVICE EXECUTION (A) AND RESPONSE (B) LATENCY AND JITTER ..............................................................................................................................106 FIGURE 36: OPTICAL NAVIGATION E VENT RELEASE PERIOD JITTER ...........................................106 FIGURE 37 A: OPTICAL NAVIGATION EXECUTION (A) AND RESPONSE JITTER (B) .......................107 FIGURE 38: RAMP CONTROL RELEASE PERIOD JITTER ................................................................107 FIGURE 39 A AND B: RAMP CONTROL EXECUTION (A) AND RESPONSE (B) JITTER .....................108 FIGURE 40: RACE TELEMETRY RELEASE PERIOD JITTER ...........................................................108 FIGURE 41 A AND B: RACE TELEMETRY E XECUTION (A) AND RESPONSE (B) JITTER ................109 FIGURE 42: RACE FRAME LINK RELEASE PERIOD JITTER ..........................................................109 FIGURE 43 A AND B: RACE FRAME LINK EXECUTION (A) AND RESPONSE (B) JITTER................109 FIGURE 44: RACE CAMERA CONTROL PERIOD RELEASE JITTER ................................................110 FIGURE 45 A AND B: RACE CAMERA CONTROL E XECUTION (A) AND RESPONSE (B) JITTER ......110 FIGURE 46 A AND B: BEFORE AND AFTER PHASING CONTROL ...................................................112 Copyright 2000 Sam Siewert, All Rights Reserved xii FIGURE 47: FRAME LINK TERMINATION DEADLINE MISS CONTROL............................................113 FIGURE 48: MISCONFIGURATION E XECUTION VARIANCE EXAMPLE ............................................114 Copyright 2000 Sam Siewert, All Rights Reserved xiii 1 Introduction The range of real-time applications is expanding from traditional domains such as digital control, data acquisition, and digital switching, to include emerging domains like multimedia, virtual reality, optical navigation, and speech recognition, all of which really expand the input/output range and frequency of such systems and therefore impose higher bandwidth and processing requirements. At the same time, the complexity of microprocessors, typical algorithms, and input/output architectures is also increasing in an attempt to provide performance that can handle bandwidth and processing demands. The foundation of real-time systems theory is the recognition that bandwidth and processing resources will always be constrained (a more demanding application always exists that can make use of increased resources) and that given this reality, the question is how does an engineer formally ensure that given the resource constraints, the system will not only function correctly, but function correctly and meet timing deadlines. Since the introduction of the formal theory of real-time systems, perhaps best marked by the work of Liu and Layland on rate-monotonic analysis [LiuLay73], significant progress has been made on the engineering process for hard real-time systems using rate-monotonic theory. Improvements to the basic theory have been made, but perhaps more importantly, improvements to the real-time systems engineering process have been made. In order to engineer a real-time microprocessor-based system, ultimately it is necessary that the engineer be able to translate requirements into a system which meets cost, performance, and reliability objectives for the system. For a system which has timing requirements, where deadlines must be met, the definition of a real-time system, this means that in addition to correct function, the system must also provide correct function by a deadline, with required reliability and within cost constraints. Engineering a system requires a process and methods to measure the quality and success of each step in the process. Traditional steps in the engineering process include analysis, design, implementation, unit testing, integration, systems testing and maintenance. Depending upon the application, the materials, performance requirements, cost constraints and required reliability, this process must have figures of merit that provide meaningful feedback to the process. The current state of practice in real-time systems is to perform rate monotonic analysis (RMA) using estimated worst-case execution and release periods, design the functional code, and then implement it in a priority preemptive multitasking operating system or interrupt driven executive environment [Bu91]. The problem is that RMA assumes 100% reliability and furthermore it requires significant resource margin (approximately 20-30% of the central processing unit resource). Finally it is not clear how such a system will react in an overload scenario. The research presented in this thesis intends to provide a way of implementing not only timing performance requirements, but to also meet cost and reliability requirements by providing an implementation framework for mixed hard and soft real-time services. In real-time systems most services are provided by event-released tasks which are tasks providing a service dispatched for execution from an interrupt . Safely scheduling event-released Copyright 2000 Sam Siewert, All Rights Reserved 1 tasks, to complete execution by relative real-time deadlines is most challenging when processor loading approaches full utility (i.e. 100% central processing unit utilization). Liu and Layland established interesting bounds on this problem with the RMA least upper bound (provable safe utility < 0.7) and the ideal theoretical upper bound of full utility provided by the Earliest Deadline First (EDF) dynamic priority algorithm [LiuLay73]. The derivation of the RMA least upper bound treats all tasks in the set as having equally hard deadlines and is pessimistic since it requires a worst-case release and a worst-case execution assumption. Task sets with process utility loads below the RMA least upper bound can easily be scheduled, and as Liu and Layland point out, the EDF ideal is not achievable given the impracticality of the dynamic priority assignment required. What is perhaps most interesting about Liu and Layland’s work is that it provides the bounds for a marginally safe region of real-time scheduling using the fixed priority preemptive scheduling method – i.e. those task sets with loading between the RMA least upper bound and full utility (provable safe utility < 0.7 < marginally safe utility < full utility) – a marginal task set. Due to pessimistic execution time and period assumptions, tasks can be scheduled successfully within this region of marginally safe utility. The problem is that the safety of meeting deadlines for such a system cannot be mathematically guaranteed, nor can it even be well estimated from a reliability perspective with RMA. Ideally, from an engineering viewpoint, it would be beneficial to be able to schedule a marginal task set such that the reliability in meeting deadlines could be specified for each task and so that the range of reliability includes guaranteed service, soft real-time service (number of missed deadlines over a period is fully predictable), and best effort tasks. A formulation and implementation of such a programming framework called the RT EPA (Real-Time Execution Performance Agent) is introduced in this thesis. The RT EPA can be applied to typical real-time applications such as continuous media, digital control, and event processing. A confidence-interval mathematical formulation for task admission is used to provide a range of scheduling reliability within the RT EPA framework which enforces policy and provides an intuitive way to build processing pipelines and to schedule execution of task sets based on desired deadline reliability performance. An evaluation of the RT EPA framework is presented based on simulated applications, an optical navigation test-bed called RACE (Railguided, Air-powered Controls Experiment), and a video processing system for the SIRTF/MIPS (Space-based Infrared Telescope Facility / Multi-band Imaging Photometer for SIRTF) instrument. The results show that given minimal programmer specification, the RT EPA applications performed according to expectation for guaranteed performance requirements, within specified tolerances for soft real-time requirements, and also supported best effort performance for a marginal task set. The simulated applications and the SIRTF/MIPS instrument application prove the viability of the theory, implementation concept, and ability to solve difficult scheduling problems that include a mix of hard, soft and best effort tasks in a single real-time application. This thesis introduces a novel real-time scheduling method and real-time programming framework along with examples of how it can be used. The examples provide evidence that this approach is a valuable new way to implement real-time applications which include high loading and mixed hard and soft real-time tasks. The thesis defines the scope of real-time execution addressed; defines the problems with existing hard and soft real-time frameworks; provides a Copyright 2000 Sam Siewert, All Rights Reserved 2 proposed solution to those problems; presents results of testing an implementation of the proposed solution; and finally describes the significance of the results. 1.1 Research Scope The purpose of the RT EPA research is to validate extensions to Deadline Montonic (DM) theory for mixed hard and soft services in multiphase in-kernel pipelines. The threads of execution in the RT EPA are trusted modules that are loaded into kernel space and dynamically bound to kernel symbols. This is based upon the concept that such modules can be tested in protected memory spaces for correctness and once thoroughly tested for correctness, loaded into kernel space for better performance (elimination of system call traps) and for better real-time scheduling control (kernel-level threads). So, the research includes building an RT EPA framework according to the design presented in Section 5 and evaluating this framework with synthetic loads, an optical navigation test-bed, and on an actual NASA space-based telescope. Evaluating the RT EPA framework with applications that include marginal services (thread sets) is of fundamental importance. Furthermore, the applications must include both event driven thread releases and time-based releases. The ideal application test of the RT EPA should include mixed hard and soft real-time services released both by external events and by an internal clock to provide a significant validation of the framework. Scheduling event-released tasks (tasks released by interrupts generated from external events) to complete execution by relative real-time deadlines is not difficult when the processor on which these tasks are executed is under-loaded. An under-loaded system is considered to have processor loading less than the RMA least upper bound (approximately 70%). Constrained processor resources make the goal of safely scheduling tasks for real-time execution much more difficult as loading demands approach full utility. As previously mentioned, Liu and Layland set out a basic theory for safe and practical hard real-time processor scheduling utility when they formulated RMA. It is important to note that they also compared this RMA least upper bound to the ideal theoretical full utility upper bound provided by the Earliest Deadline First (EDF) dynamic priority algorithm (the algorithm most typically used in real-time pipelines – see Section 1.2). They derived the RMA least upper bound, which asymptotes to 70 percent utility with increasing numbers of tasks in a set. Furthermore, they established that it was not safe to assume real-time deadlines could be met with the EDF theoretical upper bound due to the impracticality of dynamic priority assignment (i.e. other real-time pipeline implementations have questionable real-time safety – discussed in Section 1.2). The derivation of the RMA least upper bound treats all tasks in the set as having equally hard deadlines. The RMA least upper bound is pessimistic for the following reasons: 1) overestimation of interference (the necessary and sufficient RMA least upper bound derived since is fully accurate in terms of interference), 2) worst-case release assumption (highest release frequency assumed for quasi-periodic releases), and 3) worst-case execution assumption (longest execution time possible assumed) [LiuLay73]. Their RMA least upper bound is perhaps slightly optimistic since it does not include overhead of context switching, but this is typically insignificant (e.g. 100 microseconds for release execution times in the millisecond range, so less than 10 percent and can be included in release execution time). Task sets with process Copyright 2000 Sam Siewert, All Rights Reserved 3 utility loads below the RMA least upper bound can easily be scheduled by a priority preemptive system with highest priority assigned to tasks with the shortest release period. Perhaps what is most interesting about Liu and Layland’s work is that it provides bounds for an interesting region of real-time scheduling – those task sets with loading between the RMA least upper bound and full utility. Tasks can be scheduled successfully to some level of utility between the RMA least upper bound and ideal EDF full utility given the margin afforded by the RMA pessimistic assumptions – the problem is that the safety of meeting deadlines for such a system cannot be mathematically guaranteed, nor can it even be well estimated from a reliability perspective. Task sets which fall between the RMA least upper bound and full utilization are referred to as marginal task sets in this thesis and are the motivation for the research completed. Figure 1 graphically shows the marginal thread scheduling region that is the primary region of theoretical interest and therefore also the goal for experimental validation. Validation of the RT EPA in the under-loaded region shows that the framework is functionally correct, but does not test the features for providing an execution environment for mixed reliable hard and soft real-time services. Two major new theories for real-time scheduling of the central processing unit (CPU) are presented here and validated in an RT EPA implementation: 1) Confidence-Based Deadline Monotonic (CBDM) scheduling, and 2) Multi-Epoch (ME) theory for scheduling. These two new theories are implemented in the RT EPA framework to provide the service admission test, scheduling policy, and overall to provide the capability to reliably schedule mixed hard and soft services in marginally loaded systems. System safety in the RT EPA is provided by on-line monitoring of releases and their deadlines such that release over-runs are controlled and the system may associate an action with such an over-run to safe the system (i.e. disconnect actuators or other devices which may be damaged by further execution and/or enter a fail-safe mode such as switching automatically to a back-up protection control system). Figure 1: RT EPA Confidence-Based Scheduling Upper-Bound RT EPA CBDM Upper Bound No Loading RMA LeastUpper Bound Marginal Thread Sets Under-loaded Thread Sets 0.0 Copyright 2000 Sam Siewert, All Rights Reserved EDF Theoretical Bound 0.7 1.0 4 The RT EPA is a framework for implementing mixed hard and soft real-time services (each implemented with a thread of computation) with on-line monitoring, control, and admission with quantifiable performance. The scope of the RT EPA research is summarized as follows: 1) Establishing an admission test and sufficient upper bound for thread sets with specified deadline confidence requirements. This upper bound is thread set specific and based on required confidences, but will always lie between the RMA sufficient least upper bound and the theoretical EDF bound of full utility. 2) The termination deadline must be less than the thread release period so that the CBDM admission test can bound interference (i.e. no overruns beyond the release period are allowed). In other words, multiple active releases of the same service are not allowed. 3) Thread priorities are not specified to the RT EPA, but the RT EPA CBDM policy requires fixed priority preemptive scheduling within a single scheduling epoch. 4) The RT EPA assumes that response utility is nonnegative between release and the termination deadline and has a maximum value on that interval. This utility assumption inherently includes both hard and soft real-time services, which are best described by the utility provided by such a service completing before a deadline relative to release. 5) Multiple scheduling epochs are investigated and shown to be viable and beneficial. An online multi-epoch RT EPA capability provides limited-scope dynamic prioritization – i.e. priorities are fixed within a single epoch, but change between epochs. Threads sets are then admitted to one of n epochs rather than to a system (i.e. single epoch). The research presented shows how multi-epoch priorities and admission tests remove pessimism inherent in the RMA critical instant assumption; it also shows that with an increasing number of epochs, the EDF theoretical ideal can be approached, but never reached due to overhead associated with online multi-epoch management. 6) Deadline miss and overrun policy includes traditional hard real-time full system safing (e.g. switching to backup control and/or disabling actuators) as well as soft real-time policies to: 1) terminate overrunning release, and restart for next release (allow a dropout), 2) allow overrun of a soft deadline to proceed noted and with application callback (soft overrun), or 3) dismiss a thread from the on-line set when it overruns and perform application reconfiguration callback. Why should the RT EPA support both hard and soft services? This goal was derived from the fact that many systems include the concept of both hard and soft services with respect to completing a service release by a deadline and the relative utility in doing so. The RT EPA utility assumption is based on the goal for the RT EPA to support a mixed set of hard and soft real-time services together as shown in Figure 2. Copyright 2000 Sam Siewert, All Rights Reserved 5 Figure 2: The RT EPA Utility Assumption Utility Soft Real-Time Isochronous Utility Curve Hard Real-time Bounding Curve 1.0 Soft Real-Time Deminishing Utility Curve Time Release Soft Deadline (Overrun Allowed) Deadline (Hard or Futility) Stated simply, the hard real-time utility curve is assumed to produce full system utility for a response any time after release, but before deadline – at the deadline, hard real-time utility goes negative, meaning that continued execution causes “negative utility” or harm. RT EPA soft realtime utility is considered to be any piecewise continuous function inside this hard real-time utility bound. All deadlines are considered relative to the time the event is released. The RT EPA supports two deadline concepts: 1) a soft deadline which indicates an application notion of a diminishing utility boundary, and 2) a termination deadline. For hard real-time services the termination deadline has the traditional definition of full system failure and for soft real-time services it simply marks the point at which continued execution will no longer lead to any system or service utility. The RT EPA does not consider a service which has a utility curve that never diminishes to zero to be a real-time service, but does accommodate such a service by providing a best effort scheduling class. Note that at the point where utility becomes zero it is futile to continue in the case of a soft service and it may actually be even more harmful to continue beyond this point for a hard service – either way, continuing beyond the zero utility point means that the system is being loaded with work that has no value. 1.2 Comparison to Existing Approaches Traditionally, if an application requires service time assurances, there are three approaches: best-effort systems, hard real-time systems, and QoS (Quality of Service) soft real-time systems. Best-effort systems rely on adequate resources always being available for a task set, and can make no guarantees when resources are even temporarily overloaded. Hard real-time systems require Copyright 2000 Sam Siewert, All Rights Reserved 6 that the application provide resource bounds to ensure margin (e.g. release worst-case execution time and period), so that scheduling feasibility can be mathematically assured for the worst case; task sets are only admitted when completion for all tasks can be guaranteed by hard deadlines in all circumstances. Typically, QoS systems provide resource reservation or specification of levels of service such that each task within the system set will have a performance bound; however, it is not clear how an abstract level or reservation translates into deadline reliability. Work has been completed to provide improved translations between service levels and reliability [BraNu98], but the RT EPA is the only framework that provides a direct probability theoretical relationship between execution model confidence and deadline reliability. The goal of directly associating model confidence and deadline reliability is a traditional engineering approach. For example, a solid mechanics engineer is able to test material used as members in a system and obtain a confidence in stress/strain performance which is then ultimately translated into overall structural system reliability. The mapping between QoS levels and actual performance is weak since it is not linked in any way to actual variances in the system, but rather, requires the service negotiator to estimate needs up front – this is much like estimating capital needed to complete a construction project. In such QoS systems resources are reserved and protected from misuse [MerSav94], but the allocation of resources requires high predictability in service demands. Reserving more resources is always beneficial, but clearly not possible, so what is not clear about QoS is how to translate real-time system performance requirements into resource requests. Most QoS systems have addressed this problem by providing for on-line re-negotiation of service levels [BraNu98], [JonRos97], [NuBra99]. Iterative request methods are a possibility, and have been investigated, but having a good estimate up front based on a mathematical model of loading including execution time variance and release period variance would give a better idea of worst case and average case resource needs and provide good bounds for such negotiation. Unless the underlying QoS levels intuitively map mathematical execution models into levels of service, this negotiation will not be straight forward – this is the case for existing QoS methods. The RT EPA confidence-based models provided initially and continually refined on-line provide a simple method for refining resource allocations and renegotiating service levels which is concrete by comparison (grounded in probability theory). The RT EPA sets up processing pipelines using trusted modules. This is based upon the current best practice available to balance safety with the need for efficiency and resource control. A number of efforts have been made in previous research to construct real-time in-kernel pipeline processing frameworks [Gov91], [Co94], [Fal94], [MosPet96]. In all cases these frameworks either employed an EDF service admission policy or in the case of Fall’s work, the scheduling was best effort. EDF is a fully dynamic priority assignment policy which makes it difficult to prove that it will be able to meet desired deadlines for all services in advance. By comparison, the RT EPA employs an extension to Deadline Monotonic (DM) scheduling which has been proven to be a safe and optimal policy for hard real-time systems [Au93]. Govindan’s in-kernel pipe work was the earliest and functioned by providing a memory mapping to devices directly to the application level, essentially defeating memory protection completely for pipelined applications in this environment. Since Govindan’s work, the concept of Copyright 2000 Sam Siewert, All Rights Reserved 7 trusted kernel modules has become the best way to deal with providing memory protection for applications yet at the same time providing for greater efficiency and control of resources needed by particular threads/services in a real-time system. A trusted module is dynamically linked into the kernel address space with a privileged command to load the object code. This concept of loadable modules is currently supported in Wind River Systems VxWorks 6.x, Sun Microsystems Solaris 2.x, and in the Linux 2.4.x operating system. All three of these popular operating systems also include memory protection domains for applications in addition to the trusted modules such that the RT EPA can provide a system call interface for non-real-time control of services, additional best effort applications and off-line initialization. Work subsequent to Govindan has made use of trusted kernel modules rather than opening up a hole into the kernel address space for all applications. The efficacy of this approach has been investigated in detail in the SPIN operating system [Be95] and this is widely accepted as a safe compromise for services that need improved efficiency and resource control, yet maintaining overall system safety for applications. Based on this related research history on in-kernel pipeline frameworks, the RT EPA employs the trusted module mechanism and policy. 1.3 Problem Statement Ideally, a programmer should be able to take a set of real-time performance requirements and map them into service requests directly. For example, consider the characteristics and real-time requirements for command and control of a remote digital video camera as summarized in Table 1. This example is a typical real-time digital system and includes three important real-time application types: 1) continuous media (e.g. video or audio), 2) digital control, and 3) event-driven automation (e.g. fault protection) [Si96]. Furthermore, this example application includes all three types of real-time processing in a mixed service application – hard real-time, soft real-time, and best-effort -- which is also typical of many emerging real-time applications. A survey of real-time application types is provided in Section 2. It should be noted that variances in execution and release frequency are also typical in this example. For example, execution variance can stem from complex algorithms (e.g. data-driven compression algorithms) and complex yet overall very efficient microprocessor architectures (e.g. pipelined super-scalar with L1/L2 cache). The use of real-time systems for applications with complex algorithms and microprocessors still requires traditional hard real-time control functions in addition to softer real-time functions when the application provides a mixture of services (as is true in this example). The largest execution variance in the Table 1 example comes from the video acquisition and compression typical of memory reference streams that require a cache larger than the L1/L2 cache. This scenario is typical of video compression processing since data driven compression algorithms and buffer copies normally produces high cache miss rates and therefore pipeline stalls (this is discussed in more detail in Section 2). What is most notable is that less than guaranteed reliability is also acceptable on two of the largest loads. This is because from a user viewpoint, it is often Copyright 2000 Sam Siewert, All Rights Reserved 8 acceptable to drop a video frame occasionally at the 20 frames/sec rate and yet still have acceptable quality. The deadline reliability for frame processing in this example is that, on average, there should be less than one frame dropout every 5 seconds and the probability of two frames dropping out in succession should be less than 1/10000. While the frame processing has the most significant execution variance, the service interval on fault safing varies the most in this example. Let us assume that the limits violation monitor in this example contains a consecutive out of limits count that must be exceeded before a safing request is made (this may vary from 2 to 20 in this example) -- only one safing request is made for multiple violations. Furthermore, the reliability on telemetry is such that there will be no more than one telemetry dropout every 5 seconds on average and the probability of two dropouts in a row is less than 1/400. These are typical engineering specifications for any type of system, but this is typically not the view taken in hard real-time scheduling, nor with QoS scheduling which does not allow for specification of reliability in meeting deadlines with a probability. This example is intended to illustrate the problems with translating real-time system requirements into traditional hard real-time system implementations or into QoS service level specifications. The example is not complete, but it highlights the difficulty with which system requirements must be realized in terms of RMA or QoS frameworks. The difficulty of realizing systems like this example using traditional RMA is well recognized by publications which provide guides to applying real-time theory [BriRoy99], [Laplante93]. Despite excellent references s for making the mapping between requirements and the implementation, the task is arduous and soft requirements are still not addressed. The QoS frameworks have not evolved enough to provide engineer’s guides like RMA and really only provide examples of how a particular QoS framework was used to implement an example application [BraNu98], [MerSav94], [NiLam96]. Loïc Briand [BriRoy99] states, “This book is the result of years of accumulated frustration” in reference to his book written with Daniel Roy to provide a prescriptive guide on how to apply RMA to design a system to meet real-time requirements. The intent of the RT EPA framework is to provide a translation from real-time system requirements to code which is as direct as possible. Specification of service interval and deadline are required for QoS, RMA, and the RT EPA, but the RT EPA is the only framework which provides for direct specification of service execution time confidence and desired reliability in each service meeting its deadline. The goal of the RT EPA is therefore to bypass prescriptions for mapping requirements to theory and to implementation by providing a direct mapping between requirements and implementation through theory encapsulated in the framework directly rather than the application. Copyright 2000 Sam Siewert, All Rights Reserved 9 Table 1: Mixed Best Effort, Soft, and Hard Real-Time Application Example Service Service Interval (msec) Execution Variance (msec) Deadline (msec) Worst-case Utility Best-case Utility Deadline Reliability Camera platform stability and position control 50 5 +/- 1 50 0.12 0.08 100 % (hard) Video source acquisition and frame compression 50 20 +/- 10 50 0.6 0.2 99 % (soft) Camera fault detection 250 12.5 +/- 2.5 100 0.06 0.04 100 % (hard) Camera state telemetry acquisition and transport 250 20 +/- 5 250 0.10 0.06 95% (soft) Camera fault safing 500 +/ 4500 10 +/- 5 1000 0.03 0.001 100 % (hard) Camera command processing 500 10 +/- 2 500 0.024 0.016 100 % (hard) Memory scrubbing 12000 500 +/- 100 12000 0.05 0.033 best-effort 0.984 0.43 TOTAL The example in Table 1 cannot be scheduled according to standard RMA and DM admission tests, and is an example of a marginal task set. So, the implementation choices with current realtime systems technology are to schedule this system according to RMA priority policy despite not clearly meeting admission criteria, schedule the system using best effort, or schedule the system using QoS levels based on estimates about how much resources should be reserved for each service. None of these approaches allows an engineer to have confidence that the system can meet the deadlines required with the reliabilities required. Examples of requirements such as these and marginal task sets like this are the motivation behind the RT EPA. 1.4 Proposed Solution This thesis describes an interface and scheduler that provide an on-line intelligent service negotiation and execution control mechanism called the Real-Time Execution Performance Agent (RT EPA). The RT EPA ensures execution of algorithms based on required reliability and confidence in meeting deadlines rather than on priorities or an abstract derivative thereof (such as QoS levels). The basis for the RT EPA is confidence-based scheduling, an extension to DM scheduling employing stochastic execution models. Furthermore, the EPA provides predictable and safe execution of hard real-time safety critical tasks in addition to predictable execution of soft Copyright 2000 Sam Siewert, All Rights Reserved 10 real-time tasks. Hard real-time tasks are still scheduled safely since all tasks are protected from each other in terms of interference, and each task has quantifiable assurances of processor resource availability – i.e. it is possible to predict the order in which tasks will fail in an overload situation. Hard real-time tasks are protected from soft real-time tasks which may occasionally overrun deadlines through strict enforcement of termination deadlines for all tasks (this strictly limits maximum interference). By analogy, the EPA provides a balancing capability much like the everyday ability people have to walk without tripping (hard real-time) while chewing gum (soft real-time) and contemplating how to build a better career (best effort). Perhaps a less safe example is the emerging habit of people talking on a cell phone and driving – the critical task of driving must be fire-walled from interfering service demands based on the phone for safe use of a cell phone while driving (e.g. no dialing except at stoplights and use of a hands-free microphone and speaker). The RT EPA performs this fire-walling by executing tasks in specific execution reliability and confidence spaces and monitoring actual execution times to determine when resources must be adjusted due to execution variances exceeding originally negotiated requests for service. Since continuous media and digital control applications (2 of the 3 main application types) both include pipeline processing of inputs to outputs (a pipeline is a logical stream with input, filtering stages, and output), the RT EPA has also been designed to make the construction of such real-time pipelines along with more event-driven task releases a process where timing and dataflow can be specified together. The specification of real-time data-flow processing has been the subject of research on in-kernel pipelines, which have also been implemented in conjunction with QoS scheduling [Gov91], [Co94], [Fal94], [MosPet96]. A more recent example of continuing work on in-kernel pipelines is Microsoft DirectX which along with the rich history of pipeline research clearly establishes the importance of pipelines to efficient and reliable real-time streaming applications which have device sources and sinks [McCart00]. While pipelined realtime services are typical of continuous media and digital control (two of the most prevalent realtime applications – see Section 2.5), the RT EPA mechanism is intended to provide time-critical pipelined and non-pipelined applications with quantifiable assurance of system response using a simple extension to the DM scheduling algorithm [Au93]. In addition, the RT EPA provides an admission interface and execution control which allows applications to monitor and control realtime performance of tasks and processing pipelines on-line. So, the RT EPA significantly extends existing work on in-kernel pipelines as well as QoS soft real-time scheduling in general through the confidence-based scheduling admission test and on-line monitoring and control provided by the RT EPA. Most importantly, given the RT EPA interface, a developer who needs to manage a mixture of hard and soft real-time now has a framework for developing applications with quantifiable reliability, execution-model failure fire-walls, negotiable service, and on-line monitoring. This insight into execution performance and intuitive quantifiable service negotiation in terms of probability of meeting/missing deadlines not only makes the job of implementing such systems easier, but it can help a programmer implement applications reliably that no other existing method can. Copyright 2000 Sam Siewert, All Rights Reserved 11 1.5 Evaluation The RT EPA mechanism has been implemented as an extension to the Wind River Systems VxWorks micro-kernel [WRS97]. It has been tested with applications including: 1) pseudo task loads, 2) monitoring capabilities on the SIRTF/MIPS instrument incorporating continuous media and digital control pipelines, 3) a 5 degree-of-freedom (DOF) robotic arm, 4) a video acquisition and compression pipeline and 5) an optical navigation test-bed incorporating real-time video processing and digital control. The RT EPA has demonstrated the ability to increase reliability and predictability of systems that require both hard real-time execution control in addition to flexible soft real-time processing. The experiments demonstrate the viability of the confidencebased scheduling formulation and the RT EPA implementation for negotiated, monitored, on-line quantifiable service assurance using the confidence-based formulation. Three of the test-beds (pseudo loading, 5 DOF robot, and video compression pipeline) are applications designed to evaluate the RT EPA and serve no other purpose. Since it could be argued that such applications are “toy examples” and are not indicative of realistic applications, portions of the RT EPA were evaluated with the SIRTF/MIPS (Space-based Infrared Telescope Facility / Multi-band Imaging Photometer for SIRTF) instrument. The code for this application includes over 45 thousand lines of C code in addition to the VxWorks Real-Time Operating System (RTOS). The SIRTF/MIPS instrument has three real-time digital detectors with hard realtime continuous media processing pipeline deadlines, a complicated video compression method, closed loop digital thermal control, a PowerPC pipelined microprocessor with L1 cache, many soft real-time requirements such as command handling and telemetry processing, as well as best effort task such as scrubbing memory. Furthermore, SIRTF/MIPS is an example of a mixed application with continuous media (three concurrent video sources), digital control (thermal control), event driven quasi-periodic tasks (command and telemetry by request), and totally non-real-time maintenance type processing (memory scrubbing). While the MIPS instrument is a unique application, many similar examples exist for more commercial applications – virtual reality, multimedia, and in-situ interactive real-time applications such as vehicle navigation systems, digital flight management systems for aircraft, and satellite-based internet and digital media services. These types of applications, similar to SIRTF/MIPS, all include: complex data and event driven algorithms, mixed service requirements, and complex high performance microprocessor architectures. Finally, during development, the SIRTF/MIPS software was timing out on missed deadlines with RMA priorities; RT EPA was incorporated into the system, enabling it to meet all deadlines within existing specifications. Without the use of the RT EPA monitoring capabilities the SIRTF/MIPS instrument was not able to operate correctly. 1.6 Summary of Research Results The thesis presents three basic types of major results: 1) theoretical formulation, 2) prototype software, and 3) software/hardware test-beds to demonstrate goals for the framework and validate the theory. Copyright 2000 Sam Siewert, All Rights Reserved 12 1.6.1 Theoretical Results The theoretical results include three major new real-time theories: 1) An engineering view of real-time scheduling that inherently includes mixed hard and soft services with quantifiable reliability and confidence in the system and specification of desired service performance (compared to abstract levels of service). 2) Confidence-based thread admission and monitoring reducing RMA pessimism in execution time bounds and how they translate into ability to respond by deadlines. 3) Evidence that thread sets can be admitted to multiple on-line epochs with priority changes between epochs, but fixed within an epoch. The confidence-based scheduling admission test is derived by modifying the DM admission test to include execution confidence intervals and deadline confidence. After multi-epoch theory is presented by example, the multi-epoch theory is shown to approach the EDF ideal upper bound of full utility. 1.6.2 1) 2) 3) 4) Framework Prototype Implementation The RT EPA prototype framework provides: On-line confidence-based deadline monotonic (CBDM) admission of threads. On-line monitoring of thread/service performance in terms of missed deadlines, reliability, and a confidence model. A periodic server for performance monitoring and service re-negotiation. Pipelining of data processing between source and sink interfaces to enable distribution of processing load over time and high level sequencing and control of processing stages. 1.6.3 Proof-of-Concept Test Results Three applications were tested with the RT EPA prototype: 1) Pseudo software loads (demonstrate basic features and characteristics of CBDM and negotiation/re-negotiation). 2) Use of the on-line kernel monitoring in the RT EPA to identify service epochs and apply this theory to solve a real-world scheduling problem on a space-based telescope. 3) Use of the full RT EPA to demonstrate all research goals with an optical navigation test-bed including multiple RT EPA pipelines for digital video processing and control of an airpowered vehicle. 1.7 Significance The significance of confidence-based scheduling and the RT EPA is that this approach provides a reliable and quantifiable performance framework for mixed hard and soft real-time applications. This thesis provides a detailed explanation of the RT EPA and confidence-based scheduling concepts, a comparison to other soft real-time scheduling methods, the theoretical background of confidence-based scheduling, the mathematical formulation for confidence-based scheduling, an example RT EPA implementation and results which validate the usefulness of the framework. The set of applications requiring this type of performance negotiation support from an Copyright 2000 Sam Siewert, All Rights Reserved 13 operating system is increasing with the emergence of virtual reality environments [Nu95], continuous media [Co94], multimedia [Ste95], digital control, and “shared-control” automation [Bru93][SiNu96]. Furthermore, in addition to providing scheduling for mixed hard and soft services, the RT EPA facility allows an application developer to construct a set of real-time kernel modules that manage an input (source) device; apply sequential processing to the input stream (pipeline stages); control individual processing stage behavior through parameters obtained from a user-space application; provide performance feedback to the controlling application; and manage the output (sink) device. This type of pipeline construction in combination with the RT EPA on-line admission testing based on requested deadline reliability and on-line performance monitoring make implementation of typical continuous media, digital control, and event-driven real-time systems much more simple than hard real-time, QoS, or best-effort systems. The RT EPA framework with CBDM and ME formulations for scheduling provides a powerful framework for a broad set of applications requiring timely services with measurable quality in the ability to meet deadlines. Copyright 2000 Sam Siewert, All Rights Reserved 14 2 Problem Statement Real-time software systems must be functionally correct and must produce correct results by deadlines relative to the occurrence of a stream of events (in the context of this thesis, the occurrence of events is referred to as “the event-based release of a thread of execution”). Put more idiomatically, if real-time software produces a functionally/mathematically correct result that is inconsistent with the event occurrences, then it is incorrect; and of course conversely, if such a system produces a functionally/mathematically incorrect result on time, then it is also wrong. To be correct, the application must produce a correct answer at the right time. Usually producing the correct answer before a deadline is sufficient, but in the case of isochronal applications, the result must be produced at only one particular time, that is the result must not be produced too early or too late, but on time. Unfortunately, many complex real-world systems such as the SIRTF/MIPS video processing application and the RACE application studied in this thesis experience very occasional jitter in response such that the end result is a timing fault which is exhibited somewhat rarely, but frequently enough to be a problem. The response jitter ultimately can be shown to stem from execution or input/output jitter or both. In the case of the RACE results presented in Section 8.4 execution jitter was seen for the frame display task that was as high as 7.7 % of the average execution time on a Pentium processor with L1/L2 cache (Figure 35). Worse yet, the SIRTF/MIPS video processing application experienced execution jitter in video frame slice compression validation processing of 55% as measured by the SIRTF/MIPS RT EPA monitor (Table 16). This 55% jitter rarely happened, but regularly caused timeouts when it did – the result is apparently random timeout faults. Full results for both applications are presented in Section 8, but the point is that the variance can cause very occasional timing faults and requires either very pessimistic assumptions of resource demands (high margin) or on-line monitoring and control for occasional glitches as is provided by the RT EPA. The problem with processing events in real-time can be categorized according to the system boundary (i.e. elements that are controlled parts of the system under construction rather than the environment in which the system must operate). By this definition we have only two domains to consider – the system and the environment. Furthermore, we will be applying an admission test which will ensure that there is sufficient CPU resource to provide the throughput and turnaround required for the system, the most basic timing issues. So, what is of most significant concern is how will variances in the environment and system itself affect performance. Ideally, the environment would be modeled as an event source which initiates service releases in a purely periodic manner with no jitter or latency in this source. Likewise, the system would ideally be modeled as responding to the periodic demands by releasing services (threads of execution) to handle the event and produce outputs with no jitter in the response. In the case of the environment the latency would ideally be a constant determined for the environment by the physics of the detector (e.g. speed of light, sound, electrical transmission and sensing of physical phenomena). In the case of the system the latency of response would be purely determined by a deterministic worst-case dispatch and execution time on the CPU given all demands upon this shared resource. For numerous reasons to be discussed in this section, both the environment and system are not Copyright 2000 Sam Siewert, All Rights Reserved 15 ideal. Therefore, given the system/environment boundary model proposed here, the two categories of timing issues that must be considered are: 1) environmental latency and variance and 2) system latency and variance. The first category is addressed to some extent in Section 4 of this thesis, but a comprehensive treatment is not provided, because modeling of the environment is beyond the scope of this thesis. Some research into variance in event rates due to sensors and detection was completed [SiNu96]. System timing latency and variance is the category of principal interest in this thesis because it is a more tractable problem and because most systems are deployed in controlled environments which require a model specific to that environment. The problem of deploying systems in uncontrolled environments is ultimately not tractable (it requires an accurate world model) and many excellent sources for modeling environmental event rates for more well-defined and somewhat controlled environments already exist [Tin93], [Tin94], [Tom87], [Sprunt88], [Sprunt89], [Kl94]. However, it should be noted that the operational environment for the system must be reasonably well modeled in order to make good use of the RT EPA framework. The RT EPA does address the issue of environmental variance in a limited scope through service epochs presented in Section 4, however this assumes that a good model of environmental event rates and modes exists – as already noted, deriving an event rate and mode model goes beyond the scope of this investigation and the capabilities of the RT EPA. Finally, a number of real-time system application domains are examined with respect to impact of both the environmental and system variance characteristic of each domain. 2.1 System Timing Issues Ultimately system latency can be shown to be the sum of the latencies for the following: 1) release, 2) dispatch/preempt, 3) input, 4) execution, and 5) output as is evident in Figure 3 and well noted[BriRoy99]. Likewise, there are 4 types of system variances (or jitter) that are of interest with respect to the ability to meet deadlines relative to events including: 1) release jitter, 2) dispatch/preempt jitter, 3) execution jitter, and 4) I/O jitter. These variances have been noted in related research as well as in the work presented here[Ste95], [Fle95], [Bru93], [Tör95]. As already noted in the problem statement boundary definition, given the system scope of this investigation, the real world events themselves are not considered to have significant latency or jitter from the system view alone. This is despite the fact that real-world events may be aperiodic (e.g. events generated based upon the whim of an operator or environmental events that are hard to predict). Likewise, latency due to speed of sound and light could actually be considered, but this goes beyond the boundaries of the system as defined here. Typically an environmental model can be formed so that real-world events are assumed to have a maximum event inter-arrival rate for a particular mode of service, leading to the concept of service epochs discussed in Section 4. In this thesis we consider all deadlines to be relative to the occurrence of events in the real world once they have already been detected by the system (e.g. a sensor/transducer has transformed a physical event into an electrical assertion interrupting the CPU). Ultimately all that really matters is the response latency and jitter which results end-to-end from the time of the event related interrupt assertion to the time of system response output assertion. Simply stated, each event release must have a response before a deadline relative to that event release and the ability to meet the relative deadline will be determined by the sum of the latencies. Likewise, the ability to Copyright 2000 Sam Siewert, All Rights Reserved 16 reliably meet the deadline on a periodic basis will be determined by the end-to-end jitter. Typically latency is a matter of resource capabilities such as bus bandwidth, CPU speed, CPI (Clocks Per Instruction), and network bandwidth. In addition, availability of the resource is also an issue and contributes to latency and jitter due to interference and overhead time to preempt lower priority users of the resource. Jitter however is much more complicated to assess since it is due to several variable factors including contention for resources (interference), system hazards (e.g. cache misses and pipeline stalls), and system inaccuracy (e.g. clock jitter and device interrupt assertion jitter). By far the most significant jitter is due to competition for resources which is the fundamental consideration in hard real-time scheduling [LiuLay73] (the two best examples are network/bus contention and CPU interference). The secondary source of jitter is typically due to system hazards and in general, the contribution due to system inaccuracies should be minor unless the system has hardware deficiencies for real-time application. Examining each source of system timing variance in detail, there are multiple sources for latency and jitter in each of the four phases identified here. 1) Release variance – Due to I/O contention (bus grant) and CPU interference (higher priority threads), the time from when a external event occurs and when the servicing thread becomes ready to run is variable – this is the time between the external event and when the initial release is made (i.e. the service is granted the CPU resource for the first time) . Furthermore, over the time from initial release to response completion there may be interference and contention that will require preemption and re-dispatch. For example, a higher priority thread may very likely become ready to run during a release that is in progress or, in the case of I/O contention, a bus grant may be revoked due to the bus grant arbitration scheme. 2) Dispatch/Preempt variance – the context switch time is dependent upon save/restore requirements and potentially kernel overhead for interrupt servicing, checking kernel events and maintaining services depending upon the event release interrupt priority relative to other interrupting sources (e.g. checking semaphore take lists, updating virtual clocks, and dealing with floating-point tasks). 3) Execution variance – execution variance has two major sub-components including architectural and algorithmic sources of variance [BriRoy99]. a) Algorithm execution variance – This variance is due to non-uniform computational loading over time resulting from data driven applications. For example, content-based digital video compression like change-only digital pixel stream compression has an execution time and I/O bandwidth needs that is proportional to the scene change rate. If more pixels are changing in the scene due to motion in the scene, then the compression algorithm will have to produce more change packets per frame – best case there is no scene change and nothing needs to be generated. Copyright 2000 Sam Siewert, All Rights Reserved 17 b) Architectural execution variance – This variance is due to the complexity of microprocessor architecture features such as multi-level caches which greatly increase average throughput, but make individual release execution time difficult to predict. Examples of these features include: pipelining, super-scalar ALUs, L1/L2 cache memory hierarchy, and branch prediction [HePa90]. Hennessy and Patterson describe these features and modeling execution times in great detail. 4) Input/Output variance – This is variance in I/O due to the complexity of peripheral bus architectures and multiplexed access to these I/O interfaces by multiple threads of execution. Multiplexed access to buses requires mutual exclusion which can cause priority inversion and bus grant arbitration may also result in response output jitter due to buffering. These variances are distinguished from CPU architecture variances since they are purely I/O related. For example, a PCI bus controller may be programmed to establish the bus grant latency for each device through systems software. Scheduling the bus is beyond the scope of this thesis and therefore all research was completed on a system with a significantly under utilized bus (maximum of 20% of available bandwidth). The RT EPA to handles environmental timing variance through the use of ME addressed in Section 4. 2.1.1 Release Variance Due to Contention and Interference Release jitter is due to the possibility of variable phasing between interfering threads and the currently executing thread already released; between releases, a given service may encounter different levels of interference by higher priority threads prior to release. In some cases it is possible to reduce interference by synchronizing the services.. Liu and Layland addressed this issue with the critical instant assumption, which means that they made no assumption about phasing of threads. This is the safest assumption, but can be overly pessimistic. The synchronizing features of the RT EPA for pipelining can greatly reduce interference due to bad phasing. 2.1.2 Dispatch and Preemption Variance Due to System Overhead A thread will potentially need to be involved in one or more context switches, depending on interference that might have added overhead to the response time. Since this overhead is a function of interference, it will experience jitter if the interference has jitter in addition to any variances in the context switch time itself. The isochronal feature of the RT EPA can be used to control end-to-end jitter in pipelines by providing buffer holds between stages or at the end of the pipeline. Copyright 2000 Sam Siewert, All Rights Reserved 18 2.1.3 Algorithm Execution Variance Due to Non-uniform Loading In a Single Release With regard to release execution time, this thesis addresses the challenge of dealing with systems that have execution jitter and therefore impose non-uniform loading over time by introducing the concept of multi-epoch scheduling. The SIRTF/MIPS application provides an example of how real-time scheduling problems can be solved by analyzing the scheduling feasibility of task releases in two or more separate scheduling windows or epochs rather than assessing utilization over the longest single inter-release period as does RMA [LiuLay73]. Likewise, DM assesses scheduling feasibility of the relatively largest deadline (i.e. iterative calculation of interference by all threads of higher priority compared to that under test). The problem of taking an existing singly threaded application and redistributing it into multiple threads is not addressed in this thesis because that is a source code optimization problem, however, the research completed here shows a distinct advantage to decomposing real-time systems into larger numbers of threads which each have smaller release execution times in order to enable multiepoch solutions. The concept of analyzing loading over multiple co-existent scheduling epochs is clearly demonstrated by example and made possible the successful real-time scheduling of video processing on the SIRTF/MIPS instrument. The results of the multi-epoch solution are described in Section 8.3. 2.1.4 Architectural Execution Variance Due to Micro-parallelism and Memory Hierarchy Modern microprocessor architecture features which make real-time execution difficult to predict include: 1) Pipelining, branch prediction and super-scalar instruction execution hazards can cause CPI variances. For a super-scalar pipelined system the ideal rate of execution will be less than 1 clock per instruction. When the pipeline stalls due to a hazard, the CPI may fall to a worstcase value where no micro-parallelism is employed and one clock is required for each stage of execution (fetch, decode, execute, write-back). So, due to hazards, the CPI at any time may vary between 0.5 to 4 easily on modern microprocessor architectures such as the PowerPC and Pentium processors. An average CPI may be assumed of course, but the prediction of hazards is extremely difficult except in an average sense for a specific body of code [HePa90]. 2) Memory hierarchy features contribute to execution variance. At the highest level, the L1/L2 set associative write-back and write-through caches will experience misses based upon the specific application memory reference stream. At the next level in the hierarchy use of virtual memory will result in occasional page faults and significant delays during page swapping. It is not typical for embedded systems to use virtual memory for this reason, but the goal of the RT EPA is to handle these types of occasional variances in timing. Unix systems pages can be hardwired into memory for a real-time application. Copyright 2000 Sam Siewert, All Rights Reserved 19 3) I/O optimizations can lead to variances in timing. Direct memory access (DMA) transfers save CPU cycles, but the transfer and interrupt rate are driven by the bus mastering I/O device rather than the application on the CPU. Burst transfers typical with DMA transfers across buses also impose long periods of bus utilization which may hold off single word requests for many bus cycles. Pipelining a CPU with micro-parallelism essentially reduces the CPI to 1.0 (increasing efficiency) as long as the pipeline is not stalled by a hazard preventing the CPU control from safely continuing parallel instruction fetch, decode, execution, and register write-back on each clock cycle [HePa90]. Furthermore, super-scalar CPUs use micro-parallelism to provide for CPIs of 0.5 or less by not only providing parallel execution of CPU cycle phases, but by providing parallel execution of each phase – i.e. multi-instruction fetch, multi-decode, multi-path execution, and multi-path write-back [HePa90]. There is no standard form of micro-parallelism, so the efficiencies gained by a particular pipelined super-scalar architecture depend upon the exact nature of the micro-parallelism and hazards associated with execution of arbitrary instruction sequences. From a real-time execution perspective, the efficiencies gained must in essence be ignored since in the worst case a given instruction sequence could result a high frequency of pipeline hazards and in the worst case could result in a continuously stalled pipeline increasing the CPI from 1.0 or less to a maximum of 4.0 or more depending upon the number of CPU cycle phases. Complex memory hierarchies including L1/L2 caches speed up processing overall (and enable pipelined/super-scalar execution rates by providing cached memory references in a single clock), but at the cost of predictability of execution time for a given instruction sequence [HePa90]. The execution variance introduced by caches is due to the complexity of predicting cache hit/miss rates for a given thread release (i.e. it’s difficult to predict a memory reference trace in advance for a reasonably complex algorithm) [HePa90]. Some cache hit/miss rates can be predicted satisfactorily for trivial algorithms (such as copying a block of bytes from one location in memory to another), but many algorithms are more complex and there are also context switch interactions that are hard to predict. Examples of more complex data-driven algorithms include: data compression, image processing, searching, sorting, root-solving, variable-step-size integration, and matrix inversion are all examples of much more rigorous computations in terms of variability in complexity and memory reference streams. Cache misses require main memory fetches that are much slower than cached hits. Furthermore, a cache miss stalls the processor pipeline until the required data or instruction is fetched. Likewise, if the memory hierarchy includes virtual memory to provide working space greater than main memory, then a page fault will greatly increase execution time while pages are swapped from disk to main memory. (Usually virtual memory is considered unacceptable for real-time applications given the huge difference in access time – potentially from nanoseconds to milliseconds). From the traditional RM/DM real-time perspective, the efficiencies gained by caching must be ignored, and instead it is assumed that all execution times will be worst case. So, every memory access must be considered a cache miss requiring main memory access times and associated processor pipeline stalling – or, a super-scalar architecture with a CPI of 0.5 must be de-rated to a CPI of 4 or more. Likewise, the flexibility afforded by virtual memory is normally disabled in real-time systems either by omission as is the Copyright 2000 Sam Siewert, All Rights Reserved 20 case with VxWorks or by wiring pages used by real-time threads of execution so that they can never be swapped out to disk (an option on Unix systems such as Linux). Pipeline hazard reduction decreases overall CPU pipeline stall frequency and thereby increases overall efficiency. However, once again, these features may not be used in a real-time system since methods such as branch prediction are based purely on probability and therefore cannot provide guaranteed pipeline stall control – i.e. branch prediction improves overall efficiency, but does not improve execution time predictability. 2.1.5 Input/Output Variance Due to Shared Resource Contention and Transfer Modes Optimizations for efficient microprocessor I/O, such as DMA transfers, increase overall I/O bandwidth and reduce CPU cycle-stealing, but the increases in bandwidth and the decreases in cycle-stealing are hard to guarantee. Likewise, buses are a shared resource and shared bus access introduces unpredictability of I/O resource contention between multiple threads of execution. If the bus is underutilized, then the I/O affects may be negligible; but a multi-threaded application which requires multi-threaded bus access needs to access the shared resource with mutual exclusion. The priority-inversion problems are well-known, and while there has been progress in minimizing the possibility and duration of inversions, there is not a way to avoid the problem completely [ShaRaj90]. The priority inheritance protocol prevents unbounded inversions, but can lead to chaining of temporary priority amplification which is a problem for hard and soft real-time systems. The potential for chaining can be limited by setting a priority amplification ceiling according to either the priority ceiling protocol [ShaRaj90], or the highest locker protocol [Klein93]. In either case this still is not a complete solution since a complex system may not be able to be analyzed to guarantee the ceiling is sufficient.. The full implications of I/O resource contention are beyond the scope of this thesis since the focus here is the CPU utilization and contention. 2.1.6 System End-to-End Latency and Jitter Ultimately each contribution to latency and jitter leads to an overall system latency and jitter. Namely the latency from event release to output response is the response time which must be less than the relative deadline. Furthermore, the jitter in each phase of the response leads to overall response jitter. Figure 3 depicts response latency with minimal sum of the various jitter components and likewise Figure 3 depicts response latency with maximal summed jitter. Any given response will have an overall latency within this bound. Copyright 2000 Sam Siewert, All Rights Reserved 21 Figure 3: End-to-end Jitter from Event Release to Response Event release event latency Dispatch release interference Preempt Dispatch exec interference exec time Complete exec time output latency Response (Output) Event release event latency Dispatch release interference Preempt exec time Dispatch exec interference exec time Response Jitter Complete output latency Response (Output) Real-World Event 2.2 jitter execution Real-World Actuation Environmental Event Rate Variance Due to Nature of Events and Modeling Difficulty Event sources can be classified as follows: 1) Aperiodic – Events having no predictable release characteristics (e.g. system faults). 2) Quasi-periodic or bursty – When a bursty source is active, it tends to be periodic, but when it will become active is difficult to predict (e.g. user interaction with a virtual world object through a data glove). 3) Periodic – Events which are completely predictable in terms of inter-release frequency (e.g. video frame digitization rate through multiplexed A/D converter). Traditional hard real-time scheduling policies including rate monotonic (RM) and deadline monotonic (DM) require periodicity. Therefore, aperiodic sources are modeled as periodic by determining a maximum worst-case release frequency for them and assuming this frequency to determine their period for admission tests. For a rare event source that is high frequency when it does emerge, significant CPU will be wasted to reserve resources normally not needed. The RT EPA addresses this problem by providing an interface which provides on-line admission and reconfiguration to support occasional modes which can be negotiated on-line by the controlling application – for example, a burst of faults can trigger a real-time safing mode. It is quite possible to leave RT EPA tasks memory resident (i.e. fully allocated, but inactive) so that the transitions between such modes simply requires executing activation and deactivation sequences. In the SIRTF/MIPS application this concept of multiple execution modes was extended such that admission and execution control is provided in two distinct mode on-line with dynamic priority assignment between executions in a given mode (an epoch). Copyright 2000 Sam Siewert, All Rights Reserved 22 2.3 Characteristics of emerging real-time applications In "Dynamically Negotiated Resource Management for Data Intensive Application Suites" [Nu97], it is noted that emergent real-time applications require a range of execution performances from hard real-time to soft real-time, with the degenerate case being best effort. Ideally a programming framework for CPU scheduling should provide a simple interface for specifying the full range of desired execution performance in a real-time system. Currently, there are three types of CPU scheduling frameworks: 1) Hard real-time priority preemptive, 2) Soft real-time quality of service, and 3) Non-real-time. The reason that a better framework for a range of required realtime execution performance would be beneficial is because in addition to traditional hard real-time application domains such as digital control and continuous media (e.g. digital video and audio) there are many emerging systems which have soft real-time requirements (e.g. multi-media entertainment systems) and even more interesting, mixed domains (e.g. virtual reality). These emerging application domains have created a need to fill the gap between best effort scheduling and RMA or DM hard real-time priority preemptive scheduling with an approach that is more reliability oriented than QoS methods. Furthermore, due to more user interaction in mixed applications such as virtual reality, there is a need for more flexibility and more dynamic mixed hard/soft real-time systems which provide an interface for reconfiguration, negotiation for service, and on-line monitoring. In the following Sections (2.6 – 2.8), the traditional domains are reviewed with respect to their real-time characteristics and the case is made for the need for more configuration control on-line by analyzing the characteristics of emerging domains (2.9-2.11). 2.3.1 Loading characteristics of purely continuous media Continuous media applications must be isochronal end-to-end so that the output data is neither too early nor too late -- either case causes output jitter and will result in poor end-user media quality. Continuous media such as digital video and audio are not a new application domain, but the popularity and importance of these applications has increased due to multi-media applications and the proliferation of digital internetworking. The loading characteristics of these systems is periodic since they are driven by frame rates determined by human computer interaction principals (e.g. motion picture frame rates). What is interesting about continuous media applications is that due to typically high bandwidth requirements for networked applications, most video systems include compression pipelines. So, with video pipelines, there can be a large amount of execution jitter stemming from data-driven compression for network transport and tradeoffs to maximize performance so that the overall application is neither CPU nor I/O bound (as depicted in Figure 4). The execution jitter due to data driven algorithms for compression is also exacerbated by jitter due to high cache miss rates associated with large frame buffer manipulations. A worst case for frame processing and compression algorithms can be extremely pessimistic. For example, take a simple algorithm which determines the pixel brightness centroid in a 1024x1024 image (a typical algorithm used in optical navigation). A detailed analysis of such a scenario is provided in Appendix B., but in short, the execution time can vary from 70 to 129.5 milliseconds per frame given a PowerPC 750 microprocessor and its architectural characteristics [HePa90]. Furthermore, if there are multiple pipelines in a multi-media application, I/O resource contention may also cause Copyright 2000 Sam Siewert, All Rights Reserved 23 additional output jitter due to demands made on the bus for devices such as memory-mapped frame-grabbers and soundcards. To summarize the characteristics of continuos media, the period jitter is low and the execution jitter may be high due to compression and manipulation of large frame arrays. Figure 4: Continuous Media Digital Video Pipeline Source Flow Control Application Sink Flow Control Application API local pipeline agent +/- milliseconds local pipeline agent Frame Decompression Frame Compression HW/SW Frame Grabber Network Device Network Device Video Adapter 2.3.2 Loading characteristics of purely event-driven processing Fault-handling is one of the best examples of purely event-driven processing. Normally, once a fault is detected, the earlier a response is generated, the better. It is very difficult to predict fault rates in advance and there will be aperiodic faults as well as the possibility of bursty faults depending upon the nature of the fault detection and sensor interface processing software and hardware. Typically, hard real-time fault protection systems are designed to handle a particular maximum fault rate and often will safe the system completely if the rate becomes higher than the design maximum (i.e. if there is a fault in the fault protection system itself such as a fault queue overflow). The RT EPA has been designed to provide an interface with on-line admission and reconfiguration to support occasional task load sets which can be negotiated for admission in advance and then brought on-line by the controlling application – for example, a burst of faults can trigger a real-time safing mode. It is quite possible to leave an already admitted RT EPA task set memory resident (i.e. fully allocated, admitted as a separate set from the currently active set of tasks, but inactive) so that the transitions between such modes simply requires executing activation and deactivation sequences for the set (activation and deactivation of services are part of the RT EPA application programmer’s interface). Figure 5 depicts such a scenario. After such a scenario it would be possible, assuming a system recovery was possible, to reactivate the nominal task set. Copyright 2000 Sam Siewert, All Rights Reserved 24 Figure 5: Purely Event-Driven Real-Time Processing Sensor Processing Sensor Electronics Fault Identification Sensors Fault Handling Faulty Effector Environment 2.3.3 Loading characteristics of purely digital control applications Digital control applications must be isochronal end-to-end so that the output data is neither too early or too late since either case causes output jitter and will result in decreased stability of the control loop. Digital control applications are periodic, sensitive to jitter and/or dropouts, and typically have low execution variance for simple digital control applications like thermal control or other types of large time constant single variable control problems (depicted in Figure 6). However, applications such as spacecraft attitude determination and control have many more variances, but still require the same sort of end-to-end stability in the input/output rates. Attitude determination and control may require significant sensor filtering (e.g. extended Kalman filter) on the sensor interface and may require high load computations such as orbit determination and momentum management to produce a single actuator output. To summarize the characteristics of digital control, the execution jitter tends to be low, but output jitter must be as minimal as possible to prevent timing instability. Regularity of output can be much more important than consistency of sample inputs – i.e. many digital control applications are robust to occasional stale sample input, but are often sensitive to output jitter which may cause actuators to adjust control points non-uniformly over time and lead to instability. Copyright 2000 Sam Siewert, All Rights Reserved 25 Figure 6: Real-Time Digital Control Processing Sensor Processing Sensor Electronics Digital Control Law Sensors Effectors Actuator Processing Actuator Electronics Environment 2.3.4 Loading characteristics of mixed processing Mixed applications which include some combination of continuous media and event processing are becoming more prevalent with more sophisticated user interaction interfaces and applications. For example, virtual reality (VR) systems include frame-based rendering of scenes as well as event-driven user input (depicted in Figure 7). In general, the rendered scene only needs updated in the video buffer when there is a change in observer perspective (e.g. the observer changes viewpoint or moves in the VR world). So, rendering is, on the one hand, a frame-based service, but on the other hand more event driven than video processing since it is inherently a change-only service (video may be designed to be change-only, but is not so by nature like VR rendering). Another characteristic of rendering is that scene complexity drives the execution time dramatically. A VR rendering algorithm can therefore control execution variance by rendering all scenes, no matter how complex in terms of the world model with a fixed number of polygons, but this may not be desirable, so there may be execution variances in frame rendering times due to variations in the number of polygons required. Furthermore, it is possible to design the rendering to trade off frame rate with the number of polygons rendered to balance update rates and scene quality – for example, if one moves quickly through a VR world, then the execution variance can be controlled by reducing the quality of the frame rendering in favor of a higher update rate (half the number of polygons and twice the frame rate). Making these types of trades is not the subject of this research, but providing a framework to reliably schedule releases of VR tasks and control execution is an issue no matter how these trades are made. Copyright 2000 Sam Siewert, All Rights Reserved 26 Figure 7: Mixed Continuous Media and Event-Driven Real-Time Processing VR Control Application VR Control Application +/- milliseconds VR World Rendering/Model VR World Rendering/Model Data Glove Network Device Network Device Graphics Accelerator Data Glove Graphics Accelerator 2.3.5 Loading characteristics of mixed event-driven and digital control Mixed applications which include some combination of digital control and event processing are becoming more prevalent with more sophisticated semi-autonomous robotic systems that include shared control of a device by combining user interaction with autonomous agents [Bru93]. For example, a semi-autonomous robotic system is typically commanded at a very high level such as providing a navigational way-point and then allowing the system to autonomously achieve the goal with digital control to handle environmental perturbations and agent software to handle faults. A perfect example is a robotic rover which must deal with obstacles, control speed on varying surfaces and inclines, and navigate to a user defined way-point. All modern satellite systems fit this application class as well since they include hard real-time digital control for attitude determination and control, but also event-driven automation for fault detection and safing. Semi-autonomous systems are becoming more and more prevalent in space systems and robotics due to the high cost of tele-operations and the unfeasibility of tele-operation in some circumstances (e.g. large latency in communications with planetary rovers, underwater vehicles, and robotic manipulators) [Bru93] [Fle95]. On-line re-planning is often performed for rovers whereby a rover may encounter an unmapped obstacle and require an update to its obstacle model and on-line re-planning of its route. So, it would be typical for such an application to include a more deliberative planning agent function and a more reactive agent function which provides immediate avoidance of an obstacle (depicted in Figure 8). Copyright 2000 Sam Siewert, All Rights Reserved 27 Figure 8: Mixed Digital Control and Event-Driven Real-Time Processing Deliberative Agent Soft real-time planning Soft real-time management +/- minutes +/- sub-seconds Interactive Agent API Hard real-time software digital control Reactive Agent (Software Digital Control) +/- milliseconds HW/SW Hard real-time hardware digital control Digital Controller Environment 2.3.6 +/- microseconds Loading characteristics of mixed real-time applications in general In summary, mixed applications provide services with different requirements on deadline reliability and for various environment event releases with a range of system execution demands and variances in release frequency. This environmental event rate and system loading demands are summarized in Table 2 (required reliability would be a third dimension - not shown). Within the system space alone, it is also clear that there are a range of application demands upon the combination of I/O and CPU resources. This feature space is characterized by Figure 9. Table 2: Environmental Event-Rate Types with Application Examples of Each High Execution Variance Medium Execution Variance Low Execution Variance Periodic Digital video Quasi-Periodic Obstacle avoidance Digital audio Digital control and packet telemetry VR scene rendering VR user input (e.g. data glove) Aperiodic On-line replanning Fault handling Command processing It is not hard to imagine applications which include combinations of these types of environmental and system characteristics – e.g. a remotely piloted vehicle with video and audio streaming to a VR interface for semi-autonomous operator shared control of the vehicle – this application would have digital video, digital audio, digital control, telemetry, obstacle avoidance, on-line re-planning, fault handling and command processing on the vehicle processor and likewise digital video, digital audio, scene rendering, high bandwidth user input, telemetry and command processing within the VR control environment. Copyright 2000 Sam Siewert, All Rights Reserved 28 Figure 9: Feature Space of System I/O and CPU Requirements By Application Type I/O Bandwidth (buffer, rate) (frame X x Y, rate) Data Acquisition (DMA words/sec) Graphics Display Continuous Media Procesing (N polygons, rate) Rendering (N Sensors/Actuators, rate) Digital Control Numerical Simulation/Modeling (N elem, rate) CPU Loading As already noted, digital video, audio, and control are periodic and the range of execution variances stems from algorithm complexity and data manipulation size (larger data units are more likely to experience cache misses and pipeline stalls). For quasi-periodic applications the environment drives the release rate and typically this is bursty depending upon for example motion in the environment – if a rover moves through an environment quickly, then obstacle avoidance is higher rate and depending upon the complexity of the environment and the algorithm (local or more global), the execution variance can be high. Likewise, if an avatar moves through a VR world quickly, then the frame update rate and scene change rate increases. Aperiodic event sources are exceptions to normal processing – faults definitely fit this definition and associated fault handling which may also require on-line re-planning for highly autonomous systems such as a planetary rover. Command processing is completely driven by a mission and/or user input which is clearly aperiodic in nature although a maximum command rate may be imposed at a periodic rate. More examples could be filled in, but Table 2 is sufficient to establish that applications do exist which span this range of release and execution demands upon systems. Copyright 2000 Sam Siewert, All Rights Reserved 29 3 Related Research Most of the related research focuses on either admission tests and scheduling policy or on processing frameworks given an admission test and policy. The most extensive frameworks which compare to the RT EPA include the Chorus micro-kernel QoS work based on DM scheduling by Coulson et al [Co94], the RT-Mach QoS resource reservation framework research at Carnegie Mellon University [MerSav94] and work by Jeffay et al at the University of North Carolina to build frameworks that deal with release period and loading variance [JefGod99]. In all cases, the application domain focus for these frameworks is continuous media rather than applications. Soft real-time frameworks relate to the RT EPA in that they also accommodate execution and release period variances as well as marginal task sets, but these approaches ultimately require mapping tasks onto abstract levels of service [BraNu98]. It’s not clear how soft real-time QoS methods can support mixed hard and soft real-time applications since they inherently optimize and/or control loading to maximize service quality rather than controlling deadline reliability as the RT EPA does. Finally, the Pfair method provides an approach to scheduling releases in multiple windows much like the multi-epoch scheduling investigated with the RT EPA [Baruah97], however, Pfair reduces these windows into very small slices within a single release period rather than the more granular use of windows investigated with the RT EPA. A clear distinction of the RT EPA research is that while related research focuses on a particular application type such as continuous media or digital control in isolation -- not a mixed services applications and mixed deadline reliability requirements like the RT EPA research does. Furthermore, it is not clear at all that any of the related research attempts to take the deadline reliability view of scheduling a mixed set of services as the RT EPA research presented here does. 3.1 Hard Real-Time Research Related to RT EPA As noted in the introduction, the hard real-time theory relates to the RT EPA since it defines the least upper bound for marginal task sets that the RT EPA has been designed to handle in terms of deadline reliability and execution variance control. The work of Liu and Layland clearly defines the least upper bound for fixed priority scheduling and dynamic priority scheduling and therefore defines limits on the potential capabilities of the RT EPA. Furthermore, the RT EPA directly extends the DM equations developed by Audsley and Burns [Au93] to incorporate execution variance and reliability directly into the DM admission test. The RT EPA and CBDM goal is to minimize the pessimism in two classic hard real-time assumptions: 1) Bounds on execution for admission testing must be worst-case. 2) No assumptions regarding the phasing of service releases may be made. Furthermore, the RT EPA provides the same admission test as DM for guaranteed service requests, but allows such hard real-time services to coexist with soft real-time services by firewalling guaranteed services from the effects of soft overruns and dropouts. Copyright 2000 Sam Siewert, All Rights Reserved 30 3.2 Soft Real-Time Research Related to RT EPA and Confidence-based Scheduling The RT EPA has been designed to perform on-line admission testing, monitoring, interference fire-walling, and on-line negotiation for soft real-time applications and therefore research in soft real-time QoS is related. Potentially related QoS frameworks include: 1) RT-Mach processor capacity reserves [MerSav94], 2) Rialto [JonRos97], 3) the DQM middle-ware [BraNu98], 4) SMART [NiLam96], and 5) MMOSS [Fan95]. What is inherently different is that while all of these soft real-time execution frameworks provide methods to schedule marginal task sets at the cost of missing occasional deadlines, they do so without a direct specification of the reliability desired for each task. Furthermore, it is not clear how an application would negotiate for hard real-time guarantees mixed with soft real-time deadline reliability as is possible with the RT EPA. Like QoS methods, the RT EPA also provides an interface for negotiation of a service level – for the comparable QoS systems these service levels are abstractly quantified in terms of benefit and resource requirements, but for the RT EPA service levels are clearly quantified in terms of reliability [NuBra99]. It is assumed that the application designer will make decisions as to what the benefit is of negotiation for a particular reliability with the RT EPA, however, not only does the RT EPA ensure that it can meet requested reliability before admitting a task, it returns the reliability level possible whether a task is successfully admitted or not so that higher or lower reliabilities can be negotiated. Finally, QoS methods do protect one service level from another similar to the termination deadline fire-walling provided by the RT EPA. 3.3 Execution Frameworks Similar to RT EPA A number of pipeline mechanisms for continuous media have been developed [Gov91], [Co94], [Fal94]. However, most common implementations include application-level processing with device buffers mapped from kernel space into user-space rather than an in-kernel mechanism for executing trusted modules loaded into kernel space. Likewise, these memory-mapped implementations also employ user-level threads with split-level scheduling or bindings of user threads onto kernel threads. The splice mechanism is most relevant since it operates in kernel space using loadable modules or simple streaming as the RT EPA does, and was shown to have up to a 55% performance improvement [Fal94]. However, splice does not provide a configuration and on-line control interface like the RT EPA for scheduling. Many examples of periodic hard real-time digital control streams exist [Kl94], but no general mechanism for reliable real-time control of pipelines is known to exist. Research on process control requirements for digital control indicate that parametric control of a number of processing pipelines within a general operating system environment would be useful for sophisticated industrial applications. Finally, many real-time semi-autonomous and “shared control” projects are in progress [Bru93] [Fle95], including applications where occasional missed deadlines would not be catastrophic [Pa96]. Copyright 2000 Sam Siewert, All Rights Reserved 31 4 Scheduling Epochs On-line monitoring of execution not only provides feedback on deadline performance, but gives insight into the load distribution on the system. The traditional hard real-time approach to scheduling is to analyze loading over the longest period for all tasks in the system since the deadline guarantees are based on estimating worst-case interference along with actual resource demands by any given task in the system. The DM approach somewhat improves upon the RMA least upper bound computation since it lends itself more to on-line admission given iterative interference formulation compared to a simple bound. With DM, an admission test must use iteration to compute utility and interference for each period (or deadline) in the system -- at the cost of algorithm complexity in the admission test compared to the quickly computed RMA least upper bound. With either approach, the basic assumption is that there is one basic steady state release model for the system and if the system has different modes of execution (i.e. vastly different task sets and releases), then each mode will be separately analyzed for scheduling feasibility. A very good example of this is the Space Shuttle PASS (Primary Avionics SubSystem), which has a high, medium, and low rate executive and each one of these threads of execution runs different software based on a major and minor flight software mode for the Shuttle phase of flight (e.g. re-entry and ascent are totally different modes) [Carlow84]. 4.1 Multiple On-line Scheduling Epoch Concept Definition In contrast to the traditional RMA and DM hard real-time view of releases is the extreme case of the Pfair algorithm which reduces the window of scheduling feasibility not just to a window smaller than the longest overall period, but actually to a window shorter than even the shortest release period. The goal of Pfair is to ensure that all tasks make progress at a steady rate proportional to the weight of the task (utility) [Baruah97]. To do this, the Pfair scheduler must slice a release up into many smaller releases with intermediate pseudo deadlines. Multi-epoch scheduling is at a much higher level of granularity, but lies between the Pfair extreme and the RMA on-line mode extremes by considering the possibility that multiple modes or scheduling epochs can be active at the same time, but only one of the epochs can be released at any given point in time. Figure 10: Multiple Epochs of Scheduling Active Simultaneously s1 s2 s3 e1-r1 e1-r2 e2-r1 s2 & s3 e2-r2 e1-r3 e2-r3 s1 & s3 s2 &Siewert, s3 => admitted toReserved all e1 epoch releases with unique negotiation Copyright 2000 Sam All Rights s1 & s3 => admitted to all 32 epoch releases with unique negotiation 32 While the multi-epoch capability has not yet been implemented in the RT EPA, the framework could be extended to provide specification of epochs and admission of threads to each epoch with virtual management of the resources available to each epoch. As already noted and apparent in Figure 10, the major restriction on the proposed RT EPA support for multi-epoch scheduling is that the epochs do not overlap, i.e. that they are mutually exclusive in time. Much like Pfair, a release which originally executes over a period spanning two or more epochs can be decomposed into two separate releases in each of the two epochs. This capability is supported by the RT EPA with on-line monitoring which makes clear highly loaded and relatively less loaded sub-periods in the longest overall release period. Furthermore, the RT EPA pipeline framework for decomposing single releases into multiple synchronized releases would require that pipelines run completely in a single epoch – pipelining between epochs would be indirectly possible through careful configuration by the application. 4.1.1 Admission and Scheduling Within an Epoch The RT EPA concept for epochs assumes that normally multiple services would be released in each epoch, but that epochs themselves would be released with known phasing such that there is no need to consider the possibility of two epochs being released in a critical instant – epoch phasing must be fully specified to the RT EPA and the RT EPA admits services to an epoch just like it does to a system. The concept of multiple epochs is most useful when there are clear event release epochs which in turn implies unique service epochs for them. It is possible to divide service releases into logical epochs even when event releases do not naturally fall into epochs, but this simply introduces more RT EPA management overhead. Therefore, admission to an epoch just like admission to a system requires: 1) Service code to run 2) Release frequency within the epoch, resource needs (Cexp), and deadline requirements 3) Desired service quality 4) Event or time release source of the epoch For each service admitted into an epoch, the RT EPA must assign resource usage priority and monitor actual usage and control overruns just like it does for a single system. So, multiple epochs do greatly increase management overhead. If a service is admitted to more than one epoch, then it will have multiple release and execution contexts and therefore if viewed from the system level (over all active epochs), it actually has dynamic priority, but fixed priority within a single epoch. Furthermore, the most simple and perhaps useful application of epochs is to subdivide the scheduling problem and allocate independent execution sequences to one of the epochs with epoch release requiring no dispatch or preemption of another epoch because all epochs are released by other epochs. 4.1.2 Active epoch policy Just like a service in a system, an epoch must have a policy for becoming the active epoch (equivalent of a service being dispatched). Since epoch overhead is significant and since it is Copyright 2000 Sam Siewert, All Rights Reserved 33 envisioned that most applications will have limited (although significant) need for multiple epochs, the proposed RT EPA active epoch policy is simple: 1) The epoch must be based upon an event that is part of the normal system event release stream. 2) The epoch is active to a deadline equivalent to the longest period service admitted to the epoch. 3) Epochs are dispatched and preempted by priority assigned according to the epoch release period. Due to the overhead of multiple levels of dispatch and preemption and the overhead of adjusting service priorities for the currently active epoch, it is envisioned that most systems would make use of only a few epochs at most. Furthermore, in most cases epochs will never need to preempt each other if the simplest phasing rule is applied which is namely that epoch e1 completion releases epoch e2. 4.2 Equivalence of EDF and Multi-Epoch Scheduling in the Limit Within each epoch every service has a unique priority based upon its deadline within that particular epoch. The service with the shortest deadline has the highest priority within the epoch by DM and CBDM policy as applied to the epoch concept. In the limit, a system could be decomposed into as many epochs as there are event releases so that only one service is released in each epoch. The priority assigned to a service would be adjusted according to the active epoch and active service within that epoch even though there is only one service. Other pending epochs at the time of the realization of a new epoch would require demotion of their services so that the current epoch can be handled. If the active epoch policy is to make the epoch active which is shortest, and by definition that one service has the earliest deadline (only deadline in fact) in that epoch and it is assigned highest priority within the epoch – thus the earliest deadline service always has highest priority in the system – and the active epoch is always the shortest one, so this is the definition of EDF. Of course this multi-epoch limit suffers the same problem as EDF, the on-line identification of epochs would impose so much overhead when reduced to this fine granularity, such that it is infeasible. Given the multi-epoch framework and policy, clearly if all epochs are decomposed to a single service, then in essence, this is specifying that all service priorities be adjusted with each service release and the epoch/service with the shortest deadline is dispatched, so on every service release the shortest deadline service is effectively dispatched. 4.3 Application of Multi-Epoch Scheduling During the RT EPA research, the question arose as to whether it would be possible to take this concept of scheduling feasibility analysis for modes and use the RT EPA monitoring capability to identify multiple modes and redistribute releases. For example, a video processing application could be viewed as having a data acquisition and a data processing mode; generally there time to switch from one mode to the other would be very short. Once modes are identified and characterized, the RT EPA can help an application developer redistribute releases in order to take Copyright 2000 Sam Siewert, All Rights Reserved 34 a single mode which cannot be scheduled reliably and turn it into two modes, both on-line at the same time, that can be scheduled reliably. In essence, we have on-line switching between modes at a high frequency (compared to, e.g., Shuttle software which switches only a few times during a mission), and we use the RT EPA monitoring to identify the modes and reorder releases. Ultimately it is envisioned that at the granularity of threads, the RT EPA can facilitate on-line redistribution of releases to improve scheduling. 4.3.1 SIRTF/MIPS Multi-Epoch Scheduling Example The importance of scheduling epochs is clearly demonstrated by the RT EPA monitoring experiments with the SIRTF/MIPS software. The SIRTF/MIPS multi-epoch decomposition was needed to solve a loading and deadline reliability problem, but it is important to note that epochs as applied to SIRTF/MIPS were only released by other epochs (there was not concept of epoch preemption and dispatch) – all scheduling epochs in SIRTF/MIPS were released by the completion of another epoch. The MIPS epochs included: e0: The instrument ready state where telemetry is gathered and processed and clocking hardware is ready to start clocking out exposure data and the ADC channels are ready to digitize the data. e1: An exposure start event ends e0 ready and starts e1 exposure start sequencing and clocking synchronization. e2: An exposure collection cycle start event ends e1 and starts e2 data collection and processing for the steady state prior to detector electronics reset to avoid saturation. e3: A programmed detector reset event ends e2 and starts e3 data processing completion (no data ready). The e3 epoch is ended by one of two events – either return to the ready state and e0 or return to data collection and e2. The segmentation of the SIRTF/MIPS scheduling into these 4 service epochs made it possible to analyze the scheduling feasibility in each independently (e2 and e3 timing summarized in Table 5). Ultimately only one priority was adjusted between epochs and only one allocation change was made between epochs to provide system level scheduling feasibility, but without this dynamic priority adjustment and this reallocation of loading, the SIRTF/MIPS instrumentation software never would have been able to meet deadlines for the most CPU intensive exposures. The RT EPA kernel level monitoring capabilities were used to identify the loading in each epoch and to determine priorities in each epoch. Initially, exposures could not be scheduled reliably with the software due to execution variances that were causing processing to miss deadlines and time out. The results of this experiment are provided in Section 8. Each epoch identified for SIRTF/MIPS was clearly related to a real-world event release epoch. One interesting fact is that the hardware characteristics were designed to provide better granularity in processing events. First, a 16 Kword FIFO was proposed for the detector data digitization sources which lead to a highest frequency data ready event rate of 262 milliseconds as shown in Table 3. Copyright 2000 Sam Siewert, All Rights Reserved 35 Table 3: Single Epoch Design of the SIRTF/MIPS Video Processing With 16 Kword FIFOs task ID Description Release Period (msecs) Epoch Compressed Frame Set 2621.44 1 Si FIFO driver (SOURCE) 262.144 2 Ge FIFO driver (SOURCE) 917 3 Science FIFO driver (SINK) 131.072 4 Ge Processing and Compression 917 5 Si Processing and Compression 262.144 6 Science Grouping 1048.576 Note that the Science FIFO service rate has a period of 131 milliseconds, but that is a sink event rate rather than a data ready source event rate. Processing can only start when data is ready, so it the source event rates are what drive the overall pipeline, and the data ready event rate would have been a 262 msec period for the 16 Kword FIFO. In order to better distribute loading and releases over time, it was decided that a 4 Kword FIFO be used instead of a 16 Kword FIFO. This dropped the source event rate to a period of 65 milliseconds. So, this allowed for design of a pipeline with a fundamental 65 millisecond release period driven by the source. Shorter releases provided for better distribution of interface servicing in the system as a whole (summarized in Table 4). Table 4: Single Epoch Design of the SIRTF/MIPS Video Processing With 4 Kword FIFOs task ID Description Release Period (msecs) Epoch Compressed Frame Set 2621.44 1 Si FIFO driver 65.536 2 Ge FIFO driver 233.02 3 Science Link FIFO driver 32.768 4 Ge Processing and Compression 131.072 5 Si Processing and Compression 65.536 6 Science Grouping 1048.576 Epoch e2 and e3 are epochs that coexist during steady-state processing of exposure data to compress it for downlink to the ground. The e0 and e1 epochs are only active during the initialization of an exposure and the system always returns to e0 once an exposure is finished. So, the instrument receives an exposure command transitions from e0 to e1 and then to e2 and alternates between e3 and e2 until the exposure is finished with e2 that finally returns to e0. In the case of SIRTF/MIPS, the scheduling epochs are coexistent, but mutually exclusive and the release phasing is well known. This is the ideal situation for multi-epoch scheduling. Copyright 2000 Sam Siewert, All Rights Reserved 36 Table 5: Multiple Epoch Design of the SIRTF/MIPS Steady-State Video Processing task ID Description Release Period (msecs) th Epoch 2 Reset and Initial Sample 1/8 Frame Period – No data available 589.824 1 Si FIFO driver inactive 2 Ge FIFO driver 233.02 3 Science Link FIFO driver 32.768 4 Ge Processing and Compression 131.072 5 Si Compression 589.824 6 Science Grouping 589.824 Epoch 3 Sample Frame Period – Data available 2031.616 1 Si FIFO driver 65.536 2 Ge FIFO driver 233.02 3 Science Link FIFO driver inactive 4 Ge Processing and Compression 131.072 5 Si Compression 589.824 6 Science Grouping 1048.576 4.3.2 Multi-epoch Scheduling Compared to Multi-level Scheduling Multi-epoch scheduling differs from multi-level scheduling where a multi-threaded task is scheduled which in turn schedules its thread set during its release since multi-epoch requires that the software have only one scheduling policy and mechanism overall. The multiple epochs are simply sub-periods of a larger period related to releases and the leverage in the multi-epoch view of the longest period in the system is that it provides a method for analyzing how to adjust releases and relative phasing of releases when the application does in fact have such control. The RMA critical instant assumption pessimistically assumes all releases might be simultaneous and will have no predictable relative phasing when in reality multiple hardware interfaces can and likely will be synchronized and software events may be directly correlated to hardware events – e.g. processing is applied every tenth frame. The RT EPA facilitates multi-epoch scheduling by providing on-line monitoring and task set activation and deactivation, but it is envisioned that the RT EPA could be further extended to actually admit task sets to one or more epochs and then epochs would be admitted. It is interesting to note that this concept is further facilitated by lots of short execution task releases rather than small numbers of long execution releases – i.e. more smaller releases can be interleaved and redistributed easier. Overall, the idea of multi-epoch scheduling is to provide a method to control loading distribution to prevent missing deadlines due to transient overloads arising from execution variance. Copyright 2000 Sam Siewert, All Rights Reserved 37 5 Real-Time Execution Performance Agent Framework The RT EPA provides a framework for both hard and soft real-time threads. The definition of hard real-time is well understood and universally accepted, but the definition of soft real-time is not yet universally understood and accepted. The RT EPA implements the traditional hard realtime admission policy, underlying priority preemptive priority assignment policy, and will safe the entire system when a hard real-time termination deadline is missed. For soft real-time, the RT EPA allows for bounded overruns and for service dropouts. Since most real-time applications are driven by events and data/resource availability, the RT EPA provides methods for setting up phased data processing pipelines between a source and sink. Finally, the RT EPA has been implemented as VxWorks kernel extensions and application code, but can be ported to most modern operating systems and microprocessors which provide priority preemptive scheduling of multiple threads, kernel loadable modules for access to kernel task state and context switching, access to a real-time clock with microsecond or better accuracy. asynchronous real-time signaling (e.g. POSIX 1003.1b compliant) [POSIX93], priority inheritance and ceiling protocol mutual exclusion semaphores, and binary signaling semaphores. 5.1 Design Overview This basic in-kernel pipeline design is similar to the splice mechanism [Fal94], but the RT EPA API, performance monitoring, and execution control are much different. Each RT EPA module, shown in Figure 11, is implemented as a kernel thread configured and controlled through the EPA and scheduled by the CBDM (Confidence-Based Deadline Monotonic) algorithm. Figure 11: In-Kernel Pipe with Filter Stage and Device Interface Modules Application system call kernel API Execution-Performance Agent Device Interface Pipe-Stage Filter Device Interface HW / SW Interface Source Device Sink Device The controlling application executes as a normal user thread. The RT EPA mechanism is efficient due to removal of overhead associated with protection domain crossings between device Copyright 2000 Sam Siewert, All Rights Reserved 38 and processing buffers, and reliable due to kernel thread scheduling (compared to split-level scheduling of user threads). In the case of RT EPA implementation in a single user / protection mode operating system like VxWorks, the significance of the in-kernel execution is inconsequential, but does not affect the basic design. The RT EPA API provides configuration and execution flexibility on-line, with performance-oriented “reliable” execution (in terms of expected number of missed soft deadlines and missed termination deadlines). 5.1.1 Pipeline Time Consistency and Data Consistency The RT EPA may be used to admit and schedule any set of asynchronous services, but pipelines are typical for continuous media and digital control real-time applications. The advantages of designing a real-time data processing application as a pipeline include: 1) testability of individual stages, 2) on-line configuration control of the pipeline processing stages, 3) stage super-frequencies and sub-frequencies, 4) buffer holds for isochronal output, and 5) stage load distribution and potential for parallel processing. These are all advantages of the pipeline approach compared to for example having a single interrupt driven executive that provides equivalent processing in a single thread. However, the most fundamental advantage of pipelines is that they provide flexibility in terms of time and data consistency. Requirements for time and data consistency may vary and the RT EPA provides a simple way of configuring a pipeline to meet vary requirements. For example a digital control application may have strict time consistency requirements, but fairly relaxed data consistency requirements. Often for digital control it is more important that outputs be made on a very regular time interval to system actuators, but since sensors are usually over-sampled compared to outputs, the consistency between sample data and producing outputs is less important. If an output stage sometimes uses the three most recent samples and other times uses two most recent and an old sample just missing output of the third, that’s not particularly a problem (i.e. it does not affect stability or responsiveness), but if there is output jitter, this may compromise stability [Tin94]. In other cases, it may be more important to have a fully synchronized pipeline such that each stage is release upon data availability from the preceding stage. The RT EPA supports both data and time consistency requirements through the pipeline configuration API. 5.1.2 Pipeline Admission and Control The RT EPA API is intended to allow an application to specify desired service and adjust performance for both periodic isochronal pipelines and for non-isochronal application execution as well. As demonstrated in the RACE and SIRTF/MIPS examples as well as explored theoretically, many scenarios exist for on-line RT EPA service re-negotiation for continuous media, digital control, etc. [Si96]. For example, a continuous media application might initially negotiate reliable service for a video pipeline with a frame-rate of 30 fps, and later renegotiate on-line for 15 fps so that an audio pipeline may also be executed. An application loading pipeline stages or other realtime service threads must at least specify the following parameters for a service epoch: 1) Service type for particular service or pipeline: <guaranteed, reliable, or besteffort> i) Execution model: <Cworst-case for guaranteed, Cexpected for reliable, or none for besteffort> Copyright 2000 Sam Siewert, All Rights Reserved 39 ii) Off-line execution samples for Cexpected;<{Sample-array}, [distribution-free or (normal, σ, Cexpected)]> 2) Input event source (source must exist as stage or device interface): <release_model> The application must also provide and can control these additional parameters on-line during a service epoch: 5) Desired termination and soft deadlines with confidence for reliable service: <Dterm, Dsoft, term-conf, soft-conf> 6) Delay time for output response (earlier responses are held by EPA): <Tout> 7) Release period (expected minimum inter-arrival time for aperiodics): <T> The exact details of the parameter types and the API are discussed completely in the following sections and can also be found in the RT EPA application code specification in Appendix A. The approach for scheduling RT EPA thread execution is based on the EPA interface to the fixed priority DM scheduling policy and admission test called the EPA-DM approach here. The EPA-DM approach supports reliable soft deadlines given pipeline stage execution times in terms of an execution time confidence interval instead of deterministic worst-case execution time (WCET). Also noteworthy, the RT EPA facility uses two protection domains; one for user code and one for operating systems code. However, the RT EPA facility allows trusted module code to be executed in the kernel protection domain. We have focused on the functionality of architecture, relying on the existence of other technology such as that used in the “SPIN” operating system [Be95] to provide compile time safety checking. The negotiation control provided by RT EPA is envisioned to support isochronal event-driven applications which can employ and control these pipelines for guaranteed or reliable execution performance. 5.2 RT EPA Traditional Hard Real-Time Features The simplest traditional definition of hard real-time systems is that the utility of continuing execution beyond a hard real-time deadline is not only futile, but may even have a negative utility – i.e. it may damage the system even more than simply safing it. As such, if an RT EPA thread is designated to have a guaranteed deadline instead of the other two options (reliable and besteffort), then when such a deadline is missed, the RT EPA first calls an application specific system safing callback provided at initialization time and then deactivates all threads under RT EPA control fully halting the system in the safed state. So, the Dterm for a hard real-time thread in an RT EPA controlled system is a traditional hard deadline that not only has service implications, but actually results in termination of all services and safing of the whole system. 5.3 RT EPA Soft Real-Time Features The RT EPA defines soft real-time in three ways: 1) tasks may have bounded overruns on a given release, 2) tasks may have occasional failures to complete a release, and 3) release period may jitter. This is based upon the definition adopted for the RT EPA which is that a soft real-time Copyright 2000 Sam Siewert, All Rights Reserved 40 task has a utility function [Au93] such that it is worthwhile to continue services and keep the system running despite deadline overruns, misses, and release jitter. 5.4 RT EPA Best Effort Features The RT EPA provides best effort scheduling for things like EDAC memory scrubbers, slackstealing diagnostics, and any other number of services that have not processing deadlines at all. These types of threads/services still must register with the RT EPA and be admitted and activated so that it is ensured that they do not interfere with the guaranteed and reliable services. In fact, any execution outside of the RT EPA admitted thread set other than system overhead (context switching and interrupt service routines) will completely defeat the RT EPA. This is normally prevented by the RT EPA option to demote all other tasks at initialization. The best effort tasks may be assigned an importance level by the order in which they are admitted (i.e. a best effort task admitted earlier than another best effort task will preempt it when both happen to be ready to run). 5.5 RT EPA Data Processing Pipeline Features In addition to hard and soft real-time features, the RT EPA provides pipelining and isochronal output features to simplify digital control and continuous media application implementations. For pipelining, when services are activated, they may provide a release complete callback (or simply provide NULL if they do not want this callback). Furthermore, each service may be event released by a binary semaphore, and therefore it is simple to provide a pipeline of services with phased execution by performing semaphore gives in the release completion callback. If the pipeline has requirements to execute a stage every N cycles instead of at the fundamental pipeline frequency, then the application simply needs to provide a callback which gives the semaphore every N releases rather than every release. Note that this is a nice re-negotiation feature. For example, an application might process video frame data for both local and remote display – local display capability might be 30 fps, whereas due to the limitations of the communication medium with the remote host, the capability might have to be downgraded to 10 fps compressed. Finally, when a release completion callback is provided, it is also necessary to specify whether that callback should be made isochronal or not. An isochronal callback requires that the RT EPA will wait to make the callback after release until the specified Tout time. The value Tout is a period of time after release (and therefore if it is less than the current response time, the serviceReleaseCompleteCallback will be called immediately, and otherwise, a timer will be set and the callback will be made after the specified time). This can be particularly useful for digital control processing pipelines which are sensitive to output jitter (i.e. stability may be affected by jitter in actuation of control devices) and as well for continuous media processing pipelines where early frame presentation can lead to overall poor quality of service. 5.6 RT EPA Implementation The RT EPA provides a service negotiation and control interface, deadline management in terms of execution time model confidence, specification of release event for pipelining inputs to outputs, and it returns admission results which are not just accepted or rejected, but if accepted, what reliability can be provided given the current thread set. Once the RT EPA brings a thread set Copyright 2000 Sam Siewert, All Rights Reserved 41 on line, it provides monitoring of actual execution performance and firewalls threads from occasional overruns by soft real-time threads. These RT EPA capabilities are provided by three basic components: 1) The service and configuration API, 2) The kernel-level monitoring and control, and 3) The RT EPA server. 5.6.1 RT EPA Service and Configuration API The RT EPA service negotiation and configuration API is the primary interface that the real-time system application uses in order to admit threads and bring the system on line with a specific pipeline configuration and then to obtain performance information once the system is online for the purpose of re-negotiation. In this section a review is given of the major API functions and their use and required arguments are explained. 5.6.1.1 RT EPA System Initialization and Shutdown These two API functions are used to start up and shutdown the entire RT EPA system. The initialization requires the application to provide a callback function to safe the entire system in case a guaranteed service termination deadline is missed. The initialization mask specifies basic features including whether a performance monitoring server (an RT EPA service) should be started, whether standard system tasks in VxWorks should be demoted, and finally whether an active idle task should be admitted best effort. The idle task will cause preemption when there are no processor demands – otherwise tasks completions where there are no other demands will not be visible. If the mask includes a performance monitoring server, then a monitoring period must also be specified. The shutdown includes a mask to specify whether VxWorks system task priorities should be restored and otherwise this function simply deactivates all active RT EPA tasks and completely releases all resources under control of the RT EPA. The initialization function includes specification of a system safing callback, a mask for specification of the type on on-line monitoring desired, and a monitoring period: int rtepaInitialize(FUNCPTR safing_callback, int init_mask, r_time monitor_period); The shutdown function simply specifies whether the system should be fully restored to boot-up system task configuration or not with a shutdown mask: int rtepaShutdown(int shutdown_mask); 5.6.1.2 RT EPA Service (Thread) Admission and Dismissal RT EPA service/thread admission is discussed in detail in Section 6.3. The admission does not activate the service, but makes activation possible assuming the service can be admitted. Dismissal of a task causes deactivation and deletion of the service as well as release of all Copyright 2000 Sam Siewert, All Rights Reserved 42 resources associated with it. This service must be re-admitted if it is ever to be activated again in the future. The service admission and dismissal functions include: int rtepaTaskAdmit( int *rtid, enum task_control tc_type, enum interference_assumption interference, enum exec_model exec_model, union model_type *modelPtr, enum hard_miss_policy miss_control, r_time Dsoft, r_time Dterm, r_time Texp, double *SoftConf, double *TermConf, char *name); int rtepaTaskDismiss(int rtid); 5.6.1.3 RT EPA Task Control The RT EPA includes task control functions which are only valid for previously admitted services. These functions include: int rtepaTaskActivate( int rtid, FUNCPTR entryPt, FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback, enum release_complete complete_control, int stackBytes, enum release_type release_type, union release_method release_method, uint Nonline); int rtepaTaskSuspend(int rtid); int rtepaTaskResume(int rtid); int rtepaTaskDelete(int rtid); int rtepaIDFromTaskID(WIND_TCB *tcbptr); int rtepaInTaskSet(int tid); Copyright 2000 Sam Siewert, All Rights Reserved 43 5.6.1.4 RT EPA Release and Pipeline Control Three major functions are provide by the RT EPA API for pipeline control configuration. First, a function to specify source interrupt release: int rtepaPCIx86IRQReleaseEventInitialize( int rtid, SEM_ID event_semaphore, unsigned char x86irq, FUNCPTR isr_entry_pt); Second, a function for specifying release of processing stages between the source and sink interfaces: void rtepaPipelineSeq( int src_rtid, int sink_rtid, int sink_release_freq, int sink_release_offset, SEM_ID sink_release_sem); Third and finally, a function for specifying whether a service should provide isochronal output: void rtepaSetIsochronousOutput(int rtid, r_time Tout); 5.6.1.5 RT EPA Performance Monitoring The RT EPA provides a callback specification for renegotiation so that an application can handle performance failures: int rtepaRegisterPerfMon(int rtid, FUNCPTR renegotiation_callback, int monitor_mask); Furthermore, the RT EPA can be configured to automatically update all performance data for a service automatically using rtepaPerfMonUpdateService or alternatively can update performance data on demand through the rtepaPerfMonUpdateAll function. int rtepaPerfMonUpdateAll(void); int rtepaPerfMonUpdateService(int rtid); Finally, the following specific performance criteria can be computed on demand for a given service: r_time rtepaPerfMonDtermFromNegotiatedConf(int rtid); r_time rtepaPerfMonDsoftFromNegotiatedConf(int rtid); double rtepaPerfMonConfInDterm(int rtid); double rtepaPerfMonConfInDsoft(int rtid); double rtepaPerfMonDtermReliability(int rtid); double rtepaPerfMonDsoftReliability(int rtid);r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid); Copyright 2000 Sam Siewert, All Rights Reserved 44 r_time rtepaPerfMonClow(int rtid); r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRhigh(int rtid); r_time rtepaPerfMonRlow(int rtid);RT EPA Execution Model Utilities The RT EPA provides a utility function to save and load execution models from actual on-line service including: int rtepaLoadModelFromArray(r_time *sample_array, r_time *sample_src, int n); int rtepaTaskSaveCactexec(int rtid, char *name); int rtepaTaskLoadCactexec(r_time *model_array, char *name); 5.6.1.7 RT EPA Information Utilities In order to facilitate performance data collection and analysis, the RT EPA provides print functions which summarize on-line performance including: int rtepaTaskPrintPerformance(int rtid); int rtepaTaskPrintActuals(int rtid); int rtepaTaskPrintCompare(int rtid); 5.6.1.8 RT EPA Control Block The RT EPA maintains a data structure associate with each service/thread that has been admitted and is therefore monitored and controlled by the RT EPA. This data structure can be indexed by the rtid, which is a handle for the service/thread much like a VxWorks taskid or Unix PID. The RT EPA CB (Control Block) is defined as follows (note that not all fields are shown here, just those of interest since some fields are only internally used by the RT EPA): struct rtepa_control_block { /* Service type */ enum task_control tc_type; enum interference_assumption interference_type; enum exec_model exec_model; union model_type model; /* Release and deadline specification */ enum release_type release_type; union release_method release_method; r_time Dsoft; r_time Dterm; r_time Texp; r_time Tout; enum hard_miss_policy HardMissAction; FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback FUNCPTR entryPt; char name[MAX_NAME]; Copyright 2000 Sam Siewert, All Rights Reserved 45 int stackBytes; char Stack[MAX_STACK+1]; int RTEPA_id; int sched_tid; WIND_TCB sched_tcb; WIND_TCB *sched_tcbptr; int assigned_prio; /* On-line state and performance updated on every release and/or dispatch/preemption in kernel */ int RTState; int ReleaseState; int ExecState; r_time Cexp; r_time Clow; r_time Chigh; ULONG prev_release_ticks; UINT32 prev_release_jiffies; ULONG last_release_ticks[MAX_MODEL]; UINT32 last_release_jiffies[MAX_MODEL]; ULONG last_complete_ticks[MAX_MODEL]; UINT32 last_complete_jiffies[MAX_MODEL]; ULONG last_dispatch_ticks; UINT32 last_dispatch_jiffies; ULONG last_preempt_ticks; UINT32 last_preempt_jiffies; ULONG app_release_ticks[MAX_MODEL]; UINT32 app_release_jiffies[MAX_MODEL]; ULONG app_complete_ticks[MAX_MODEL]; UINT32 app_complete_jiffies[MAX_MODEL]; uint Nstart; uint Nact; uint N; uint Nonline; r_time Cactcomp[MAX_MODEL]; r_time Cactexec[MAX_MODEL]; r_time Tact[MAX_MODEL]; uint Npreempts; uint Ndispatches; uint SoftMissCnt; uint HardMissCnt; uint HardMissReleasesTerminatedCnt; r_time HardMissCactcomp[MAX_MODEL]; r_time SoftMissCactcomp[MAX_MODEL]; uint ReleaseCnt; uint CompleteCnt; uint ReleaseError; uint CompleteError; uint ExecError; /* On demand or periodic server state and performance */ r_time Cexpactcomp; r_time Clowactcomp; r_time Chighactcomp; r_time Cexpactexec; r_time Clowactexec; r_time Chighactexec; r_time Texpact; double HardReliability; double SoftReliability; r_time ActConfDsoft; r_time ActConfDhard; }; Copyright 2000 Sam Siewert, All Rights Reserved 46 5.6.1.8.1 RT EPA CB Negotiated Service The task control is defined by enum task_control which may be {guaranteed, reliable, besteffort}. The interference assumption used in admission is enum interference and may be {worstcase, highconf, lowconf, expected}. The execution model may either be {normal, distfree}. Finally, the union model_type is defined as: union model_type { struct normal_model normal_model; struct distfree_model distfree_model; struct worst_case_model worst_case_model; }; The worst case model is the worst-case execution time, r_time Cwc, or may be a normal model or distribution free model defined by the structures: struct normal_model { /* for normal distribution supplied model */ r_time Cmu; r_time Csigma; double HighConf; double LowConf; double Zphigh; double Zplow; r_time Ntrials; }; struct distfree_model { r_time Csample[MAX_MODEL]; double HighConf; double LowConf; r_time Ntrials; }; 5.6.1.8.2 RT EPA CB Release and Deadline Specification All services must specify how they will be released with enum release_type release_type which may be either {external_event, single, internal_timer}. Depending upon the release type specified, the union release_method release_method must be populated with either a VxWorks timer or semaphore identifier based on this union defined as: union release_method { SEM_ID release_sem; timer_t release_itimer; }; The semaphore should be a standard VxWorks binary semaphore created with semBCreate or a standard POSIX 1003.1b compliant timer [POSIX93] created with timer_create. Copyright 2000 Sam Siewert, All Rights Reserved 47 The soft deadline r_time Dsoft; specifies the allowable overrun deadline which if overrun will result in the RT EPA calling the FUNCPTR serviceDsoftMissCallback if it is not NULL. The r_time Texp specifies the service release period and if FUNCPTR serviceReleaseCompleteCallback is not NULL, it will be called at the end of every release. The parameter r_time Dterm specifies the termination deadline, which if exceeded the RT EPA will take action according to the enum hard_miss_policy HardMissAction which may be either {restart, dismissal}. The release code entry point, name, stack size, and stack memory must be specified by FUNCPTR entryPt, name,stackBytes, and the pointer Stack. Finally, the RT EPA creates the unique RTEPAP_id handle for every service and maintains an association with the VxWorks sched_tid, WIND_TCB sched_tcb, WIND_TCB *sched_tcbptr; and int assigned_prio. 5.6.1.8.3 RT EPA CB On-Line Statistics and Event Tags The RT EPA kernel-level monitor tracks the service state three ways: 1) RTState, which is { 0=RT_STATE_NONE, 1=RT_STATE_ADMITTED, 2=RT_ACTIVATED, 3=RT_RESTARTED, 4=RT_DISMISSED, 5=RT_SUSPENDED}. 2) ReleaseState, which is { 0=RELEASE_NONE, 1=PEND_RELEASE, 2=RELEASED, 3=RELEASE_COMPLETED}. 3) ExecState, which is { 0=EXEC_STATE_NONE, 1=EXEC_STATE_DISPATCHED, 2=EXEC_STATE_PREEMPTED}. The remaining on-line kernel updated parameters are self-explanatory, however, it should be noted that the time stamps are all obtained by reading the real-time clock . Typically, the real-time clock is a count-up register state machine clocked by a real-time oscillator which generates an interrupt and resets itself when the register value is equal to the period count. As such, the realtime clock has a frequency (the oscillator frequency) and a period (number of oscillations before it hits zero). This is why the time stamps are composed of ticks and jiffies. The ticks are the interrupt count since the operating system was booted and the jiffies are the value of the count-up register. All other RT EPA times are relative times as defined by the type r_time, which in the VxWorks implementation is an unsigned long integer number of microseconds. So, relative times such as Cexp are accurate to a microsecond and have a maximum value of 4294 seconds. It is not anticipated that relative times will exceed 4294 seconds, but absolute times may. As a default, the real-time clock period is set such that the tick period is one millisecond and the jiffies must provide microsecond or better resolution. So, the time stamps have a maximum value of 1193 hours. This again seems reasonable since most systems will require going off line more frequently than every 49 days. At the very least, these ranges are more than sufficient for the research completed here. 5.6.1.8.4 RT EPA CB On Demand or Periodic Server Computed Performance Statistics The parameters Cexpactexec, Clowactexec and Chighactexec are respectively the average execution time computed from all on-line samples, the minimum execution time from all on-line Copyright 2000 Sam Siewert, All Rights Reserved 48 samples, and the maximum execution time from all on-line samples. Execution time is for a single release and only includes the sum of all times between the thread dispatches and preemptions over each release. The parameters Cexpactcomp, Clowactcomp, Chighactcomp are respectively the actual average, low, and maximum response times (time from interrupt driven release until final processor yield by thread). The Texpact, Tlowact, and Thighact are respectively the average, low, and maximum release periods. The period is the time from event release to event release. The parameters HardReliability and SoftReliability are computed from one minus the HardMissCnt and SoftMissCnt values divided by the total number of service completions. Finally, the parameters ActConfDsoft and ActConfDhard are computed from on-line response times by finding the confidence interval which contains the desired Dsoft and Dterm. The most significant parameters for re-negotiation are the HardReliability and SoftReliability and the ActConfDsoft and ActConfDhard parameters since a service that is missing deadlines can compare actual confidence in the desired deadline with that requested and then either accept a lower confidence or reconfigure to reduce resource demands. The reliability measures from all samples, not just those just buffered in the on-line model. The difference in samples can be huge since the MAX_MODEL parameter is 1000 by default. So, the reliability provides a statistically much more significant indication of the ability to meet deadlines given the current service configuration, however, it doesn’t provide an interval, just confidence in one particular deadline. 5.6.1.9 RT EPA Service Negotiation and Configuration Example The RT EPA service negotiation and configuration API is the primary interface that the realtime system application uses in order to admit threads and bring the system on line. This API is predominately used by the application start-up code which itself is not considered to have any real-time requirements since the system is by definition not yet on line. Typically this is acceptable since all thread may be configured before enabling hardware interfaces and associated interrupts which will quickly transition the system from off-line to on-line. So, as an example, a series of event-released data-pipelined threads could be admitted and the final action of the startup code would be to enable the source hardware and associated interrupts and then exit put the system on-line and to leave the application under the monitoring and control of the RT EPA from that point on. The following C code segment is an example for a single service system: rtepaInitiailize ( (FUNCPTR) service_hard_realtime_safing_callback, (PERFORMANCE_MON | DEMOTE_OTHER_TASKS CREATE_IDLE_TASK), active_monitoring_period | ); data_ready_event = semBCreate(SEM_Q_FIFO, SEM_EMPTY); rtepaPCIx86IRQEventInitialize(data_ready_event, irq, (FUNCPTR)service_isr); service_execution_model_initialization(); if( (test=rtepaTaskAdmit( &rtid[0], service_type_is_reliable, interference_assumption_is_worstcase, Copyright 2000 Sam Siewert, All Rights Reserved 49 execution_model_is_normal, hard_deadline_miss_policy_is_restart &service_execution_model[0], Dsoft[0], Dterm[0], T[0], &SoftConf, &TermConf, "tService1") ) != ERROR ) { printf("Service1 task %d can be scheduled\n", rtid[0]); } else { printf("Service1 task admission error\n"); return ERROR; } event_realease_type_info.release_sem = data-ready_event; rtepaTaskActivate( rtid[0], (FUNCPTR) service_entry_point, (FUNCPTR) service_soft_deadline_miss_callback, (FUNCPTR) service_release_complete_callback, service_complete_type_is_not_isochronous, service_tout_is_zero, service_stack_size, event_released_type_info, online_model_size); if( (rtepaRegisterPerfMon( rtid[0], (FUNCPTR) service_renegotiation_callback, (ACT_EXEC | ACT_RESP | ACT_FREQ | ACT_HRD_CONF | ACT_SFT_CONF) ) == ERROR) { printf("Service1 performance monitoring error\n"); } service_source_activate(); The key to the RT EPA initial service negotiation is that it is all off-line since there may be significant processing required to initially perform admission tests on a large number of threads and to initialize the RT EPA itself. The processing required for re-negotiation will be much less significant given a well designed system, infrequent. 5.6.1.10 RT EPA Admission Request and Service Specification The RT EPA admission request and service specification are made with the single API function rtepaTaskAdmit. The admission request and service specification must be made together Copyright 2000 Sam Siewert, All Rights Reserved 50 since admission is contingent upon the type of service requested. This is that standard interface through which all services are established before a system is taken on-line. An application can also renegotiate admission through this interface during run time in one of three ways: 1) best effort, 2) with a dedicated re-negotiation service the application must establish in advance, or 3) during previously negotiated time as a soft deadline fault handling procedure. Typically, service re-negotiation while the system is on line will be a combination of fault handling and either best effort or dedicated service re-negotiation. Once a service has been successfully negotiated, then faultiness in actual service due to poor execution modeling up front, a poor event-rate model, or programming error will be handled in real-time by the RT EPA as far as protecting other services from overrun interference, but it is up to the service fault handling and the application to handle re-negotiation. The RT EPA on-line model can provide significant help to the application as far as re-negotiation in terms of missed deadline frequencies, high and low execution times, expected execution time, and response time. For example, a service might be I/O bound, in which case the deadline miss frequency may be high, but the execution times are as expected. This means that the processing resources were sufficient, the execution model was good, but the I/O bandwidth was insufficient to support the frequency and input/output block size of the service. 5.6.1.10.1 Service Type The service type argument provide to rtepaTaskAdmit must be either guaranteed, reliable, or besteffort. If the service type negotiated is guaranteed, this has system implication – namely, missing a guaranteed termination deadline means that the RT EPA will call the system safing callback and then terminate all services and itself. As discussed previously, this implements the traditional notion of a hard real-time service. This service type should be negotiated for only in such circumstances where missing the service deadline will truly result in damage (negative utility) to the system as a whole rather than just that particular service. Most services that are soft, can occasionally be missed without catastrophic system consequence, and should negotiate for reliable service. Reliable means that the RT EPA will allow a soft deadline overrun up to the specified termination deadline, which in reality is a system firewall against uncontrolled overrun interference. When the thread/service overruns its soft deadline, the RT EPA will execute a soft overrun callback in that release context (these callbacks should have much shorter expected execution times than the difference between the soft and termination deadlines). A good example of a missed soft deadline callback action is to renegotiate service, release frequency or to reconfigure the data processing if possible to reduce loading. 5.6.1.10.2 Interference Assumption The interference assumption is fundamental to the overall performance of the RT EPA controlled system. The options for this are worstcase, highconf, lowconf, or expected. For worstcase, the scheduling admission test assumes that all interfering threads may execute up to their termination deadlines and therefore maximum potential interference is assumed for that service, but not for all services/threads, just those that execute to the deadline. It is important to note that the RT EPA allows the engineer to configure the admission test to have a specific interference assumption for each thread – so, if a service is hard real-time, then it is recommended that the worst-case interference also be assumed for this particular thread. For a soft real-time thread it is possible to assume worst-case interference, which means that any release jitter, Copyright 2000 Sam Siewert, All Rights Reserved 51 execution jitter, or completion jitter is wholly due to that thread’s characteristics (i.e. this might be a beneficial way to localize timing variances). However, since the thread is soft real-time itself, it is pessimistic to assume worst-case interference; therefore three other soft options are provided: highconf, lowconf, and expected. The expected assumption is typically advised for soft real-time threads since interference time for all higher priority threads is taken to be their expected execution times –the most likely scenario. This will lead to some release jitter and potential soft overruns and even termination deadline misses, but typically will meet the thread/service requirements. Some services may not require hard real-time performance, but may exhibit very high confidence in meeting deadlines. For this situation, an intermediate option, high confidence, a level between worst-case and expected can be specified which of course assumes that all interfering threads will execute up to their high confidence execution time interval. The final option of low confidence does not seem very useful other than for completeness and for providing an optimistic interference assumption. 5.6.1.10.3 Execution Model An execution model must be provided to the RT EPA for each service admitted. Two types of models are supported: a normal distribution model and a distribution free model which is simply a set of trials. Providing the model may seem to be burdensome, but it is necessary to accurately test scheduling feasibility; however the RT EPA can build this model off-line and then it can be provided as an input for subsequent on-line execution. However, this is only true for distribution free modeling. A distribution-free model simply takes a set of trials which are actual execution times and sorts them in order to compute a confidence interval. One drawback of the distribution free model is that the accuracy is directly in proportion to the size of the model. So, for example, if you want to know the execution time to a 99% confidence interval, you must provide at least 100 trials and for 99.9% you must provide 1000 trials. The RT EPA currently has an on-line maximum model size to support 99.9% confidence (i.e. 1000 trials per service). 5.6.1.10.4 Termination Deadline Miss Policy The hard deadline miss policy may be either restart or dismissal. In the case of restart, the RT EPA terminates the current release of a service when it attempts execute beyond the termination deadline, but all future releases are unaffected. Either way a thread is never allowed to overrun its termination deadline and therefore overruns and interference are ultimately fully bounded. If the policy selected is dismissal, then the application will have to completely renegotiate admission. 5.6.1.10.5 Release Period and Deadline Specification The soft and termination deadline as well as expected or desired release period must be specified for admission. The deadlines should be based upon system requirements and the release period definition depends upon the release type which is actually specified with the taskActivation interface. The specification of the soft and termination deadlines allows the RT EPA to compute deadline confidences based upon the execution model which may be retrieved from the admission test by providing a double precision storage location for each or ignored by providing a null pointer. The deadline confidences may also be retrieved at any time with the rtepaPerfReport API Copyright 2000 Sam Siewert, All Rights Reserved 52 function call. The confidence is always based on the current set of threads admitted. additional admissions will not invalidate the original negotiation, but will erode any margin. So, Since threads may be released either by events, time, or by single request, the period has meaning specific to each circumstance, but either way admission is based on the same period. In the case of an event released thread, then the period is the expected release period due to the external event (e.g. data ready ISR gives a semaphore). If the actual event rate is higher, then the RT EPA will not release the thread until the specified period is met or exceeded and period jitter and potential service dropouts will result. 5.6.1.11 Expected Performance Feedback It is possible to obtain performance information at any time, but the actual computation of performance is done during the caller’s negotiated service time, optionally by a performance service at RT EPA initialization (or it is best effort). If the RT EPA is initialized to perform periodic performance monitoring, then the performance API function calls will simply return the last globally computed value for the parameter, otherwise the function will compute and then return the value. The following C code API function calls may be made to obtain the latest performance information from the RT EPA based on on-line monitoring: 5.6.1.11.1 Global Performance Parameters Update API Functions The global performance update functions update all performance parameters on demand with: int rtepaPerfMonUpdateAll(void); Likewise on a periodic basis with: int rtepaPerfMonUpdateService(int rtid); 5.6.1.11.2 Deadline Peformance API Functions The deadline performance functions all return a particular performance value on demand related to service deadlines. These functions include: r_time rtepaPerfMonDtermFromNegotiatedConf(int rtid); r_time rtepaPerfMonDsoftFromNegotiatedConf(int rtid); double rtepaPerfMonConfInDterm(int rtid); double rtepaPerfMonConfInDsoft(int rtid); double rtepaPerfMonDtermReliability(int rtid); double rtepaPerfMonDsoftReliability(int rtid); 5.6.1.11.3 Execution Peformance API Functions The execution performance functions return estimates of execution times from actual monitoring on demand and include: r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid); Copyright 2000 Sam Siewert, All Rights Reserved 53 r_time rtepaPerfMonClow(int rtid); 5.6.1.11.4 Release Performance API Functions The release performance functions return service release performance on demand and include: r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRThigh(int rtid); r_time rtepaPerfMonRTlow(int rtid); r_time rtepaPerfMonTexp(int rtid); 5.6.1.12 RT EPA Task Activation and Execution Specification The RT EPA provides an API for task activation and specification of callbacks for soft deadline misses and for normal completion. 5.6.1.12.1 Service Execution Entry Point and Soft Deadline Miss Callback The soft deadline miss callback affords the service the opportunity to reconfigure (e.g. frequency or algorithm complexity) before its termination deadline. Typically, the time before the termination deadline after a soft deadline miss will be short, but most likely sufficient for a service to at the very least change frequency and or handle related faults. 5.6.1.12.2 Service Release Complete Isochronal Callback In order to simplify implementation of isochronal pipelines which require output rates that are regular for applications like digital control (stability is affected by actuator output regularity) and continuous media, the RT EPA API provides a release completion callback. This callback is made by the RT EPA according to specification of a delay up to a period immediately preceding the release period end. This allows for de-coupling of execution completion and response output so that isochronal outputs can be guaranteed despite jitter in phase between release and completion output. 5.6.1.12.3 Release Type and Event Specification The RT EPA provides for three types of thread releases: 1) event released, 2) time released, and 3) single release. In reality, all threads could be considered event released since a time released thread is really released by a clock event and a single release is released by a request event, but these types are useful from a practical standpoint depending upon the application needs for sequencing execution . For example, the data source in the system might need to be polled, in which case a time released thread can be set up on this interface so it can periodically check interface status and service as needed. Alternatively, a data source may provide interrupts when data is ready, in which case it is most easy to provide associate a semaphore with the interface ISR so that the RT EPA can release the interface servicing thread based on interrupts. In this case the semaphore, externally created, must be provided to both the RT EPA and the ISR. Finally, the single release thread provides a good option for exceptional events such as an out-of-band user request (e.g. dump diagnostic information while the system is on-line) or for non-critical fault handling. In this case, the application will admit the thread for the single release, and the RT EPA will automatically dismiss it upon release completion. Critical fault handling typically should be handled by a real-time periodic monitor at a reserved service level. Copyright 2000 Sam Siewert, All Rights Reserved 54 5.6.1.12.4 Service On-Line Model Size The on-line model will directly drive the ability to estimate confidence intervals. Distribution free confidence interval accuracy is derived from the number of samples and confidence possible is 1.0 - (1/N), so that for example given an on-line model size of 100, confidence interval accuracy is limited to 99%. The current maximum on-line model size is 1000 providing accuracy to 99.9%. 5.6.1.13 Service Performance Monitoring Specification The RT EPA has a negotiation monitoring capability that may be scheduled just like any other RT EPA task (so, it actually is self monitoring). This task periodically checks the on-line model of execution time, response time, thread release frequencies, soft deadline confidence, and hard deadline confidence. The frequency with which the RT EPA monitor runs is of course dependent upon system requirements and available resources, but typically is a much lower frequency than execution control which runs at the same frequency as the aggregate frequency of all RT EPA threads. This periodic monitoring capability simply reduces data collected on a context switch basis and compares on-line performance to the negotiated service and provided execution model. When there are discrepancies, then the RT EPA monitoring task executes a callback function on behalf of the application thread during its service time – typically this can be a soft real-time thread itself, however, the callbacks should be short and typically involve system reconfiguration. For example, changing release frequency, reconfiguring the algorithm for release, or eliminating the thread from the active set. The performance monitoring provides updates to the expected C and T parameters as well as re-estimating confidence in D term and D soft. The API simply provides specification of the monitoring rate and interface. 5.6.2 RT EPA Kernel-Level Monitoring and Control The kernel-level monitoring and control provided by the RT EPA is fundamental to tracking event performance and to controlling releases. The kernel monitor captures dispatch and preemption times as well as completion time and detects missed soft deadlines and hard deadlines. From this basic information capture, the RT EPA performance monitor and the service release wrapper code can detect release faults and prevent hard deadline overruns. The wrapper code around releases is minimal, but required since the wrapper actually provides the missed hard deadline termination – this wrapper code executes in the same context as the service. All RT EPA guaranteed and reliable services are intended to execute in kernel space. 5.6.2.1 Event Release Wrapper Code The RT EPA supports event release of pipeline service threads from interrupts or by internal events such as completion of processing by another pipeline service. The association of the release to service is made using the rtepaPCIx86IRQReleaseEventInitialize for a hardware interrupt release or using rtepaPipelineSeq for a software pipeline event release. The RT EPA takes the service code and wraps it with a function that provides the generic event release capability, over-run control, stage sequencing, and maintenance of the service release status. The best way to describe the RT EPA event release wrapper is to examine each part in detail, starting with release of the source device interface service code released typically by a hardware interrupt Copyright 2000 Sam Siewert, All Rights Reserved 55 and ending with the sink device interface code which must meet overall pipeline deadline and isochronal output requirements. The source device interface service code must be associated with the source interrupt as a first step in specifying an in-kernel pipeline. 5.6.2.1.1 ISR Release Wrapper Code A pseudo code specification of interrupt service routine release wrapper code is given here (please refer to Appendix A for the full source code API specification). void rtepaInterruptEventReleaseHandler(int rtid) { time_stamp_isr_release(rtid); (*rtepa_int_event_release_table[rtid].app_isr)(); update_release_timing_model(rtid); RTEPA_CB[rtid].ReleaseState = RELEASED; RTEPA_CB[rtid].ReleaseCnt++; semGive(rtepa_int_event_release_table[rtid].event_semaphore); } 5.6.2.1.2 RT EPA Event Release Wrapper Code A pseudo code specification of the event release wrapper code is given here (please refer to Appendix A for the full source code API specification). void event_released_rtepa_task(int rtid) { if((never_released(rtid)) && (no_hard_misses(rtid)) event_released_rtepa_task_init(rtid); RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; while(1) { semTake(RTEPA_CB[rtid].release_method.release_sem, WAIT_FOREVER); release_watchdog_timer_settime (rtid); (*RTEPA_CB[rtid].entryPt)(); release_watchdog_timer_cancel(rtid); if(RTEPA_CB[rtid].complete_type == isochronous) { delay_as_needed(rtid); } for(i=0; i < RTEPA_CB[rtid].NStages;i++) { handle_next_stage_release(rtid, i); } if(RTEPA_CB[rtid].serviceReleaseCompleteCallback != NULL) Copyright 2000 Sam Siewert, All Rights Reserved 56 (*RTEPA_CB[rtid].serviceReleaseCompleteCallback)(); RTEPA_CB[rtid].ReleaseState = RELEASE_COMPLETED; } } 5.6.2.2 Dispatch and Preempt Event Code A pseudo code specification of the dispatch and preempt kernel event code is given here void RTEPA_KernelMonitor(WIND_TCB *preempted_TCB, WIND_TCB *dispatched_TCB) { /**** RTEPA TASK DISPATCH ****/ if((rtid = rtepaInTaskSet(dispatched_TCB)) != ERROR) { RTEPA_CB[rtid].ExecState = EXEC_STATE_DISPATCHED; RTEPA_CB[rtid].Ndispatches++; update_dispatch_statistics(rtid); } /**** RTEPA TASK PREEMPTION ****/ if((rtid = rtepaInTaskSet(preempted_TCB)) != ERROR) { RTEPA_CB[rtid].ExecState = EXEC_STATE_PREEMPTED; RTEPA_CB[rtid].Npreempts++; update_preempt_statistics(rtid); /******** CASE 1: Release Completed ********/ if(RTEPA_CB[rtid].ReleaseState == RELEASE_COMPLETED) { record_completion_time(rtid); compute_cpu_time_for_release(rtid); compute_time_from_release_to_complete(rtid); update_deadline_performance_model(rtid); RTEPA_CB[rtid].CompleteCnt++; RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; } /******** CASE 2: Release In-Progress Preempted ********/ else { RTEPA_CB[rtid].Ninterferences++; update_cpu_time_and_response_time_models(rtid); } } } Copyright 2000 Sam Siewert, All Rights Reserved 57 5.6.2.3 Release Frequency The period between releases is tracked by the RT EPA so that release jitter can be detected. The release period is taken from a release event which must be specified in terms of either a timer or a semaphore associated with an external event (e.g. interrupt). The jitter between releases will be due to clock jitter or interrupt source jitter and is expected to be low, however, if an event rate is assumed to be periodic, and in fact it is not, this will be easily detected through this period jitter. It is important to note that if release jitter becomes significant, this most likely indicates that the environmental model for the event rate is faulty and this will not necessarily be localized to the current thread deadline performance, but will cause more or less interference than is expected, potentially causing other services to have deadline failures. Therefore, tracking this period allows the application to monitor event rates and handle system level faults. 5.6.2.4 Execution Time The kernel monitor computes execution time by detecting operating system scheduler dispatches and preemptions. The execution time of course does not include interference time when a release is preempted and requires re-release. The execution time will always be less than the response time due to release latency and due to the potential for release execution interference. 5.6.2.5 Response Time The response time is the best figure of merit for real-time system performance since the response time must be less than the relative deadlines and is an aggregate measure of end-to-end latency and jitter. The RT EPA determines response time through the use of an event descriptor table and the use of the kernel-level completion detection. The event descriptor table is time stamped when the release timer expires or when the event release interrupt is asserted. Ultimately all event releases must be tied to a hardware interrupt either directly or indirectly. Therefore, the RT EPA provides an ISR registration API function which wraps a traditional ISR entry point with the RT EPA event descriptor table updates required. For timer releases, the normal RT EPA wrapper function provides the event descriptor table update. When a service actually completes, the response time is computed by subtracting the event descriptor table release time (the time the real world event was first detected by the system) from the kernel-level release completion time. 5.6.2.6 Deadline Miss Management Management of missed deadlines is provided by the RT EPA for both soft and termination deadlines as well as traditional hard deadlines. Table 6 summarizes the RT EPA action in each of the three cases. Copyright 2000 Sam Siewert, All Rights Reserved 58 Table 6: RT EPA Deadline Management Summary Deadline Miss Type Soft Termination Hard (Same as Termination, but for guaranteed service level rather than reliable or best-effort) RT EPA Action a) The soft deadline miss is noted in the kernel level monitoring for that service, b) The registered callback for a soft deadline miss is called at the next event release a) The termination deadline miss is noted in the kernel level monitoring for that service, b) The termination timer asynchronously terminates the current release if the dismissal policy is restart, and otherwise the thread is deactivated and dismissed from the current set of online threads a) The system safing callback for a guaranteed service deadline miss is called b) The RT EPA is completely shut down 5.6.2.6.1 Terminate Execution that would Exceed Hard Deadline A fundamental feature of the RT EPA design and a requirement (in order to preserve the integrity of confidence-based scheduling) is that every thread release must have a termination deadline. The RT EPA provides support for both soft and hard real-time thread releases, so if a soft real-time thread were, for example, allowed to overrun indefinitely, it would introduce unbounded interference to the system. Several policies for thread deadline overrun control were considered. First, the RT EPA provides for specification of a soft deadline which a thread release is allowed to overrun and the RT EPA simply provides an application specific callback mechanism so that the application control may decide how to handle this execution fault. For example, if a thread is missing soft deadlines, then the application may want to renegotiate the thread release frequency for a degraded mode. Second, the RT EPA provides for specification of a harder termination deadline which if a thread attempts to overrun, the current release of that thread will be terminated, but future releases will still be made. As an option, the RT EPA allows for specification of policy on hard deadline misses to either be restart or expulsion. The default is restart and therefore future releases of the thread will continue to be made, but if the policy selected is expulsion, then the thread will be completely removed from the currently admitted set of on-line threads and any future execution of this thread would require re-negotiation by the application to readmit the thread. 5.6.2.6.2 Hard Deadline Miss Restart Policy The RT EPA hard deadline miss restart policy works by setting up a timing watchdog on the current release of each thread and if the thread does not complete its release and yield the processor prior to expiration of this watchdog timer, then the RT EPA will asynchronously terminate the release, but will enable future releases of the same thread from the thread’s normal entry point. This is done in the RT EPA by wrapping all thread release entry points with the following C code: while(1) { Copyright 2000 Sam Siewert, All Rights Reserved 59 if(RTEPA_CB[rtid].FirstRelease) event_released_rtepa_task_init(rtid); RTEPA_CB[rtid].ReleaseState = PEND_RELEASE; /* Wait for event release */ semTake(RTEPA_CB[rtid].release_method.release_sem, WAIT_FOREVER); /* Handle previous release soft deadline miss */ if( (RTEPA_CB[rtid].ReleaseOutcome == SOFT_MISS) && (RTEPA_CB[rtid].SoftMissCallback != NULL)) (*RTEPA_CB[rtid].SoftMissCallback)(); /* Set termination watchdog */ timer_settime( RTEPA_CB[rtid].Dterm_itimer, RTEPA_CB[rtid].flags, &(RTEPA_CB[rtid].dterm_itime), &(RTEPA_CB[rtid].last_dterm_itime) ); /******** Release exeuction ********/ (*RTEPA_CB[rtid].entryPt)(); /* Cancel termination watchdog */ timer_cancel(RTEPA_CB[rtid].Dterm_itimer); if(RTEPA_CB[rtid].ReleaseCompleteCallback != NULL) (*RTEPA_CB[rtid].ReleaseCompleteCallback)(); RTEPA_CB[rtid].ReleaseState = RELEASE_COMPLETED; } The watchdog timer will be canceled prior to the termination deadline in the typical case. 5.6.2.6.3 Termination Deadline Miss Dismissal Policy The RT EPA provides two policies for missed termination deadlines: 1) the current release is terminated, but all future releases are unaffected or 2) the service is deactivated and the thread is dismissed from the on-line thread set. 5.6.3 Performance Monitoring and Re-negotiation RT EPA performance monitoring can be accomplished in two ways: 1) active monitoring as a service or 2) passive monitoring during execution time of any particular service. In the case of active monitoring, the RT EPA is initialized with the performance monitoring option enabled and frequency specified. The RT EPA has an internal execution model for this specialized service it uses and it actually admits the service just like any other service. In this case, performance of all threads in the system is computed periodically and the last computed performance is available though a low-cost referencing function call to any requester. Furthermore, the performance monitor will make callbacks to any service which registers with it during system initialization at the end of each service release to indicate performance that is below the negotiated level. This callback has an event mask for each performance parameter: ACT_EXEC = expected actual Copyright 2000 Sam Siewert, All Rights Reserved 60 execution time, ACT_RESP = expected actual response time, ACT_FREQ = expected actual release period, ACT_HRD_CONF = expected termination deadline confidence and ACT_SFT_CONF = expected actual soft confidence. In the passive mode it is up to the services to poll the performance though the rtepaCurrentPerf API function call which computes performance parameters given the current on-line model and returns a mask indicating which parameters are not at or better than negotiated levels. 5.6.3.1 Soft Deadline Confidence The soft deadline confidence is computed by performance monitoring by computing the confidence in response time being less than the soft deadline – this can be an expensive computation for the on-line distribution free model. As an alternative, it is possible to request the soft deadline reliability which is simply a computation based upon the number of missed deadlines out of the total number of completions since the RT EPA went on-line. 5.6.3.2 Hard Deadline Confidence The hard deadline confidence is computed by performance monitoring by computing the confidence in response time being less than the termination deadline – this can be an expensive computation for the on-line distribution free model. As an alternative, it is possible to request the termination deadline reliability which is simply a computation based upon the number of missed deadlines out of the total number of completions since the RT EPA went on-line. Copyright 2000 Sam Siewert, All Rights Reserved 61 6 The Confidence-Based Scheduling Formulation The RT EPA on-line admission test is provided by a mathematical formulation which is an extension of the DM (Deadline Monotonic) equations developed by Audsley and Burns at the University of York. The DM equations have been extended to handle expected execution time and the RT EPA Dterm upper bound on overruns. The expected execution time is calculated from an execution model which provides a confidence interval for execution that allows for derivation of the deadline confidence for the service. 6.1 RT EPA CBDM Concept The concept of RT EPA CBDM thread scheduling for services and pipeline stages is based upon a definition of soft and termination deadlines in terms of utility and potential damage to the system controlled by the application [Bu91]. The concept is best understood by examining Figure 12, which shows response time utility and damage in relation to soft and termination deadlines as well as early responses. Figure 12: Execution Events and Desired Response Showing Utility earliest possible response earliest desired response desired optimal response latest desired response response utility release start time termination response failure: dropout degradation response damage time utility curve desired response interval WCET computation time distribution Cexpected best-case execution hold early response Rmin buffered response Ropt buffered response Clow, Dsoft signal Chigh, Dterm signal and abort The RT EPA design provides callback registration for the controlling application which the RT EPA will execute when either the soft or termination deadline is missed, and specifically will abort any thread not completed by its termination deadline. The RT EPA allows execution beyond the soft deadline which is a bounded overrun. Signaled controlling applications can handle deadline misses according to specific performance goals, using the RT EPA interface for renegotiation of service. For applications where missed termination deadline damage is catastrophic (i.e. termination deadline is a “hard deadline”), the service and/or pipeline must be configured for guaranteed service rather than reliable service. In this case, the entire system will be safed. The Copyright 2000 Sam Siewert, All Rights Reserved 62 key to the extended DM formulation for CBDM (Confidence-Based Deadline Monotonic Scheduling) is that overruns are always bounded and therefore interference in the system also ultimately has an upper bound mathematically. An extension to the well established DM scheduling policy and scheduling feasibility test is used in the RT EPA due to their ability to handle execution where deadline does not equal period and because of the iterative nature of the admission algorithm which provides for better determination of which threads can and can’t be scheduled rather than a monolithic thread set [Au93]. The RMA least upper bound simply provides scheduling feasibility for the entire set and no indication of what subsets may be able to be scheduled. Given the DM basis of CBDM, it is possible to renegotiate based on specific thread admission failures. This may often be true for the applications to be supported. One major drawback of the traditional DM scheduling policy is that to provide a guarantee, the WCET of each pipeline stage or service thread must be known along with the release period. The CBDM extension, on the other hand, provides an option for reliability-oriented applications –where occasional soft and termination deadline failures are not catastrophic, but simply result in degraded performance -- the reliable service option provides quantifiable deadline assurance given expected execution time. Despite the ability to opt for no guarantee, this reliable test and execution control does not just provide best effort execution. Instead, a compromise is provided based on the concept of execution time confidence intervals and the RT EPA execution control combined with the CBDM admission test based on expected execution times and bounded overruns. An example of the CBDM admission test is given here with a simple two-thread scenario. The CBDM scheduling feasibility test eases restriction on the DM admission requirements to allow threads to be admitted with only expected execution times (in terms of an execution confidence interval), rather than requiring deterministic WCET. The expected time is based on off-line determination of the execution time confidence interval. Knowledge of expected time can be refined on-line by the RT EPA kernel-level monitoring features each time a thread is run. By easing restriction on the WCET admission requirement, more complex processing can be incorporated, and pessimistic WCET with conservative assumptions (e.g. cache misses and pipeline stalls) need not reduce utility of performance-oriented pipelines which can tolerate occasional missed deadlines (especially with probability of misses). 6.2 CBDM Deadline Confidence from Execution Confidence Intervals CBDM provides an extended version of the DM scheduling feasibility tests, which consider computation time and interference for a thread set, and therefore viability of response time less than a deadline. Fundamental to the CBDM extension is that release latency is due predominately to interference by higher priority threads, but execution latency is due to both interference and execution jitter. The DM equations account for interference, but do not account for execution jitter – rather the DM equations assume worst case execution time only. Basic DM scheduling formulas are extended by CBDM to return expected number of missed soft and termination deadlines to the controlling application given the provided execution model which quantitatively incorporates execution jitter. Latency and jitter in dispatch/preemption is not considered here. For Copyright 2000 Sam Siewert, All Rights Reserved 63 this capability, when a module is loaded, the computation time must be provided with a sufficient sample set for distribution-free confidence estimates, or an assumed distribution and a smaller sample set of execution times measured off-line. From this, the computation time used in the scheduling feasibility tests is computed based upon desired confidence for meeting soft and termination deadlines. All interfering threads are pessimistically assumed to run to their termination deadline where they either will have completed or are aborted. For example, for thread i, let C(i) = expected execution time; Dsoft(i) = soft deadline; Dterm(i) = termination deadline; and T(i) = period; with the DM condition that C(i) <= Dsoft(i) <= Dterm(i) <= T(i). The worst-case confidence interval execution times C(i)low and C(i)high used in the extended DM scheduling feasibility tests below are based on desired confidence in execution time and probability of late response. In cases where the actual execution time is greater than the worst-case confidence interval execution time, deadlines will be missed. The expected number of missed deadlines will be less-than or equal to expected execution times outside the confidence interval resulting in response beyond a given deadline. So, if a thread has an execution time confidence of 0.999 and passes the admission test, then it is expected to miss its associated deadline 0.1% of the time or less. 6.3 CBDM Admission Test Example For example, consider two threads that have a normal distribution of execution times (the normal distribution assumption is not required, but greatly reduces the number of off-line samples needed compared to assuming no distribution), so that unit normal distribution quantiles Zplow and Zphigh can be used, and assume that WCET(i) is known for comparison, so that we have: thread i=1: Cexpected(1)=40, σ(1)=15, Ntrials(1)=32, Zplow(1)=3.29 for soft-conf=99.9%, Zphigh(1)=3.72 for term-conf=99.98%, WCET(1)=58, Dsoft(1)=50, Dterm(1)=60, and T(1)=250 thread i=2: Cexpected(2)=230, σ(2)=50, Ntrials(2)=32, Zplow(2)=1.96 for soft-conf=95%, Zphigh(2)=3.72 for term-conf=99.98%, WCET(2)=310, Dsoft(2)=400, Dterm(2)=420, and T(2)=500 If these threads can be scheduled based on the RT EPA inputs to the admission test, then thread one has a probability of completing execution before Dsoft of at least 99.9% expressed P(Clow < Dsoft) ≥ 0.999. Similarly, probability P(Chigh < Dterm) ≥ 0.9998. Likewise thread two has respective deadline confidences P(Clow < Dsoft) ≥ 0.95 and P(Chigh < Dterm) ≥ 0.9998. Based on sufficient, but not necessary scheduling feasibility tests for DM [Au93] with EPA execution time confidence intervals inputs rather than just worst-case execution time, the scheduling feasibility with desired confidence in deadlines can be derived from the execution time confidence intervals, as shown below. Copyright 2000 Sam Siewert, All Rights Reserved 64 From execution time confidence intervals and sufficient (but not necessary) DM scheduling feasibility test: eq 1: From probability theory for a normal distribution, Clow or high(i) = σ(i) Cexpected(i) + Zplow or high(i) Ntrials(i) eq 2: EPA-DM admission test: ∀i: 1 ≤ i ≤ n: Clow or high(i) Imax(i) ≤ 1.0 ? + Dsoft or term(i) Dsoft or term(i) i-1 eq 3: Imax(i) = ∑ ceiling Dterm (i) D (j) ; where Imax(i) is the interference time by higher T(j) term j=1 priority threads j=1 to i-1 which preempt and run up to the “ceiling term” number of times during the period in which thread i runs. Can thread i=1 be scheduled given execution time confidence and desired Dsoft and Dterm confidence? Yes 15 using eq 1: C high(1) = 40 + Zphigh(1) = 49.86; 32 and likewise C low(1) = 48.72 48.72 49.86 50 ≤ 1.0 and 60 ≤ 1.0 for Clow(1)and Chigh(1); 58 likewise 60 ≤ 1.0 for WCET using eq 2&3: Clow, Chigh can be scheduled. (note: highest priority thread has no interference, so Imax(i)=0) Can thread i=2 be scheduled given execution time confidence and desired Dsoft and Dterm confidence? Yes 50 using eq 1: C(2)high = 230 + 3.72 = 262.88; and likewise C(2)low = 247.32 32 Imax (2) Clow or high (2) + Dsoft or hard (2) Dsoft or hard (2) ≤ 1.0 ?; using eq 2&3: Imax(2) = ceiling Dterm (2) D (1) T(1) term Copyright 2000 Sam Siewert, All Rights Reserved 65 In the worst case, given the abort policy for incomplete threads reaching their termination deadline, maximum interference occurs when all higher priority threads execute until they are aborted by the EPA. 247.32 60 262.88 60 400 + 2 400 ≤ 1.0 and 420 + 2 420 ≤ 1.0; simplifying eq 2&3: 310 60 + 2 420 420 ≤ 1.0 ? is FALSE ; simplifying eq 2&3: WCET cannot be scheduled. Clow, Chigh can be scheduled. (note: thread 1 interferes up to its termination deadline twice in this example) These formulas show that the two threads can be scheduled using non-WCET execution time such that desired performance is achieved. Note that the basic DM formulas show that the thread set is not considered schedulable if only WCET is considered. In this case, WCET, which is a statistical extreme, lead to rejection of a thread set which can be scheduled with ≥ 99.98% probability of successfully meeting termination deadlines. Copyright 2000 Sam Siewert, All Rights Reserved 66 7 Evaluation Method Four evaluation experiments were designed for the EPA including: 1) a pseudo load source and sink pipeline, 2) use of the RT EPA on-line monitoring with the MIPS instrument, 3) use of the RT EPA to schedule a video acquisition and compression pipeline, and finally, 4) use of the of the RT EPA to command and control an optically navigated air-power control systems test-bed (RACE, Rail-Guided, Air-Powered, Controls Experiment). A 5-DOF robotic systems test-bed was also constructed and the RT EPA was successfully used to control both a dead-reckoning arm and a full-feedback arm, however, the results were not interesting from the perspective of the RT EPA’s ability to handle marginal task sets since the loading imposed even by very high rate monitoring and control was very low on the Pentium microprocessor. Future work on the 5-DOF robotic system may incorporate more demanding controls algorithms given better sensors and actuators on a higher-fidelity arm, but the results here are really only of minor interest – another example of a system controlled by the RT EPA, but not a stressful example. 7.1 RT EPA Pseudo Loading Evaluation The initial pseudo loading evaluation was of a pipeline with six total tasks executing a loop with a pseudo-random number of iterations with a normal distribution. The goal of this experiment was to establish that the RT EPA could meet requested deadline reliabilities given execution confidence interval models and that the service could be refined with iteration using RT EPA actual performance feedback . The pipeline itself is a typical continuous media pipeline, although no actual data was run through the system, the RT EPA was used to synchronize the pipeline stage releases. This is depicted in Figure 13. Figure 13: EPA Pseudo Loading Pipeline Application system call kernel API RT EPA Device Interface Pipe-Stage 1…4 Device Interface HW / SW Interface Source Device Copyright 2000 Sam Siewert, All Rights Reserved Sink Device 67 Table 7 summarizes the pseudo source/sink experiment processing tasks evaluated. The data in this experiment was not from an actual source, but rather randomly generated and the tasks simulate data processing stages with a synthetic loading function provided by the uniform_var_waste_task function. This type of pipeline would be typical of an application which takes a number of source input samples and produces output at a lower rate, but with intermediate processing. The pseudo test was devised to simply test the RT EPA concept and viability. Tests with real sources and sinks demonstrate the application of the RT EPA much better, but the pseudo service pipeline was a valuable initial step in validation. Table 7: Pseudo Source/Sink Experiment Task Set Description Task ID Task Period 1 2 3 4 5 6 Pseudo source input task. Intermediate processing stage task Intermediate processing stage task Intermediate processing stage task Intermediate processing stage task Pseudo sink output task 0.15 sec 0.20 sec 1 sec 1 sec 1 sec 1 sec Copyright 2000 Sam Siewert, All Rights Reserved Worst-case execution time 0.10 sec 0.10 sec 0.10 sec 0.10 sec 0.10 sec 0.10 sec Expected execution time 0.035 sec 0.035 sec 0.035 sec 0.035 sec 0.035 sec 0.035 sec 68 7.2 SIRTF/MIPS Video Processing RT EPA Monitoring Evaluation The importance of scheduling epochs was discovered during RT EPA monitoring experiments with the SIRTF/MIPS software. The SIRTF/MIPS software processes two video streams and forms combined compressed packets for down-link as shown in Figure 14. Initially, exposures could not be scheduled reliably with the software due to execution variances that were causing processing to miss deadlines and ultimately time out. Figure 14: SIRTF/MIPS Dual Stream Pipeline Source Flow Control Application API Frame Buffer Data Compression Packetization Frame Buffer +/- msecs Data Compression HW/SW Video ADC FIFO Video ADC FIFO 7.3 Video Electronics Crosslink FIFO SIRTF/MIPS Video Processing RT EPA Epoch Evaluation The SIRTF/MIPS system definitely could not be scheduled until it was decomposed into 4 epochs of service. The first epoch, called e0, is a ready state for the instrument. The addition of epoch e1 was required to meet a tight deadline on synchronizing the exposure processing services tGeDp and tSiDp with the instrumentation hardware state machines. This epoch is really a startup case and not applicable once steady-state processing has been started. However, without a much higher priority given to the command processing task in e1, the system would occasionally fail to synchronously start the hardware and software processing. Examining e1 as summarized in Table 8a, it is evident that the overall CPU loading is minimal, but the problem was that the need to synchronize the Ge Processing and Compression thread with the HTG (Hardware Timing Generator) state machine meant that the state of the HTG as observed by tGeFIFODrv and tGeDP had to be monitored by tDetCmd and then synchronizing commands issued to the hardware within 5 milliseconds of the observation (i.e. a very short Copyright 2000 Sam Siewert, All Rights Reserved 69 deadline despite a long period). As such, an interference from for example tGeDP to tDetCmd after the observation would cause a synchronization error. What was needed was to adjust the priority of tDetCmd once the synchronization had been observed so that it would not be preempted while issuing the synchronizing commands (a new epoch where the priority of tDetCmd is temporarily boosted). This was required since the 5 millisecond deadline for tDetCmd would not have been reasonable for other situations such as steady-state exposure processing. In a case like this, tDetCmd would not be admitted by DM, but would be admitted by RM since RM does not consider cases where deadline is less than period. If utility is computed for tDetCmd during startup where the deadline is 5 milliseconds, then it imposes a 94.4 % utility over its deadline for this brief period of time, but in general the system is very under-loaded. Table 8a: Epoch 1 of the SIRTF/MIPS Video Processing task ID Description Release Period (msecs) (msecs) (msecs) Epoch 1 Exposure Start Synchronization with Hardware State Machine – No data available 2097 Expected Execution Worst-case Execution Expected Period Utility 1 Si FIFO driver (tSiFIFODrv) inactive N/A N/A 00.0 2 Ge FIFO driver (tGeFIFODrv) 233.02 0.5 1.0 00.22 3 Science Link (tSDMFIFODrv) driver inactive N/A N/A 00.0 4 Ge Processing and Compression (tGeDP) 131.072 1.0 4.0 00.76 5 Si Processing and (tSiDP) Compression inactive N/A N/A 00.0 6 Science Grouping (tMIPSExpMgr) inactive N/A N/A 00.0 7 Command Processing (tDetCmd) 2097 4.72 4.85 00.23 FIFO 01.21 The SIRTF/MIPS e1 shows how the multi-epoch composition can be very useful for atypical execution requirements for a specific system transition (from a ready state to an exposure processing steady-state). Once the system is running, it does on-the-fly data compression from three video sources and has 2 clear epochs: 1) compressed image computation and cross-link (e2) and 2) image-ramping data collection (e3). If the two steady-state epochs, e2 and e3, are viewed as a single epoch, then the system cannot be schedule by RMA or DM, and furthermore a single epoch view causes execution failure. The full implementation of the RT EPA software with the confidence-based scheduling admission test, pipelining features, and deadline over-run control was not used, but the RT EPA monitor was used extensively. Based on the RT EPA monitoring, an off-line analysis lead to discovery of the epoch decomposition, admission of the threads based upon confidences and measured execution times to each epoch, and a solution to the timing problems. Copyright 2000 Sam Siewert, All Rights Reserved 70 Table 8b: Epoch 2 of the SIRTF/MIPS Video Processing task ID Description Release Period (msecs) (msecs) (msecs) Epoch 2 Reset and Initial Sample 1/8th Frame Period – No data available 589.824 Expected Execution Worst-case Execution Expected Period Utility 1 Si FIFO driver (tSiFIFODrv) inactive N/A N/A 00.0 2 Ge FIFO driver (tGeFIFODrv) 233.02 0.5 1.0 00.22 3 Science Link FIFO (tSDMFIFODrv) 32.768 2.0 3.0 06.10 4 Ge Processing and Compression (tGeDP) 131.072 1.0 4.0 00.76 5 Si Final Compression (tSiDP) 589.824 521.9 527 88.48 6 Science Grouping (tMIPSExpMgr) 589.824 12 13 02.03 7 Command Processing (tDetCmd) inactive N/A N/A 00.00 driver 97.59 Table 8b shows the CPU utility during e2 steady-state exposure processing when however no data is being collected due to a required reset of the instrumentation detector hardware. For the SIRTF/MIPS instrument, the most critical processing deadlines are during e3 rather than e2, and in fact the deadline for completing processing was set at 1048 milliseconds based on a requirement to fall behind no more than one full Si frame (524 msecs). A worst case over-run in e1 is not even close to this deadline, but it is clear that the CPU is highly loaded during this time of data collection quiescence. Table 8c summarizes e3 and during e3 the CPU loading is much less, but the deadline on completing Si frame slice processing by tSiDP is 131 milliseconds. Given the worst-case processing time for an Si slice and short deadline for completion, the division of processing into e2 and e3 decomposed the steady-state scheduling problem into two epochs where the Si data processing has two vastly different deadline requirements. Prior to this decomposition and a redistribution of computation described in Section 8.2.2, the system suffered from timeouts occasionally during the active data collection. Copyright 2000 Sam Siewert, All Rights Reserved 71 Table 8c: Epoch 3 of the SIRTF/MIPS Video Processing task ID Description Release Period (msecs) (msecs) (msecs) Epoch 3 Sample Frame Period – Data available 2031.616 Expected Execution Worst-case Execution Expected Period Utility 1 Si FIFO driver 65.536 0.5 1.0 00.76 2 Ge FIFO driver 233.02 0.5 1.0 00.22 3 Science Link FIFO driver inactive N/A N/A 00.0 4 Ge Processing and Compression 131.072 1.0 4.0 00.76 5 Si Slice Compression 65.536 39.8 75 60.73 6 Science Grouping 1048.576 12 13 01.14 7 Command Processing (tDetCmd) inactive N/A N/A 00.00 63.61 Copyright 2000 Sam Siewert, All Rights Reserved 72 7.4 Digital Video Pipeline Test-bed The digital video pipeline experiment provides a classic continuous media test environment for the RT EPA. This test-bed is based on the Connexant Bt878 chipset [Connex98] which provides National Television Standards Committee (NTSC) decoding [Whi99] and a peripheral component interconnect (PCI) bus direct memory access (DMA) capability from the Bt878 pixel first-in-first-out (FIFO) buffer through the PCI North Bridge (NB) to microprocessor primary memory. 7.4.1 NTSC Digital Video Decoder DMA Micro-coding The Connexant Bt878 chip includes a DMA state-machine which fetches DMA micro-code from the PCI NB accessible microprocessor main memory. It is possible to generate a PCI interrupt as a side effect of any of the basic DMA micro-code instructions. The micro-code instructions provide basic sequencing of the Bt878 ADC channels and pixel FIFO. All of the Bt878 configuration, control, and status registers are PCI memory-mapped 32-bit registers. To start the DMA transfer the Bt878 must be fully configured, the DMA micro-code starting address loaded, and then the DMA state-machine activated. The following is the basic DMA micro-code generation function for 320x240 NTSC frames used in the Bt878 VxWorks driver code used in this experiment and the RACE experiment described in Section 7.5: for(i=0;i<NUMFRAMES;i++) { mcidx = MCSIZE * i; j = 0; /* NTSC ODD/EVEN FIELD SYNC */ dma_microcode[mcidx+j] = DMA_MC_SYNC_FM1_WORD_0; j++; dma_microcode[mcidx+j] = DMA_MC_SYNC_WORD_1; j++; /* Initialize 120 lines of ODD microcode */ for(j=2,k=0;j<242;j+=2,k++) { dma_microcode[mcidx+j] = DMA_MC_WRITE_1280_LINE; dma_microcode[mcidx+j+1] = (unsigned int)&(rgb32_ntsc_2_frame_buffer[i][(320+(k*640))]); } j=242; /* NTSC VRE FIELD SYNC */ dma_microcode[mcidx+j] = DMA_MC_SYNC_VRE_WORD_0; j++; dma_microcode[mcidx+j] = DMA_MC_SYNC_WORD_1; j++; dma_microcode[mcidx+j] = DMA_MC_SYNC_FM1_WORD_0; j++; dma_microcode[mcidx+j] = DMA_MC_SYNC_WORD_1; j++; /* Initialize 120 lines of EVEN microcode */ for(j=246,k=0;j<486;j+=2,k++) { dma_microcode[mcidx+j] = DMA_MC_WRITE_1280_LINE; dma_microcode[mcidx+j+1] = (unsigned int)&(rgb32_ntsc_2_frame_buffer[i][0+(k*640)]); } Copyright 2000 Sam Siewert, All Rights Reserved 73 j = 486; dma_microcode[mcidx+j] = DMA_MC_SYNC_VRO_WORD_0_IRQ; j++; dma_microcode[mcidx+j] = DMA_MC_SYNC_WORD_1; j++; } /* end for all frames */ dma_microcode[(NUMFRAMES*MCSIZE)] = DMA_MC_JUMP_TO_BEG; dma_microcode[(NUMFRAMES*MCSIZE)+1] = (unsigned int)&(dma_microcode[0]); Once the Bt878 micro-code is activated, the DMA transfer of RGB32 format 320x240 frames starts without imposing loading on the Pentium microprocessor. However, the Bt878 master PCI bus transfers at 30 fps do impose a 7% loading on the PCI bus which is capable of a maximum burst rate of 3.3e+7 32-bit words per second. The PCI bus can easily handle this data rate and on this test-bed, there is really no contention for the bus since the only other activity is DMA microcode fetches (0.044% loading) and a minor amount of resource usage by the PCI video accelerator card. The ethernet transfer [Ether96] is accomplished through the ISA bus 10 Mb/sec ethernet interface. The ethernet is not nearly able to keep up with the PCI bus data rate at the 10 Mb/sec raw rate, especially given TCP/IP overhead [WriSte94]. At 320x240, even 4 fps imposes 98.3 % loading on the ethernet, so pragmatically, only 3 fps is possible across the ethernet. In order to minimize the ethernet bandwidth, the experiment takes the RGB32 format and produces an 8-bit grayscale image so that the ethernet is not so overloaded, which theoretically should enable an upper bound frame rate of 16 fps, but in practice, it was found that only 5 fps maximum was possible given the high overhead of TCP/IP and the fact that temporally accurate frame transfer requires that TCP packet coalescing be disabled (otherwise partial frames are delivered late). So, all TCP writes are immediate, and the overhead is very high (fixed size/cell ATM packets would be much more efficient). 7.4.2 RT EPA Digital Video Processing Pipeline The RT EPA Digital Video Processing Pipeline uses the Bt878 NTSC decoder to generate frame source interrupts at 30 Hz releasing a frame buffer management service which in turn releases a frame compression service that then releases a network transport service for transmission of compressed frames over ethernet with the TCP/IP protocol. This pipeline is shown in Figure 15 and the services in the pipeline are summarized in Table 9. Table 9 also summarizes the expected CPU utility of each service release and the reliability desired for each in this evaluation. This experiment clearly includes a task set which loads the CPU above the RM least upper bound whether worst or best-case utility is considered. Copyright 2000 Sam Siewert, All Rights Reserved 74 Table 9: Digital Video Pipeline Services Service Service Interval (msecs) Exec Time (msecs) Deadline (msecs) Worstcase Utility Bestcase Utility Deadline Reliability Video 320 x 240 source acquisition with DMA transfer 33.33 1.0 +/0.5 30 04.50 01.50 1.0 Video frame compression 33.33 19 +/- 1 33 60.00 54.01 1.0 Video frame update packet transmission 200 55 +/- 5 200 30.00 25.00 0.9 94.5 80.51 TOTAL 7.5 RACE Optical Navigation and Control Experiment The RACE digital control and continuous media experiment, depicted in Figure 15, uses video processing to determine range with +/- 5 cm knowledge from a visual target on it’s railguided vehicle ramp and real-time motor control to propel it to a ramp target with +/-10 cm deadbands. Active camera control maintains the target in the center of the camera field of view. Figure 15: Basic Digital Video RT EPA Pipeline Source Flow Control Application API +/- msecs Frame Buffer Data Compression Packetization HW/SW NTSC Frame ADC NTSC Video Electronics Ethernet The experiment provides a mixed hard and soft real-time application with challenging resource management requirements in terms of CPU utility, bus I/O bandwidth, and ethernet network bandwidth. The RACE operator is able to get a vehicle camera view of the optical target on the host workstation used to download VxWorks application code at a 5 Hz frame rate in Copyright 2000 Sam Siewert, All Rights Reserved 75 grayscale. Furthermore, the operator can display the vehicle state as determined by the optical navigation on the host workstation at a 10 Hz rate. 7.5.1 RACE Mechanical System Overview The RACE system is a hanging rail-mounted vehicle as shown in Figure 16 A and B. The vehicle has only one degree of translational freedom up and down the ramp (future work planned includes yaw control with a tail fan) and the main task for the system is to translate to a distance from the target and hold position while keeping the target centered in the camera field of view. Figure 16 A and B : RACE System Side-View(A) and Frontal-View (B) The target is a rectangle with a red, green, and red stripe which the optical navigation uses to determine distance of the RACE vehicle from the target. The vehicle is powered by two 8.4 volt motors which can draw a peak current of 30 amps together to move RACE quickly up the railguided ramp toward the target. The camera system include hobby servo tilt/pan control so that the camera control system can keep the target in the center of the camera field of view. The RT EPA based software control system allows RACE to be targeted to a ramp position between 170 and 20 cm from the target. The RT EPA and all RACE control algorithms run on a Pentium 200 Mhz processor running VxWorks 5.3.1. Interfaces between RACE and the Pentium microprocessor include 2 asynchronous RS-232 control channels and an NTSC analog video channel. Power and communication is provided through the electronics tether shown in Figure 16A. The vehicle ramp control is provided by a traditional proportional, derivative, integral controller which attempts to hit the desired control target with near-zero velocity to minimize overshoot. The centroid and distance to the target are computed at 10 Hz by the optical navigation algorithm. RACE is operated by one simple command to go to a target on the ramp between the bounds of 170 cm and 20 cm from the wall target. At all times the camera is actively controlled as are both the left and right motors through electronic speed controllers. The vehicle is constructed of high strength to Copyright 2000 Sam Siewert, All Rights Reserved 76 weight ratio basswood and aluminum. The control of the vehicle is non-trivial given the stick-slip frictional characteristics of the ramp and requires a 10 Hz PID controller to achieve +/- 10 cm target accuracy given +/- 5cm positional knowledge. 7.5.2 RACE Electronics System Description The RACE on-board electronics, shown in Figure 17, includes two main control boards: 1) the motor servo-actuated speed control board and 2) the camera servo-actuated control board. The camera is interfaced directly to the ground control computer PCI bus Bt878 NTSC video decoder. Both control boards are interfaced to the ground computer through 8N1 protocol RS-232 asynchronous serial command channels. The motors are provided power by a ground-based 012v, 0-30 amp power supply. The ground-based computer runs all of the RT EPA services to command and control RACE and uses the Bt878 video decoder’s DMA PCI master capability to digitize NTSC video from the camera and transfer 30 frames per second directly to the Pentium main memory for processing. Figure 17: RACE Vehicle and Ground Control System Electronics PENTIUM (VxWorks) +5v +Vfeed Pentium µ -proc Camera Ctl (9600 bd) RS232 SERVO 4 SPD CTL +8.4V (0-15A) Ramp Ctl (9600 bd) R-SPD RS232 +12V RS232 OOPIC RS232 DRAM NTSC Camera (tilt/pan ctl) L-SPD TTL NCD-ASIC PCI-NB LEFT MOTOR SPEED CTL DAC TILT CMD SERVO 1 TTL +5v +5v +12v SERVO 2 PAN CMD +Vfeed SERVO 3 PCI - BT878 SPD CTL +5v NTSC Analog Video (30 fps) SPEED CTL DAC +5v RIGHT MOTOR +8.4V (0-15A) The PCI Bt878 NTSC decoder provides digitization of the analog NTSC signal and interfaces to main memory through the PCI north bridge. The Bt878 DMA micro-code is fetched from the Pentium main memory by the Bt878 DMA state machine. Actuation of the hobby servos is accomplished by a transistor-to-transistor logic-level (TTL) pulse-width modulated signal for both motor control and camera servo control. The motor control board includes a National Control Copyright 2000 Sam Siewert, All Rights Reserved 77 Devices (NCD) application specific integrated circuit (ASIC) for TTL programmable output on 2 channels which may be commanded through a multi-drop RS-232 interface. The camera control board is a object-oriented programmable integrated circuit (OOPIC), which is also capable of providing a programmable TTL output based on serial commands. The OOPIC also provides an analog-to-digital conversion (ADC) channel and additional digital control capability for possible future expansion of capabilities. Likewise, the NCD ASIC board can be expanded to include additional serial (RS-232) addressable servo control channels with an additional NCD ASIC. The RACE electronics are shown in Figure 18. The OOPIC is the circuit board in the lower left corner, the NTSC camera is mounted in front (lower right corner) and the NCD ASIC board is mounted in the back with an interface to the left/right speed controllers visible in the upper center/left portion of the picture. Figure 18: RACE Electronics 7.5.3 RACE RT EPA Command, Control, and Telemetry Services All of the RACE automatic control, commanding, and telemetry is handled by 7 RT EPA services. The 7 services can also be spawned as traditional VxWorks tasks for performance comparison and to demonstrate the features of RT EPA execution compared to simple VxWorks application execution. The 7 main tasks are described in Table 10. Copyright 2000 Sam Siewert, All Rights Reserved 78 Table 10: RACE Task Set Description Ti 1 Freq 30 Hz Task tBtvid 2 30 Hz tFrmDisp 3 4 10 Hz 10 Hz tOpNav tRACECtl 5 6 7 5 Hz 5 Hz 3 Hz tTlmLnk tFrmLnk tCamCtl Description NTSC Frame-grabber event processing task which sequences all other RACE data processing and control Frame display processing task can produce XGA format frames from RGB32 as well as 8 bit grayscale single color frames (R, G, or B only) RACE optical target ranging and centroid computation task RACE ramp control task (computes left/right thrust command in order to achieve or hold a ramp position) RACE state telemetry link through ethernet RACE grayscale frame link though ethernet RACE tilt/pan camera control for target peak-up 7.5.3.1 Frame-based Processing and Control Sequencing The basic rate for the RACE application (i.e. all services are at this rate or some sub-rate of it) is 30 Hz. The 30 Hz event rate is driven by the NTSC video frame acquisition complete event which is asynchronously relayed to the microprocessor through the Bt878 DMA micro-code PCI interrupt vector assertion instruction. The PCI interrupt is routed to a Pentium interrupt-request (IRQ) and a VxWorks interrupt service routine (ISR) is registered at this vector address in order to signal the frameRdy event. The frameRdy event is indicated to the tBtvid task through a binary semaphore. The tBtvid in turn sets the current frame pointer to the completed frame address and gives semaphores to all other tasks running at sub-rates of this basic rate. 7.5.3.2 Frame Display Compression/Formatting Algorithm The frames acquired and transferred to the Pentium main memory via the Bt878 DMA are in an Alpha, R, G, B 32 bit format. The DMA directly transfers the frames into a ring buffer and the frameRdy ISR tracks the address of the last frame completed. The tFrmDisp task can produce 4 types of compressed frames (relative to the RGB32 frame) including: 1) XGA 5:6:5 format, 2) Red only grayscale, 3) Green only grayscale, and 4) Blue only grayscale. By default it only produces a Red only grayscale compressed copy of each RGB32 frame. For grayscale, the service simply takes one color channel from the RGB32 current frame buffer and extracts an 8-bit pixel monochrome frame from the RGB32 frame. 7.5.3.3 Optical Navigation Algorithm The basic video processing algorithm for determining range to the ramp target is based upon the total pixel width of a target determined by a simple pixel intensity change edge detection. The pixel width is determined by counting pixels on each scan-line which exceed an RGB vector threshold and each count is then used to update a frequency distribution so that the maximum likelihood of the true target pixel width is determined based on the highest frequency count for all scan-lines. Given the RACE 320x240 RGB format, the frequency vector has 320 elements and given a well tuned threshold for a given target, the distribution method can determine the target size for a given distance so that there is at most +/- 1 pixel of uncertainty in the ranging. At maximum distance from the target on the RACE ramp this translates into +/- 5 cm ramp position Copyright 2000 Sam Siewert, All Rights Reserved 79 uncertainty. Figure 19 A and B shows a typical RACE optical navigation target width frequency distribution for the furthest and closest distances from the optical target. Figure 19 A and B: Target Width Distribution for All Scan-lines – Close (A) and Far (B) 80 60 40 20 305 286 267 248 229 210 191 172 153 134 115 96 77 58 39 1 0 20 Number of Scanlines With Width Close-up Target Width Distribution Pixel Width of Target 50 40 30 20 10 307 289 271 253 235 217 199 181 163 145 127 109 91 73 55 37 19 0 1 Number of Scanlines With Width Far Target Width Distribution Pixel Width The centroid calculation is made as a side effect of the maximum likelihood target width computation. The start scan-line for the first maximum likelihood width and the end scan-line determine the vertical location of the centroid in the field-of-view and the start and end scan-line pixel addresses for the target are used to determine the horizontal location of the centroid. 7.5.3.4 RACE Control Algorithm The ramp hold control law is based on maintaining the distance to the target within a deadband range where the total motor thrust is simply required to counteract the force down the plane due to gravity minus static wheel friction. When the vehicle is outside the ramp position deadbands, the total thrust is set to a proportional amount relative to the distance out of range such that the vehicle will re-enter the dead-band range. The controller motor ramp-up is proportional to the distance from the target and uses velocity and acceleration derivative terms to control translation to the target in order to hit the target with zero velocity. The control dead-bands of +/-10 cm (4 inches) keep RACE from over-controlling, although optical instabilities can sometimes lead to motor oscillations. 7.5.3.5 State Telemetry Link Algorithm The state of the RACE vehicle, as computed by the tOpnav optical navigation task is downlinked at 10 Hz over ethernet to a workstation display server. The TCP/IP protocol is used with Copyright 2000 Sam Siewert, All Rights Reserved 80 the no delay option so that the telemetry represents the real-time state of RACE accurately to the user. State information includes: struct race_state { /* Translational */ int pos, vel, accel; /* Rotational */ int yaw, yaw_rate, yaw_accel; /* Commands */ int direction; unsigned short right_servo_cmd, left_servo_cmd, tilt_servo_cmd, pan_servo_cmd; /* Target Dimensions */ int target; unsigned short target_size, target_x, target_y; }; 7.5.3.6 Grayscale Frame Link Algorithm The grayscale frame link algorithm simply writes out a frame to a VxWorks TCP/IP client socket so that it may be received and displayed by a host workstation X window system display application. 7.5.3.7 NTSC Camera Tilt and Pan Control Algorithm The RACE camera system includes hobby servo tilt/pan control so that the cameral control algorithm can keep the target centroid in the center of the field of view. Copyright 2000 Sam Siewert, All Rights Reserved 81 7.5.4 RACE RT EPA Software System The RACE RT EPA pipeline is the most complicated RT EPA application implemented described in this thesis. It implements 4 separate concurrently executing data processing pipelines. The pipelines and stages are depicted in Figure 20 and pipeline stage and end-to-end frequencies can be modified on-line using the RT EPA pipeline control API. For the RACE experiments, the pipelines are typically set to 2 Hz output for P0, 5 Hz output for P1, 15 Hz output for P2, and 3 Hz output for P3, however all four pipelines share the tBtvid pipeline source which executes at 30 Hz, so therefore any of the pipelines can be configured to that maximum rate or any sub-rate of 30 Hz. The sub-frequencies typical for each stage are also annotated on Figure 20. A full specification of the RACE services and results is given in Section 8.4. Figure 20: RACE EPA Pipeline frameRdy interrupt RT Event releases / Service frameRdy grayscale buffer RGB32 buffer frame tBtvid (30 Hz) tFrmDisp (10 Hz) tOpnav (15 Hz) tFrmLnk (2 Hz) tNet (2 Hz) TCP/IP pkts tTlmLnk (5 Hz) P1 tRACECtl (15 Hz) P2 P3 tCamCtl (3 Hz) SW HW Bt878 PCI NTSC decoder (30 Hz) Copyright 2000 Sam Siewert, All Rights Reserved P0 Serial A/B ISA ethernet 82 7.6 Robotic Test-bed The robotic test-bed uses the RT EPA to control a 5 DOF robot to perform pick and place. Unfortunately, while the robotic test-bed was shown to work with the RT EPA, the results did not demonstrate capabilities of the RT EPA well given the low fidelity of the system (all motor joint control is simple electromechanical relay on/off control rather than DAC motor control). However, for completeness, the basic results are presented here since this is another usage example of the RT EPA. Two robotic arms were built and tested (pictured in Figure 21 A and B). The first (left) was controlled with dead reckoning of the joints based on rotation rates and time of rotation to translate the gripper to a target. There was no problem performing a simple pick and place with the 5 RT EPA tasks enumerated in Table 11 (one per joint) and the arm pictured in Figure 21 A. The second arm built included position sensors for joint rotation feedback and limits monitoring. This system included one targeting and arm control service, a limits monitoring service, and a telemetry service. Again, basic pick and place could be performed without difficulty using the arm shown in Figure 21 B. For the position feedback arm, the configuration space for the robot arm is based upon a set of total joint rotations since each joint has only one degree of rotational freedom. There is no requirement on the trajectory the gripper follows, so a greedy algorithm is used whereby the joints controllers all try and complete rotation as soon as possible. Each joint controller task uses a joint kinematics model to compute the gripper location relative to the arm coordinate system and uses a simple joint controller algorithm for the motor relay control. The basic joint controller algorithm is as follows: if (within_range(current_position[joint], target_position[joint]) { terminate_ motion_for_sequence(joint); } else if (current_position[joint] < target_position[joint]) { rotate_positive_direction(joint); } else if (current_ position[joint] > target_ position[joint]) { rotate_ negative_direction(joint); } Copyright 2000 Sam Siewert, All Rights Reserved 83 The predicate within_range is a function of the joint monitoring frequency and the motor average rotational rate to prevent limit cycles and therefore the accuracy of the arm is directly proportional to the monitoring rate. All targets for pick location and place locations are specified in terms of configuration space joint angles relative to the arm start position. Task ID 1 2 3 4 5 Table 11: 5 DOF Robotic Experiment Task Set Description Task Description Base rotation control Shoulder pitch control Elbow pitch control Wrist rotation control Gripper control Figure 21 A and B: 5 DOF Dead-Reckoning Robot (A, left), Position Feedback Robot (B, right) 7.7 Robotics Test-bed Inconclusive Results The loading on the robotic test-bed even at high release frequency was lower than the RMA least upper bound and therefore failed to meet experimental goals for the RT EPA fundamentally – the option of providing additional synthetic loading was considered, but this seemed no more valuable than the pseudo loading experiment. Furthermore, the systems level performance metric planned for the 5 DOF Robotic Test-bed is the target pick and place error. The pick target and place target errors were planned to be measured in degrees for each joint from the desired target which translates into a distance error between the gripper target and the pick/place target center. Unfortunately, these system errors were not a direct result of the execution and release variances, but rather due to much more significant mechanical motor joint variances. The joint control algorithm was tested for a range of individual rotations on this under-loaded system and it was found that joint controller errors overshadowed any error that might have resulted from task release and execution variances. It was therefore decided that continued experimentation with this low fidelity arm was not worthwhile as far as demonstrating the RT EPA capabilities or meeting RT EPA experimentation goals. Copyright 2000 Sam Siewert, All Rights Reserved 84 8 Experimental Results This section presents the results obtained from all four RT EPA experiments, including: 1) pseudo loading, 2) the SIRTF/MIPS instrumentation, 3) the basic video processing pipeline, and 4) the RACE optical navigation test-bed. In all the experiments performed the thread sets were marginal (i.e. loading exceeding the RMA least upper bound). For such a marginal task sets, if WCETs are provided to an RMA admission test it will fail. If expected execution times are provided, then the thread set may be admitted, but the problem is that there is no indication of the expected deadline hit/miss rate from RMA. This is the fundamental problem with traditional RMA and DM admission tests compared to the RT EPA CBDM admission test. Both RMA and DM admission tests are able to either guarantee or deny service, but otherwise can make no estimation on the quality of service for task systems that are marginal as shown here. 8.1 RT EPA Experimentation Goals In each experiment it is shown that the task set in the experiment is marginal and that the RT EPA is not only able to schedule the task set successfully, but that the expected performance is either met/exceeded or the RT EPA derives a refined model on line and the thread set is protected from overruns and occasional missed deadlines are handled and system latency and jitter are correctly computed. The goals for the RT EPA experimentation include: 1) Demonstration of the admission and execution control of marginal thread sets with deadline reliability specification. 2) Demonstration of on-line kernel monitoring and use of on-line models as a basis to renegotiate for a new service level based on actual performance. 3) Demonstration of RT EPA pipeline phasing control (goal 3a) and control to meet isochronal output requirements (goal 3b). 4) Establish viability of ME theory by showing scheduling feasibility for a real-world system that otherwise cannot be scheduled. 5) Demonstrate protection of services from deadline over-runs (i.e. fire-walling of services from each other). The pseudo loading experiment was able to meet 3 of 5 goals, the SIRTF/MIPS usage of the RT EPA was able to demonstrate 2 of 5, and the video processing and RACE experiment were able to demonstrate and meet all 5 of the goals. So, between all four RT EPA experiments, all experimental goals for the RT EPA were met and the RACE experiment was able to demonstrate meeting all of the RT EPA goals with a single system. 8.2 RT EPA Pseudo Loading Tests The pseudo loading RT EPA results clearly show that a task set with worst case execution times that cannot be scheduled according to hard RMA or DM theory may be able to be scheduled Copyright 2000 Sam Siewert, All Rights Reserved 85 by the RT EPA confidence-based scheduler. The results show that not only can the RT EPA schedule such marginal task sets, but that it can do so at the requested confidence levels and can provide feedback from the on-line monitoring including the distribution free estimate of minimum hard and soft deadlines given desired confidences. Furthermore, this feedback was used to tune the execution models and the service level requests (off-line re-negotiation for service). So, this experiment met goals 1, 2, and 3. 8.2.1 Pseudo Load Marginal Task Set Negotiation and Re-negotiation Testing (Goal 2) Table 12 shows a task set admitted to the RT EPA with expected and WCET base on off-line models for execution and desired confidence in meeting specified deadlines. As long as the initial off-line model is reasonable and the requests for confidence are within the DM bounds for utility and interference, then the RT EPA will admit the task set based on this initial negotiation. The RT EPA however provides on-line monitoring and fire-walling to protect all other system tasks from bad releases by terminating any release that would run past its termination deadline. A major influence on the VxWorks release and execution jitter performance is the tExcTask which handle maintenance of interval timers along with exception handling – i.e. all basic RTOS interrupts and associated processing. In these runs which were 6 seconds long it was released 940 times in the first, and 880 in the second. This is out of 11750 total dispatches and 1205 RT EPA dispatches in the first and 11787 total dispatches and 1146 RT EPA dispatches in the second. Table 12: Pseudo Loading Marginal Task Set Description (Timer Released) task Soft Conf Hard Conf T Dsoft Dterm Cex p CPU Exp Util WCET 1 2 3 4 5 6 0.8 0.75 0.7 0.65 0.6 0.55 1.0 0.95 0.9 0.85 0.8 0.75 150 200 1000 1000 1000 1000 100 150 200 250 250 250 150 200 250 500 500 500 35 35 35 35 35 35 0.233 0.175 0.035 0.035 0.035 0.035 70 70 70 70 70 70 0.548 CPU WCET Util 0.466 0.35 0.07 0.07 0.07 0.07 1.096 After running with this admitted task set, the RT EPA provides an on-line model continuously refined based on the performance of on-line releases up to the fidelity requested. Perhaps most important, the RT EPA provides feedback on the confidence in the supplied soft and hard deadlines as well as the current model soft and hard deadlines for the requested confidences. So, given on-line model information, which is derived from a distribution free history, an application can refine its level of service based on actual performance with this task set. One typical reason Copyright 2000 Sam Siewert, All Rights Reserved 86 that on-line performance will vary from initial estimation from off-line measurements is due to interactions with other tasks in the set. For example memory reference traces that cause more cache misses compared to isolated testing. Table 13 summarizes actual performance of the task set from Table 12 after being admitted and activated and run to collect data. Table 13 clearly shows that the actual performance was better than was expected based on the initial execution models provided. So, the Dsoft and Dterm confidence was met and exceeded in all cases. Given the better than expected performance, it was possible to re-negotiate for the tighter Dsoft and Dterm values summarized in Table 14. Table 13: Pseudo Loading Actual Marginal Task Set Performance (Timer Released) t 1 2 3 4 5 6 Online Model Size 400 301 60 60 60 60 Tact (msec) 150 250 1000 1000 1000 1000 Cact-low, Cact-high, Cact-exp 1,63,27 0,41,110 0,2.5,4 0,2.4,5 0,2.5,5 5,62,103 N preempt 35 93 71 56 9 1 Dsoft Conf actual 1.0 1.0 1.0 1.0 1.0 1.0 ∆ + + + + + + Dterm Conf actual 1.0 1.0 1.0 1.0 1.0 1.0 ∆ 0 + + + + + Dsoft Online Model 53 66 3 3 3 83 Dterm Online Model 63 109 4 4 4 90 The results for the refined model base on on-line execution with deadlines adjusted to be closer to those possible for the requested confidences (Table 14) are summarized in Table 15 Table 14: Pseudo Loading Marginal Task Set Description (Timer Released) t Soft Conf Hard Conf T Dsoft Dterm Cexp 1 2 3 4 5 6 0.8 0.75 0.7 0.65 0.6 0.55 1.0 0.95 0.9 0.85 0.8 0.75 150 200 1000 1000 1000 1000 60 80 10 10 10 100 80 120 10 10 10 100 35 35 35 35 35 35 CPU Exp Util 0.233 0.175 0.035 0.035 0.035 0.035 0.548 WCE T 70 70 70 70 70 70 CPU WCET Util 0.466 0.35 0.07 0.07 0.07 0.07 1.096 From the results in Table 15 it is evident that a few of the renegotiated deadlines could not be met. The RT EPA performance monitoring therefore allows an application programmer to iterative refine models with on-line testing until maximum safe utility can be derived from the system. Copyright 2000 Sam Siewert, All Rights Reserved 87 Table 15: Pseudo Loading Actual Marginal Task Set Performance (Timer Released) t 1 2 3 4 5 6 8.3 Online Model Size 400 301 60 60 60 60 Tact (msec) 150 200 1000 1000 1000 1000 Cact-low, Cact-high, Cact-exp 1,27,103 0,26,51 0,2.4,5 0,2.4,5 0,2.5,5 1,29.5,46 N preempt 32 63 70 38 2 0 Dsoft Conf actual 0.93 1.0 1.0 1.0 1.0 1.0 ∆ + + + + + Dterm Conf actual 0.93 1.0 1.0 1.0 1.0 1.0 ∆ + + + + + Dsoft Online Model 53 51 3 3 3 38 Dterm Online Model 103 51 4 4 4 40 SIRTF/MIPS Video Processing RT EPA Monitoring Evaluation The SIRTF/MIPS instrument video processing software as it was originally designed suffered from three common real-time scheduling problems: execution variance, poor release phasing, and response jitter. These problems were easily solved using the monitoring capabilities of the RT EPA. It was not necessary to use the full RT EPA implementation, but rather just the kernel level monitoring capabilities. However, this clearly demonstrates the utility of the RT EPA kernel monitoring capabilities. Furthermore, the RT EPA kernel monitoring facilities incorporated into the SIRTF/MIPS instrumentation software can be enabled during space telescope mission operations and data returned through telemetry. The SIRTF/MIPS thread set was a marginal thread set and prior to the incorporation of the RT EPA monitoring, scheduling problems were preventing successful processing of several instrument exposure types. These problems preventing success were: 1) Execution time variance – execution time for releases of the data processing tasks had high variance due to the POWER architecture and the nature of frame processing algorithms – solved with code optimizations to minimize buffer copies. 2) Poor phasing of releases such that loading was not well distributed was increasing interference – solved by ME. 3) Lack of visibility into the thread release and completion times and associated jitter in order to optimize the loading and phasing of task releases – solved by use of RT EPA monitoring. The key to solving the SIRTF/MIPS problem was to use the monitoring facilities of the RT EPA to identify release phasing problems that were leading to high interference and to recognize that the MIPS data processing algorithm and event rates lead to two distinct processing epochs during an exposure. Furthermore, the SIRTF/MIPS software was also shown to have two stressful loading epochs related to exposure start events and related to steady-state exposure processing. The identification of the MIPS steady state exposure processing epochs using the RT EPA kernel monitor are presented along with bus analyzer timing analysis for comparison. Table 16 summarizes measured execution times and provides low, average, and high execution times. The variance in execution time clearly demonstrates the jitter problem. Furthermore, the sum of the slope computation time and the difference image computation time clearly exceeds the 65 msec slice period and this badly phased loading along with the jitter was Copyright 2000 Sam Siewert, All Rights Reserved 88 responsible for missing intermediate slice processing deadlines. Without the monitoring capabilities of the RT EPA, it would have been difficult to diagnose the problem. Table 16: RT EPA Execution Jitter in SIRTF/MIPS Si Frame Processing Releases Slice Ave. (msec) Slice High (msec) Slice Low (msec) Slope Ave. (msec) Slope High (msec) Slope Low (msec) Diff Ave. (msec) Diff High (msec) Diff Low (msec) 39.6 87 19 509 522 497 216.5 322 202 Given the problems with the initial SIRTF/MIPS software, the difference image computation was moved to the end of the data collection so that both the difference image and the slope image were computed during the reset frame. This optimization, summarized in Table 17, lead to a much better loading. Table 17: RT EPA Execution Jitter in SIRTF/MIPS Si Optimized Frame Processing Releases Slice Ave. (msec) Slice High (msec) Slice Low (msec) Diff/Slope Ave. (msec) Diff/Slope High (msec) Diff/Slope Low (msec) 39.8 75 17 521.9 527 514 So, this actual usage of the RT EPA monitoring capabilities met experimentation goals 2 and 4. 8.3.1 SIRTF/MIPS RT EPA DM Priority Assignment The priorities for SIRTF/MIPS flight software were assigned according to the DM policy which is the same as the CBDM policy of “highest priority given to highest frequency (shortest period) task”. These priority assignments are enumerated in Table 18. The use of the RT EPA and the DM policy on SIRTF/MIPS which has 26 services, clearly demonstrates the ability of the RT EPA on an actual system. Copyright 2000 Sam Siewert, All Rights Reserved 89 Table 18: SIRTF/MIPS DM Priority Assignments Task name f (Hz) Ti Deadline Rationale (msec) tDDC tERR 100 10 Maximum collection rate required 30.51757 32.768 Worst-case error generation rate from FIFO 8 tSDMFIFODr 30.51757 32.768 SDM FIFO servicing rate 8 tCmdHandler 3.003003 333 Total time for command/response must be < 200 msecs so each phase FIFO -> Handler -> Rsp must sum to less than 200 msecs including overhead = 35+36+37 = 108 tCmdFIFODr 3.003003 333 (same as above) tRspFIFODr 3.003003 333 (same as above) tGeFIFODrv 7.629394 131.072 Maintain ½ frame margin in processing 5 tGeDP 7.629394 131.072 Maintain ½ frame margin in processing 5 tDetCmd 5 200 Total time for command/response must be < 200 msecs so each phase FIFO -> CmdNormal -> DetCmd -> Rsp must sum to less than 200 msecs including overhead = 36+63+62+37 = 198 tCmdNormal 5 200 (same as above) tSiFIFODrv 15.25878 65.536 Si half-full FIFO rate for 524.288 msec frame 9 time tSiDp 15.25878 65.536 Si slice rate for 524.288 msec frame time 9 tMIPSExpMgr 1.430615 699 Ge Only subgroup production rate + Si raw 2 subgroup production rate (2096)/3). tSiHtrMon 1 1000 Si heater control must run every second and is released by ADC semaphore tADC 1 1000 Analog data collection must run every second and should complete BEFORE limits or telemetry so that data is not stale tLIM 1 1000 Checked every second after ADC tDIAG 1 1000 Handles commands (max rate once a second) tIM 1 1000 Handles commands (max rate once a second) tCKSUM 1 1000 Performs check sum (max rate once a second) tSdmSend 1 1000 Handles SDM send requests (max rate is Ge only every MIPS second) tTLM 0.25 4000 Telemetry collection (rate is every 4 seconds, but deadline is half to prevent stale data) tPKT 0.25 4000 Packet builder (rate is every 4 seconds, but deadline is half to prevent stale data) tIRSExpMgr 0.5 2000 Maximum group production rate for IRS is every 2 seconds (inactive in MIPS modes) tShell 0.1 10000 The shell has no real hard deadline tLogTask 0.1 10000 The LogTask has no real hard deadline tSCRUB 0.01 100000 The memory scrubber is best effort Copyright 2000 Sam Siewert, All Rights Reserved Deadline (msec) priority 10 30 3 6 32 9 35 12 36 37 60 15 18 21 61 24 62 27 63 65 30 33 66 36 700 39 701 42 702 45 900 906 907 909 910 48 51 54 57 60 2000 63 2001 66 2002 69 10000 10000 100000 72 75 78 90 8.3.2 MIPS Exposure-Start Reference Timing Model The reference exposure-start timing model shows the relative order of expected events and times as well as actuals on the Rad6k microprocessor. In all of the MIPS FSW observing modes (photometry, sky-scan, super-resolution, total power, and spectral-energy-distribution) , the MIPS FSW, HTG, and TPG state machines are in a ready state. The ready state is defined by the following CE hardware and software conditions: 1) HTG state machine is executing the ready image cycle (2 MIPS seconds), is asserting the TPG reset address lines, and is producing HTG frames with valid engineering analog and digital status data, but invalid Ge detector science. 2) TPG is slaved to the HTG and under it’s control through the address line interface between the HTG and TPG and is executing the Si detector reset timing pattern. 3) Ge DP task (tGeDP) and the Ge FIFO driver task (tGeFIFODrv) are acquiring frames from the IOB Ge FIFO and extracting the analog and digital status engineering data from them. The ready state of the CE hardware and software provides for Ge detector telemetry collection by the HTG between exposures and therefore an exposure is started by synchronously reconfiguring the HTG for the exposure while it is already running in the ready state -- the exposure configuration and data acquisition will start on the next HTG IC. The synchronization is achieved through a combination of hardware support (HTG synchronous latching of double buffered registers into the state machine prior to each IC start) and software support whereby the MIPS Detector Command task (tDetCmd) writes out the HTG register commands for the exposure within a timing window that is not too close to an HTG IC boundary. If the MIPS FSW encounters the HTG ready IC in the “Unsafe Hold-Off Region”, then it waits 4 frames plus a patchable delay additional amount of time in order to synchronize with the HTG on the next ready IC. This synchronous exposure command window is depicted in Figure 22. Figure 22: MIPS Mode HTG Ready Exposure-Start HW/SW Synchronization Window 1573 msecs Safe Synchronous Unsafe Exposure-Start Window Hold-Off Region 2097 msecs 2621 msecs Si Rst Smpl Si Smpl GeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGe Si Rst Si Rst Si Rst Si Rst Si Bst HTG Ready IC Copyright 2000 Sam Siewert, All Rights Reserved Si Bst Si Rst HTG Exposure IC 91 The FSW scheme for synchronizing with the HTG results in two timing models for both SUR and Raw MIPS exposure starts. These two cases, referred to as Start Case A and Start Case B. Start Case A, the worst case exposure delay, is the result of an asynchronous command arriving inside the Unsafe Hold-Off Region. In this case, the MIPS FSW must delay the exposure start because the HTG image cycle is too near the ready image cycle end for the FSW to safely transition the state machine to exposure settings (i.e. race conditions between the state-machine and data processing software would cause data processing and configuration errors otherwise). So, in Start Case A, the software iteratively delays until the unsafe region has passed, but the software does complete the science generation response and Si DP initialization in the mean time and finally synchronizes the exposure start near the beginning of the next image cycle. Therefore, in Start Case A, after command receipt, the exposure image cycle will not start for at least one full image cycle plus some portion of the hold-off time (between 2097 and 2855 msecs assuming that the pc_ge_dp_WriteSafeICFrameCnt is maintained at the current setting of 4 frames). For Start Case A, the resynchronization will occur between 756 and 1282 msecs after the exposure command is received (756 being the hold-off time + the write safe wait time; the worst case of 1282 includes 2 frames of error in this iterative synchronization approach). An example of Start Case A is shown in Figure 23 . Figure 23: MIPS Exposure Start Worst Case Delay (Case A) Worst Case Exposure Start Delay 2621 msecs 1573 msecs Unsafe Safe Synchronous Hold-Off Exposure-Start Window Region (524 msecs) 2097 msecs HTG Exposure IC 2097 msecs Resynch HTG Ready IC HTG Ready IC Write Safe Wait (234 msecs) Exp Cmd Received HTG Resynch Start Case B, the best case exposure start from the standpoint of minimum delay to exposure start, occurs when the exposure command is received just before the Unsafe Hold-Off region. In this case, it is safe to reconfigure the HTG state machine since sufficient time exists prior to the end of the current image cycle. If the command is processed before entry into the safe region, then the exposure image cycle will occur somewhere between 524 msecs and 2097 msecs since exposure command receipt. Furthermore, the resynchronization will occur as quickly as 20 msecs from command receipt, but definitely before 524 msecs. This exposure start scenario is depicted in Figure 24. Copyright 2000 Sam Siewert, All Rights Reserved 92 The MIPS FSW scheme for synchronizing software configuration, control, and data processing with the state machine is based on several key system design facts: 1) The HTG state machine is already running an image cycle prior to commanded exposure starts asynchronously with respect to command processing (this enable telemetry acquisition when not taking exposure data) 2) The HTG state machine does provide double-buffering of key control and configuration registers, but the synchronous latching of these registers into the state machine must still NOT be commanded by the MIPS FSW too close to an image cycle boundary (i.e. no closer than a partial frame from the boundary, or less than 131 msecs) 3) The MIPS FSW has knowledge of where the HTG state machine is in the image cycle to +/- 2 HTG frames (264 msecs) based upon the fact that the HTG FIFO driver is always holding a partial frame (due to modulus between size of frames, 1152 words, and size of a half full FIFO, 2048 words) and based upon the frequency of the GeDP task releases and processing time. 4) The DetCmd task, as noted in Table 19, has a deadline of 62 msecs, and therefore from release the time between checking the HTG state in GeDP and actually completing the software-hardware synchronization can be a good portion of this time Given the facts 1 to 4 above, it was determined that a hold-off region of 4 HTG frames was sufficient worst case. While the MIPS FSW HTG synchronization does cause exposure start jitter of up to 2 MIPS seconds, it guarantees that the software data processing, HTG state machine, and software control of the state machine are fully synchronized no matter when an exposure start command is received and processed. Figure 24: MIPS Exposure Start Best Case Delay (Case B) Unsafe Hold-Off Region (524 msecs) 1573 msecs HTG Exposure IC Safe Synchronous Exposure-Start Window 2097 msecs 2097 msecs HTG Exposure IC HTG Ready IC SUR/Raw Exp Cmd HTG Resynch The MIPS exposure start sequence and timing are very similar for both Raw and SUR (Sample Up-the-Ramp) exposures, however, they do differ slightly, so Table 19 enumerates the Copyright 2000 Sam Siewert, All Rights Reserved 93 order of expected events for an SUR exposure start, and Table 20 enumerates expected events for a Raw exposure start. Both tables include actual measurements made on the flight microprocessor system along with tolerances derived from this model validation. While the deadlines are generous and therefore allow significant jitter in events, the ordering should not change except as noted for Start Case A and B due to the variations possible in software synchronization with the detector timing generator hardware. Furthermore, please note that the model includes validation of both Start Case A and Start Case B. It is important that the regression tester be aware of the two start cases and account for the possibility by verifying both cases or at least one of the possible cases. Table 19: MIPS SUR C0F2N2 Exposure Start Vmetro Time Tags C Code Marker / I/O Board Event Bus Tag Case A Case B {Port, Data} (msecs) (msecs) SUR Exposure Command Received (collection 0xC434, 0x5523 0.0 0.0 trigger) 0x904, 0x9801 2.335 2.247 EXP_PARSED_AND_INIT Deadline A or B N/A 3.0 EXP_SI_SUBGRP_MQ_CHECKED 0x904, 0x9802 2.552 2.359 3.4 EXP_GE_SUBGRP_MQ_CHECKED 0x904, 0x9803 2.585 2.385 3.8 EXP_SI_FIFO_RESET_DONE 0x904, 0x9804 3.661 2.473 4.2 EXP_INTS_CONNECTED 0x904, 0x9805 3.705 3.517 4.6 EXP_SUR_DP_SETUP_DONE 0x904, 0x9806 3.766 3.580 5.0 EXP_FIRST_SDM_ALLOC_DONE 0x904, 0x9808 3.949 3.772 5.4 EXP_START_RECORDED 0x904, 0x9809 4.106 4.042 5.8 EXP_CMD_RSP_SENT 0x904, 0x980a 4.417 4.348 6.2 EXP_FIRST_GRP_TIMEOUT_SET 0x904, 0x980c 4.449 4.379 6.6 EXP_SEMFLUSH_DONE 0x904, 0x980b 4.472 4.405 7.0 EXP_HTG_START_CALLED 0x904, 0x980d 4.847 4.686 7.4 EXP_HTG_START_COMPLETED 0x904, 0x980e 555.56 27.291 (4 frame delay if cmd received in unsafe zone) (2 orders) 1282 or 80.0 IOB Event: Sci Data Generation Command Response (Earlier response if exposure start delay required) 0xC330, 0xAAF3 5.891 28.41 8.0 or SIDP_EXP_INIT_DONE 0x904, 0x0001 100.0 (2 orders) 23.417 45.772 25.0 or 60.0 2378 1099 2855 or 2097 (2 orders) GEDP_EXP_IC_STARTED 0x904, 0x0802 (2 orders) Copyright 2000 Sam Siewert, All Rights Reserved 94 Table 20: MIPS Raw C0F1N2 Exposure Start Vmetro Time Tags C Code Marker / I/O Board Event Bus Tag {Port, Data} 0xC434, 0x5524 Case A (msecs) 0.0 Case B (msecs) 0.0 Deadline A or B N/A 0x904, 0x9801 2.330 2.273 3.0 EXP_SI_SUBGRP_MQ_CHECKED 0x904, 0x9802 2.437 2.381 3.4 EXP_GE_SUBGRP_MQ_CHECKED 0x904, 0x9803 2.573 2.409 3.8 EXP_SI_FIFO_RESET_DONE 0x904, 0x9804 3.684 3.584 4.2 EXP_INTS_CONNECTED 0x904, 0x9805 3.735 3.637 4.6 EXP_SUR_DP_SETUP_DONE 0x904, 0x9807 3.792 3.694 5.0 EXP_FIRST_SDM_ALLOC_DONE 0x904, 0x9808 3.983 3.886 5.4 EXP_START_RECORDED 0x904, 0x9809 4.146 4.043 5.8 EXP_CMD_RSP_SENT 0x904, 0x980a 4.459 4.357 6.2 EXP_FIRST_GRP_TIMEOUT_SET 0x904, 0x980c 4.490 4.389 6.6 EXP_SEMFLUSH_DONE 0x904, 0x980b 4.513 4.415 7.0 EXP_HTG_START_CALLED 0x904, 0x980d 4.896 4.825 7.4 EXP_HTG_START_COMPLETED 0x904, 0x980e 1217 27.184 (4 frame delay if cmd received in unsafe zone) (2 orders) 1282 or 80.0 IOB Event: Sci Data Generation Command Response (Earlier response if exposure start delay required) 0xC330, 0xAAF3 5.922 28.303 8.0 or SIDP_EXP_INIT_DONE 0x904, 0x0001 SUR Exposure Command Received (collection trigger) EXP_PARSED_AND_INIT 100.0 (2 orders) 27.872 45.747 25.0 or 60.0 2494 1414 2855 or 2097 (2 orders) GEDP_EXP_IC_STARTED 0x904, 0x0802 (2 orders) 8.3.3 SIRTF/MIPS Exposure Steady-State Reference Timing Model The SIRTF/MIPS exposure steady-state compression mode processing, known as SUR, is based upon the instrument data collection and detector timing control hardware. Figure 25 shows how frames are collected from both the Ge and Si detectors and how this relates to software data production events for compressed Ge down-link data (Ge Only Groups) and compressed Si data (SUR Groups). The first two Si frames of an exposure are detector voltage boost frames (Bst) and do not produce data. Furthermore, the third frame of an exposure is a reset frame (Rst) and also produces no data. All subsequent frames produce data until the detector saturates at which time a DCE (Data Collection Event) is complete, and the hardware and software continue in the steadystate as shown in Figure 26. From this point on the DCE shown in Figure 26 repeats until the Copyright 2000 Sam Siewert, All Rights Reserved 95 number of iterations commanded is achieved and then the exposure is terminated and the instrument returns to its ready state. The most significant feature of these timing requirements is that during steady-state exposure processing, a brief period of no data collection (534 milliseconds) exists on a periodic basis. As we will see in Section 8.3.4, this is key to the success of the ME decomposition used in the SIRTF/MIPS scheduling and load distribution. Figure 25: SUR C0F2Nn First DCE Data Collection and Production Event Timing Model 3669 msecs Ge Only Grp SUR Grp Slope Sub Ge Only Sub Ge Only Sub Ge Only Sub Si Si Si Si Smpl Rst/Smp Smpl Smpl l X X X X GeGeGeGeGeGeGeGeGeGeGeGeGeGeGeGe GeGeGeGeGeGeGeGe Si Bst Si Bst Si Rst First DCE Figure 26: SUR C0F2Nn DCE 2 to n Data Collection and Production Event Timing Model 3669 msecs SUR Combined Group Ge Only Group Slope/Diff Subgrp Si Rst Ge Only Sub Ge Only Sub Ge Only Sub Si Rst/Smpl Si Smpl Si Smpl Si Smpl Si Smpl Si Smpl X X X X Ge GeGe Ge Ge GeGeGe G GeGeGe G GeG Ge Ge GeGeGe G GeGe Ge e e e e DCE 2 to N Copyright 2000 Sam Siewert, All Rights Reserved 96 The SIRTF/MIPS raw exposure steady-state processing is based upon the instrument data collection and detector timing control hardware just as the compressed SUR processing mode. Figure 27 shows the first DCE with boost and reset frames and Figure 28 shows all subsequent DCEs with initial reset frames. Figure 27: First DCE Raw C0F1Nn Data Collection and Production Event Timing Model 2621 msecs Raw Grp Si Raw Sub Ge Only Si Bst Si Bst Si Rst Ge Only Si Rst/Smpl Si Smpl X X X X Ge GeGeGe Ge GeGeGe Ge GeGeGe Ge GeGeGe First DCE Figure 28: DCE 2 to n Raw C0F1Nn Data Collection and Production Event Timing Model 2621 msecs Raw Group Si Raw Subgrp Ge Only Ge Only Si Rst Si Rst/Smpl Si Smpl Si Smpl Si Smpl X X X X Ge GeGeGe Ge GeGeGe Ge GeGeGe Ge GeGeGe 8.3.4 SIRTF/MIPS SUR Mode Steady-State ME Results Table 21 shows how the ME decomposed SUR processing with epochs e1 and e2 as described in Section 4.3 meets all deadlines. The deadlines in Table 21 are based upon the data collection Copyright 2000 Sam Siewert, All Rights Reserved 97 for an exposure such that the processing will not fall behind collection. If processing was allowed to fall behind the collection rate, eventually buffer overflows in the pipeline would result and corrupt the science data. Table 21: SUR C0F2N2 Steady-state Exposure Time Tags C Code Marker / I/O Board Event Bus Tag Test {Port, Data} (msecs) SUR Exposure HTG Resynch Commanded 0xC610,0xA580 0.0 Deadline N/A GEDP_EXP_IC_STARTED 0x904, 0x0802 815.03 2097 GEDP_EXP_SUBGROUP_SENT (1/1) 0x904, 0x0803 2251 4717 MEMDP_EXP_GEONLY_GRP_SENT (1/2) 0x904,0x1004 2264 5766 GEDP_EXP_SUBGROUP_SENT (1/2) 0x904, 0x0803 3293 5766 SIDP_EXP_SUR_FRM_PROCESSED (1/2) 0x904, 0x0004 3730 5766 SIDP_EXP_SUR_FRM_PROCESSED (2/2) 0x904, 0x0004 4241 6290 GEDP_EXP_SUBGROUP_SENT (2/2) 0x904, 0x0803 4333 7338 GEDP_IC_STARTED 0x904, 0x080F 4461 6290 GEDP_EXP_IC_STARTED 0x904, 0x0802 4461 6290 SIDP_EXP_SI_DIFF_COMPUTED 0x904, 0x0007 4516 7338 SIDP_EXP_SI_SLOPE_COMPUTED 0x904, 0x0006 4694 7338 SIDP_EXP_SUR_SUBGRP_SENT (1/1) 0x904, 0x0008 4694 7338 MEMDP_EXP_COMBINED_GRP_SENT (1/2) 0x904,0x1006 4740 7338 GEDP_EXP_AUTOREADY_SENT 0x904, 0x0804 4990 9435 SIDP_EXP_SUR_FRM_PROCESSED (1/5) 0x904, 0x0004 5810 9435 GEDP_EXP_SUBGROUP_SENT (1/1) 0x904, 0x0803 5902 9435 MEMDP_EXP_GEONLY_GRP_SENT (2/2) 0x904,0x1004 5915 9435 SIDP_EXP_SUR_FRM_PROCESSED (2/5) 0x904, 0x0004 6339 9435 SIDP_EXP_SUR_FRM_PROCESSED (3/5) 0x904, 0x0004 6856 9435 GEDP_EXP_SUBGROUP_SENT (2/3) 0x904, 0x0803 6948 9435 SIDP_EXP_SUR_FRM_PROCESSED (4/5) 0x904, 0x0004 7380 9435 SIDP_EXP_SUR_FRM_PROCESSED (5/5) 0x904, 0x0004 7904 9959 GEDP_EXP_SUBGROUP_SENT (3/3) 0x904, 0x0803 7997 11007 GEDP_EXP_END 0x904, 0x0806 7997 9435 GEDP_EXP_FULLREADY_SENT 0x904, 0x0805 8005 9435 GEDP_IC_STARTED 0x904, 0x080F 8124 9435 SIDP_EXP_SI_DIFF_COMPUTED 0x904, 0x0007 8198 11007 SIDP_EXP_SI_SLOPE_COMPUTED 0x904, 0x0006 8379 11007 MEMDP_EXP_COMBINED_GRP_SENT (2/2) 0x904, 0x1006 8396 11007 Copyright 2000 Sam Siewert, All Rights Reserved 98 Table 22: Raw C0F1N2 Steady-state Exposure Time Tags C Code Marker / I/O Board Event Bus Tag Test {Port, Data} (msecs) SUR Exposure HTG Resynch Commanded 0xC610,0xA580 0.0 (collection trigger) 0x904, 0x0802 1550 GEDP_EXP_IC_STARTED Deadline N/A 2097 GEDP_EXP_SUBGROUP_SENT (1/2) 0x904, 0x0803 2988 4718 SIDP_EXP_RAW_FRM_PROCESSED (1/2) 0x904, 0x0002 3406 4718 SIDP_EXP_RAW_FRM_PROCESSED (2/2) 0x904, 0x0002 3933 5242 SIDP_EXP_RAW_SUBGRP_SENT 0x904, 0x0005 3933 6290 GEDP_EXP_SUBGROUP_SENT (2/2) 0x904, 0x0803 4035 6290 MEMDP_EXP_RAW_GRP_SENT (1/2) 0x904,0x1005 4048 6290 GEDP_IC_STARTED 0x904, 0x080F 4163 4718 GEDP_EXP_IC_STARTED 0x904, 0x0802 4163 4718 GEDP_EXP_AUTOREADY_SENT 0x904, 0x0804 4690 7339 SIDP_EXP_RAW_FRM_PROCESSED (1/4) 0x904, 0x0002 4973 7339 SIDP_EXP_RAW_FRM_PROCESSED (2/4) 0x904, 0x0002 5495 7339 GEDP_EXP_SUBGROUP_SENT (1/2) 0x904, 0x0803 5601 7339 SIDP_EXP_RAW_FRM_PROCESSED (3/4) 0x904, 0x0002 6018 7339 SIDP_EXP_RAW_SUBGRP_SENT 0x904, 0x0005 6539 8911 SIDP_EXP_RAW_FRM_PROCESSED (4/4) 0x904, 0x0002 6539 7863 GEDP_EXP_SUBGROUP_SENT (2/2) 0x904, 0x0803 6644 8911 GEDP_EXP_END 0x904, 0x0806 6645 7339 GEDP_EXP_FULLREADY_SENT 0x904, 0x0805 6652 7339 MEMDP_EXP_RAW_GRP_SENT (2/2) 0x904, 0x1005 6665 8911 8.3.5 SIRTF/MIPS Raw Mode Steady-State Results Table 22 shows how the ME decomposed raw processing with epochs e1 and e2 as described in Section 4.3 meets all deadlines. The deadlines in Table 21 are based upon the data collection for an exposure such that the processing will not fall behind collection. If processing was allowed to fall behind the collection rate, eventually buffer overflows in the pipeline would result and corrupt the science data. 8.3.6 SIRTF/MIPS Video Processing RT EPA Epoch Evaluation The importance of scheduling epochs is clearly demonstrated by the RT EPA monitoring experiments with the SIRTF/MIPS instrument video processing software. Without the epoch analysis and redistribution of releases, this instrument would not have been able to meet its requirements for real-time processing at all. Furthermore, without dynamic adjustment of priority Copyright 2000 Sam Siewert, All Rights Reserved 99 to enable full synchronization of hardware and software during the exposure start epoch of the MIPS software the system never would have succeeded in synchronizing data processing and data production by the hardware. The MIPS software, as noted in Section 8.2, is an excellent example of a marginal task set that by RMA should not be able to be scheduled safely, yet by reorganizing the software into multiple scheduling epochs and by exploiting the fact that it is highly improbable that worst-case executions and releases will cause the maximum interference case (i.e. a timeout is still possible, but highly unlikely), the system has been operating for many months without a single missed deadline. 8.4 Digital Video Pipeline Test-bed Results The digital video pipeline experiment can be characterized as shown in Table 23. This experiment was a preliminary test of the video processing capabilities ultimately used in the RACE test-bed. This was done to provide a simple example in this thesis and to test basic capabilities of the RT EPA. Table 23: Digital Video Pipeline Marginal Task Set Description task tBtvid tFrmDisp tFrmLnk Soft Conf 1.0 0.5 0.5 Hard Conf 1.0 0.9 0.8 T µscs 33333 20000 33333 3 Dsoft Dterm Cexp Util WCET Util 20000 100000 300000 33333 150000 333333 100 58000 50000 0.003 0.290 0.150 1200 60000 56000 0.036 0.300 0.168 0.443 0.504 The processor is under-loaded, so the results of this test simply show that the RT EPA can schedule a non stressful thread set just as well as an RMA priority preemptive policy. The pipeline includes successive release from the source interrupt, to tBtvid, from tBtvid completion to tFrmDisp release, and finally from tFrmDips completion to tFrmLnk. Given pipeline sequncing like this, the next stage is fully synchronized with the previous and therefore the data is fully consistent through the pipeline. However, the response jitter from the previous stage directly drives the release jitter in the current stage. From the point of release, the only additional jitter is then due to response jitter in that stage, but the overall pipeline sink output jitter is the summation of latency and jitter through all stages. Figure 29 A and B: RACE Frame Compression (A) and Response Jitter (B) Frame Link Response Jitter 60000 59500 59000 58500 58000 57500 57000 56500 0.0E+00 Cactcomp (microsec) Cactcomp (microsec) Frame Compression Response Jitter 1.0E+07 2.0E+07 3.0E+07 Time (microsec) Copyright 2000 Sam Siewert, All Rights Reserved 150000 100000 50000 0 0 5E+06 1E+07 2E+07 2E+07 3E+07 Time (microsec) 100 After actually running, the RT EPA on-line monitoring results, summarized in Table 24, showed that the application easily meets all negotiated service levels, which given the underloading is not surprising. The results do show however that the RT EPA correctly computes service level capability that exceeds what was requested. Table 24: Actual Digital Video Task Set Performance t Tact (sec) 1 2 Online Model Size 1000 1000 3 1000 0.331 0.033 0.331 Cact-low, Cact-high, Cact-exp 30, 1073, 60 56966, 59620, 58400 40911, 55790, 48600 0 + Dterm Conf act 1.0 1.0 ∆ Dsoft Online Model 0 260 + 60000 Dterm Online Model 260 60000 + 1.0 + 120000 N preempt 14 76 Dsoft Conf act 1.0 1.0 ∆ 2809 1.0 101000 More interesting than the trivial service levels presented here (much more interesting marginal results are presented with the RACE test-bed), is how the RT EPA isochronal output feature can be used to eliminate the tFrmLnk jitter in the pipeline (Figure 30a and 30b shows response without control). Being able to remove jitter in the last stage of the pipeline prior to output to a sink device is a key feature of the RT EPA pipelining capabilities. Figure 30 A and B: RACE Frame Link Execution (A) and Response Jitter (B) Video Link Execution Jitter Video Link Response Jitter (No Isochrony Control) Cactcomp (microsec) Cactexec (microsec) 60000 58000 56000 54000 52000 50000 0 130000 120000 110000 100000 90000 0 5E+06 1E+07 2E+07 2E+07 3E+07 5E+06 1E+07 2E+07 2E+07 3E+07 Time (microsec) Time (microsec) Figures 31a and 31b show the response jitter filtering effect of the RT EPA isochronal output control feature. In Figure 31a it can be seen that there is up to 6 milliseconds of video processing execution jitter which without control directly lead to similar response jitter in Figures 30 a and b. The RT EPA mechanism can produce isochronal output as long as the given stage is meeting or exceeding specified deadlines by holding (buffering) results before they are passed onto either the next stage or presented to the output device. The mechanism is described in detail in Section 5.6.1. Copyright 2000 Sam Siewert, All Rights Reserved 101 Figure 31 A and B: RACE Frame Link Execution (A) and Response Jitter (B) With Isochronal Output Control Video Link Execution Jitter Video Link Response Jitter (With Isochrony Control) 58000 Cactcomp (microsec) Cactexec (microsec) 60000 56000 54000 52000 50000 0 5000000 1E+07 1.5E+07 2E+07 Time (microsec) 8.5 130000 120000 110000 100000 90000 0.0E+00 5.0E+06 1.0E+07 1.5E+07 2.0E+07 Time (microsec) RACE Results The RACE results satisfy all experimentation goals for the RT EPA. The following is a summary of results which exhibit the five goals for the RT EPA. 8.5.1 RACE Marginal Task Set Experiment (Goal 1) The first experiment goal to implement a marginal task set was met by the RACE experiment. The RACE thread set was rejected based upon the execution model by both the RMA least upper bound test and the DM admission test, but admitted by the RT EPA CBDM test. The thread set and expected CPU loading is summarized in Table 25. Note that the expected execution times in RACE lead to a loading of approximately 95%, but given the low confidence required on many of the threads, this is the reason that the thread set can be admitted by CBDM despite the high average loading. No matter how one interprets execution time, in all cases the loading is above the RMA least upper bound of 72.05% for 9 threads. The results of the CBDM admission test may be found in Appendix D. Furthermore, the RT EPA overrun control also makes this otherwise marginal thread set feasible since overruns will be terminated and therefore interference controlled. The CBDM admission test computes the utility each thread imposes over its termination deadline period and the interference expected over that period by other threads. The reason that CBDM works well is based on the relative independence of execution jitter such that an overrun for thread A may have 0.1% probability and for thread B 0.1% also, therefore making the probability that A will interfere up to its worst-case time and that B will execute for its worstcase time less than one chance in a million. The S array taken from the RT EPA on-line admission test which accounts for utility and interference over each thread deadline is provided here. Intuitively, the threads with the largest S values have the highest probability of missing a deadline – a high S value thread with high confidence is the most likely point of failure in maintaining negotiated service. The tasks tNet and tExc are VxWorks tasks. The tNet task actually handles TCP/IP packet transmission and tExc handles VxWorks scheduling and operating system resource management. The tExc task is high frequency, but very low loading. So, tExc can not be demoted below highest priority, but much like interrupt servicing, it provides a fairly constant level of background overhead of approximately 5% on the RACE RT EPA test-bed. The Copyright 2000 Sam Siewert, All Rights Reserved 102 RT EPA kernel monitoring takes place during tExc time and the RT EPA release control takes place during the actual execution time of each release, so the RT EPA is accounted for here as well. Table 25: RACE Source/Sink Pipeline Task Set Description id Name Low High T Cexp Exp Clow Low C C wc WC Conf Conf (msec) (µsec) Util (µsec) Util (µsec) Util 0 tBtvid 1.0 1.0 33.333 64 0.002 0 0 1200 0.036 1 tFrmDisp 0.5 0.9 100.00 38772 0.388 36126 0.361 40075 0.400 2 tOpnav 0.9 0.99 66.67 20906 0.314 19545 0.293 23072 0.346 3 tRACEC 1.0 1.0 66.67 190 0.003 0 0 1272 0.020 tl 4 tTlmLnk 0.2 0.5 200.00 384 0.002 0 0 1392 0.007 5 tFrmLnk 0.5 0.8 500.00 55362 0.111 50083 0.100 58045 0.116 6 tCamCtl 1.0 1.0 200.0 610 0.003 317 0.001 1530 0.008 7 tNet 0.5 0.25 100.00 8000 0.080 4000 0.040 10000 0.100 8 tExc 1.0 1.0 1.0 50 0.050 25 0.050 50 0.050 0.953 0.845 1.073 If we now look closely at plots of the RACE RT EPA kernel monitoring results, we see that in fact tOpnav and tRACECtl, both of which have the highest S values also in fact exhibit the highest jitter in response. This is because these two threads have not only high utility, but high interference from tBtvid and tFrmDisp. What is interesting to note at this point is that there is no assumption about phasing the loads at all. So, even with a loading somewhere between 0.814 and 0.951, there is still pessimism in the critical instant assumption. The RT EPA pipelining control for this thread set as summarized by Table 26. The pipeline control configuration is important because it can control jitter and can reduce interference effects with phasing. To see this, we look at plots of thread release period, execution, and response jitter. rtid 0 1 2 3 4 5 6 8.5.2 Table 26: RACE Standard Pipeline Phasing and Release Frequencies Task Released by Frequency Offset tBtvid interrupt 30 Hz 0 tFrmDisp tBtvid 10 Hz 0 tOpnav tBtvid 15 Hz 1 tRACECtl tOpnav 15 Hz 0 tTlmLnk tOpnav 5 Hz 0 tFrmLnk tFrmDisp 2 Hz 0 tCamCtl tOpnav 5 Hz 0 RACE Nominal Configuration Results The following Sections 8.5.2.1 to 8.5.2.7 summarize the jitter in each RACE thread release. Each release is either made by completion of another RACE thread, or by source interrupt as summarized in Table 27. Another important characteristic of CBDM inherited from DM is that the underlying priorities assigned to threads are such that the highest priority is given to the shortest deadline thread. This is summarized for RACE in Table 27. Copyright 2000 Sam Siewert, All Rights Reserved 103 S 0.02 0.82 0.96 0.97 0.77 0.91 0.89 N/A N/A rtid 0 1 2 3 4 5 6 Table 27: RACE Soft and Termination Deadline Assignment Task Dsoft Dterm RT EPA priority tBtvid 33333 33333 0 tFrmDisp 42000 50000 1 tOpnav 64000 66000 2 tRACECtl 64600 66600 3 tTlmLnk 100000 200000 4 tFrmLnk 300000 500000 5 tCamCtl 360150 1000000 6 8.5.2.1 Bt878 Video Frame Buffer Service The Bt878 Video task simply processes a DMA interrupt event and sets the frame buffer pointer to the current frame and sequences any tasks according to the RT EPA pipelining specification. Since the task is released by the Bt878 hardware interrupt, the period jitter is extremely low except in rare cases where the processing is coincident with tExc kernel resource management (e.g. the 1 msec virtual clock). The affects of a collision with tExc are also evident by clock read dropouts in the RT EPA kernel monitoring (an unfortunate side effect which however is only a problem for dispatch times less than 100 microseconds and rare). The very occasional tExc interference for tasks with execution releases on the order of the context switch time is a problem that is ignored in this thesis since, as is evident in the B878 Video results, it happens much less frequently than 1% of the time – what this means is that 99% of the time the time-stamping accuracy is in fact good to a millisecond, but occasionally it is only good to +/- 100 microseconds and therefore releases like this one will experience on-line monitoring clock read jitter and system interference. This interference can be accounted for by admitting tExc as an RT EPA task, but never activating it since it is activated by the operating system. It was found to be insignificant here either way. Figure 32: Bt878 Video Release Jitter Bt878 Video Period Jitter T (microsec) 34500 34000 33500 33000 32500 32000 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 Time (microsec) The execution and response jitter in the Bt878 Video thread is minimal, but is evident in Figures 33 A and B. The occasional tExc interferences are evident again as execution and response time drop outs. Copyright 2000 Sam Siewert, All Rights Reserved 104 Figure 33 A and B: Bt878 Video Execution (A) and Response Jitter (B) Bt878 Video Response Jitter 1200 1000 800 600 400 200 0 0.0E+00 Cactcomp (microsec) Cactexec (microsec) Bt878 Video Execution Jitter 1.0E+07 2.0E+07 3.0E+07 4.0E+07 400 350 300 250 200 150 100 50 0 0.0E+00 Time (microsec) 8.5.2.2 1.0E+07 2.0E+07 3.0E+07 4.0E+07 Time (microsec) Frame Display Formatting and Compression Service Since the Frame Display and Compression service for RACE is released by the completion of the Bt878 Video task which has response jitter of approximately +/- 200 microseconds and very little interference by tExc or tBtvid, the release period jitter is low – once again +/- 200 microseconds. However, being a pipeline release, if one compares the release jitter in the Bt878 Video and Frame Display services, it is clear that variance in the release is higher in this completion dependent release compared to a pure hardware event release. Figure 34: RACE Frame Display Service Release Period Jitter Frame Display Period Jitter T (microsec) 100300 100250 100200 100150 100100 100050 100000 99950 99900 0 1E+07 2E+07 3E+07 4E+07 5E+07 Time (microsec) Given the much more significant processing in the Frame Display service, the execution jitter is much more significant – approximately +/- 2 milliseconds. Since the Frame Display algorithms are determinate – totally drive by number of pixels rather than data driven with variance in algorithm complexity, the execution jitter can only be explained by architectural variance. This is a logical deduction especially when the nature of pixel by pixel processing is considered with respect to the L1/L2 cache and probability of misses and therefore pipeline stalls. Large memory traverses will increase probability of such variance, so not only does this release impose much Copyright 2000 Sam Siewert, All Rights Reserved 105 more loading, it brings out architectural variance. It is apparent from Figure 35 A and B that the execution jitter directly results in equivalent response jitter and that in general the response has a small amount of additional latency with similar jitter. Figure 35 A and B: Frame Display Service Execution (A) and Response (B) Latency and Jitter Frame Display Response Jitter 41000 40500 40000 39500 39000 38500 38000 37500 Cactcomp (microsec) Cactexec (microsec) Frame Display Execution Jitter 0 40000 39500 39000 38500 38000 37500 0 1E+07 2E+07 3E+07 4E+07 5E+07 1E+07 2E+07 3E+07 4E+07 5E+07 Time (microsec) Time (microsec) Again, in a few instances the execution time exceeds the response time, which again is due to tExc clock interference since of course response time must always be greater than execution time (for the majority of samples the data is as expected). Execution times are in general slightly less than 39 msecs and response times are in general at or slightly above 39 msecs. 8.5.2.3 Optical Navigation Ranging and Centroid Location Service Since the RACE Optical Navigation service is in a separate, asynchronously executing pipeline from Frame Display and Compression, the RACE pipelining configuration was set up in order to create phasing and event release to minimize the jitter and interference to Optical Navigation by Frame Display. No data consistency is required between the frames that are displayed for the operator and the frames that are used for navigation since the display frames are really just to give the operator a vague idea of the RACE positioning and are ultimately displayed at a much lower frequency than 10 Hz due to network bandwidth limitations. So, tOpnav is released directly by tBtvid at half the rate since it is in a different pipeline. The period jitter is minimal due to the low jitter release source. However, the response jitter is clearly bimodal due to interference from the Frame Display service (Figures 37 A and B). Figure 36: Optical Navigation Event Release Period Jitter Opnav Period Jitter T (microsec) 68000 67500 67000 66500 66000 65500 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 5.0E+07 Time (microsec) Copyright 2000 Sam Siewert, All Rights Reserved 106 Figures 37 A and B clearly show that while the Opnav execution jitter is low, the response jitter is high due to interference from the frame display processing and is clearly bimodal since latency is added when interference exists and otherwise response is more immediate. Figure 37 A: Optical Navigation Execution (A) and Response Jitter (B) Opnav Response Jitter Cactcomp (microsec) Cactexec (microsec) Opnav Execution Jitter 30000 25000 20000 15000 10000 5000 0 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 5.0E+07 70000 60000 50000 40000 30000 20000 10000 0 0.0E+00 1.0E+07 8.5.2.4 2.0E+07 3.0E+07 4.0E+07 5.0E+07 Time (microsec) Time (microsec) RACE Vehicle Ramp Distance Control The RACE Ramp Control service has significant interference from both the frame display and Opnav services. In addition, it is released without isochronal control by Opnav and the overall affect of interference and the previous stage jitter leads to a tri-modal release jitter as seen in Figure 38. This release jitter could be significantly filtered by specifying Opnav to produce isochronal output, but typically this is not needed until a stage actually produces sink device output since processing is not typically sensitive to the jitter, but digital control devices typically are. Either way, the RT EPA can control jitter that is due to staging, but it cannot control jitter due to interference. Figure 38: Ramp Control Release Period Jitter Ramp Control Period Jitter 120000 T (microsec) 100000 80000 60000 40000 20000 0 0 1E+07 2E+07 3E+07 4E+07 5E+07 Time (microsec) Figures 39 A and B show that the Ramp Control service itself does not have significant or frequent execution jitter and likewise does not have much response jitter relative to release. Copyright 2000 Sam Siewert, All Rights Reserved 107 Figure 39 A and B: Ramp Control Execution (A) and Response (B) Jitter Ramp Control Response Jitter 1500 Cactcomp (microsec) Cactexec (microsec) Ramp Control Execution Jitter 1000 500 0 0 1E+07 2E+07 3E+07 4E+07 5E+07 1000 800 600 400 200 0 0 1E+07 Time (microsec) 2E+07 3E+07 4E+07 5E+07 Time (microsec) 8.5.2.5 RACE Vehicle Telemetry Processing The lowest priority services in the system will suffer from the most release jitter due to interference. In Figure 40 we see approximately 2 msecs of jitter around the 200 msec period worst-case. This is still fairly minimal jitter despite heavy interference. To understand release jitter the pipeline configuration must be considered carefully. Looking at Figure 20, we see that the telemetry service is released by a global mechanism based upon frame events. So, the jitter we see here is completely the result of interference rather than due to previous stage jitter. Figure 40: RACE Telemetry Release Period Jitter TLM Link Period Jitter 202000 T (microsec) 201500 201000 200500 200000 199500 199000 198500 0 1E+07 2E+07 3E+07 4E+07 5E+07 Time (microsec) Figures 41 A and B show minimal execution jitter, but more significant response jitter. Since the telemetry service execution time is sub-millisecond and many of the RACE ISR times are in the hundreds of microseconds, the response jitter is most likely due to ISR interference rather than task interference. The RT EPA currently considers ISR time to be insignificant, but it does have affect, and perhaps future work should address ISR time as well as task execution time. Copyright 2000 Sam Siewert, All Rights Reserved 108 Figure 41 A and B: RACE Telemetry Execution (A) and Response (B) Jitter TLM Link Response Jitter 1600 1400 1200 1000 800 600 400 200 0 Cactcomp (microsec) Cactexec (microsec) TLM Link Execution Jitter 0 1E+07 2E+07 3E+07 4E+07 1400 1200 1000 800 600 400 200 0 0 5E+07 1E+07 3E+07 4E+07 5E+07 Time (microsec) Time (microsec) 8.5.2.6 2E+07 RACE Video Frame Link Processing The RACE video link service release jitter is again bi-modal although not more than a millisecond. This service is globally released by the base source interrupt event, so the jitter is purely a result of interference most likely at interrupt level. Figure 42: RACE Frame Link Release Period Jitter Frame Link Period Jitter T (microsec) 500900 500800 500700 500600 500500 500400 500300 500200 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 Time (microsec) Figures 43 A and B show that the execution and response jitter are minimal for this service. Figure 43 A and B: RACE Frame Link Execution (A) and Response (B) Jitter Frame Link Response Jitter Cactcomp (microsec) Cactexec (microsec) Frame Link Execution Jitter 100000 80000 60000 40000 20000 0 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 Time (microsec) Copyright 2000 Sam Siewert, All Rights Reserved 500000 400000 300000 200000 100000 0 0.0E+00 1.0E+07 2.0E+07 3.0E+07 4.0E+07 Time (microsec) 109 8.5.2.7 RACE Camera Control Figure 44 shows that the camera control service suffers from significant release jitter due to interference by other services despite being globally released. Figure 44: RACE Camera Control Period Release Jitter Release T (microsec) Camera Control Release Jitter 400000 350000 300000 250000 200000 150000 100000 50000 0 0.0E+ 1.0E+ 2.0E+ 3.0E+ 4.0E+ 5.0E+ 6.0E+ 00 07 07 07 07 07 07 Time (microsec) Perhaps more interesting than the high release jitter for camera control is the very high response jitter (Figure 45 B) despite relatively low execution jitter (Figure 45 A). This demonstrates that the camera control service is being interfered with since execution times are 30120 microseconds, yet response times are varying between 200 to 10000 microseconds, an order of magnitude great dispersion in response compared to execution. Figure 45 A and B: RACE Camera Control Execution (A) and Response (B) Jitter 120 100 80 60 40 20 0 0.0E+0 1.0E+0 2.0E+0 3.0E+0 4.0E+0 5.0E+0 6.0E+0 0 7 7 7 7 7 7 Camera Control Response Jitter Cactcomp (microsec) Cactexec (microsec) Camera Control Execution Jitter Time (microsec) 8.5.3 12000 10000 8000 6000 4000 2000 0 0.0E+ 1.0E+ 2.0E+ 3.0E+ 4.0E+ 5.0E+ 6.0E+ 00 07 07 07 07 07 07 Time (microsec) RACE RT EPA Initial Service Negotiation and Re-negotiation (Goal 2) Based upon the initial RACE RT EPA marginal task set configuration tested in Section 8.4.2 (Table 28), the service negotiation is now refined using the on-line models derived from initial execution time estimates based on worst-case observations. The tNet and tExc VxWorks system tasks are ignored here since during RACE testing it was found that tExc imposes insignificant loading on the CPU (less than 1% worst-case) and since tNet was treated as a best effort task. Copyright 2000 Sam Siewert, All Rights Reserved 110 rtid 0 1 2 3 4 5 6 Table 28: Initial RACE Source/Sink Pipeline Task Service Description Name Soft Hard Dsoft Dterm T Cwc DM Conf Conf. (msecs) (msecs) (msecs) (µsec) Util . tBtvid 1.0 1.0 40 50 33.333 1000 02.00 tFrmDisp 0.5 0.9 50 50 100.000 10000 20.00 tOpnav 0. 0.99 66 66 66.67 28000 42.42 tRACECtl 1.0 1.0 66 66 66.67 1200 01.80 tTlmLnk 0.2 0.5 150 200 200.00 2000 01.00 tFrmLnk 0.5 0.8 400 500 500.00 60000 30.00 tCamCtl 1.0 1.0 1000 1000 1000.00 2500 00.75 87.97 RM Util 03.00 10.00 42.42 01.80 01.00 30.00 00.75 88.55 Re-negotiation for tighter deadlines at the same confidence level is one possible service renegotiation, but another is to keep the desired deadlines and accept higher than desired confidence. That was the approach taken with the RACE experiment and the results are summarized in Table 29. In contrast, in the pseudo loading experiment the deadlines were iteratively shortened until the desired confidence and actually reliability converged (Section 8.2.1) The raw data on which these results are based was collected by the RT EPA monitor and included in Appendix D. This example shows very well how observed worst-case execution times are pessimistic and how extremely how actual reliability can be observed over large sample sizes (2300 33.33 msec periods for this data). rtid Name 0 1 2 3 4 5 6 tBtvid tFrmDisp tOpnav tRACECtl tTlmLnk tFrmLnk tCamCtl 8.5.4 Table 29: RACE Source/Sink Actual Performance Soft Hard Dsoft Dterm T Cexp Rel. Rel.. (msecs) (msecs) (msecs) (µsec) 1.0 1.0 40 50 33.333 77.9 1.0 1.0 50 50 100.000 9840 1.0 1.0 66 66 66.67 21743 1.0 1.0 66 66 66.67 1200 1.0 1.0 150 200 200.00 2000 1.0 1.0 400 500 500.00 55028 1.0 1.0 1000 1000 1000.00 583 DM Utility 01.56 19.68 32.94 01.80 01.00 11.01 00.06 68.5 RM Utility 02.34 09.84 32.94 01.80 01.00 11.01 00.06 58.99 RACE Release Phasing Control Demonstration (Goal 3a and 3b) Experimental goal 3a, demonstration of stage-to-stage release phasing control, is demonstrated by Figures 38 and 39b from Section 8.4.2.4. In this experiment, stages were set up to release each other and the effect of the previous stage jitter causing release period jitter in the next is apparent in Figure 38. Figure 39b shows that despite period jitter, a given stage may still have low response jitter since response times are always taken relative to the release and the only contribution to response jitter for a stage is therefore execution jitter. Experimental goal 3b, demonstration of stage-to-stage isochronal release phasing control, is demonstrated by Figures 46a and 46b. In Figure 46a, the video link response jitter is uncontrolled such that the jitter from this stage will result in sink output or next stage release period jitter. Copyright 2000 Sam Siewert, All Rights Reserved 111 However, in Figure 46b, the isochronal hold output control feature of the RT EPA was enabled and the result is that response jitter is greatly minimized. Figure 46 A and B: Before and After Phasing Control Video Link Response Jitter (With Isochrony Control) 130000 130000 120000 110000 100000 90000 0.0E+0 5.0E+0 1.0E+0 1.5E+0 2.0E+0 2.5E+0 0 6 7 7 7 7 Cactcomp (microsec) Cactcomp (microsec) Video Link Response Jitter (No Isochrony Control) 120000 110000 100000 90000 0 5000000 1E+07 1.5E+07 2E+07 Time (microsec) 8.5.5 Time (microsec) RACE Protection of System from Unbounded Overruns (Goal 5) The RT EPA bounds the interference due to an overrun to the specified termination deadline. Over that period it is possible to assume that the release will attempt to use either the full resources of the period (worst-case assumption) or its typical resource demands, but either way it has not completed by the termination deadline due to one of the following conditions: 1) 2) 3) 4) lack of I/O resources lack of CPU resources lack of both I/O and CPU resources atypical execution jitter due to algorithmic or architectural variance 8.5.5.1 Example of Unanticipated Contention for I/O and CPU Resources In order to demonstrate the RT EPA ability to protect the system from occasional or malfunctioning service termination deadline overruns, the RACE test-bed was run and an artificial interference introduced by requesting a color frame to be dumped without going through task admission. This unaccounted for interference to the frame link and camera control task resulted in overruns for both of those services. The output log for that case is contained in Appendix C. Figure 47 shows that two missed deadlines occurred due to the unaccounted for interference, but that the system continued to function after those isolated misses. The assumed interference by the RT EPA over the miss was configured for expected execution time for the thread. Looking carefully at the releases around the miss we see that while the execution time of the miss was much higher than normal, due to the nature of TCP/IP packetization and interference by the dump to the same channel, the dropout due to restarting caused the overall interference to average out close to expected execution time. This test was more complicated than simple CPU interference since both the frame link thread and the interfering frame dump request not only were competing for CPU, but also for the ethernet interface. Despite this complication, the RT EPA was able to control the overrun. Copyright 2000 Sam Siewert, All Rights Reserved 112 Figure 47: Frame Link Termination Deadline Miss Control Frame Link Deadline Miss Control Cactcomp (microsec) 1000000 750000 500000 250000 0 0 1E+07 2E+07 3E+07 4E+07 5E+07 6E+07 7E+07 Time (microsec) 8.5.5.2 RT EPA Protection from Period/Execution Jitter Due to Misconfiguration (Goal 5) Similarly, if a particular task were to malfunction or be misconfigured, then the RT EPA protects other services from the misconfigured task which rather than occasionally missing deadlines, may continually miss deadlines while misconfigured. Such a case is contained in Appendix D. It should be noted that the deadline confidence for the termination deadline dropped below the requested 0.9 to 0.85 due to the period of misconfiguration. Furthermore, the actual reliability in the deadline was 0.71. If the misconfiguration had been allowed to continue, eventually the confidence would have dropped to zero if all actual execution times exceeded the desired deadline. The reliability is based on number of missed deadlines over all samples taken and the confidence is based on the number of samples out of all on-line samples that are within the deadline. Since the misconfiguration was allowed to persist for approximately 150 releases with an on-line model size of 100, the computation of the confidence is straight-forward. Furthermore, since the number of samples was less than the on-line model size (523 samples) and the initial model was a normal model instead of distribution-free, this explains why the reliability was lower than the confidence (all initial values in the model are set to zero unless a distribution free model is loaded). What is also very interesting is that after the misconifuration, there is an execution and response time hysteresis. This is most likely due to a newly evolved L1/L2 cache reference stream and/or dispatch context after the period of higher loading since the hysteresis exists in both the execution time and response time. This particular task has almost no interference since it is one of the shortest deadline and therefore highest priority tasks in the system. Copyright 2000 Sam Siewert, All Rights Reserved 113 Figure 48: Misconfiguration Execution Variance Example Misconfiguration of Frame Display Execution Hysteresis 90000 80000 140000 Cactexec (microsec) Cactcomp (microsec) Misconfiguration of Frame Display 70000 60000 50000 40000 30000 20000 10000 120000 100000 80000 60000 40000 20000 0 0 0 1E+07 2E+07 3E+07 4E+07 5E+07 6E+07 Time (microsec) 0 1E+07 2E+07 3E+07 4E+07 5E+07 6E+07 Time (microsec) In this case, a useful extension to the RT EPA would be to provide for a restart policy on occasional misses with a secondary dismissal policy for miss trends. This is not currently a feature of the RT EPA, but would be a simple extension. Copyright 2000 Sam Siewert, All Rights Reserved 114 9 Significance The significance of confidence-based scheduling and the RT EPA is that this approach provides a reliable and quantifiable performance framework for mixed hard and soft real-time applications. The examples presented show the ability to specify desired reliability, the RT EPA capability to monitor performance on-line and protect tasks from other poorly modeled tasks, and the ability to renegotiate reliability with the RT EPA through iterative refinement of requests based on actual execution performance. Furthermore, the thesis reports on future work planned to extend and broaden the examples to which the RT EPA can be applied. The set of applications requiring this type of performance negotiation support from an operating system is increasing with the emergence of virtual reality environments, continuous media, multimedia, digital control, and shared-control automation [Bru93][SiNu96]. The RT EPA real-time scheduling framework supports a broad spectrum of contemporary applications ranging from virtual environments to semi-autonomous digital control systems because it does support reliability and on-line monitoring and control of execution [Si96]. Furthermore, in addition to confidence-based scheduling, the RT EPA facility allows an application developer to construct a set of real-time kernel modules that manage an input (source) device; apply sequential processing to the input stream (pipeline stages); control individual processing stage behavior through parameters obtained from a user-space application; provide performance feedback to the controlling application; and manage the output (sink) device. This type of pipeline construction in combination with the RT EPA on-line admission testing based on requested deadline reliability and on-line performance monitoring make implementation of typical continuous media, digital control, and event-driven real-time systems much more simple than hard real-time, QoS, or best-effort systems. In general, the RT EPA provides a real-time scheduling framework which can handle multiservice applications including continuous media processing, digital control, and event-oriented processing. Without such an interface to a reliable scheduler for mixed services with quantifiable performance, such applications can be built using hard real-time methods such as RMA which waste resources to provide guaranteed service. Or, they can be built by using soft real-time approaches which provide abstract levels of service, but no quantifiable reliability assurances and no way of fire-walling services from execution/release variances, nor on-line monitoring methods which provide insight into actual performance. The RT EPA provides deadline reliabilities given execution models and on-line refinement to provide a real-time reliability framework for the first time. Copyright 2000 Sam Siewert, All Rights Reserved 115 10 Plans for Future Research Future research for the RT ERA includes extension of the API and formulation to include resources in addition to CPU (e.g. I/O), more direct support for resource usage epochs, and further demonstration of the RT EPA’s capabilities with additional test-beds exhibiting latency and jitter characteristics not demonstrated here already (e.g. high algorithmic execution jitter). Specifically, goals for future RT EPA research include: 1. Admission test modification to reduce the pessimism of the critical instant assumption for pipelines which specify synchronized release of stages and phasing of those releases. This can greatly reduce the interference in such pipelines and therefore lead to scheduling feasibility that ultimately is much higher than not only the RMA least upper bound, but also the current CBDM admission bound. 2. The admission test algorithm used is an extension of the DM sufficient test and is O(n2) for n services. Since the CBDM test is only sufficient, it is therefore pessimistic in terms of accounting for partial interference. Several other admission tests which have greater complexity, but are in fact necessary and sufficient could be considered for confidence-based extension including the scheduling point [Lehoc87] and completion tests[Jos86]. The CBDM test formulated here was selected for simplicity despite not being necessary and sufficient. More evaluation of the possibility of extending a necessary and sufficient test using expected and reliable execution estimates could lead to a better on-line admission test (less pessimistic). 3. Extend the RT EPA to formally model the demands for I/O resources and scheduling of these resources to meet data transport deadlines. In the current implementation, an I/O bound RT EPA service will have longer than anticipated response times based on interference and execution jitter alone which can be accounted for in the deadline confidence negotiation, but this is not directly formalized by the RT EPA. 4. Admission of services to service epochs such that scheduling feasibility is checked in two or more minor periods over one system major period and control by the RT EPA such that one epoch is protected from the other and so that independent on-line models can be derived for each epoch. 5. In this research the critical instant assumption and the WCET were both shown to be overly pessimistic, but the critical instant assumption appears more significant than WCET pessimism due to jitter. A test-bed which has high algorithmic/architectural execution jitter and less significant phasing impact would demonstrate the jitter control more dramatically. Beyond these specific goals to establish and validate the resource management and negotiation concepts introduced by the research presented here, porting the RT EPA to additional kernels would establish the viability of viewing the RT EPA as kernel-ware which can support mixed hard/soft real-time applications on a variety of platforms. This will require providing some system specific API functions (e.g. for associating interrupts with event releases) and will require ensuring that basic capabilities are portable (e.g. event time-stamping to microsecond accuracy). Copyright 2000 Sam Siewert, All Rights Reserved 116 11 Conclusion Experiments were implemented using both the RT EPA and user-level applications to compare performance. The RT EPA not only improved throughput compared to hard real-time scheduling admission and prioritization policy, it also provided reliable configuration, monitoring, and control through the confidence-based scheduling policy. The fundamental aspect of the RT EPA performance control is based on the CBDM approach for admitting threads for reliable execution. Thus, the RT EPA was evaluated in terms of how well the three example pipelines were able to meet expected and desired performance in terms of missed deadlines. These experiments were also evaluated in terms of real-time parameters such as exposure timeouts, video stream dropouts, latency and control system overshoot in order to evaluate the reliability afforded by the RT EPA to applications. These experiments were run individually and simultaneously to evaluate use of the RT EPA mechanism for complex real-time applications involving multimedia and interaction for complex applications that have multiple hard and soft real-time requirements. The RT EPA provides a framework for on-line service admission, monitoring and control as demonstrated here and was used to establish the theory of CBDM, multiple scheduling epochs, and negotiation for reliable service in terms of expected number of missed/made deadlines. Overall, the RT EPA theory, prototype framework, and validating experiments introduces an engineering oriented process for implementing timing reliability requirements in real-time systems. Copyright 2000 Sam Siewert, All Rights Reserved 117 References [Au93] Audsley, N., Burns, A., and Wellings, A., "Deadline Monotonic Scheduling Theory and Application", Control Engineering Practice, Vol. 1, pp 71-8, 1993. [Baruah97] Baruah, S., Gehrke, J., Plaxton, C., Stoica, I., Abdel-Wahab, H., Jeffay, K., “Fair online scheduling of a dynamic set of tasks on a single resource”, Information Processing Letters, 64(1), pp. 43-51, October 1997. [Be95] Bershad, B., Fiuczynski, M., Savage, S., Becker, D., et al., "Extensibility, Safety and Performance in the SPIN Operating System", Association for Computing Machinery, SIGOPS '95, Colorado, December 1995. [Bra99] Brandt, Scott A., “Soft Real-Time Processing with Dynamic QoS Level Resource Management,” PhD dissertation, Deparment of Computer Science, University of Colorado, 1999. [BraNu98] Brandt, S., Nutt, G., Berk, T., and Mankovich, J., “A Dynamic Quality of Service Middleware Agent for Mediating Application Resource Usage”, Proceedings of the 19th IEEE Real-Time Systems Symposium, pp. 307-317, December 1998. [BriRoy99] Briand, Loïc and Roy, Daniel, Meeting Deadlines in Hard Real-Time Systems – The Rate Monotonic Approach, IEEE Computer Society Press, 1999. [Bru93] Brunner, B., Hirzinger, G., Landzettel, K., and Heindl, J., “Multisensory shared autonomy and tele-sensor-programming - key issues in the space robot technology experiment ROTEX”, IROS ‘93 International Conference on Intelligent Robots and Systems, Yokohama, Japan, July, 1993. [Bu91] Burns, A., "Scheduling Hard Real-Time Systems: A Review", Software Engineering Journal, May 1991. [Carlow84] Carlow, Gene, “Architecture of the Space Shuttle Primary Avionics Software System”, Communications of the Association for Computing Machinery, Vol. 27, No. 9, September, 1984. [Connex98] Connexant Corp., “Bt878/879 Single-Chip Video and Broadcast Audio Capture for the PCI Bus”, manual printed originally by Rockwell Semiconductor Systems, March 1998 (available from www.connexant.com). [Co94] Coulson, G., Blair, G., and Robin, P., "Micro-kernel Support for Continuous Media in Distributed Systems", Computer Networks and ISDN Systems, pp. 1323-1341, Number 26, 1994. [Ether96] ISO/IEC Standard 8802/3, “Information Technology – Local and Metropolitan Area Networks – Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications”, 1996 (supersedes IEEE 802.3), IEEE, New York, NY, 1996. [Fal94] Fall, K., and Pasquale, J., "Improving Continuous-Media Playback Performance With In-Kernel Data Paths", Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS), pp. 100-109, Boston, MA, June 1994. [Fan95] Fan, C., “Realizing a Soft Real-Time Framework for Supporting Distributed Multimedia Applications”, Proceedings of the 5th IEEE Workshop on the Future Trends of Distributed Computing Systems, pp. 128-134, August, 1995. [Fle95] Fleischer, S., Rock, S., Lee, M., "Underwater Vehicle Control from a Virtual Environment Interface", Association for Computing Machinery, 1995 Symposium on Interactive 3D Graphics, Monterey CA, 1995. [Gov91] Govindan, R., and Anderson, D., "Scheduling and IPC Mechanisms for Continuous Media", 13th ACM Symposium on Operating Systems Principles, 1991. Copyright 2000 Sam Siewert, All Rights Reserved 118 [HePa90] Hennessy, J., Patterson, D., “Computer Architecture – A Quantitative Approach”, , Morgan Kaufmann, 1990. [JefGod99] Jeffay, K., and Goddard, S., “A Theory of Rate-Based Execution”, Proceedings of the 20th IEEE Real-Time Systems Symposium, pp. 304-314, Phoenix, AZ, December 1999. [JefSton91] Jeffay, K., Stone, D., Poirier, D., “YARTOS – Kernel Support for Efficient, Predictable, Real-Time Systems”, Proceedings of Joing IEEE Workshop on RealTime Operating Systems and Software, Atlanta, Georgia, May 1991. [JonRos97] Jones, M., Rosu, D., Rosu, M., “CPU Reservations and Time Constraints: Efficient Predictable Scheduling of Independent Activities”, Proceedings of the 16th ACM Symposium on Operating Systems Principles, October, 1997. [Jos86] Joseph, M., Pandia, P., “Finding Response Times in a Real-Time System”, The Computer Journal, British Computing Society, Vol. 29, No. 5, October 1986, pp. 390-395. [Kl93] Klein, M., Ralya, T., Pollak, B., et al, “A Practitioner’s Handbook for Real-Time Analysis: Guide to Rate Monotonic Analysis for Real-Time Systems”, Kluwer Academic Publishers, Boston, 1993. [Kl94] Klein, M., Lehoczky, J., and Rajkumar, R., “Rate-Monotonic Analysis for Real-Time Industrial Computing”, IEEE Computer, January 1994. [Lehoc87] Lehoczky, J., Sha, L., Ding, Y., “The Rate Monotonic Scheduling Algorithm: Exact Characterization and Average Case Behavior”, Tech. Report, Department of Statistics, Carnegie-Mellon University, Pittsburgh, Pa., 1987. [LiuLay73] Liu, C., and Layland, J., “Scheduling Algorithms for Multiprogramming in a HardReal-Time Environment”, Journal of the Association for Computing Machinery, pp. 46-61, Vol. 20, No. 1, January 1973. [Laplante93] Laplante, P.A., Real-Time Systems Design and Analysis – An Engineer’s Handbook, IEEE Press, New York, 1993. [McCart00] McCartney, C., “DirectX Display Drivers – DirectX 7.0 and Beyond”, Windows Hardware Engineering Conference, New Orleans, April 25, 2000. [MerSav94] Mercer, C., Savage, S., Tokuda, H., “Processor Capacity Reserves: Operating System Support for Multimedia Applications”, IEEE International Conference on Multimedia Computing Systems, Boston, MA, May 1994. [MosPet96] Mosberger, D. and Peterson, L., “Making Paths Explicit in the Scout Operating System”, Second Symposium on Operating Systems Design and Implementation, 1996. [NiLam96] Nieh, J., and Lam, M., “The design, implementation and evaluation of SMART: A Scheduler for Multimedia Applications”, Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, October, 1997. [Nu95] Nutt, G., Antell, J., Brandt, S., Gantz, C., Griff, A., Mankovich, J., “Software Support for a Virtual Planning Room”, Technical Report CU-CS-800-95, Dept. of Computer Science, University of Colorado, Boulder, December 1995. [NuBra99] Nutt, G., Brandt, S., Griff, A., Siewert, S., Berk, T., Humphrey, M., "Dynamically Negotiated Resource Management for Data Intensive Application Suites", IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 1 (January/February 2000), pp. 78-95. [Pa96] Paulos, E., and Canny, J., “Delivering Real Reality to the World Wide Web via Telerobotics”, IEEE International Conference on Robotics and Automation. [POSIX93] ISO Standard 1003.1b, “Standard for Information Technology Portable Operating System Interface (POSIX), Part 1: System Application Program Interface (API), Realtime Extension [C Language], IEEE, New York, NY, 1993. Copyright 2000 Sam Siewert, All Rights Reserved 119 [Red98] Redell, Ola, “Global Scheduling in Distributed Real-Time Computer Systems – An Automatic Control Perspective”, Technical Report, Department of Machine Design, Royal Institute of Technology, Stockholm, Sweden, March 1998. [ShaRaj90] Sha, L., Rajkumar, R., and Lehoczky, J., “Priority Inheritence Protocols: An Approach to Real-Time Synchronization”, IEEE Transactions on Computers, 39(9), pp. 1175-1185, September 1990. [Si96] Siewert, S., “Operating System Support for Parametric Control of Isochronous and Sporadic Execution in Multiple Time Frames”, Ph.D. dissertation proposal, Univ. of Colorado Boulder, 1996. [SiNu96] Siewert, S., and Nutt, G., “A Space Systems Testbed for Situated Agent Observability and Interaction”, In the Second ASCE Specialty Conf. on Robotics for Challenging Environments, Albuquerque, New Mexico, June 1996. [Sprunt89] Sprunt, B., Lehoczky, J., Sha L., “Aperiodic Task Scheduling for Hard Real-Time Systems”, Journal of Real-Time Systems, Vol. 1, 1989, pp. 27-60. [Sprunt88] Sprunt, B., Sha L., Lehoczky, J., “Exploiting Unused Periodic Time for Aperiodic Service Using the Extended Priority Exchange Algorithm”, Proceedings of the 9th Real-Time Systems Symposium, IEEE, Huntsville, Alabama, December 1988, pp. 251-258. [Stank88] Stankovic, J., Ramamritham, K., “Hard Real-Time Systems, Tutorial”, IEE Computer Society Press, Washington D.C., 1988. [Ste95] Steinmetz, R., and Wolf, L., "Evaluation of a CPU Scheduling Mechanism for Synchronized Multimedia Streams", in Quantitative Evaluation of Computing and Communication Systems, Beilner, H. and Bause, F. eds., Lecture Notes in Computer Science, No. 977, Springer-Verlag, Berlin, 1995. [TatTez94] Tatsuo Nakajima and Hiroshi Tezuka, “A Continuous Media Application Supporting Dynamic QoS Control on Real-Time Mach”, Association for Computing Machinery, Multimedia 94, San Francisco, California, 1994. [Tin93] Tindell, K., Burns, A., Davis, R., “Scheduling Hard Real-Time Multi-Media Disk Traffic”,University of York, UK, Computer Science Report, YCS, 204, 1993. [Tin94] Tindell, K., Clark, J., “Holistic Schedulability Analysis for Distributed Hard RealTime Systems”, Microprocessors and Microprogramming, Vol. 40, No. 2-3, April 1994, pp. 117-134. [TinBur92] Tindell, K., Burns, A., and Wellings, A., “Allocating Hard Real-Time Tasks: An NPhard Problem Made Easy”, Journal of Real-Time Systems, Vol. 4, pp. 145-165, Kluwer Academic Publishers, 1992. [To90] Tokuda, H., Nakajima, T., and Rao P., "Real-Time Mach: Towards a Predictable Real-Time System", Proceedings of USENIX Mach Workshop, October 1990. [Tom87] Tomayko, J., “Computers in Spaceflight: The NASA Experience”, Encyclopedia of Computer Science and Technology, Vol. 18, Suppl. 3, Marcel Dekker, New York, 1987, pp. 44-47. [Tör95] Törngren, M., “On the Modelling of Distributed Real-Time Control Systems”, Proceedings of 13th IFAC Workshop on Distributed Computer Control Systems, Toulouse-Blagnac, France, Sept. 1995. [Whi99] Whitaker, J., “Video and Television Engineering”, 3rd Ed., McGraw Hill Companies Inc., New York, NY, April 1999. [WRS97] “VxWorks Reference Manual 5.3.1”, Ed. 1, February 21, 1997, part # DOC-12068ZD-00. [WriSte94] Wright, G., Stevens, R., “TCP/IP Illustrated, Volume 2 – The Implementation”, Addison Wesley Publishing Company, October 1994. Copyright 2000 Sam Siewert, All Rights Reserved 120 Appendix A RT EPA Source Code API Specification rtepa.h #ifndef _d_rtepa_h_ #define _d_rtepa_h_ #include #include #include #include #include <semLib.h> <timers.h> <taskLib.h> <signal.h> <types/vxTypesBase.h> /* RTEPA internal time used for relative times including C,D, and T has a maximum value of 4 billion microsecs, 4000 seconds, or about 1 hour. This seems quite reasonable for release computation time C, deadline relative to release D, and release period T. */ #define r_time unsigned long int #define MAXRTIME _ARCH_UINT_MAX #define uint unsigned long int #define MAXUINT _ARCH_UINT_MAX #define MAXINT _ARCH_INT_MAX #define MAX_DISPATCH_HISTORY 1000 #define RTEPA_HIGHEST_PRIO 1 #define #define #define #define TASK_ADMISSION_REQUEST 0 TASK_REMOVE_REQUEST 1 TASK_UPDATE_REQUEST 2 TASK_PERFORMANCE_REQUEST 3 #define #define #define #define MAX_MODEL 1000 MAX_TASKS 10 UNUSED_ID -1 MAX_NAME 128 #define QTABLE_SIZE 5000 #define MICROSECS_PER_SEC 1000000 #define NANOSECS_PER_MICROSEC 1000 #define MAX_STACK 32768 #define NO_STAGE_RELEASE -1 Copyright 2000 Sam Siewert, All Rights Reserved 121 /* RT State: 0=RT_STATE_NONE, 1=RT_STATE_ADMITTED, 2=RT_ACTIVATED, 4=RT_SUSPENDED */ #define RT_STATE_NONE 0 #define RT_STATE_ADMITTED 1 #define RT_ACTIVATED 2 #define RT_SUSPENDED 3 #define RT_TERMINATED 3 /* Execution State: 0=EXEC_STATE_NONE, 1=EXEC_STATE_PEND_RELEASE, 2=EXEC_STATE_DISPATCHED, 3=EXEC_STATE_PREEMPTED, 4=EXEC_STATE_COMPLETED */ #define EXEC_STATE_NONE 0 #define EXEC_STATE_PEND_RELEASE 1 #define EXEC_STATE_DISPATCHED 2 #define EXEC_STATE_PREEMPTED 3 #define EXEC_STATE_COMPLETED 4 /* Release State: 0=RELEASE_NONE, 1=PEND_RELEASE, 2=RELEASED, 3=RELEASE_COMPLETED */ #define RELEASE_NONE 0 #define PEND_RELEASE 1 #define RELEASED 2 #define RELEASE_COMPLETED 3 #define DEMOTE_OTHER_TASKS 1 enum exec_model {normal, distfree}; enum task_control {guaranteed, reliable, besteffort}; enum interference_assumption {worstcase, highconf, lowconf, expected}; enum release_type {external_event, single, internal_timer}; enum hard_miss_policy {restart, dismissal}; enum release_complete {isochronous, anytime}; union release_method { /* release specification parameters */ SEM_ID release_sem; /* external source */ /* OR */ timer_t release_itimer; /* internal source */ }; struct normal_model { /* for normal distribution supplied model */ r_time Cmu; r_time Csigma; double HighConf; /* determines Zphigh unit normal dist quantile */ double LowConf; /* determines Zplow unit normal dist quantile */ double Zphigh; /* computed */ Copyright 2000 Sam Siewert, All Rights Reserved 122 double Zplow; /* computed */ r_time Ntrials; }; struct distfree_model { /* for distribution free supplied model */ r_time Csample[MAX_MODEL]; double HighConf; double LowConf; r_time Ntrials; }; struct worst_case_model { /* for worst-case supplied model */ r_time Cwc; }; union model_type { struct normal_model normal_model; /* OR */ struct distfree_model distfree_model; /* OR */ struct worst_case_model worst_case_model; }; struct rtepa_interupt_event_release { FUNCPTR app_isr; SEM_ID event_semaphore; int rtid; }; /* All times are considered microseconds with limit of 4K seconds */ struct rtepa_control_block { /* The main entry point is wrapped by the RTEPA with either a signal handler or a semTake loop such that entry point is called on specified release event. VxWorks task body is kept resident until RT EPA task is removed. Copyright 2000 Sam Siewert, All Rights Reserved 123 RT EPA on-line stats are computed during kernel context switches. Performance parameters are computed on demand and kept in the CB. */ /***************************** supplied ***************************/ /***************************** required for admission */ /**** Service type */ enum task_control tc_type; enum interference_assumption interference_type; enum exec_model exec_model; union model_type model; /**** Release and deadline specification */ enum release_type release_type; union release_method release_method; r_time Dsoft; /* release relative early soft deadline in microsecs */ r_time Dterm; /* release relative hard deadline where execution is terminated by rtepa in microsecs */ r_time Texp; /* period for release expected in microsecs */ enum hard_miss_policy HardMissActon; FUNCPTR serviceDsoftMissCallback; FUNCPTR serviceReleaseCompleteCallback; /***************************** required for activation */ FUNCPTR entryPt; char name[MAX_NAME]; int stackBytes; char Stack[MAX_STACK+1]; /************************** maintained by rtepa ********************/ int RTEPA_id; int sched_tid; WIND_TCB sched_tcb; WIND_TCB *sched_tcbptr; int assigned_prio; /***************************** optional pipeline I/O ctl source -- int --> stage_0 -- semGive(next) --> stage_1 ... --> sink */ /* Output control */ enum release_complete complete_type; r_time Tout; ULONG Tout_ticks; ULONG Tout_jiffies; Copyright 2000 Sam Siewert, All Rights Reserved 124 /* Number of stages being sequenced */ int NStages; /* Pipe stage sequencing of next thread */ int next_stage_rtid[MAX_TASKS]; int next_stage_event_releases[MAX_TASKS]; int next_stage_activation[MAX_TASKS]; /* Pipe stage sequencing next stage sub-frequency */ uint next_stage_cycle_freq[MAX_TASKS]; /* Pipe stage sequencing next stage phasing offset */ uint next_stage_cycle_offset[MAX_TASKS]; /***************************** optional pipeline I/O ctl */ /**** INTERNAL USE ONLY */ timer_t Dterm_itimer; /* watchdog timer for termination deadline */ struct itimerspec dterm_itime; struct itimerspec last_dterm_itime; struct sigevent TermEvent; struct sigaction TermAction; int flags; /************************** On-line Model */ int ExecState; int RTState; int ReleaseState; /* computed from supplied model */ r_time Cexp; r_time Clow; r_time Chigh; /* Based on the RT clock frequency and interrupt period the exact time can be derived to the accuracy of the oscillator using the number of interrupt ticks and portion of a tick (jiffies). */ /* Event release record */ ULONG prev_release_ticks; UINT32 prev_release_jiffies; ULONG last_release_ticks[MAX_MODEL]; UINT32 last_release_jiffies[MAX_MODEL]; ULONG last_complete_ticks[MAX_MODEL]; UINT32 last_complete_jiffies[MAX_MODEL]; /* Dispatch and preempt time records for current release */ Copyright 2000 Sam Siewert, All Rights Reserved 125 ULONG last_dispatch_ticks; UINT32 last_dispatch_jiffies; ULONG last_preempt_ticks; UINT32 last_preempt_jiffies; /* App times */ ULONG app_release_ticks[MAX_MODEL]; UINT32 app_release_jiffies[MAX_MODEL]; ULONG app_complete_ticks[MAX_MODEL]; UINT32 app_complete_jiffies[MAX_MODEL]; uint uint uint uint Nstart; /* current on-line model starting index */ Nact; /* current on-line model complete index */ N; /* total completions sampled */ Nonline; /* desired on-line model size */ r_time Cactcomp[MAX_MODEL]; /* history of actual completion times */ r_time Cactexec[MAX_MODEL]; /* history of actual execution times */ r_time Tact[MAX_MODEL]; /* history of actual release periods */ /* Statistics */ uint Npreempts; uint Ninterferences; uint Ndispatches; uint SoftMissCnt; uint HardMissCnt; uint HardMissTerm; r_time HardMissCactcomp[MAX_MODEL]; /* history of actual completion times */ r_time SoftMissCactcomp[MAX_MODEL]; /* history of actual completion times */ uint ReleaseCnt; uint CompleteCnt; uint ReleaseError; uint CompleteError; uint ExecError; /* On demand performance model */ /* Model expectation */ r_time Cexpactcomp; r_time Clowactcomp; r_time Chighactcomp; r_time Cexpactexec; r_time Clowactexec; r_time Chighactexec; r_time Texpact; double HardReliability; double SoftReliability; r_time ActConfDsoft; r_time ActConfDhard; Copyright 2000 Sam Siewert, All Rights Reserved 126 }; int rtepaInitialize(FUNCPTR safing_callback, int init_mask, r_time monitor_period); int rtepaShutdown(int shutdown_mask); int rtepaTaskAdmit(int *rtid, enum task_control tc_type, enum interference_assumption interference, enum exec_model exec_model, union model_type *modelPtr, enum hard_miss_policy miss_control, r_time Dsoft, r_time Dterm, r_time Texp, double *SoftConf, double *TermConf, char *name); int rtepaTaskDismiss(int rtid); int rtepaTaskActivate(int rtid, FUNCPTR entryPt, FUNCPTR serviceDsoftMissCallback, FUNCPTR serviceReleaseCompleteCallback, enum release_complete complete_control, int stackBytes, enum release_type release_type, union release_method release_method, uint Nonline); int rtepaTaskSuspend(int rtid); int rtepaTaskResume(int rtid); int rtepaTaskDelete(int rtid); int rtepaTaskPrintPerformance(int rtid); int rtepaIDFromTaskID(WIND_TCB *tcbptr); int rtepaInTaskSet(int tid); int rtepaTaskPrintActuals(int rtid); int rtepaTaskPrintCompare(int rtid); int rtepaPCIx86IRQReleaseEventInitialize(int rtid, SEM_ID event_semaphore, unsigned char x86irq, FUNCPTR isr_entry_pt); void rtepaPipelineSeq(int src_rtid, int sink_rtid, int sink_release_freq, int sink_release_offset, SEM_ID sink_release_sem); Copyright 2000 Sam Siewert, All Rights Reserved 127 int rtepaRegisterPerfMon(int rtid, FUNCPTR renegotiation_callback, int monitor_mask); int rtepaPerfMonUpdateAll(void); int rtepaPerfMonUpdateService(int rtid); r_time r_time double double double double rtepaPerfMonDtermFromNegotiatedConf(int rtid); rtepaPerfMonDsoftFromNegotiatedConf(int rtid); rtepaPerfMonConfInDterm(int rtid); rtepaPerfMonConfInDsoft(int rtid); rtepaPerfMonDtermReliability(int rtid); rtepaPerfMonDsoftReliability(int rtid); r_time rtepaPerfMonCexp(int rtid); r_time rtepaPerfMonChigh(int rtid); r_time rtepaPerfMonClow(int rtid); r_time rtepaPerfMonRTexp(int rtid); r_time rtepaPerfMonRhigh(int rtid); r_time rtepaPerfMonRlow(int rtid); int rtepaLoadModelFromArray(r_time *sample_array, r_time *sample_src, int n); int rtepaTaskSaveCactexec(int rtid, char *name); int rtepaTaskLoadCactexec(r_time *model_array, char *name); void rtepaSetIsochronousOutput(int rtid, r_time Tout); #endif Copyright 2000 Sam Siewert, All Rights Reserved 128 Appendix B Loading Analysis for Image Centroid Calculation with Variance Due to Cache Misses Centroid Calculation Performance Comparison (X2000 132 MhZ PowerPC 750 and 33 MhZ RAD6000) 11.1 Arhitecture Performance Assumptions Pipeline (4 Cycles) 1) 2) 3) 4) INSTRUCTION FETCH, INSTRUCTION DECODE, EXECUTION, WRITE-BACK Instructions are cached and fetched in ONE clock by each execution unit Instructions are decoded in ONE clock by each execution unit Execution in ONE clock (note that registers are loaded by instruction) Write-back to cache in ONE clock Any cache miss is assumed to stall the pipeline. For superscalar architectures adjust for number of execution units. X2000 132 MhZ PowerPC 750 CLOCK: L1 Cache: L2 Cache: ALU: Data local-bus: I/O bust: 11.2 132 MHZ, 7.58 NANOSECONDS 32 Kbyte 8-way set associative data and instruction NONE pipelined, superscalar, CPI=0.33 best case mixed FP and integer, CPI=0.5 integer only, CPI=2 worst case if both integer pipelines are stalled 64 bits 32 bits 33 MhZ RAD6000 Analysis Clock: L1 Cache: L2 Cache: ALU: Data local-bus: I/O bus: 33 MhZ, 30.3 nanoseconds 8 Kbyte 8-way set associative data and instruction NONE pipelined, CPI=1.0 best case, CPI=4.0 if pipeline is stalled 32 bits 32 bits Note: Neither cache is large enough to cache a DMA transferred frame so therefore references to memory containing the frame buffer will always cause a cache miss as the frame is traversed. I.e. Copyright 2000 Sam Siewert, All Rights Reserved 129 there is no good way to keep the frame values cached, only the temp variables used in the calculations in both cases. 11.3 Centroid Computation Time Model Supplying initial time models to the RT EPA can be done by analyzing code or by off-line experimental runs. If the code is not yet implemented, and several algorithms are under consideration, the method outlined here to approximate computation time may be useful. The key is to determine the complexity of the algorithm, the native architecture instructions required for the algorithm, and the affect of the code on the native architecture pipeline. The best execution model will always be one based upon actual execution, but a method of approximation is useful during application design. 11.3.1 Alogirthm Description x-bar = sum ( x * m ) / M , where x-bar is the weighted-mean coordinate, and m is the increment mass (or brightness) and M is the total image mass. To brute-force process each frame of image data takes at least: 1 million multiplies (for each of x and y) 1 million adds (for total brightness) 1 divide (for each of x and y) 11.3.2 Load-Store RISC Pseudo-code Instructions to Implement for X-bar and Y-bar TB_LOOP: load r1, M load r2, m[r0] iadd r1,r2,r3 store r3,M INCR R0 jne r0, TB_LOOP -- cache hit on total M -- cache miss to load pixel brightness -- cache hit zero r0 zero r31 Y_LOOP: X_LOOP: imul r0,r31,r30 load r1,m[r30] imul r0,r1,r29 load r28,Xsum iadd r28,r29,r27 -- DX =1 AND DY = 1 -- doubly dimensioned array index calculation -- cache miss to load pixel brightness -- (x*m) -- cache hit -- sum(x*m) Copyright 2000 Sam Siewert, All Rights Reserved 130 store r27,Xsum imul r31,r1,r29 load r28,Ysum iadd r28,r29,r27 store r27,Ysum incr r0 jne r0, X_LOOP -- cache hit -- (y*m) -- cache hit -- sum(y*m) -- cache hit zero r0 incr r31 jne r31, Y_LOOP load r0,M load r1,Xsum load r2,Ysum idiv r1,r0,r3 idiv r2,r0,r4 store r3,Xbar store r4,Ybar 11.4 Overall Expected Cache Hit Rate Variables assumed cached include: X-bar, Y-bar, x, y, M. Given large array sizes that can’t be cached after DMA transfer, it is assumed that pixel sample m references cause a cache miss on every time. Therefore, analyzing the load-store reduced-instruction-set (RISC) pseudo code above, one gets the following cache hit rates for each significant section of code: TB_LOOP – 0.66 hit rate Y_LOOP, X_LOOP – 0.8 hit rate Both loops have n iterations where n is the number of pixels, so therefore overall expected cache hit rate is: (0.66 + 0.8)/2 = 0.73 centroid computation hit rate 11.5 Centroid CPI Estimations CPIr6k = (0.73 * 1.0) + (0.27 * 4.0) = 0.73 + 1.08 = 1.81 CPIppc = (0.73 * 0.5) + (0.27 * 2.0) = 0.365 + 0.54 = 0.905 11.6 Algorithm Complexity M – 6n X,Y – 12n Nf = 18n, where n is the number of pixels and Nf is the number of instructions per frame Final calculation and outer loop calculations are not significant! Copyright 2000 Sam Siewert, All Rights Reserved 131 11.7 Time to Compute Array Centroid Tf = (Nf * CPI) * Tclk Tf-r6k = (18n * 1.81) * 30.3 nanoseconds * (1 sec / 1e+9 nsecs) Tf-ppc = (18n * 0.905) * 7.58 nanoseconds * (1 sec / 1e+9 nsecs) 11.8 Example for 1024x1024 Array Tf-r6k = (18 * 1024 * 1024 * 1.81) * 30.3 nanoseconds * (1 sec / 1e+9 nsecs) = 1.035 seconds Tf-ppc = (18 * 1024 * 1024 * 0.905) * 7.58 nanoseconds * (1 sec / 1e+9 nsecs) = 0.1295 seconds 11.9 General Result Tf-r6k / Tf-ppc = (1.81 * 30.3) / (0.905 * 7.58) Tf-r6k = 8 * Tf-ppc Time to compute centroid on the RAD6000 is 8 x as long as it takes on the PPC 750. Copyright 2000 Sam Siewert, All Rights Reserved 132 Appendix C Unmodeled Interference Causes Several Termination Deadline Misses Script started on Tue Jun 27 09:01:15 2000 -> ld < rtepaLib.o Undefined symbols: _rtepaTaskDismiss Warning: object module may not be usable because of undefined symbols. value = 655440 = 0xa0050 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_race microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0xa6 **** VIDEO PRESENT **** DECODING EVEN FIELD **** PLL OUT OF LOCK Copyright 2000 Sam Siewert, All Rights Reserved 133 **** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0xe300022e I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x447fedc mmio INTSTATUS testval = 0xe300022e I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x447fedc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0xeb000204 I2C RACK DMA_MC_SKIP DMA_MC_JUMP DMA_MC_SYNC DMA ENABLED EVEN FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37d610 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedScam Servo Driver Serial Interface Driver /tyCo/0 intialized and opened with status=0 OOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x383d3c, and assigned = 0x383d3c RTEPA stack base = 0x1c0f14c RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 50001000 Created RTEPA task 1 with tcbptr=0x1c172e8 Entry pointer passed in = 0x38444c, and assigned = 0x38444c RTEPA stack base = 0x1c25028 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 66000000 Created RTEPA task 2 with tcbptr=0x1c2d1c4 Entry pointer passed in = 0x383ae4, and assigned = 0x383ae4 RTEPA stack base = 0x1c3af04 Created RTEPA task 3 with tcbptr=0x1c430a0 Copyright 2000 Sam Siewert, All Rights Reserved 134 RTEPA_CB[3].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[3].dterm_itime.it_value.tv_nsec = 66600000 Entry pointer passed in = 0x38390c, and assigned = 0x38390c RTEPA stack base = 0x1c50de0 Created RTEPA task 4 with tcbptr=0x1c58f7c RTEPA_CB[4].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[4].dterm_itime.it_value.tv_nsec = 200000000 Entry pointer passed in = 0x3839fc, and assigned = 0x3839fc RTEPA stack base = 0x1c66cbc Created RTEPA task 5 with tcbptr=0x1c6ee58 RTEPA_CB[5].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[5].dterm_itime.it_value.tv_nsec = 500000000 Entry pointer passed in = 0x383b68, and assigned = 0x383b68 RTEPA stack base = 0x1c7cb98 Created RTEPA task 6 with tcbptr=0x1c84d34 ******** RACE system fully activated ******** RTEPA_CB[6].dterm_itime.it_value.tv_sec = 1 RTEPA_CB[6].dterm_itime.it_value.tv_nsec = 0 value = 46 = 0x2e = '.' = s_B + 0x6 -> prio_dump_frame(5) value = 32989940 = 0x1f762f4 -> ****************** MISSED HARD DEADLINE [rtid=5, release=47] ****************** RESTARTING ****************** MISSED HARD DEADLINE [rtid=6, release=148] ****************** RESTARTING -> stop_race Actual pipeline sequencing rtid=0 completed 2294 times, activated next stage @ 192 => next_stage_rtid=1 released 701 times [specified freq = 3, offset = 0, expected releases = 700] rtid=0 completed 2294 times, activated next stage @ 223 => next_stage_rtid=2 released 1036 times [specified freq = 2, offset = 5, expected releases = 1035] rtid=0 completed 2294 times, activated next stage @ 252 => next_stage_rtid=3 released 1022 times [specified freq = 2, offset = 0, expected releases = 1021] rtid=0 completed 2294 times, activated next stage @ 282 => next_stage_rtid=4 released 336 times [specified freq = 6, offset = 0, expected releases = 335] rtid=0 completed 2294 times, activated next stage @ 330 => next_stage_rtid=5 released 66 times [specified freq = 30, offset = 0, expected releases = 65] rtid=0 completed 2294 times, activated next stage @ 350 => next_stage_rtid=6 released 195 times [specified freq = 10, offset = 0, expected releases = 194] ******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c0127c ******** Dispatch parameters Dsoft=40000, Dterm=50000, Texp=33333, Cexp=100 ******** Initial model ******** Copyright 2000 Sam Siewert, All Rights Reserved 135 High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model ******** Dhard from actual dist free confidence interval =479 Dsoft from actual dist free confidence interval =479 Confidence in supplied Dhard based on exec time=0.999000 Confidence in supplied Dsoft based on exec time=0.999000 Confidence in supplied Dhard based on complete time=0.999000 Confidence in supplied Dsoft based on complete time=0.999000 N samples =2294 Start sample index =294 Last sample index =294 ReleaseCnt=2294 CompleteCnt=2294 Npreempts=2307 Ninterferences=13 Ndispatches=2307 Texpact=33334 Cexpactexec=135 Clowactexec=0 Chighactexec=1178 Cexpactcomp=135 Clowactcomp=0 Chighactcomp=1178 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=36576 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 ******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c17158 ******** Dispatch parameters Dsoft=50000, Dterm=50001, Texp=100000, Cexp=10000 Copyright 2000 Sam Siewert, All Rights Reserved 136 ******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =701 Start sample index =0 Last sample index =701 ReleaseCnt=701 CompleteCnt=701 Npreempts=1403 Ninterferences=702 Ndispatches=1403 Texpact=100104 Cexpactexec=299846 Clowactexec=37558 Chighactexec=182941525 Cexpactcomp=299846 Clowactcomp=37558 Chighactcomp=182941525 ******** free confidence interval =39471 free confidence interval =38865 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=36756 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 ******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2d034 ******** Dispatch parameters Dsoft=66000, Dterm=66000, Texp=66666, Cexp=10000 Copyright 2000 Sam Siewert, All Rights Reserved 137 ******** Initial model ******** High Conf = 0.990000 Low Conf = 0.900000 Cexp = 10000 Chigh = 10257 Clow = 10164 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =1036 Start sample index =36 Last sample index =36 ReleaseCnt=1036 CompleteCnt=1036 Npreempts=1038 Ninterferences=2 Ndispatches=1038 Texpact=66669 Cexpactexec=21407 Clowactexec=0 Chighactexec=26974 Cexpactcomp=21407 Clowactcomp=0 Chighactcomp=26974 ******** free confidence interval =61693 free confidence interval =60990 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.998000 Dsoft based on complete time=0.998000 ******** Deadline performance ******** SoftMissCnt=1 SoftMiss C[0] = 0 HardMissCnt=1 HardMissTerm=0 HardMiss C[0] = 0 ******** Execution performance ******** SoftReliability=0.999035 HardReliability=0.999035 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=36920 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 ******** Performance Summary for rtid=3, prio=4, tcbptr=0x1c42f10 Copyright 2000 Sam Siewert, All Rights Reserved 138 ******** Dispatch parameters Dsoft=66600, Dterm=66600, Texp=66667, Cexp=500 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 500 Chigh = 1500 Clow = 1500 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =1022 Start sample index =22 Last sample index =22 ReleaseCnt=1022 CompleteCnt=1022 Npreempts=1025 Ninterferences=3 Ndispatches=1025 Texpact=66669 Cexpactexec=184 Clowactexec=0 Chighactexec=1214 Cexpactcomp=184 Clowactcomp=0 Chighactcomp=1214 ******** free confidence interval =64748 free confidence interval =64748 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=37096 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 Copyright 2000 Sam Siewert, All Rights Reserved 139 ******** Performance Summary for rtid=4, prio=5, tcbptr=0x1c58dec ******** Dispatch parameters Dsoft=150000, Dterm=200000, Texp=200000, Cexp=500 ******** Initial model ******** High Conf = 0.500000 Low Conf = 0.200000 Cexp = 500 Chigh = 567 Clow = 525 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =322 Start sample index =0 Last sample index =322 ReleaseCnt=336 CompleteCnt=322 Npreempts=376 Ninterferences=54 Ndispatches=376 Texpact=200212 Cexpactexec=577772 Clowactexec=0 Chighactexec=185913477 Cexpactcomp=577772 Clowactcomp=0 Chighactcomp=185913477 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=80 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=37256 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 Copyright 2000 Sam Siewert, All Rights Reserved 140 ******** Performance Summary for rtid=5, prio=6, tcbptr=0x1c6ecc8 ******** Dispatch parameters Dsoft=400000, Dterm=500000, Texp=500000, Cexp=60000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 60000 Chigh = 60128 Clow = 60067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =60 Start sample index =0 Last sample index =60 ReleaseCnt=66 CompleteCnt=60 Npreempts=2908 Ninterferences=2848 Ndispatches=2908 Texpact=992740 Cexpactexec=3171435 Clowactexec=36983 Chighactexec=186960233 Cexpactcomp=3171435 Clowactcomp=36983 Chighactcomp=186960233 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.998000 Dsoft based on complete time=0.994000 ******** Deadline performance ******** SoftMissCnt=6 SoftMiss C[0] = 0 SoftMiss C[1] = 473495 SoftMiss C[2] = 498358 SoftMiss C[3] = 991332 SoftMiss C[4] = 490276 SoftMiss C[5] = 566849 HardMissCnt=2 HardMissTerm=1 HardMiss C[0] = 0 HardMiss C[1] = 991332 ******** Execution performance ******** SoftReliability=0.900000 HardReliability=0.966667 ******** Execution errors ******** ReleaseError=0 CompleteError=31 ExecError=0 Copyright 2000 Sam Siewert, All Rights Reserved 141 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=37450 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 ******** Performance Summary for rtid=6, prio=7, tcbptr=0x1c84ba4 ******** Dispatch parameters Dsoft=1000000, Dterm=1000000, Texp=1000000, Cexp=500 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 500 Chigh = 1500 Clow = 1500 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =126 Start sample index =0 Last sample index =126 ReleaseCnt=195 CompleteCnt=126 Npreempts=766 Ninterferences=640 Ndispatches=766 Texpact=338948 Cexpactexec=1491956 Clowactexec=352 Chighactexec=187915109 Cexpactcomp=1491956 Clowactcomp=352 Chighactcomp=187915109 ******** free confidence interval =299940 free confidence interval =299940 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=1 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=44 Copyright 2000 Sam Siewert, All Rights Reserved 142 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f683d4 Total dispatches/preemptions=37628 gv_rtepa_dispatch_cnt=9823 gv_rtepa_preempt_cnt=9823 Canceling timer for task 0 Suspended task 0 Canceling timer for task 1 Suspended task 1 Canceling timer for task 2 Suspended task 2 Canceling timer for task 3 Suspended task 3 Canceling timer for task 4 Suspended task 4 Canceling timer for task 5 Suspended task 5 Canceling timer for task 6 Suspended task 6 Deleted task 0 Deleted task 1 Deleted task 2 Deleted task 3 Deleted task 4 Deleted task 5 Deleted task 6 value = 0 = 0x0 -> exit thinker <22> exit thinker <23> exit script done on Tue Jun 27 09:02:56 2000 Copyright 2000 Sam Siewert, All Rights Reserved 143 Appendix D RACE Initial Scheduling and Configuration Admission Results ********************Admit test [Ntasks = 7] **** Thread 0 => D[0]=50000 Util=0.021160, Intf=0.000000 for thread 0 S[0]=0.021160 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=50001 i=1, j=0, D[1]=50001, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 1 from 0 Cfull=1058 for 1 from 0 Ifull=1058 for 1 from 0 Npart=1 for 1 from 0 Cpart=1058 for 1 from 0 Ipart=1058 for 1 from 0 Int=2116 for 1 from 0 Util=0.783504, Intf=0.042319 for thread 1 S[1]=0.825823 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=66000 i=2, j=0, D[2]=66000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 2 from 0 Cfull=1058 for 2 from 0 Ifull=1058 for 2 from 0 Npart=1 for 2 from 0 Cpart=1058 for 2 from 0 Ipart=1058 for 2 from 0 Int=2116 for 2 from 0 i=2, j=1, D[2]=66000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=1 for 2 from 1 Cfull=39176 for 2 from 1 Ifull=39176 for 2 from 1 Npart=0 for 2 from 1 Cpart=39176 for 2 from 1 Ipart=0 for 2 from 1 Int=41292 for 2 from 1 Util=0.334348, Intf=0.625636 for thread 2 S[2]=0.959985 **** Thread 2 can be scheduled safely **** Thread 3 => D[3]=66600 i=3, j=0, D[3]=66600, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=1 for 3 from 0 Cfull=1058 for 3 from 0 Ifull=1058 for 3 from 0 Npart=1 for 3 from 0 Cpart=1058 for 3 from 0 Ipart=1058 for 3 from 0 Int=2116 for 3 from 0 i=3, j=1, D[3]=66600, D[1]=50001, T[1]=100000, C[1]=39176 Copyright 2000 Sam Siewert, All Rights Reserved 144 Nfull=1 for 3 from 1 Cfull=39176 for 3 from 1 Ifull=39176 for 3 from 1 Npart=0 for 3 from 1 Cpart=39176 for 3 from 1 Ipart=0 for 3 from 1 Int=41292 for 3 from 1 i=3, j=2, D[3]=66600, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=1 for 3 from 2 Cfull=22067 for 3 from 2 Ifull=22067 for 3 from 2 Npart=0 for 3 from 2 Cpart=22067 for 3 from 2 Ipart=0 for 3 from 2 Int=63359 for 3 from 2 Util=0.017973, Intf=0.951336 for thread 3 S[3]=0.969309 **** Thread 3 can be scheduled safely **** Thread 4 => D[4]=200000 i=4, j=0, D[4]=200000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=5 for 4 from 0 Cfull=1058 for 4 from 0 Ifull=5290 for 4 from 0 Npart=2 for 4 from 0 Cpart=2 for 4 from 0 Ipart=4 for 4 from 0 Int=5294 for 4 from 0 i=4, j=1, D[4]=200000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=2 for 4 from 1 Cfull=39176 for 4 from 1 Ifull=78352 for 4 from 1 Npart=0 for 4 from 1 Cpart=0 for 4 from 1 Ipart=0 for 4 from 1 Int=83646 for 4 from 1 i=4, j=2, D[4]=200000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=3 for 4 from 2 Cfull=22067 for 4 from 2 Ifull=66201 for 4 from 2 Npart=1 for 4 from 2 Cpart=2 for 4 from 2 Ipart=2 for 4 from 2 Int=149849 for 4 from 2 i=4, j=3, D[4]=200000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=3 for 4 from 3 Cfull=1197 for 4 from 3 Ifull=3591 for 4 from 3 Npart=0 for 4 from 3 Cpart=1197 for 4 from 3 Ipart=0 for 4 from 3 Int=153440 for 4 from 3 Util=0.001865, Intf=0.767200 for thread 4 S[4]=0.769065 **** Thread 4 can be scheduled safely **** Thread 5 => D[5]=500000 Copyright 2000 Sam Siewert, All Rights Reserved 145 i=5, j=0, D[5]=500000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=14 for 5 from 0 Cfull=1058 for 5 from 0 Ifull=14812 for 5 from 0 Npart=2 for 5 from 0 Cpart=5 for 5 from 0 Ipart=10 for 5 from 0 Int=14822 for 5 from 0 i=5, j=1, D[5]=500000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=5 for 5 from 1 Cfull=39176 for 5 from 1 Ifull=195880 for 5 from 1 Npart=0 for 5 from 1 Cpart=0 for 5 from 1 Ipart=0 for 5 from 1 Int=210702 for 5 from 1 i=5, j=2, D[5]=500000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=7 for 5 from 2 Cfull=22067 for 5 from 2 Ifull=154469 for 5 from 2 Npart=1 for 5 from 2 Cpart=22067 for 5 from 2 Ipart=22067 for 5 from 2 Int=387238 for 5 from 2 i=5, j=3, D[5]=500000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=7 for 5 from 3 Cfull=1197 for 5 from 3 Ifull=8379 for 5 from 3 Npart=1 for 5 from 3 Cpart=1197 for 5 from 3 Ipart=1197 for 5 from 3 Int=396814 for 5 from 3 i=5, j=4, D[5]=500000, D[4]=200000, T[4]=200000, C[4]=373 Nfull=2 for 5 from 4 Cfull=373 for 5 from 4 Ifull=746 for 5 from 4 Npart=1 for 5 from 4 Cpart=373 for 5 from 4 Ipart=373 for 5 from 4 Int=397933 for 5 from 4 Util=0.114170, Intf=0.795866 for thread 5 S[5]=0.910036 **** Thread 5 can be scheduled safely **** Thread 6 => D[6]=1000000 i=6, j=0, D[6]=1000000, D[0]=50000, T[0]=33333, C[0]=1058 Nfull=29 for 6 from 0 Cfull=1058 for 6 from 0 Ifull=30682 for 6 from 0 Npart=2 for 6 from 0 Cpart=10 for 6 from 0 Ipart=20 for 6 from 0 Int=30702 for 6 from 0 i=6, j=1, D[6]=1000000, D[1]=50001, T[1]=100000, C[1]=39176 Nfull=10 for 6 from 1 Cfull=39176 for 6 from 1 Ifull=391760 for 6 from 1 Copyright 2000 Sam Siewert, All Rights Reserved 146 Npart=0 for 6 from 1 Cpart=0 for 6 from 1 Ipart=0 for 6 from 1 Int=422462 for 6 from 1 i=6, j=2, D[6]=1000000, D[2]=66000, T[2]=66666, C[2]=22067 Nfull=15 for 6 from 2 Cfull=22067 for 6 from 2 Ifull=331005 for 6 from 2 Npart=1 for 6 from 2 Cpart=10 for 6 from 2 Ipart=10 for 6 from 2 Int=753477 for 6 from 2 i=6, j=3, D[6]=1000000, D[3]=66600, T[3]=66667, C[3]=1197 Nfull=15 for 6 from 3 Cfull=1197 for 6 from 3 Ifull=17955 for 6 from 3 Npart=0 for 6 from 3 Cpart=1197 for 6 from 3 Ipart=0 for 6 from 3 Int=771432 for 6 from 3 i=6, j=4, D[6]=1000000, D[4]=200000, T[4]=200000, C[4]=373 Nfull=5 for 6 from 4 Cfull=373 for 6 from 4 Ifull=1865 for 6 from 4 Npart=0 for 6 from 4 Cpart=0 for 6 from 4 Ipart=0 for 6 from 4 Int=773297 for 6 from 4 i=6, j=5, D[6]=1000000, D[5]=500000, T[5]=500000, C[5]=57085 Nfull=2 for 6 from 5 Cfull=57085 for 6 from 5 Ifull=114170 for 6 from 5 Npart=0 for 6 from 5 Cpart=0 for 6 from 5 Ipart=0 for 6 from 5 Int=887467 for 6 from 5 Util=0.001698, Intf=0.887467 for thread 6 S[6]=0.889165 **** Thread 6 can be scheduled safely Copyright 2000 Sam Siewert, All Rights Reserved 147 Appendix E Video Pipeline Test Results (Without Isochronous Output) -> ld < rtepaLib.o value = 808160 = 0xc54e0 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_vpipe(-0_ __ _0) microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Warning: failure to demote system task Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0x86 **** VIDEO PRESENT **** DECODING ODD FIELD **** PLL OUT OF LOCK **** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0x200022e I2C RACK DMA DISABLED ODD FIELD VIDEO PRESENT CHANGE DETECTED Copyright 2000 Sam Siewert, All Rights Reserved 148 LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc ******** RTEPA tid = 0 Number of RTEPA tasks = 1 LowConf = 1.000000, Zplow = 10.000000, HighConf = 1.000000, Zphigh = 10.000000 interference=2, Cmu = 100, Csigma = 100, Clow = 200, Chigh = 200, Dsoft = 20000, Dterm = 33333 RTEPA_Cterm[0]=200 RTEPA_Dterm[0]=33333 ********************Admit test [Ntasks = 1] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely ********************Admit test [Ntasks = 1] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely Btvid task 0 can be scheduled by more sufficient ******** RTEPA tid = 1 Number of RTEPA tasks = 2 LowConf = 0.500000, Zplow = 0.674490, HighConf = 0.900000, Zphigh = 1.644855 interference=2, Cmu = 10000, Csigma = 1000, Clow = 10067, Chigh = 10164, Dsoft = 100000, Dterm = 150000 RTEPA_Cterm[1]=10067 RTEPA_Dterm[1]=150000 ********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Copyright 2000 Sam Siewert, All Rights Reserved 149 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely ********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely Frame compress task 1 can be scheduled by more sufficient ******** RTEPA tid = 2 Number of RTEPA tasks = 3 LowConf = 0.500000, Zplow = 0.674490, HighConf = 0.800000, Zphigh = 1.281552 interference=2, Cmu = 10000, Csigma = 1000, Clow = 10067, Chigh = 10128, Dsoft = 150000, Dterm = 180000 RTEPA_Cterm[2]=10067 RTEPA_Dterm[2]=180000 ********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Copyright 2000 Sam Siewert, All Rights Reserved 150 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=150000 i=2, j=0, D[2]=150000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=4 for 2 from 0 Cfull=200 for 2 from 0 Ifull=800 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1000 for 2 from 0 i=2, j=1, D[2]=150000, D[1]=100000, T[1]=200000, C[1]=10067 Nfull=1 for 2 from 1 Cfull=10067 for 2 from 1 Ifull=10067 for 2 from 1 Npart=0 for 2 from 1 Cpart=10067 for 2 from 1 Ipart=0 for 2 from 1 Int=11067 for 2 from 1 Util=0.067113, Intf=0.073780 for thread 2 S[2]=0.140893 **** Thread 2 can be scheduled safely ********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=180000 i=2, j=0, D[2]=180000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=5 for 2 from 0 Cfull=200 for 2 from 0 Ifull=1000 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1200 for 2 from 0 i=2, j=1, D[2]=180000, D[1]=150000, T[1]=200000, C[1]=10164 Copyright 2000 Sam Siewert, All Rights Reserved 151 Nfull=1 for 2 from 1 Cfull=10164 for 2 from 1 Ifull=10164 for 2 from 1 Npart=0 for 2 from 1 Cpart=10164 for 2 from 1 Ipart=0 for 2 from 1 Int=11364 for 2 from 1 Util=0.056267, Intf=0.063133 for thread 2 S[2]=0.119400 **** Thread 2 can be scheduled safely Frame TLM task 2 can be scheduled by more sufficient Entry pointer passed in = 0x3827ac, and assigned = 0x3827ac RTEPA stack base = 0x1bf91f8 RTEPA_CB[0].dterm_itime.it_value.tv_sec = Created RTEPA 0task RTEPA_CB[0].dterm_itime.it_value.tv_nsec = 033333000 with tcbptr=0x1c01394 mmio INTSTATUS testval = 0x300022e I2C RACK DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0x8b000204 I2C RACK DMA_MC_SYNC DMA ENABLED EVEN FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37b830 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedOOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x381da4, and assigned = 0x381da4 RTEPA stack base = 0x1c0f0e0 RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 150000000 Created RTEPA task 1 with tcbptr=0x1c1727c Entry pointer passed in = 0x381c1c, and assigned = 0x381c1c RTEPA stack base = 0x1c24fc8 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 180000000 Created RTEPA task 2 with tcbptr=0x1c2d164 ******** VIDEO system fully activated ******** value = 47 = 0x2f = '/' = s_B + 0x7 -> stop_vpipe Actual pipeline sequencing Copyright 2000 Sam Siewert, All Rights Reserved 152 rtid=0 completed 851 times, activated next stage @ 190 => next_stage_rtid=1 released 67 times [specified freq = 10, offset = 0, expected releases = 66] Actual pipeline sequencing rtid=1 completed 67 times, activated next stage @ 4 => next_stage_rtid=2 released 64 times [specified freq = 1, offset = 0, expected releases = 63] ******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c01204 ******** Dispatch parameters Dsoft=20000, Dterm=33333, Texp=33333, Cexp=100 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =851 Start sample index =1 Last sample index =851 ReleaseCnt=851 CompleteCnt=851 Npreempts=861 Ninterferences=10 Ndispatches=861 Texpact=33197 Cexpactexec=57 Clowactexec=0 Chighactexec=1081 Cexpactcomp=57 Clowactcomp=0 Chighactcomp=1081 ******** free confidence interval =1232 free confidence interval =1232 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 Copyright 2000 Sam Siewert, All Rights Reserved 153 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=16418 gv_rtepa_dispatch_cnt=3550 gv_rtepa_preempt_cnt=3550 ******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c170ec ******** Dispatch parameters Dsoft=100000, Dterm=150000, Texp=200000, Cexp=10000 ******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =67 Start sample index =1 Last sample index =67 ReleaseCnt=67 CompleteCnt=67 Npreempts=135 Ninterferences=68 Ndispatches=135 Texpact=328703 Cexpactexec=57948 Clowactexec=0 Chighactexec=59877 Cexpactcomp=57948 Clowactcomp=0 Chighactcomp=59877 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 Copyright 2000 Sam Siewert, All Rights Reserved 154 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=16592 gv_rtepa_dispatch_cnt=3550 gv_rtepa_preempt_cnt=3550 ******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2cfd4 ******** Dispatch parameters Dsoft=150000, Dterm=180000, Texp=200001, Cexp=10000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10128 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =64 Start sample index =1 Last sample index =64 ReleaseCnt=64 CompleteCnt=64 Npreempts=2554 Ninterferences=2490 Ndispatches=2554 Texpact=328468 Cexpactexec=51795 Clowactexec=0 Chighactexec=57625 Cexpactcomp=51795 Clowactcomp=0 Chighactcomp=57625 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 Copyright 2000 Sam Siewert, All Rights Reserved 155 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=16766 gv_rtepa_dispatch_cnt=3550 gv_rtepa_preempt_cnt=3550 Canceling timer for task 0 Suspended task 0 Canceling timer for task 1 Suspended task 1 Canceling timer for task 2 Suspended task 2 Deleted task 0 Deleted task 1 Deleted task 2 value = -771751424 = 0xd2000200 -> exit thinker <22> exit thinker <23> exit script done on Wed Jun 28 21:55:25 2000 Copyright 2000 Sam Siewert, All Rights Reserved 156 Appendix F Video Pipeline Test Results (With Isochronous Output) -> ld < rtepaLib.o value = 807560 = 0xc5288 -> setout Original setup: sin=3, sout=3, serr=3 All being remapped to your virtual terminal... You should see this message now!!! value = 35 = 0x23 = '#' = precis + 0x3 -> start_vpipe(10_ _) microseconds_per_tick = 9.998491e+02, microseconds_per_jiffy = 4.190483e-01 Warning: failure to demote system task Intel NB controller PCI concurrency enable = 0x8 Modified Intel NB controller PCI concurrency enable = 0x8 Intel NB controller PCI latency timer = 0x40 Modified Intel NB controller PCI latency timer = 0x40 Intel NB controller PCI Cmd Reg = 0x6 Modified Intel NB controller PCI Cmd Reg = 0x6 Intel NB controller PCI ARB CTL = 0x80 PCI 2.1 Compliant Intel NB controller PCI ARB CTL = 0x80 Intel SB controller latency control = 0x3 PCI 2.1 Compliant Intel SB controller latency control = 0x3 Intel SB controller IRQ Routing Reg = 0xb808080 Modified Intel SB controller IRQ Routing Reg = 0x6808080 Intel SB controller APIC Addr Reg = 0x0 BAR 0 testval=0xe2001008 before any write BAR 0 MMIO testval=0xfffff008 BAR 1 testval=0x0 before any write BAR 1 not implemented BAR 2 testval=0x0 before any write BAR 2 not implemented BAR 3 testval=0x0 before any write BAR 3 not implemented BAR 4 testval=0x0 before any write BAR 4 not implemented BAR 5 testval=0x0 before any write BAR 5 not implemented Found Bt878 configured for IRQ 11 Bt878 Allowable PCI bus latency = 0x40 Bt878 PCI bus min grant = 0x10 Bt878 PCI bus max latency = 0x28 Modified Bt878 Allowable PCI bus latency = 0xff mmio DSTATUS testval = 0xa6 **** VIDEO PRESENT **** DECODING EVEN FIELD **** PLL OUT OF LOCK **** LUMA ADC OVERFLOW mmio INTSTATUS testval = 0x300022e I2C RACK DMA DISABLED EVEN FIELD VIDEO PRESENT CHANGE DETECTED Copyright 2000 Sam Siewert, All Rights Reserved 157 LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc ******** RTEPA tid = 0 Number of RTEPA tasks = 1 LowConf = 1.000000, Zplow = 10.000000, HighConf = 1.000000, Zphigh = 10.000000 interference=2, Cmu = 100, Csigma = 100, Clow = 200, Chigh = 200, Dsoft = 20000, Dterm = 33333 RTEPA_Cterm[0]=200 RTEPA_Dterm[0]=33333 ********************Admit test [Ntasks = 1] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely ********************Admit test [Ntasks = 1] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely Btvid task 0 can be scheduled by more sufficient ******** RTEPA tid = 1 Number of RTEPA tasks = 2 LowConf = 0.500000, Zplow = 0.674490, HighConf = 0.900000, Zphigh = 1.644855 interference=2, Cmu = 10000, Csigma = 1000, Clow = 10067, Chigh = 10164, Dsoft = 100000, Dterm = 150000 RTEPA_Cterm[1]=10067 RTEPA_Dterm[1]=150000 ********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Copyright 2000 Sam Siewert, All Rights Reserved 158 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely ********************Admit test [Ntasks = 2] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely Frame compress task 1 can be scheduled by more sufficient ******** RTEPA tid = 2 Number of RTEPA tasks = 3 LowConf = 0.500000, Zplow = 0.674490, HighConf = 0.800000, Zphigh = 1.281552 interference=2, Cmu = 10000, Csigma = 1000, Clow = 10067, Chigh = 10128, Dsoft = 150000, Dterm = 180000 RTEPA_Cterm[2]=10067 RTEPA_Dterm[2]=180000 ********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=20000 Util=0.010000, Intf=0.000000 for thread 0 S[0]=0.010000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=100000 i=1, j=0, D[1]=100000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=3 for 1 from 0 Cfull=200 for 1 from 0 Ifull=600 for 1 from 0 Npart=1 for 1 from 0 Cpart=1 for 1 from 0 Ipart=1 for 1 from 0 Copyright 2000 Sam Siewert, All Rights Reserved 159 Int=601 for 1 from 0 Util=0.100670, Intf=0.006010 for thread 1 S[1]=0.106680 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=150000 i=2, j=0, D[2]=150000, D[0]=20000, T[0]=33333, C[0]=200 Nfull=4 for 2 from 0 Cfull=200 for 2 from 0 Ifull=800 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1000 for 2 from 0 i=2, j=1, D[2]=150000, D[1]=100000, T[1]=200000, C[1]=10067 Nfull=1 for 2 from 1 Cfull=10067 for 2 from 1 Ifull=10067 for 2 from 1 Npart=0 for 2 from 1 Cpart=10067 for 2 from 1 Ipart=0 for 2 from 1 Int=11067 for 2 from 1 Util=0.067113, Intf=0.073780 for thread 2 S[2]=0.140893 **** Thread 2 can be scheduled safely ********************Admit test [Ntasks = 3] **** Thread 0 => D[0]=33333 Util=0.006000, Intf=0.000000 for thread 0 S[0]=0.006000 **** Thread 0 can be scheduled safely **** Thread 1 => D[1]=150000 i=1, j=0, D[1]=150000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=4 for 1 from 0 Cfull=200 for 1 from 0 Ifull=800 for 1 from 0 Npart=1 for 1 from 0 Cpart=200 for 1 from 0 Ipart=200 for 1 from 0 Int=1000 for 1 from 0 Util=0.067760, Intf=0.006667 for thread 1 S[1]=0.074427 **** Thread 1 can be scheduled safely **** Thread 2 => D[2]=180000 i=2, j=0, D[2]=180000, D[0]=33333, T[0]=33333, C[0]=200 Nfull=5 for 2 from 0 Cfull=200 for 2 from 0 Ifull=1000 for 2 from 0 Npart=1 for 2 from 0 Cpart=200 for 2 from 0 Ipart=200 for 2 from 0 Int=1200 for 2 from 0 i=2, j=1, D[2]=180000, D[1]=150000, T[1]=200000, C[1]=10164 Copyright 2000 Sam Siewert, All Rights Reserved 160 Nfull=1 for 2 from 1 Cfull=10164 for 2 from 1 Ifull=10164 for 2 from 1 Npart=0 for 2 from 1 Cpart=10164 for 2 from 1 Ipart=0 for 2 from 1 Int=11364 for 2 from 1 Util=0.056267, Intf=0.063133 for thread 2 S[2]=0.119400 **** Thread 2 can be scheduled safely Frame TLM task 2 can be scheduled by more sufficient Entry pointer passed in = 0x3827ac, and assigned = 0x3827ac RTEPA stack base = 0x1bf91f8 RTEPA_CB[0].dterm_itime.it_value.tv_sec = Created RTEP0A task RTEPA_CB[0].dterm_itime.it_value.tv_nsec = 033333000 with tcbptr=0x 1c01394 mmio INTSTATUS testval = 0x200022e I2C RACK DMA DISABLED ODD FIELD VIDEO PRESENT CHANGE DETECTED LUMA/CHROMA OVERFLOW DETECTED mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0xfffffffc Timing Gen Ctl Reg = 0x0 Configured NTSC Setting INPUT_REG = 0x79 Set mux Loaded MC mmio INTSTATUS testval = 0x8a000204 I2C RACK DMA_MC_SYNC DMA ENABLED ODD FIELD mmio CAPTURE_CNT = 0x0 mmio DMA PC = 0x37b830 Brightness was 128 Setting INPUT_REG = 0x19 Starting video Video startedOOPIC Servo Driver Serial Interface OOPIC driver /tyCo/1 intialized and opened with status=0 Entry pointer passed in = 0x381da4, and assigned = 0x381da4 RTEPA stack base = 0x1c0f0e0 RTEPA_CB[1].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[1].dterm_itime.it_value.tv_nsec = 150000000 Created RTEPA task 1 with tcbptr=0x1c1727c Entry pointer passed in = 0x381c1c, and assigned = 0x381c1c RTEPA stack base = 0x1c24fc8 RTEPA_CB[2].dterm_itime.it_value.tv_sec = 0 RTEPA_CB[2].dterm_itime.it_value.tv_nsec = 180000000 Created RTEPA task 2 with tcbptr=0x1c2d164 ******** VIDEO system fully activated ******** value = 47 = 0x2f = '/' = s_B + 0x7 -> stop_vpipe Copyright 2000 Sam Siewert, All Rights Reserved 161 Actual pipeline sequencing rtid=0 completed 766 times, activated next stage @ 190 => next_stage_rtid=1 released 58 times [specified freq = 10, offset = 0, expected releases = 57] Actual pipeline sequencing rtid=1 completed 58 times, activated next stage @ 4 => next_stage_rtid=2 released 55 times [specified freq = 1, offset = 0, expected releases = 54] ******** Performance Summary for rtid=0, prio=1, tcbptr=0x1c01204 ******** Dispatch parameters Dsoft=20000, Dterm=33333, Texp=33333, Cexp=100 ******** Initial model ******** High Conf = 1.000000 Low Conf = 1.000000 Cexp = 100 Chigh = 200 Clow = 200 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =766 Start sample index =1 Last sample index =766 ReleaseCnt=766 CompleteCnt=766 Npreempts=779 Ninterferences=13 Ndispatches=779 Texpact=33308 Cexpactexec=53 Clowactexec=0 Chighactexec=1065 Cexpactcomp=53 Clowactcomp=0 Chighactcomp=1065 ******** free confidence interval =222 free confidence interval =222 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 ExecError=0 Copyright 2000 Sam Siewert, All Rights Reserved 162 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=14764 gv_rtepa_dispatch_cnt=3192 gv_rtepa_preempt_cnt=3192 ******** Performance Summary for rtid=1, prio=2, tcbptr=0x1c170ec ******** Dispatch parameters Dsoft=100000, Dterm=150000, Texp=200000, Cexp=10000 ******** Initial model ******** High Conf = 0.900000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10164 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =58 Start sample index =1 Last sample index =58 ReleaseCnt=58 CompleteCnt=58 Npreempts=117 Ninterferences=59 Ndispatches=117 Texpact=327931 Cexpactexec=57782 Clowactexec=0 Chighactexec=59763 Cexpactcomp=57782 Clowactcomp=0 Chighactcomp=59763 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 CompleteError=0 Copyright 2000 Sam Siewert, All Rights Reserved 163 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=14924 gv_rtepa_dispatch_cnt=3192 gv_rtepa_preempt_cnt=3192 ******** Performance Summary for rtid=2, prio=3, tcbptr=0x1c2cfd4 ******** Dispatch parameters Dsoft=150000, Dterm=180000, Texp=200001, Cexp=10000 ******** Initial model ******** High Conf = 0.800000 Low Conf = 0.500000 Cexp = 10000 Chigh = 10128 Clow = 10067 ******** On-line model Dhard from actual dist Dsoft from actual dist Confidence in supplied Confidence in supplied Confidence in supplied Confidence in supplied N samples =55 Start sample index =1 Last sample index =55 ReleaseCnt=55 CompleteCnt=55 Npreempts=2296 Ninterferences=2241 Ndispatches=2296 Texpact=327613 Cexpactexec=52025 Clowactexec=0 Chighactexec=56895 Cexpactcomp=52025 Clowactcomp=0 Chighactcomp=56895 ******** free confidence interval =0 free confidence interval =0 Dhard based on exec time=0.999000 Dsoft based on exec time=0.999000 Dhard based on complete time=0.999000 Dsoft based on complete time=0.999000 ******** Deadline performance ******** SoftMissCnt=0 HardMissCnt=0 HardMissTerm=0 ******** Execution performance ******** SoftReliability=1.000000 HardReliability=1.000000 ******** Execution errors ******** ReleaseError=0 Copyright 2000 Sam Siewert, All Rights Reserved 164 CompleteError=0 ExecError=0 ********General info ******** gv_last_preempted_tid=0x1f7e7cc gv_last_dispatched_tid=0x1f76384 Total dispatches/preemptions=15100 gv_rtepa_dispatch_cnt=3192 gv_rtepa_preempt_cnt=3192 Canceling timer for task 0 Suspended task 0 Canceling timer for task 1 Suspended task 1 Canceling timer for task 2 Suspended task 2 Deleted task 0 Deleted task 1 Deleted task 2 value = -771751420 = 0xd2000204 -> exit thinker <22> exit thinker <23> exit script done on Thu Jun 29 00:02:22 2000 Copyright 2000 Sam Siewert, All Rights Reserved 165