Embedded Java
Transcription
Embedded Java
Embedded Java Hsiao-Lung H i L Chan Ch Dept Electrical Engineering Chang Gung University Taiwan Evolution of p programming g g languages g g 2 Host and target g system y diagram g Target [Embedded System] Host [Development System] Application Layer Application Layer Preprocessor Compiler Linker … System Software Layer System Software Layer Hardware Layer Hardware Layer 3 C example p compilation/linking p g steps p C Source File (s) C Compiler Preprocessing C Header File (s) Compiling C Object File (s) Linker C System Libraries C Executable File Host Computer Embedded System 4 Embedded Java compilation p and linking g diagram g Java developed p byy Sun Microsystems y Java bytecode (complied Java binary code): Platform independent 5 Java byte y code Java code Java bytecode outer: for (int i = 2; i < 1000; i++) { for (int j = 2; j < i; j++) { if (i % j == 0) continue outer; } System.out.println (i); } 0: iconst_2 1: istore_1 2: iload_1 3: sipush 1000 6: if_icmpge 44 9: iconst_2 10: istore_2 11: iload_2 12: iload_1 13: if_icmpge 31 16: iload_1 17: iload_2 18: irem 19: ifne 25 22: goto 38 25: 5 iinc 2,, 1 28: goto 11 31: getstatic #84; 34: iload_1 35: invokevirtual #85; 38: iinc 1, 1 41: goto 2 44: return 6 Java virtual machine (JVM) ( ) Java bytecode y is verified,, interpreted or JIT-compiled for the architecture The Java APIs and JVM together make up the Java Runtime Environment (JRE) 7 Translation of Java codebyte y on target g Interpretation Byte Code 1 Parsing vtab Byte Code 1 B Byte C Code d 2 Byte Code 1 Interpreting Source File (s) Byte Code 2 Parsing Byte Code 3 Byte Code 2 Interpreting Host Compiler …. Byte Code File (s) … Byte Code 1 Parsing & Interpreting vtab Byte Code 1 JIT Compiling Just In Time Just-In-Time [JIT] Byte Code 2 P Parsing i & Interpreting Byte Code 2 JIT Compiling Way-Ahead-ofTime/Ahead-OfTi /Ah d Of Time [WAT/AOT] 1 .class File Compiled Byte Code 1 Byte Code 1 Byte Code 2 Compiled Byte Code 2 Byte Code 2 Byte Code 3 Compiled Byte Code 3 Byte Code 3 … …. Byte Code 1 …. JVM WAT Compiler object File 2 JVM Linker Runtime Libraries executables … First Pass of Processing Byte Code 2nd and Additional Passes of Processing Byte Code 8 JVM interpreter p Every bytecode instruction is parsed and converted to native code, one bytecode at a time Redundant p portions of code are reinterpreted p everyy time they are run Byte Code 1 Lowest performance Parsing vtab Byte Code 1 Byte Code 2 Byte Code 1 Interpreting Byte Code 2 Parsing B t Code Byte C d 3 Byte Code 2 Interpreting …. … 9 JIT (j (just-in-time)) compiler p Interpret bytecode once, then compiles, and stores native form of bytecode at runtime Additional runtime overhead Byte Code 1 Additional memory Parsing & Interpreting vtab Byte Code 1 JIT Compiling Byte y Code 2 Parsing & Interpreting Byte Code 1 Compiled Byte Code 1 Byte Code 2 Compiled Byte Code 2 Byte Code 3 Compiled Byte Code 3 …. … Byte Code 2 JIT C Compiling ili … First Pass of Processing Byte Code 2nd and Additional Passes of Processing Byte Code 10 Way-ahead-of-time/ahead-of-time compiling (WAT/AOT) All Java codebyte is complied into native code at compile time Better p performance than JIT for non-reduncdant code Additional Java classes dynamically downloaded at runtime 1 .class File JVM WAT Compiler Byte Code 1 object File 2 Byte Code 2 JVM Linker Byte Code 3 Runtime Libraries …. executables 11 Bytecode y verifier No user program can crash or interfere the host machine Avoid programmer errors Data corruption p or unpredictable p behavior such as accessing g off the end of an array or using an uninitialized pointer The JVM verifies all bytecode before it is executed Branches are always to valid locations Data is always initialized and references are always type-safe Access to private data and methods is controlled 12 JVM components p JVM classes,, compiled p libraries of Java byte y code,, Java APIs (application program interfaces) Execution engine contains components needed to process Java code Garbage collection deallocates any memory no longer in use by Java applications 13 Copying py g garbage g g collection algorithm g Copy referenced objects to a different part of memory, and then free up original memory space Use a larger g memoryy area and cannot be interrupted p during the copy 14 Mark and sweep p garbage g g collection algorithm g Mark all objects j that are in use,, and then sweep p (deallocate) ( ) unused objects System can interrupt this GC and execute other functions Lead to memory fragmentation fragmentation. Additional memory compacting algorithm can be implemented 15 Generational g garbage g collection algorithm g Youngest Generation Older Generation Copying GC Mark (Sweep) & Compact GC 16 Generational g garbage g collection (cont.) ( ) Separate objects into generations Assume that most objects that are allocated by a Java program p g are short-lived Objects in the younger generation group are cleaned up more frequently than objects in older generation groups Different generational GC may employ different algorithms 17 How can Java add to an embedded system’s architecture ? Application Layer Application Layer Java Device Drivers Application Layer JVM JVM System Software Layer System Software Layer Java Processor System Software Layer Hardware Layer Hardware Layer y Hardware a d a e Layer aye JVM compiled in application Esmertec’s Jbed, Kava’s KavaVM, IBM’s J9 … JVM part of System layer Skelmir s Cee Skelmir’s Cee-JJ, Esmertec/Insignia’s Esmertec/Insignia s Jeode and Jbed, Tao’s Intent, Kava’s KavaVM … JVM in Hardware ARM’s Gazzelle, AJile’s aj100,… 18 Java platform p 19 Java chip p picoJava Execute Java bytecode by hardware ARM Jazelle architecture 20 Cortex A8 Design For Test interface for manufacturing testing of the core Pipeline Main interface to the system bus Output trace information for debugging Embedded E b dd d Trace T Macrocell M ll for f non-intrusive debug NEON coprocessor implements 10-stage pipeline that decode and execute the Advanced SIMD (single-instruction multiple data) media processing architecture Program g status register g ((PSR)) 22 Big-endian g and little-endian formats Big-endian little-endian 23 Registers g in Cortex A8 24 Jazelle mode in ARM Jazelle mode is entered via the BXJ instructions T bit is 0 and J bit is 1 Following g an entryy into the Jazelle state mode,, bytecodes y can be processed in one of three ways Decoded and executed in hardware (Coprocessor 14) Handled in software (ARM/ThumbEE JVM code) Treated as an invalid/illegal opcode. This cause a branch to an ARM exception ti mode d 25 Processor operating p g states ARM state 32 32-bit, bit, word word-aligned aligned ARM instructions are executed in this state Thumb state 16-bit and 32-bit, halfword-aligned Thumb-2 instructions. T bit is 0 and J bit is 0 T bit is 1 and J bit is 0 ThumbEE state 16-bit and 32-bit, halfword-aligned variant i off the h Thumb-2 Th b 2 instruction i i set T bit is 1 and J bit is 1 D i Designed d as a target t t ffor d dynamically i ll generated t d code d This is code compiled on the device either shortly before or during execution from a portable bytecode 26 Thumb instruction format Example: ADD Rd, Rd #Constant 3 bits 15 Thumb Code 001 Major op-code 10 Minor op-code Rd 0 8-bit immediate Destination & Source register 0 31 ARM Code 1110 Always condition code Immediate value 00 1 0100 1 0 Rd 0 Rd 0000 8-bit immediate S 27 Thumb Execution Environment (Thumb-EE) instruction set In ThumbEE state, the processor executes almost the same instruction set as in Thumb state. Some ThumbEE instructions are added New ThumbEE instructions to branch to handlers Null pointer checking on load/store instructions Additional instruction to check array bounds Some other modifications to load, store, and control flow i t ti instructions 28 Co-processor p in ARM Co-processor Co processor Added function unit that is called by instruction Fl ti Floating-point i t units it are often ft structured t t d as co-processors ARM allows up to 16 designer-selected co-processors 29 Co-processor p in ARM ((cont.)) Memory system ARM core CPDRIVE Co-processor Instruction stream from memory MRC: Move to ARM register from the coprocessor MCR Move MCR: Mo e the coprocessor op o e o from f om ARM register egi te LDC: Load memory data to the coprocessor STC: Store data from coprocessor to memory 30 Co-processor p in ARM ((cont.)) A typical coprocessor contains: • An instruction pipeline • Instruction decoding logic • Handshake logic • A Register bank • Special processing logic 31 Co-processor p in ARM ((cont.)) cPA PA (Coprocessor (C absent) b t) ARM core nCPI (NOT coprocessor instruction) cPB (Coprocessor busy) Coprocessor 1 Coprocessor 2 Coprocessor n 32 Co-processor p in ARM ((cont.)) ARM processor : • Evaluate instruction to determine whether the instruction is executed by the coprocessor. • Communicate coprocessor using nCPI. • Generate address required by the instruction. • Take undefined instruction trap if no coprocessor accept the instruction. Core processor : • Decode the instruction to determine whether accept it. • Indicate the response status by CPA and CPB. • Fetch values required from its own register bank. • Perform the operation required by the instruction. 33 Co-processor p in ARM (Cont.) ( ) CPA CPB Response Remarks 0 0 Coprocessor present p ese t (1)The coprocessor (CP) can accept and execute an instruction immediately. (2)ARM ignores this instruction and executes next instruction. 0 1 Coprocessor busy ((1)The ) CP can accept p an instruction but is currently unable to process, it can stall ARM by asserting Busy-wait. (2)When CP is ready to start executing, it drives CPB LOW 1 0 Invalid response 1 1 Coprocessor absent (1)The CP cannot accept an instruction (2)ARM take the undefined instruction trap. 34 References T. Noergaard, Embedded Systems Architecture, Elsevier 2005. Wikipedia, p , the free encyclopedia. y p 35