Embedded Java

Transcription

Embedded Java
Embedded Java
Hsiao-Lung
H
i L
Chan
Ch
Dept Electrical Engineering
Chang Gung University
Taiwan
Evolution of p
programming
g
g languages
g g
2
Host and target
g system
y
diagram
g
Target [Embedded
System]
Host [Development System]
Application Layer
Application Layer
Preprocessor
Compiler
Linker …
System Software Layer
System Software Layer
Hardware Layer
Hardware Layer
3
C example
p compilation/linking
p
g steps
p
C Source File (s)
C Compiler
Preprocessing
C Header File (s)
Compiling
C Object File (s)
Linker
C System Libraries
C Executable File
Host Computer
Embedded System
4
Embedded Java compilation
p
and linking
g diagram
g


Java developed
p byy Sun Microsystems
y
Java bytecode (complied Java binary code): Platform independent
5
Java byte
y code
Java code
Java bytecode
outer:
for (int i = 2; i < 1000; i++) {
for (int j = 2; j < i; j++) {
if (i % j == 0)
continue outer;
}
System.out.println (i);
}
0: iconst_2
1: istore_1
2: iload_1
3: sipush 1000
6: if_icmpge 44
9: iconst_2
10: istore_2
11: iload_2
12: iload_1
13: if_icmpge 31
16: iload_1
17: iload_2
18: irem
19: ifne 25
22: goto 38
25:
5 iinc 2,, 1
28: goto 11
31: getstatic #84;
34: iload_1
35: invokevirtual #85;
38: iinc 1, 1
41: goto 2
44: return
6
Java virtual machine (JVM)
(
)


Java bytecode
y
is verified,,
interpreted or JIT-compiled for
the architecture
The Java APIs and JVM together
make up the Java Runtime
Environment (JRE)
7
Translation of Java codebyte
y on target
g
Interpretation
Byte Code 1
Parsing
vtab
Byte Code 1
B
Byte
C
Code
d 2
Byte Code 1
Interpreting
Source File (s)
Byte Code 2
Parsing
Byte Code 3
Byte Code 2
Interpreting
Host
Compiler
….
Byte Code File (s)
…
Byte Code 1
Parsing &
Interpreting
vtab
Byte Code 1
JIT
Compiling
Just In Time
Just-In-Time
[JIT]
Byte Code 2
P
Parsing
i
&
Interpreting
Byte Code 2
JIT
Compiling
Way-Ahead-ofTime/Ahead-OfTi
/Ah d Of
Time [WAT/AOT]
1
.class File
Compiled Byte Code
1
Byte Code 1
Byte Code 2
Compiled Byte Code
2
Byte Code 2
Byte Code 3
Compiled Byte Code
3
Byte Code 3
…
….
Byte Code 1
….
JVM WAT Compiler
object File
2
JVM Linker
Runtime Libraries
executables
…
First Pass of
Processing Byte Code
2nd and Additional
Passes of Processing
Byte Code
8
JVM interpreter
p



Every bytecode instruction is parsed and converted to
native code, one bytecode at a time
Redundant p
portions of code are reinterpreted
p
everyy time
they are run
Byte Code 1
Lowest performance
Parsing
vtab
Byte Code 1
Byte Code 2
Byte Code 1
Interpreting
Byte Code 2
Parsing
B t Code
Byte
C d 3
Byte Code 2
Interpreting
….
…
9
JIT (j
(just-in-time)) compiler
p



Interpret bytecode once, then compiles, and stores native
form of bytecode at runtime
Additional runtime overhead
Byte Code 1
Additional memory
Parsing &
Interpreting
vtab
Byte Code 1
JIT Compiling
Byte
y Code 2
Parsing &
Interpreting
Byte Code 1
Compiled Byte Code 1
Byte Code 2
Compiled Byte Code 2
Byte Code 3
Compiled Byte Code 3
….
…
Byte Code 2
JIT C
Compiling
ili
…
First Pass of Processing Byte
Code
2nd and Additional Passes of
Processing Byte Code
10
Way-ahead-of-time/ahead-of-time compiling
(WAT/AOT)



All Java codebyte is complied into native code at compile
time
Better p
performance than JIT for non-reduncdant code
Additional Java classes dynamically downloaded at runtime
1
.class File
JVM WAT Compiler
Byte Code 1
object File
2
Byte Code 2
JVM Linker
Byte Code 3
Runtime Libraries
….
executables
11
Bytecode
y
verifier


No user program can crash or interfere the host machine
Avoid programmer errors


Data corruption
p
or unpredictable
p
behavior such as accessing
g
off the end of an array or using an uninitialized pointer
The JVM verifies all bytecode before it is executed



Branches are always to valid locations
Data is always initialized and references are always type-safe
Access to private data and methods is controlled
12
JVM components
p



JVM classes,, compiled
p
libraries of Java byte
y code,, Java APIs
(application program interfaces)
Execution engine contains components needed to process Java code
Garbage collection deallocates any memory no longer in use by
Java applications
13
Copying
py g garbage
g
g collection algorithm
g


Copy referenced objects to a different part of memory,
and then free up original memory space
Use a larger
g memoryy area and cannot be interrupted
p
during the copy
14
Mark and sweep
p garbage
g
g collection algorithm
g




Mark all objects
j
that are in use,, and then sweep
p (deallocate)
(
) unused
objects
System can interrupt this GC and execute other functions
Lead to memory fragmentation
fragmentation.
Additional memory compacting algorithm can be implemented
15
Generational g
garbage
g collection algorithm
g
Youngest Generation
Older Generation
Copying GC
Mark (Sweep) & Compact GC
16
Generational g
garbage
g collection (cont.)
(
)




Separate objects into generations
Assume that most objects that are allocated by a Java
program
p
g
are short-lived
Objects in the younger generation group are cleaned up
more frequently than objects in older generation groups
Different generational GC may employ different algorithms
17
How can Java add to an embedded system’s
architecture ?
Application Layer
Application Layer
Java Device
Drivers
Application Layer
JVM
JVM
System
Software Layer
System
Software Layer
Java Processor
System Software Layer
Hardware Layer
Hardware Layer
y
Hardware
a d a e Layer
aye
JVM compiled in application
Esmertec’s Jbed, Kava’s KavaVM,
IBM’s J9 …
JVM part of System layer
Skelmir s Cee
Skelmir’s
Cee-JJ, Esmertec/Insignia’s
Esmertec/Insignia s
Jeode and Jbed, Tao’s Intent,
Kava’s KavaVM …
JVM in Hardware
ARM’s Gazzelle, AJile’s aj100,…
18
Java platform
p
19
Java chip
p



picoJava
Execute Java bytecode by hardware
ARM Jazelle architecture
20
Cortex A8
Design For Test interface for manufacturing
testing of the core
Pipeline
Main interface to the system bus
Output trace information for
debugging
Embedded
E
b dd d Trace
T
Macrocell
M
ll for
f
non-intrusive debug
NEON coprocessor implements 10-stage pipeline that decode
and execute the Advanced SIMD (single-instruction multiple data)
media processing architecture
Program
g
status register
g
((PSR))
22
Big-endian
g
and little-endian formats
Big-endian
little-endian
23
Registers
g
in Cortex A8
24
Jazelle mode in ARM

Jazelle mode is entered via the BXJ instructions


T bit is 0 and J bit is 1
Following
g an entryy into the Jazelle state mode,, bytecodes
y
can be processed in one of three ways



Decoded and executed in hardware (Coprocessor 14)
Handled in software (ARM/ThumbEE JVM code)
Treated as an invalid/illegal opcode. This cause a branch to
an ARM exception
ti mode
d
25
Processor operating
p
g states

ARM state 32
32-bit,
bit, word
word-aligned
aligned ARM instructions are
executed in this state


Thumb state 16-bit and 32-bit, halfword-aligned
Thumb-2 instructions.


T bit is 0 and J bit is 0
T bit is 1 and J bit is 0
ThumbEE state 16-bit and 32-bit, halfword-aligned
variant
i
off the
h Thumb-2
Th b 2 instruction
i
i set



T bit is 1 and J bit is 1
D i
Designed
d as a target
t
t ffor d
dynamically
i ll generated
t d code
d
This is code compiled on the device either shortly before or
during execution from a portable bytecode
26
Thumb instruction format
Example: ADD Rd,
Rd #Constant
3 bits
15
Thumb Code
001
Major
op-code
10
Minor
op-code
Rd
0
8-bit immediate
Destination &
Source register
0
31
ARM Code
1110
Always
condition
code
Immediate
value
00
1
0100
1
0 Rd
0 Rd
0000 8-bit immediate
S
27
Thumb Execution Environment (Thumb-EE)
instruction set


In ThumbEE state, the processor executes almost the
same instruction set as in Thumb state.
Some ThumbEE instructions are added




New ThumbEE instructions to branch to handlers
Null pointer checking on load/store instructions
Additional instruction to check array bounds
Some other modifications to load, store, and control flow
i t ti
instructions
28
Co-processor
p
in ARM


Co-processor
Co
processor

Added function unit that is called by instruction

Fl ti
Floating-point
i t units
it are often
ft structured
t t d as co-processors
ARM allows up to 16 designer-selected co-processors
29
Co-processor
p
in ARM ((cont.))
Memory
system
ARM
core
CPDRIVE
Co-processor
Instruction stream from memory
MRC: Move to ARM register from the coprocessor
MCR Move
MCR:
Mo e the coprocessor
op o e o from
f om ARM register
egi te
LDC: Load memory data to the coprocessor
STC: Store data from coprocessor to memory
30
Co-processor
p
in ARM ((cont.))
A typical coprocessor contains:
• An instruction pipeline
• Instruction decoding logic
• Handshake logic
• A Register bank
• Special processing logic
31
Co-processor
p
in ARM ((cont.))
cPA
PA (Coprocessor
(C
absent)
b
t)
ARM
core
nCPI (NOT coprocessor instruction)
cPB (Coprocessor busy)
Coprocessor 1
Coprocessor 2
Coprocessor n
32
Co-processor
p
in ARM ((cont.))
ARM processor :
• Evaluate instruction to determine whether the instruction is
executed by the coprocessor.
• Communicate coprocessor using nCPI.
• Generate address required by the instruction.
• Take undefined instruction trap if no coprocessor accept the
instruction.
Core processor :
• Decode the instruction to determine whether accept it.
• Indicate the response status by CPA and CPB.
• Fetch values required from its own register bank.
• Perform the operation required by the instruction.
33
Co-processor
p
in ARM (Cont.)
(
)
CPA
CPB
Response
Remarks
0
0
Coprocessor
present
p
ese t
(1)The coprocessor (CP) can accept and execute
an instruction immediately.
(2)ARM ignores this instruction and executes
next instruction.
0
1
Coprocessor
busy
((1)The
)
CP can accept
p an instruction but is
currently unable to process, it can stall ARM
by asserting Busy-wait.
(2)When CP is ready to start executing, it drives
CPB LOW
1
0
Invalid
response
1
1
Coprocessor
absent
(1)The CP cannot accept an instruction
(2)ARM take the undefined instruction trap.
34
References


T. Noergaard, Embedded Systems Architecture, Elsevier
2005.
Wikipedia,
p
, the free encyclopedia.
y p
35