a simulation model for streaming applications over a power

Transcription

a simulation model for streaming applications over a power
A SIMULATION MODEL FOR STREAMING APPLICATIONS OVER A
POWER-MANAGEABLE WIRELESS LINK
Andrea Acquaviva
Emanuele Lattanzi
Alessandro Bogliolo
ISTI - University of Urbino
Piazza della Repubblica 13
61029 Urbino, Italy
E-mail:acquaviva,lattanzi,bogliolo sti.uniurb.it
KEYWORDS
Multimedia, Event-Oriented, Resource Management, RealTime, Simulators.
ABSTRACT
In this work we introduce a hardware-validated simulation
model for the exploration of real-time multimedia systems,
where system components are modeled as interacting generalized semi-Markov processes (GSMPs). We apply the simulation model to explore the design space of a mobile client accessing streaming data through a wireless network. The model
has been characterized and validated against power and performance measurements performed on an instrumented HP’s
iPAQ with wireless LAN running a MPEG4 video application.
We analyze the impact of tuning parameters for the real-time
multimedia system (buffer sizes, channel bandwidth, power
management policy) on the trade off between power consumption and QoS.
INTRODUCTION
One of the most critical challenges in designing wireless
multimedia systems is to provide adequate quality of service
(QoS) with optimal energy efficiency. In many cases, there are
obvious trade offs between quality of service (e.g., bandwidth,
latency) and power consumption. To avoid poorly controlled
QoS degradation, multimedia systems are often designed and
managed in a conservative fashion, with little consideration
for energy efficiency. QoS modeling of wireless networked
systems (Cali et al., 1998; Eshghi & Ekhakeem, 1998) is a mature and highly active discipline (refer to (MSWIM, 2002) for
an up-to-date overview of the topic), and various approaches
have been explored to enhance QoS-oriented models with
power consumption models (Raghunathan et al., 2002; Sinha
& Chandrakasan, 2001; Marculescu et al., 2001; Krashinski
& Balakrishnan, 2002; Zorzi & Chockalingam, 1998). Most
of the approaches explored in the past adopted a stochastic
discrete event model (Lee & Sangiovanni-Vincentelli, 1998),
Luca Benini
DEIS - University of Bologna
Viale Risorgimento 2
Bologna, 40136, Italy
E-mail: lbenini deis.unibo.it
where the system evolves in an enumerable set of time instants
and transitions are randomized.
Our work moves from this widely adopted modeling framework and pushes it one step closer to practice. We present two
original contributions. First, we developed, consistently with
previous work, a detailed stochastic discrete-event model of
a complex multimedia system. Our model includes all hardware and software components involved in communication of
multimedia data over a wireless channel: application software,
transport and network stack, operating system drivers, power
management software, network interface card hardware, wireless channel, base station hardware and software. Many of the
model components are parameterized, in an effort to represent
a large design space of different hardware and software configurations. Second, we have fully characterized the power and
performance metrics in the model with experimental measurements on fully operational real-life hardware and software. As
a result, our model is detailed enough to represent a number
of complex effects that impact power and quality of service
in real-life multimedia systems, while at the same time our
characterization flow is precise enough to obtain power and
performance estimates that closely match the measured data.
Our results demonstrate that the performance and energy consumption of a multimedia system are deeply influenced by a
number of interacting hardware and software components, and
that focusing only on a part of the system (e.g., the wireless
channel or the operating system), could lead to serious design
pitfalls.
MODELING POWER-MANAGEABLE REAL-TIME MULTIMEDIA SYSTEMS
At the high level of abstraction, electronic systems exhibit
different operating modes and make transitions among them
at discrete points in time. Hence, they can be suitably represented as Discrete event systems (DES) (Cassandras, 1993).
We model real-time power-manageable systems as DES composed of interacting state machines. Each component has a
state structure that represents its operating modes and the transitions among them. State transitions are triggered by events
that can be either generated within the component according to
some distribution (internal events) or received in input (external events). The number of states of the model of each component may be finite, discrete, or continuous. Infinity states
are modeled by means of a finite number of parameterized
states. The evolution of the system is described by specifying next-event and next-state functions for each component.
Next-event function determines the next triggering event generated by the component, while next-state function determines
the destination state of next transition. We model each component as a generalized semi Markov process (GSMP) with
non-deterministic next-event and next-state functions based
on conditional residual-time distributions and on conditional
next-state probabilities (Glynn, 1989). Any GSMP component
is composed of a state structure and a clock structure. The
state structure component is a Stateflow model that takes in
input both internal events (generated by the local clock structure) and external events (coming from an input port) and generates a timeout value corresponding to the residual time of
the next triggering event. The clock structure takes in input
a timeout and a reset signal and generates an event when the
timeout has elapsed. The interface of the component is specified by means of input/output ports, used to exchange both
events and parameter values. The interaction among multiple
GSMP components is simply obtained by connecting their input/output ports, as shown in Figure 1. Additional output signals can be used to observe the system behavior, to evaluate
cost/performance metrics and to exchange data among modules.
Figure 2. State
manageable NIC.
diagram
of
the
power-
STREAMING VIDEO OVER AN IEEE 802.11b LINK
The video server runs on a notebook computer directly connected to an access point (AP) through a dedicated wired link.
The AP acts as a bridge to the wireless network interface card
(NIC) installed on a HP’s IPAQ palmtop. The palmtop runs
the client MPEG4 decoder that reads, buffers, elaborates and
plays-back at a constant rate (consumer rate) the video information read from the network. In order to match real-time
video constraints, the client application buffer should be never
empty when the decoder looks for a frame to decode. If this
happens, a deadline miss occurs. Without any power control,
the shape of the traffic on the wireless channel reflects the
server transmission rate. If the 802.11b MAC-level protocol
power management (LAN/MAN Standards Committee of the
IEEE Computer Society, 1999) is enabled, the AP performs
traffic reshaping by bufferizing incoming packets to allow the
NIC to sleep for a determined period of time. In this period,
energy can be saved by the wireless interface. After the expiration of the sleeping period, the card wakes up and asks to the
AP if there are packets accumulated to be received. If this is
the case, the AP sends the back-log to the card. The traffic reshaping function performed by AP imposes bufferization both
at producer and consumer sides.
Model
The block diagram of the Simulink model of the system is
shown in Figure 1.
Producer. The producer generates output events (representing network packets) according either to a given distribution of inter-arrival times, or to a trace of time-stamped packet
information. For each packet, three properties are either randomly generated or read from the trace: the size (in bytes),
the frame it belongs to, and the total number of packets representing the same frame. Packets belonging to the same frame
are generated as a burst, while packets belonging to different
frames are generated at a rate depending on the application.
Packet information is made available at the output ports, together with the event that represents the generation of a new
packet.
Base station buffer. Packet events generated by the producer become input events for the buffer of the base station,
which is explicitly represented in the model as a limited FIFO
queue with a customizable size. The content of the queue is
saved in memory as an array of packets each with the corresponding information (size, frame number, packet per frame).
If the producer tries to send a packet and the queue is full,
a lost event is generated and the corresponding packet is discarded.
Base station. The base station (AP) gets the packets from
the queue and send them to the wireless channel. If the input
buffer is empty or the receiver in not ready, the AP goes to a
waiting state. If the receiver is sleeping because of DPM, the
AP goes to an idle state, causing incoming packets to accumulate in the input buffer.
Wireless Channel. The wireless channel is represented by
a block that receives input events representing incoming packets and generates, with a given latency, output events representing packet delivery. The wireless channel is bidirectional
and it has a user-defined packet-loss probability. Lost packets are not delivered to the receiver. Notice that we use a
simple channel model since we are interested in modeling the
entire wireless system, rather than the wireless channel by itself. Channel latency, loss probability and bandwidth (implicitly modeled by the receiver) are sufficient to perform realistic system-level simulations. Nevertheless, any channel model
(Chong, 2003) can be easily embedded in this block to take
into account complex error statistics.
Wireless Network Card. This is the most critical block
of the system, since its power consumption is critical for the
battery lifetime of the palm-top. The state diagram is reported
in Figure 2. It consists of 5 states: idle, waking up, waiting,
Figure 1. Simulink model of the streaming application over a wireless channel.
stant rate. Rather, the consumer decides how many packets to
read within a frame period based on the information associated
with the incoming packets. Frames that either arrive late with
respect to the deadline or are incomplete because of a packet
loss are discarded by the consumer.
EXPERIMENTAL RESULTS
To perform our characterization experiments, we used a
HP’s IPAQ palmtop computer. With the IPAQ, we used two
different wireless network interface cards for our experiments,
CISCO Aironet 350 Series (Cisco System, Cisco Aironet 350
Series Wireless LAN Adapters, 2003) and COMPAQ WL110
(HP WL110, 2003), hereafter denoted by CISCO and COMPAQ, respectively. The power consumption of the NICs was
Idle - PM ON
Receiving - PM ON
Receiving - PM OFF
200
CISCO
150
100
50
300
0
250
COMPAQ
receiving, acknowledge. The card is normally waiting. When
an incoming packet is detected, the card goes into the receiving state and generates a busy output event to indicate that
it cannot receive more packets until the actual one has been
processed. The card remains in the receiving state for a time
interval that is dynamically computed as the ratio between the
size of the packet (including the payload and the protocol overhead) and the bandwidth of the channel. After each packet has
been completely received, the NIC takes some time to process
the packet at the MAC level and then sends an acknowledge
back to the AP through the wireless channel. After the acknowledge packet has been sent, the NIC goes back to the
waiting state until the next packet arrives. The timing diagram of the wireless channel is shown in Figure 4. When the
power management is enabled, the card can go from the waiting state to a low-power idle state. In our model, this transition
is triggered by an input event generated by an external power
manager, described later in this section. Wake-up from idle is
triggered by the power manager when a determined sleep time
has elapsed. Wake up transitions may take a non-negligible
amount of time and power, modeled by a separate state.
Power Manager. The power manager does not represent
a hardware unit, but the model of the actual implementation
of the power management protocol of the 802.11b standard
(LAN/MAN Standards Committee of the IEEE Computer Society, 1999). Based on external settings (that can be decided
by the user), the power manager generates output events (ShutDown and WakeUp) to notify the beginning and ending of
sleeping periods. After the card has been woken-up, it starts
to receive packets accumulated by the AP. After each received
packet, the power manager resets a timeout counter. If the
timeout expires before the reception of a new packet, the card
is put again in the idle state by means of a ShutDown event.
Output Buffer. This block represents the bufferization
performed by the consumer. It can be used to model either
the protocol stack buffer or the application buffer (if present).
More levels of buffering can be added. In our case we decided
to represent only the UDP protocol buffer, since the application buffer is usually larger and less critical.
Consumer. The consumer simulates a streaming application that reads packets from the output buffer at a given rate.
Since real-time constraints impose a constant frame rate, but
each frame may be encoded using a different number of packets, packet requests from the consumer do not arrive at a con-
200
150
100
50
0
0.73
0.74
0.75
time (s)
0.76
0.77
0.78
0.79
time (s)
1.6
0.8
1.61
time (s)
1.62
Figure 3. Current profiles obtained under different workload and DPM conditions.
measured using a Sycard Card Extender that allowed us to
monitor the time behavior of the supply current drawn by the
cards.
Characterization
We performed three sets of experiments to characterize: i)
the power states of each NIC, ii) the effective bandwidth of the
wireless channel and iii) the buffer size of the base station.
Power States. The power states of the NIC were characterized in terms of power consumption and transition time by
looking at the power profiles under different workload conditions and DPM settings. Typical results are shown in Fig-
CISCO
COMPAQ
Wait
495
375
Rx
650
450
Tx
870
750
Sleep
100
35
WakeUp
325
-
of the Ethernet packets transferred from the laptop to the base
station. The actual transmission time can be expressed as:
Table 1. Power consumption of the NICs in mW
ure 3. The power consumption of CISCO is higher than that
of COMPAQ in all states (sleep, idle, receiving, transmitting)
but during wake up. Moreover, differently from COMPAQ,
CISCO has a non-negligible wake-up time. As for performance, COMPAQ has a slower reaction to incoming packets, as shown in the right-most graphs of Figure 3 obtained
while receiving the same burst of packets. Both cards stay
idle for a very short time among packets, reception takes most
of the time, while higher peaks correspond to the acknowledge
transmitted upon reception and processing of each packet. The
acknowledge peaks produced by COMPAQ look delayed and
wider with respect to those produced by CISCO, actually reducing the number of packets received in a time unit (e.g., in
the 20ms time window shown in Figure 3, COMPAQ receives
14 packets while CISCO 15). The measured average power
consumption is reported in Table for all power states of the
two NICs. The wake-up time of CISCO is 12ms, while that of
COMPAQ is neglected.
overhead
AckTime APtime
latency payload
latency
t
overall packet time
Figure 4. Timing diagram of the wireless channel
10
Transmission time (ms)
CISCO (measured)
COMPAQ (measured)
Fitting model
5
0
packet overhead
0
2000
4000
frame size (bytes)
6000
8000
Figure 5. Transmission time as a function of the
frame size.
Channel bandwidth. The effective bandwidth provided
by the wireless link depend on traffic fragmentation in packets. In fact, each packet has a two-fold overhead: the headers
and tails introduced by the protocol stack, and the acknowledge time. To characterize the effective bandwidth by taking
packet overheads into account we measured the total time required to send 50 UDP frames of a given size from the laptop to the palmtop PC. The number of frames transferred was
chosen small enough to avoid the saturation of the AP buffer.
Figure 5 shows the transmission time per frame as a function
of the frame size. Steps occurring every 1500 bytes are due
to fragmentation. In fact, 1500 bytes is the maximum payload
!#"%$'&(*),+-/.
01
(1)
where 100 is the number of protocol bytes per packet, L is the
channel latency, AckTime is the overall reaction time of the
NIC. Notice that both the AckTime and the channel Latency
are due to the acknowledge needed after each packet, as shown
in Figure 4. Using Equation 1 as a fitting model for the experimental results we can obtain indirect measures of the fitting
parameters Bw and L. Fitting curves are plotted in Figure 5.
Finally, the channel latency depends on the distance between
the base station and the NIC. For our experiments we used a
fixed distance of 50cm, providing an ideal channel quality, in
order to evaluate the frame loss due to DPM only.
Buffer size. While the size of the UDP and application
buffers on the palmtop can be set by the user, the size of buffer
used by the base station is constant. We characterized the
behavior of the internal buffer by sending bursts of packets
across the wireless channel and measuring the packet loss as
a function of the packet size and number. For a given packet
size, we call limiting burst size the maximum number of packets in a burst delivered without any loss. Interestingly, the
limiting burst size grows with the packet size: a few packets are lost if more than 100 packets of size 10bytes are sent
across the wireless channel, while up to 500 packets of size
1000bytes can be sent without packet loss. The reason for this
counter-intuitive behavior is two-fold. First, the internal buffer
of the base station is organized in packets, so that it can contain a given number of packets regardless of their size. Second, according to the results outlined in the previous section,
the larger the packets the larger the effective bandwidth provided by the wireless channel. Since the internal buffer of the
base station saturates because input data (provided by the laptop) are faster than output data (sent across the wireless link),
the higher the output bit rate the longer the time required to
saturate the buffer. Experimental results indicate that the base
station can bufferize up to 100 packets.
Model validation
We validated our model by comparing simulation results
and measurements obtained by running two MPEG4 benchmarks: conference and fireworks. The first one has a frame
rate of 15 frames/sec and is composed of 899 frames. The
second has a frame rate of 30 frames/sec and is composed
of 522 frames. Each benchmark was ran and simulated using both NICs, with all available DPM configurations, namely,
COMPAQ PM OFF, COMPAQ PM ON 200ms, COMPAQ PM
ON 100ms, CISCO PM OFF, CISCO PM ON 200ms. In order to make simulation results directly comparable with measurements we implemented a trace-based producer generating
packets according to a time-stamped trace collected while running the streaming application on the laptop. Experimental results reported in Table show that the average power consumption provided by the simulation model was always within 4%
from measurements.
COMPAQ PM OFF
PM ON 200ms
PM ON 100ms
CISCO PM OFF
PM ON 200ms
CONFERENCE
meas
sim
380.31
377,1
56.48
55.5
61.66
63.2
497.11
501
129.85
133.5
FIREWORKS
meas
sim
383.74
382.7
143.21
139.5
146.77
141.9
502.45
503.6
197.3
201.5
Table 2. Power consumption of the NICs in mW
units
PM OFF
PM ON
period
(ms)
200
200
300
300
300
400
400
400
500
500
500
500
1000
1000
latency
(ms)
100
100
200
100
200
300
200
300
400
200
300
400
500
800
1000
avg pw
(mW)
377.1
55.5
55.5
52.17
52.17
52.17
52.17
48.51
48.51
47.28
47.28
47.28
47.28
42.83
42.83
AP buf
(pac)
1
4
4
5
5
7
7
7
7
9
9
9
9
17
17
fr. lost
(perc.)
0%
18%
0%
47%
11%
0%
52%
8%
0%
61%
43%
7%
0%
29 %
0%
Table 3. Exploration for different PM configuration and latency values
Design space exploration
The simulation model was used to evaluate the impact of
DPM settings not yet supported by the real NICs. In particular we performed experiments on the conference benchmark
for different durations of the sleeping periods and for different
values of the consumer latency. Results are reported in Table .
For a determined sleeping period, we repeated the experiments by changing the initial latency of the consumer. By
increasing the sleeping period, the average power consumption decreases at a cost of higher frame loss. However, the
frame loss probability can be reduced by increasing the initial latency, allowing the consumer to bufferize more packets
before starting playback. Since the frame consumption rate is
constant, the initial bufferization allows the consumer to compensate for the long inter arrival times imposed by the power
management policy. In fact, as shown in the table, for a determined sleep duration, the frame loss decreases as a function of
the latency. In particular, the packet loss is null whenever the
initial latency is greater or equal than the sleep period, since
in this case the application buffer is never fully depleted because of DPM. In practice, client-side buffering provides the
opportunity for saving power without impairing quality of service, at a cost of an initial latency. The limiting sleep time
(and latency) depends on the size of the application buffer.
For our case study, power consumption could be reduced to
42.83mW without violating real-time constraints, with a sleep
period and a latency of 1sec and a maximum occupancy of the
application buffer of 17 packets. This is a 22% improvement
over the power savings provided by the longest sleep period
supported by the real NIC.
References
Cali, F., Conti, M. & Gregori, E. (1998) Ieee 802.11 wireless
lan: capacity analysis and protocol enhancement. INFOCOM, , 142–149.
Cassandras, C. G. (1993) Discrete event systems: modeling
and performance analysis. Aksen, .
Chong, C. C. (2003) A new statistical wideband spatiotemporal channel model for 5-ghz band wlan systems.
IEEE Transactions on Selected Areas in Communications, 21 (2), 139–150.
Cisco
System,
Cisco
Aironet
350
Series
Wireless
LAN
Adapters
(2003).
http://www.cisco.com/univercd/cc/td/doc/product/wireless/
airo 350/350cards/index.htm.
Eshghi, F. & Ekhakeem, A. (1998) Performance analysis of ad
hoc wireless lans for real-time traffic. IEEE Journal on
Selected areas in Communications, 21 (2).
Glynn, P. W. (1989) A gsmp formalism for discrete event systems. IEEE Proceedings, 77 (1), 14–23.
HP WL110 (2003). http://h18004.www1.hp.com/products/
wireless/wlan/wl110.html.
Krashinski, R. & Balakrishnan, H. (2002) Minimizing energy
for wireless web access with bounded slowdown. Procedings of MOBICOM, .
LAN/MAN Standards Committee of the IEEE Computer Society (1999). Part 11: Wireless LAN MAC and PHY
Specifications: Higher-Speed Physical Layer Extension
in the 2.4 GHz Band.
Lee, E. A. & Sangiovanni-Vincentelli, A. (1998) A framework
for comparing models of computation. IEEE Transactions on CAD ICAS, 17 (12), 1217–1229.
Marculescu, R., Nandi, A., Lavagno, L. & SangiovanniVincentelli, A. (2001) System-level power/performance
analysis of portable multimedia systems communicating
over wireless channel. Procedings of ICCAD, , 207–214.
MSWIM (2002) ACM International Workshop on Modeling
Analysis and Simulation of Wireless and Mobile Systems.
Raghunathan, V., Schurgers, C., Park, S. & Srivastava, M.
(2002) Energy-aware wireless microsensor networks.
IEEE Signal Processing Magazine, 19 (2), 40–50.
Sinha, A. & Chandrakasan, A. (2001) Dynamic power management in wireless sensor networks. IEEE Design and
Test of Computers, 18 (2), 62–74.
Zorzi, M. & Chockalingam, A. (1998) Energy efficiency of
media access protocols for mobile data networks,. IEEE
Transactions on Communications, 46 (11), 1418–1421.