CCCPublications

Transcription

CCCPublications

INTERNATIONAL JOURNAL
of
COMPUTERS, COMMUNICATIONS & CONTROL
ISSN 1841-9836
ISSN-L 1841-9836
A Bimonthly Journal
With Emphasis on the Integration of Three Technologies
Year: 2012
Volume: 8
Issue: 1 (February)
This journal is a member of, and subscribes to the principles of,
the Committee on Publication Ethics (COPE).
Agora University Editing House
CCC Publications
http://univagora.ro/jour/index.php/ijccc/
International Journal of Computers, Communications & Control
EDITOR IN CHIEF:
Florin-Gheorghe Filip
Member of the Romanian Academy
Romanian Academy, 125, Calea Victoriei
010071 Bucharest-1, Romania, ffi[email protected]
ASSOCIATE EDITOR IN CHIEF:
Ioan Dzitac
Aurel Vlaicu University of Arad, Romania
St. Elena Dragoi, 2, 310330 Arad, Romania
[email protected]
&
Agora University of Oradea, Romania
Piata Tineretului, 8, 410526 Oradea, Romania
[email protected]
EXECUTIVE EDITOR:
Răzvan Andonie
Central Washington University, USA
400 East University Way, Ellensburg, WA 98926, USA
[email protected]
MANAGING EDITOR . . . . . . . . . . DEPUTY MANAGING EDITOR
Mişu-Jan Manolescu
Horea Oros
Agora University of Oradea, Romania
University of Oradea, Romania
Piata Tineretului, 8, 410526 Oradea
St. Universitatii 1, 410087, Oradea
[email protected]
[email protected]
TECHNICAL SECRETARY
Cristian Dziţac
Emma Valeanu
R & D Agora, Romania
R & D Agora, Romania
[email protected]
[email protected]
EDITORIAL ADDRESS:
R&D Agora Ltd. / S.C. Cercetare Dezvoltare Agora S.R.L.
Piaţa Tineretului 8, Oradea, jud. Bihor, Romania, Zip Code 410526
Tel./ Fax: +40 359101032
E-mail: [email protected], [email protected], [email protected]
Journal website: http://univagora.ro/jour/index.php/ijccc/
EDITORIAL BOARD
Boldur E. Bărbat
Lucian Blaga University of Sibiu
Faculty of Engineering, Department of Research
5-7 Ion Raţiu St., 550012, Sibiu, Romania
[email protected]
Xiao-Shan Gao
Academy of Mathematics and System Sciences
Academia Sinica
Beijing 100080, China
[email protected]
Pierre Borne
Ecole Centrale de Lille
Cité Scientifique-BP 48
Villeneuve d’Ascq Cedex, F 59651, France
[email protected]
Kaoru Hirota
Hirota Lab. Dept. C.I. & S.S.
Tokyo Institute of Technology
G3-49,4259 Nagatsuta,Midori-ku,226-8502,Japan
[email protected]
Ioan Buciu
University of Oradea
Universitatii, 1, Oradea, Romania
[email protected]
George Metakides
University of Patras
University Campus
Patras 26 504, Greece
[email protected]
Hariton-Nicolae Costin
Faculty of Medical Bioengineering
Univ. of Medicine and Pharmacy, Iaşi
St. Universitatii No.16, 6600 Iaşi, Romania
[email protected]
Petre Dini
Cisco
170 West Tasman Drive
San Jose, CA 95134, USA
[email protected]
Ştefan I. Nitchi
Department of Economic Informatics
Babes Bolyai University, Cluj-Napoca, Romania
St. T. Mihali, Nr. 58-60, 400591, Cluj-Napoca
[email protected]
Shimon Y. Nof
School of Industrial Engineering
Purdue University
Grissom Hall, West Lafayette, IN 47907, U.S.A.
[email protected]
Antonio Di Nola
Dept. of Mathematics and Information Sciences
Università degli Studi di Salerno
Salerno, Via Ponte Don Melillo 84084 Fisciano,
Italy
[email protected]
Stephan Olariu
Department of Computer Science
Old Dominion University
Norfolk, VA 23529-0162, U.S.A.
[email protected]
Ömer Egecioglu
Department of Computer Science
University of California
Santa Barbara, CA 93106-5110, U.S.A
[email protected]
Horea Oros
Dept. of Mathematics and Computer Science
University of Oradea, Romania
St. Universitatii 1, 410087, Oradea, Romania
[email protected]
Constantin Gaindric
Institute of Mathematics of
Moldavian Academy of Sciences
Kishinev, 277028, Academiei 5, Moldova
[email protected]
Gheorghe Păun
Institute of Mathematics
of the Romanian Academy
Bucharest, PO Box 1-764, 70700, Romania
[email protected]
Mario de J. Pérez Jiménez
Dept. of CS and Artificial Intelligence
University of Seville, Sevilla,
Avda. Reina Mercedes s/n, 41012, Spain
[email protected]
Athanasios D. Styliadis
Alexander Institute of Technology
Agiou Panteleimona 24, 551 33
Thessaloniki, Greece
[email protected]
Dana Petcu
Computer Science Department
Western University of Timisoara
V.Parvan 4, 300223 Timisoara, Romania
[email protected]
Gheorghe Tecuci
Learning Agents Center
George Mason University, USA
University Drive 4440, Fairfax VA 22030-4444
[email protected]
Radu Popescu-Zeletin
Fraunhofer Institute for Open
Communication Systems
Technical University Berlin, Germany
[email protected]
Horia-Nicolai Teodorescu
Faculty of Electronics and Telecommunications
Technical University “Gh. Asachi” Iasi
Iasi, Bd. Carol I 11, 700506, Romania
[email protected]
Imre J. Rudas
Institute of Intelligent Engineering Systems
Budapest Tech
Budapest, Bécsi út 96/B, H-1034, Hungary
[email protected]
Dan Tufiş
Research Institute for Artificial Intelligence
of the Romanian Academy
Bucharest, “13 Septembrie” 13, 050711, Romania
tufi[email protected]
Yong Shi
Research Center on Fictitious Economy
& Data Science
Chinese Academy of Sciences
Beijing 100190, China
[email protected]
and
College of Information Science & Technology
University of Nebraska at Omaha
Omaha, NE 68182, USA
[email protected]
Lotfi A. Zadeh
Professor,
Graduate School,
Director,
Berkeley Initiative in Soft Computing (BISC)
Computer Science Division
Department of Electrical Engineering
& Computer Sciences
University of California Berkeley,
Berkeley, CA 94720-1776, USA
[email protected]
DATA FOR SUBSCRIBERS
Supplier: Cercetare Dezvoltare Agora Srl (Research & Development Agora Ltd.)
Fiscal code: 24747462
Headquarter: Oradea, Piata Tineretului Nr.8, Bihor, Romania, Zip code 410526
Bank: MILLENNIUM BANK, Bank address: Piata Unirii, str. Primariei, 2, Oradea, Romania
IBAN Account for EURO: RO73MILB0000000000932235
SWIFT CODE (eq.BIC): MILBROBU
Short Description of IJCCC
Title of journal: International Journal of Computers, Communications & Control
Acronym: IJCCC
Abbreviated Journal Title: INT J COMPUT COMMUN
International Standard Serial Number: ISSN 1841-9836, ISSN-L 1841-9836
Publisher: CCC Publications - Agora University
Starting year of IJCCC: 2006
Founders of IJCCC: Ioan Dzitac, Florin Gheorghe Filip and Mişu-Jan Manolescu
Logo:
Publication frequency: Bimonthly: Issue 1 (February); Issue 2 (April); Issue 3 (June); Issue 4
(August); Issue 5 (October); Issue 6 (December).
Coverage:
• Beginning with Vol. 1 (2006), Supplementary issue: S, IJCCC is covered by Thomson Reuters SCI Expanded and is indexed in ISI Web of Science.
• Journal Citation Reports(JCR)/Science Edition:
– Impact factor (IF): JCR2009, IF = 0.373; JCR2010, IF = 0.650; JCR2011, IF = 0.438.
• Beginning with Vol. 2 (2007), No.1, IJCCC is covered in EBSCO.
• Beginning with Vol. 3 (2008), No.1, IJCCC, is covered in Scopus.
Scope: International Journal of Computers Communications & Control is directed to the international
communities of scientific researchers in computer and control from the universities, research units and
industry.
To differentiate from other similar journals, the editorial policy of IJCCC encourages the submission
of scientific papers that focus on the integration of the 3 "C" (Computing, Communication, Control).
In particular the following topics are expected to be addressed by authors:
• Integrated solutions in computer-based control and communications;
• Computational intelligence methods (with particular emphasis on fuzzy logic-based methods, ANN,
evolutionary computing, collective/swarm intelligence);
• Advanced decision support systems (with particular emphasis on the usage of combined solvers
and/or web technologies).
c 2006-2013 by CCC Publications
Copyright ⃝
ISSN 1841-9836, ISSN-L 1841-9836, Volume 8, Issue 1, February, 2013.
Contents
Forecasting Chaotic Series in Manufacturing Systems by Vector Support Machine Regression and Neural Networks
M.D. Alfaro, J.M. Sepúlveda, J.A. Ulloa
8
Broadcast Scheduling Problem in TDMA Ad Hoc Networks using Immune Genetic Algorithm
D. Arivudainambi, D. Rekha
18
Outlier Detection with Nonlinear Projection Pursuit
M. Breaban, H. Luchian
30
Bio-inspired Sensory Systems in Automata for Hazardous Environments
L. Canete
37
Datastores in Cloud Governance
A. Copie, T.-F. Fortiş, V.I. Munteanu
42
A Fuzzy Control Heuristic Applied to Non Linear Dynamic System using a Fuzzy Knowledge Representation
F.M. Cordova, G. Leyton
50
CRCWSN: Presenting a Routing Algorithm by using Re-clustering to Reduce Energy
Consumption in WSN
A.G. Delavar, A.A. Baradaran
61
Detecting DDoS Attacks in Cloud Computing Environment
A.M. Lonea, D.E. Popescu, H. Tianfield
70
Reliable Critical Infrastructure: Multiple Failures for Multicast using Multi-Objective
Approach
F.A. Maldonado-Lopez, Y. Donoso
79
Simulating the Need of Working Capital for Decision Making in Investments
M. Nagy, V. Burca, C. Butaci, G. Bologa
87
Managing Information Technology Security in the Context of Cyber Crime Trends
D. Neghina, E. Scarlat
97
Flexible GPS/GPRS based System for Parameters Monitoring in the District Heating
System
A. Peulic, S. Dragicevic, Z. Jovanovic, R. Krneta
105
Radio Resource Adaptive Adjustment in Future Wireless Systems Based on Application
Demands
E. Puschita, T. Palade, R. Colda, I. Vermesan, A. Moldovan
111
A Novel Method for Service Differentiation in IEEE 802.15.4 : Priority Jamming
S.Y. Shin
127
Efficiency Consideration for Data Packets Encryption within Wireless VPN Tunneling
for Video Streaming
D. Simion, M.F. Ursuleanu, A. Graur, A.D. Potorac, A. Lavric
136
Solving Method for Linear Fractional Optimization Problem with Fuzzy Coefficients in
the Objective Function
B. Stanojević, M. Stanojević
146
PSO for Graph-Based Segmentation of Wrist Bones in Bone Age Assessment
P. Thangam, K. Thanushkodi, T.V. Mahendiran
153
Alternative Wireless Network Technology Implementation for Rural Zones
F.J. Watkins, R.A. Hinojosa, A.M. Oddershede
161
Issues on Applying Knowledge-Based Techniques in Real-Time Control Systems
D. Zmaranda, H. Silaghi, G. Gabor, C. Vancea
Author index
166
176
INT J COMPUT COMMUN, ISSN 1841-9836
8(1):8-17, February, 2013.
Forecasting Chaotic Series in Manufacturing Systems by Vector
Support Machine Regression and Neural Networks
Miguel D. Alfaro, Juan M. Sepúlveda
Jasmín A. Ulloa
Department of Industrial Engineering, University of Santiago of Chile
3769 Ecuador Ave. Santiago, Chile.
E-mail: [email protected], juan.sepulveda @usach.cl
[email protected]
Abstract:
Currently, it is recognized that manufacturing systems are complex in their structure
and dynamics. Management, control and forecasting of such systems are very difficult
tasks due to complexity. Numerous variables and signals vary in time with different
patterns so that decision makers must be able to predict the behavior of the system.
This is a necessary capability in order to keep the system under a safe operation.
This also helps to prevent emergencies and the occurrence of critical events that may
put in danger human beings and capital resources, such as expensive equipment and
valuable production. When dealing with chaotic systems, the management, control,
and forecasting are very difficult tasks. In this article an application of neural networks
and vector support machines for the forecasting of the time varying average number
of parts in a waiting line of a manufacturing system having a chaotic behavior, is
presented. The best results were obtained with least square support vector machines
and for the neural networks case, the best forecasts, are those with models employing
the invariants characterizing the system’s dynamics.
Keywords: chaos; forecast; neural networks; vector support machines; manufacturing systems
1
Introduction
Manufacturing systems are conceived as complex ones; although the complex term does not
have a unique definition [1] it is possible to distinguish two kinds of complexities in production
systems: a) structural complexity or static complexity dealing with the number of system’s
components and their relationships, and b) dynamic complexity dealing with the uncertainty
in the systems behavior [2]. It may seem paradoxical that an artificial system engineered for
making a set of given tasks had its own laws as if it was a natural system. This is due to the
fact that production systems are becoming everyday more complex by the technological progress
and the transformation of the supply chain. Flexible manufacturing machinery, global markets,
and supply network relationships are typical examples of such changes. Managing these systems
to bring them under control is today a difficult task. The dynamic complexity of production
systems has been demonstrated by the kinds of behavior that they can exhibit, among these a
chaotic behavior [3], [4], [5]. Several metrics have been proposed for measuring the complexity of
manufacturing systems [2], [6], [7], [8]. These studies relate metrics of performance with metrics
of complexity. In this article, a way of control by forecasting the system’s behavior is shown.
For this purpose, a time series of the average number of parts in the waiting line of a chaotic
manufacturing system is utilized [3]. As forecasting methods, Support Vector Machines (SVMs)
and Artificial Neural Networks (ANNs) have been selected because these methods can distinguish
chaotic patterns and therefore they can predict the evolution of an observed control variable.
In [9] SVMs have been used for support vector regression (SVR) analysis of several exchange
rates with respect to the US dollar. In the present work, a least square support vector regression
Copyright ⃝
Forecasting Chaotic Series in Manufacturing Systems by Vector Support Machine Regression
and Neural Networks
9
(LS-SVR) with less computational effort than the one reported in [10] is proposed. In [11] an
ANN model for forecasting the observed error of monitoring units of the sea level in Singapore is
presented. In [12] an ANN is constructed for forecasting the behavior of a diode having a chaotic
pattern; in this case the local dimension and the time delay are proposed for determining the
network architecture.
The originality of this work consists of the study of the performance of two forecasting
techniques: LS-SVM and ANN as applied to chaotic series. Similar studies have been made but
for series that are not chaotic [13], [14]. Also, it is a novel application in the manufacturing area.
The paper is organized as follows: section 2 shows in a summarized way the manufacturing
system from which the time series was obtained; section 3 presents the methods utilized for
the forecasting; section 4 shows the results of the analysis of the time series by using nonlinear dynamic systems (NLDS) theory. Forecasting results are detailed in Section 5. Finally,
conclusions and research directions are given.
2
System under study
2.1 Variable to be analyzed
The system under study is described in [3]. The variable to be analyzed is the average in time
of the number of parts in the waiting line of a flexible manufacturing system. By utilizing a time
series of this variable the system’s dynamics is characterized by means of the theory of non-linear
dynamic systems. The machining shop is formed by three different machines producing three
types of parts. Each part has a set of operations which can be executed in different machines
according to the operations sequence. Figure 1 shows the layout of the system under study.
Figure 1: Machining shop layout.
2.2 Operation of the system
• The arrivals rate and service rate are such that the system is under equilibrium; that is,
the number of parts neither tends to zero nor infinity.
• Upon arrival of a part, a function f assigns the part to a machine able to perform the
operation and that has the least number of parts in queue.
• The priority of the queue of each part type at any machine is first-in first-out (FIFO).
10
• The machine works cyclically with a kind of part, during a time interval equivalent to the
time needed to complete the stock of that kind of part at the k-th machine. The function gk
manages the execution cycle according to the part type. The values of plant parameters are
shown in Table 1, where βj is the arrival rate of the j-th part type (number of parts/time
unit) and Oij is the i-th operation of the j-th part. The values in the table represent the
operation time at each machine.
3
Forecasting Methods Utilized
Research on forecasting models has received considerable attention over the last 50 years.
Currently, there exist numerous forecasting methods [15]. For the case of chaotic systems, the
very theory on NLDS provides forecasting methods [16]. In this article, NLDS theory is used
as a base for constructing ANN and VSM models. These techniques are proposed due to their
capacity for recognizing chaotic patterns.
3.1 Artificial Neural Networks
An artificial neural network (ANN) is a computational model of the brain. It consists of a
limited number of connected elements (neurons) and it is distributed in an input layer, one or
more hidden layers, and an output layer.An ANN is a mathematical structure that allows pattern
recognition; in this work we use a Back-Propagation type of network, as shown in figure 2.
As it is known that the system is chaotic, it is proposed as the number of neurons in the
input layer the dimension of the phase space. This value is obtained by the method of false
neighbors. The temporal distance between the input variables corresponds to the time delay
in the construction of the systemďż˝s attractor. This value is obtained from the average of the
mutual information [16]. The transfer function is of the sigmoidal type and the network is defined
by the equation (1).
xt = β0 +
n
∑
i=1
βi f(sωi0 +
d
∑
ωij xt−j )
(1)
j=1
Where n is the number of neurons in the hidden layer, d is the number of neurons in the
and Neural Networks
11
Figure 2: Neural Network for Forecasting
input layer, s is the standard deviation of the weights matrix and βi , ωij are the weights. The
number of neurons in the hidden layer is due to the following empirical relationship 2:
Nobs
≥ (Ne + 1)Nc + (Nc + 1)Ns
(2)
10
Where Nobs is the number of observations, Ne, Nc and Ns are the number of neurons in the
input layer, hidden layer, and output layer, respectively.
3.2 Support vector machines for least squares regression
Support vector machines (SVM) algorithms emerged from the artificial intelligence field and
they have been successfully used in a variety of applications for problems of classification and
regression. The least-squares vector support machine is a modified version of a standard VSM for
regression; the model is trained by solving a linear system instead of a quadratic programming
optimization model [10]. The LS-SVM are closely related with regularization networks and
Gaussian processes, buy they emphasize and exploit its interpretation from the viewpoint of the
optimization theory. The general formulation for a LS-SVR is shown in (3):
y = wT f (x) + b
(3)
Where x is the input vector of the data series, Y is the output vector. Parameters w and b
are obtained from the optimization problem given by equations (4), (5).
1
1∑ 2
Min τ (w, b, e) = ∥w∥2 + γ
ei
2
2
(4)
s.t. (yi − (⟨w, xi ⟩) + b) = ei ∀i = 1, ...N
(5)
N
i=1
In (4), γ is an arbitrarily chosen parameter. For learning, the kernel utilized is the radial
base function (RBF) (6).
k(xi , xj ) = exp(−
∥yi − xi ∥2
)
σ2
(6)
12
The architecture of the LS-SVR [17] is shown in figure 3.
Figure 3: Architecture generated by regression SVM
4
Characterization of the system’s dynamics
It has been stated that the system has a chaotic behavior; in figures 4, 5, 6, 7, 8, 9 the analysis
confirming this assumption is shown. In figure 4 the time series plot is seen, figure 5 shows the
Fourier spectrum where the erratic nature of the series is verified; figure 6 presents the time
delay corresponding to the first minimum of the mutual information average. Figure 7 shows the
phase space dimension d = 5, which has been obtained by the percentage of false neighbors [16].
However, figure 8 shows that the local dimension is 3 (correlation dimension 2.784). Finally, in
figure 9 is observed that the highest Lyapunov’s exponents value is 0.41, which corroborates the
signal’s chaotic character.
Figure 4: Average of the number of parts in
time
Figure 5: Fourier power spectrum
and Neural Networks
13
5
Figure 6: Mutual information average
Figure 7: Dimension of phase space
Figure 8: Correlation Dimension
Figure 9: Lyapunov’s exponents
Forecasting models results
In order to measure the performance of the models, the Apropiability Index (IA) and the
Normalized root of quadratic error (RMS) are used, as defined by (7) and (8) respectively:
∑n
(yi − yi, )2
IA = 1 − ∑n i=1
(7)
, 2
i=1 (|yi | + |yi |)
√∑
, 2
n
i=1 (yi − yi )
∑
RMS =
(8)
n
2
i=1 (yi )
Where, yi is the value of the average number of parts at period i, yi, is the value predicted
by the model, and n is the number of forecasted periods. The IA indicates the proportion of
the variance that is explained by the model; values greater than 0.9 are expected. Whereas, the
RMS compares the error between the desired output and the one generated by the model; values
close or below to 0.1 are expected.
The series has 12001 data or periods, which correspond to the time persistent average number
of parts in a time interval of 0.15 time units. A 70% is used for training and the remaining 30%
for validation.
5.1 Experimental Results for ANN
For the forecasting model, a neural network is designed based on the time delay τ and the
dimension d which were obtained from the chaotic system’s characterization. Five forecasting
models are constructed with 1, 3, 5, 7 and 10 neurons in the input layer, with a time delay τ = 4
in each model. The number of neurons in the hidden layer is obtained by the relationship (2) for
14
each model; thus it is obtained: 279, 167, 119, 93, 69 neurons, respectively. All of the models
were implemented by using the M AT LAB T M AN N toolbox. Figures 10, 11, and 12 show the
results for the cases of 3, 5 and 10 neurons in the input layer.
Figure 10: Forecast with three neurons in
the input layer
Figure 11: Forecast with five neurons in the
input layer
Figure 12: Forecast with ten neurons in the input layer
Table 2 show the values of IA and RMS indicators for the different models.
It is observed that the model of 10 neurons is superior in 0.03% to the model with five neurons
in the input layer. However, if the RMS of the same models are compared, the model with 10
neurons is greater in 3.65%. In a similar analysis between the models with three and ten neurons,
it can be seen that the difference of the IAs is 0.05% and the RMS’s 0.67%. Then, it is possible
to state that the best models can be found between the ones with three and five neurons in the
input layer. These values are exactly those corresponding to the local dimension and the global
dimension in the phase space.
5.2 Experimental Results for LS-SVR
The values of the RBF kernel parameters associated with the optimization problem described
by equations (4), (5) and (6) are γ = 256 and σ = 8. As time delay in the input variables τ =
4 has been used. Likewise the ANN case, five models have been constructed with 1, 3, 5, 7 and
10 input vectors. The models were implemented by using M AT LAB T M LSSVMlab1.7. Figures
13, 14 and 15 show results for 3, 5 and 10 input vectors.
and Neural Networks
15
Figure 13: Forecast with three input vectors
Figure 14: Forecast with five input vectors
Figure 15: Forecast with ten input vectors
Table 3 shows the values of the indicators IA and RMS for the different models.
The results in Table 3 are not the expected ones; the best results correspond to seven and ten
input vectors but not between dimensions three and five as with ANNs. The same experiment
was executed without time delay, that is τ = 1 was assumed. The results are shown in Table 4.
Again, it is observed that best results are for seven and ten input vectors. Thus, for this
specific case, it is verified that does not exists a behavior pattern based on NLDS characterization.
5.3 Comparison between experimental results for ANN and LS-SVR
Even though the difference between the IA of the best ANN and LS-SVR models is 0.11%
in these models (five neuron for ANN and seven input vector for LS-SVR) the LS- SVR’s RMS
value is practically a half the ANN’s. Hence, it is possible to conclude that the best model is the
one constructed with LS-SVR. Nevertheless, it is observed that in general terms both approaches
perform adequately.
16
6
Conclusions
As seen, the best results were obtained with least square support vector machines. Similar
results have been reported for SVR and ANN for non-chaotic series [13], [14]. Notwithstanding,
it can be stated that both models are efficient for the forecasting of a chaotic series obtained from
a flexible manufacturing system. According to the results, it can be concluded that the system’s
behavior can be predicted in one time step, that is 0.15 time units. As research directions, it
is suggested to develop models able to predict a longer time interval. For this purpose, ongoing
work by the authors is addressing the use of the inverse of the Lyapunov’s exponent in order to
determine the number of neurons in the output layer for the ANN case, and of the number of
output vectors for the LS-SVR.
Acknowledgments
This research has been supported by DICYT ( Scientic and Technological Research Bureau)
of The University of Santiago of Chile (USACH) and Department of Industrial Engineering.
Bibliography
[1] EIMaraghy H., Kuzgunkaya 0., Urbanic R., Manufacturing Systems Configuration Complexity, CIRP ANNALS-MANUFACTURING TECHNOLOGY, ISSN 0007-8506, 54(1): 445-450,
2005.
[2] Papakostas N., Efthymiou K., Mourtzis D., Chryssolouris G., Modelling the complexity of manufacturing systems using nonlinear dynamics approaches, CIRP ANNALSMANUFACTURING TECHNOLOGY, ISSN 0007-8506, 58: 437-440, 2009.
[3] Alfaro M., Sepúlveda J. Chaotic behavior in manufacturing systems, INT. J. PRODUCTION
ECONOMICS, ISSN 0925-5273, 101: 150-158, 2006.
[4] Papakostas N., Mourtzis D., An Approach for Adaptability Modeling in Manufacturing- Analysis Using Chaotic Dynamics, CIRP ANNALS-MANUFACTURING TECHNOLOGY, ISSN
0007-8506, 56(1): 491-494, 2007.
[5] Donnera R., Scholz-Reiter B., Hinrichs U., Nonlinear characterization of the performance
of production and logistics networks, JOURNAL OF MANUFACTURING SYSTEMS ISSN:
0278-6125, 27(2): 84-99, 2008.
[6] Wu Y., Frizelle G., Efstathiou J., A study on the cost of operational complexity in customersupplier systems, INT. J. PRODUCTION ECONOMICS, ISSN 0925-5273, 106(1): 217-229,
2007.
[7] Phukan A., Kalava M., Prabhu V.,Complexity metrics for manufacturing control architectures
based on software and information flow, Computers and Industrial Engineering ISSN: 03608352, 49(1): 1-20, 2005.
[8] Wang H., Hu S., Manufacturing complexity in assembly systems with hybrid configurations
and its impact on throughput, CIRP ANNALS-MANUFACTURING TECHNOLOGY, ISSN
0007-8506, 59: 53-56, 2010.
and Neural Networks
17
[9] Huang S., Chuang P., Wub C., Lai H., Chaos-based support vector regressions for exchange
rate forecasting, EXPERT SYSTEMS WITH APPLICATIONS, ISSN: 0957-4174, 37: 85908598, 2010.
[10] He K., Lai K., Yen J., A hybrid slantlet denoising least squares support vector regression
model for exchange rate prediction, PROCEDIA COMPUTER SCIENCE ISSN 1877-0509,
1: 2397-2405, 2010.
[11] Sun Y., Babovic V., Chan E., Multi-step-ahead model error prediction using time-delay
neural networks combined with chaos theory, JOURNAL OF HYDROLOGY, ISSN 00221694, 395(1): 109-116, 2010.
[12] Hanias M., Karras D. On ef ?cient multistep non-linear time series prediction in chaotic diode
resonator circuits by optimizing the combination of non-linear time series analysis and neural networks, ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE ISSN
0952-1976, 22(1): 32-39, 2009.
[13] Vanajakshi L., Rilett L.,A Comparison of The Performance Of Artificial. Neural Networks
and Support Vector Machines for the Prediction of Traffic Speed, IEEE INTELLIGENT VEHICLES SYMPOSIUM ISBN 0-7803-8310-9, 194-199, Parma, Italy June 2004.
[14] Yoon H.,Jun S, ,Hyun Y.,Bae G., Lee K.,A comparative study of artificial neural networks
and support vector machines for predicting groundwater levels in a coastal aquifer, JOURNAL
OF HYDROLOGY, ISSN 0022-1694, 396(1): 128-138, 2011.
[15] Scott J., it Principles of Forecasting, University of Pennsylvania, Kluwer Academic Publisher 2001.
[16] Abarbanel H., Analysis of observed chaotic data. New York, Springer-Verlag, 1996.
[17] Vapnik V., Statistical Learning Theory, Wiley. Springer, New York. USA. 1998.
8(1):18-29, February, 2013.
Broadcast Scheduling Problem in TDMA Ad Hoc Networks
using Immune Genetic Algorithm
Department of Mathematics
Anna University, Chennai, India
E-mail: [email protected], [email protected]
Abstract:
In this paper, a new efficient immune genetic algorithm (IGA) is proposed for broadcast scheduling problem in TDMA Ad hoc network. Broadcast scheduling is a primary
issue in wireless ad hoc networks. The objective of a broadcast schedule is to deliver a
message from a given source to all other nodes in a minimum amount of time. Broadcast scheduling avoids packet collisions by allowing the nodes transmission that does
not make interference of a time division multiple access (TDMA) ad hoc network. It
also improves the transmission utilization by assigning one transmission time slot to
one or more non-conflicting nodes such a way that every node transmits at least once
in each TDMA frame. An optimum transmission schedule could minimize the length
of a TDMA frame while maximizing the total number of transmissions. The aim of
this paper is to increase the number of transmissions in fixed Ad hoc network with
time division multiple access (TDMA) method, with in a reduced time slot. The results of IGA are compared to the recently reported algorithms. The simulation result
indicates that IGA performs better even for a larger network.
Keywords: Ad hoc networks, Broadcast Scheduling, Genetic algorithm, Immune
genetic algorithm.
1
Introduction
A wireless ad hoc network is a collection of nodes, which communicate with each other using
radio transmissions. In wireless networks, nodes can communicate directly if they are within
the transmission range of each other. When a source node is out of the transmission range of
destination node then it uses intermediate nodes for routing their message. In ad hoc networks,
there are no base stations to act as routers and the nodes themselves perform routing. Therefore,
data should be delivered from source to destination through multiple hops. Ad hoc networks
rely on multihop transmission among the nodes on the same channel. We have proposed how
efficiently broadcasting can be done in ad hoc network.
In this paper, we consider a fixed ad hoc networks with TDMA and its topology can be
represented by a graph. TDMA is divided into frames where each frame is further divided into
time slots that can be assigned to different nodes. In a single frame, all nodes must be allowed
to transmit packets at least once. TDMA scheduling algorithms can be categorized into link
scheduling algorithm and broadcast/node scheduling algorithm [10]. In Ad hoc network, a link
can be represented as (t, r), where t is a transmitter and r is a receiver. In link scheduling,
the transmission in every slot is assigned to certain links, where as in broadcast scheduling the
transmission in every slot is assigned to certain nodes. The aim of the algorithm is to activate all
the nodes at least once, to improve the utilization factor of the network channels, i.e., to increase
the number of transmissions within minimum number of time slots.
Ahmad et al., [1] proposed an algorithm based on finite state machine synthesis that determines the minimum frame length with the maximum slot utilization. A maximal compatible set
of nodes is produced and these are chosen such that the nodes in that set do not have conflicts
Copyright ⃝
19
with one another. A tight lower bound derived from set of maximal incompatibles forms the
basis for deriving minimum frame length. The algorithm applies set of rules on the maximal
compatibles in order to maximize utilization of slots.
Genetic algorithm (GA) is population-based stochastic optimization method with an iterative
process of generation-and-test. It has been recognized that GA is promising approach for NPhard or NP-complete problems. GA solves many search and optimization problems effectively.
A standard genetic algorithm approach is given by Chakraborty in 1998 for scheduling problem
in packet radio networks, though the algorithm able to solve small problems but performs poorly
for large networks. This is because classical crossover and mutation operations create invalid
population that goes through several generations and delay the progress of search for valid
solutions. Special crossover and mutation operations for elite population method is defined by
Chakraborty [3], such that members of the population always remain valid solutions for the
problem. Even though, it produces optimal solution in less number of generations and it reduces
invalid solutions but the computation time is not reduced.
Gunasekaran et al., [4] proposed two different algorithms for spatial reuse in WiMAX networks. First, a dynamic programming (DP) method is adopted to produce maximal collision free
set of nodes, but suffered from high memory requirements. The second one is genetic algorithm
approach, it is more scalable than DP approach but does not guarantee optimality. Two kinds
of population are used in co-evolutionary genetic algorithm that act as competitors for each
other and hence increase the evolving rates. This approach seems to have a better performance
than classical genetic algorithm approach, which did not produce good result for large networks.
The main drawback of co-evolutionary genetic algorithm is every member of test population has
to be compared with every member of solution population. This requires many comparisons,
calculations and hence may slow down the procedure when population sizes are vast.
A novel hysteretic noisy chaotic neural network (HNCNN) by controlling noises of the equivalent model for broadcast scheduling problem in packet radio networks is proposed by Ming Sun
et al., [7]. They combine the HNCNN with the gradual expansion scheme to find the minimal
frame length in the first phase, and to maximize the conflict-free transmission in the second
phase.
Ngo and Li [8] proposed an approach based on a modified GA called genetic-fix. They formulate the problem based on a within-two-hop connectivity matrix and proposed a centralized
scheduling algorithm using a modified genetic-fix algorithm. Traditional GA generates subsets
of all possible sizes whereas genetic-fix algorithm generates fixed-size subsets i.e., in binary representation number of one’s is fixed.
A mixed tabu-greedy algorithm has been implemented to solve the broadcast scheduling
problem in packet radio networks is given by Peng et al., [9]. Improvements are achieved in
terms of both channel utilization and packet delay by using a two-step algorithm.
The problem of determining an optimal minimum length TDMA schedule for a general multihop radio network is NP-complete for both link and broadcast scheduling [10]. Both link
scheduling and broadcast scheduling are considered in their approach.
Salcedo et al., [11] propose a procedure that combines Hopfield neural network for the constraints satisfaction and genetic algorithm for achieving maximal throughput. Their approach
to solve broadcast scheduling problem is by dividing problem in two sub problems. The first is
to find minimum frame length without interference using discrete Hopfield neural network. The
second increases the throughput for given frame length is done by combining Hopfield neural
network with genetic algorithm.
A linear integer programming formulation for the composite problem of maximizing channel
utilization while minimizing the length of the frame is given by Syam menon [12] that performs
in reduced computation time but maximum number of stations taken in their approach is 50
20
stations.
In [13], Wang and Ansari propose a broadcast scheduling algorithm based on mean field
annealing neural networks. They proposed a three-step algorithm first step reduces solution
space by presetting some neurons according to the topology of the scheduling network. Second
step executes MFA procedure to maximize channel utilization. A heuristic approach is the final
step to arrange transmissions of unassigned stations.
A competent permutation encoded genetic algorithm is executed to solve optimum time
division multiple access broadcast scheduling problem for mobile ad hoc networks is proposed
by Wu et al., [14]. The problem search space is reduced mostly and genetic algorithm becomes
more capable in searching the optimum solutions.
In genetic algorithm, the mutation creates new genes for the population and crossover operator orients seeking the best solution from genes in the population. However, they may drop
into local optimal solutions or they may find optimal solution by low convergence speed and GA
blindly wanders over search space. In GA, a normal mutation operator takes chance to change a
best solution obtained from previous operation. To overcome these problems, we used immune
concept to enhance the GA. Immune genetic algorithm (IGA) does not perform mutation in a
normal way and mutation operator is carried out by two steps in this algorithm 1) Immune selection and 2) Vaccination. Immune selection performs reduction of time slots whereas vaccination
gets the knowledge from hop matrix to mutate a bit. IGA increases number of transmission in a
reduced time slot. The aim of proposed immune genetic algorithm is to activate all nodes at least
once, while minimizing number of time slots. Another goal is to increase utilization factor of
network channels, i.e., the usage of available channels should be maximized by increasing number
of transmissions. The method based on immune genetic algorithm for the broadcast scheduling
problem in Ad hoc networks has not been studied so far. A comparison with existing methods for
the test instance reported in the literature shows that our algorithm identifies optimal solution
in less number of generations and the average time delay is reduced even for large network.
This paper is organized as follows: Section 2 gives a formal definition of broadcast scheduling problem, along with constraints to be satisfied. An immune genetic algorithm approach is
provided in Section 3. Simulation, results and analysis of the proposed algorithm are provided
in Section 4. Section 5 concludes the paper.
2
Formulation of Broadcast Scheduling Problem
Ad hoc topology can be represented as undirected graph G = (N, E), where N is the set
of nodes or mobile stations and E is the links or transmissions assumed bidirectional. A link
(i, j) exists between nodes i and j if they are within transmission range of each other. When
node i transmits data, other nodes within the transmission range of i will receive it. If both
i and j transmit packet in the same time slot then it leads to primary conflict. When a node
receives two or more packets from directly connected different nodes in a single time slot then
it cause to secondary conflict. For a collision free transmission primary conflict and secondary
conflict should not occur. Two mobile stations can transmit in the same time slot without
mutual interference, if they are located more than two hops apart. Three important matrixes
used in this study are Connectivity matrix [CM ] is to represent a direct link between nodes,
Hop information matrix [HM ] is to represent one-hop and two-hop connectivity information of
each node and TDMA frame matrix [T M ] is to represent the allotted time slots of given network
without any interference.
The connectivity matrix in Fig.2.(a) represents direct link between nodes given in the network
Fig.1. Each column represents the nodes of network and row represents link existence between
nodes. In Fig.2.(a), first row says about connectivity information of node 1, likewise for remaining
21
Figure 2: (a) Connectivity matrix and (b)
Hop information matrix for a 6-node network
Figure 1: A sample 6-node network
Figure 3: (a), (b), (c) and (d) Sample solution TDMA frames created for a 6-node network
nodes. The matrix has value 0 or 1, where 1 represents existence of a link. The hop information
matrix for the given 6-node network is shown in Fig.2.(b), where row value represents one-hop
and two-hop information between nodes. The matrix takes value 0 or 1 from first row, it is
identified nodes 2 and 3 is either one-hop or two-hop away from node 1.
The TDMA frame matrix is a |M |X|N | matrix where |M | is the number of time slots and
|N | = {n1 , n2 , ..., nx } is total number of nodes in the network. For the 6-node network, possible
TDMA frame matrix is shown in Fig.3. The first [T M ] is a trivial solution where there is no
chance to get conflict since each node is assigned in each time slot. Fig.3.(b) & (c). represents
solution frame with reduction in time slot but it is not an optimal solution, where as the solution
frame in Fig.3.(d) is generated with 4 time slots which is optimal solution for the given 6-node
network. The optimal solution is determined based on fitness criteria Tight lower bound and
Channel utilization variable.
Tight lower bound is
M D = max |DS (n)|
(1)
nϵ|N |
Let DS(n) be the degree set of n nodes, M D represents maximum degree of the network, based
on this value tight lower bound is generated as,
∆ = |M | − M D ≥ 1
(2)
If ∆ = 1 then the solution is optimal.
Channel utilization variable for entire network


|M | |N |
∑
∑
1

[T Mij ]
ρ=
|M | ∗ |N |
(3)
i=1 j=1
The total number of transmission taken by each node is calculated using
ρx =
|M |
∑
i=1
[T Mix ]
(4)
22
Applying the criteria in the sample TDMA solutions shown in Fig.3 it is identified solution
in Fig.3.(d) satisfies tight lower bound value with high channel utilization value compared to
solution in Fig.3.(a).(b) & (c). M D value for the given 6-node network is 3, so ∆ = 1 for
solution in (d) and channel utilization of entire network is 0.33, where as in solution (a) channel
utilization ρ is 0.17, for (b) and (c) ρ is 0.2.
3
Proposed IGA
3.1. Analysis of Genetic Algorithm
Genetic algorithm is a heuristic search technique that simulates the processes of natural selection and evolution. Genetic algorithms are effective, robust search procedure for NP-complete
problems. GA is a nontraditional search and optimization method. It works for the problem
that has large number of solutions, out of which some are feasible and some are infeasible. The
task is to get best solution out of feasible solutions. Standard GA starts with a set of solutions
called population. A population is a collection of chromosomes. Solutions for one population
are taken and used to form a new population, which are selected according to their fitness. This
is repeated until some conditions for improvement of best solution are satisfied. Three main
operators are used to create a new population 1) selection, 2) crossover, and 3) mutation. The
new population is further evaluated and tested for termination. If termination criteria are not
satisfied, the population is iteratively operated by three operators and evaluated. One sequence
of these operations and subsequent evaluation procedure is a generation in GA.
3.1.1. Initial Population
TDMA scheduler matrix is represented as bit string chromosome containing 0’s and 1’s in
broadcast scheduling problem. Each row and column of scheduler matrix represents to time slot
and node transmission. The value 1 in position (i, j) in scheduler matrix indicates j th node
is allowed for transmission in ith time slot. In classical GA approach, chromosome contains
random string of 0’s and 1’s that does not perform well when number of nodes exceeds 40 [3].
The reason is when number of generations increases number of invalid individuals in population
keeps increasing as the validity of every individual is not ensured while following classical GA
crossover and mutation operations. Finally, invalid members dominate total population and thus
optimal solution is not found. This was overcome by changing initial population, crossover and
mutation methods to keep members of population valid by Chakraborty [3]. In our algorithm,
initial TDMA frames are constructed using Elite population method of Chakraborty [3].
3.1.2. Selection
The selection operators for parent selection and survivor selection follow Darwinian principle
of survival of the fittest. For parent selection two chromosomes is selected randomly from the
population to serve as parents for reproduction process. Second, survivor selection applies the
principle of survivor of the fittest. Only fittest individuals selected as parents for next-generation,
to achieve this k-tournament selection is done. Survivor selection is also called as elitism, which
is to retain some of the best individuals in each generation. In this study, a small percentage of
best fitness individuals retained to next generation. It increases the performance of algorithm,
by preventing loss of best found solution. From each generation 10% of best solution is retained
to next iteration.
3.1.3. Crossover and Mutation
The selected chromosomes for reproduction are gathered in mating pool. The single-point
crossover operator is done on rows of the population. Once a crossover point is identified,
a random row from first parent PR1 is crossed over with a random row from second parent
PR2. The resultant chromosome CH1 is replaced with PR1 and CH2 is replaced with PR2.
After replacing, if the solution violates constraints then it is removed from population. The
23
mutation operator behaves in a different manner depending on the fitness of selected gene.
The mutation operator changes one bit in selected chromosome depending on individual fitness.
Simple mutation is done by flipping a bit. At every bit of all members of population, a random
number between 0 and 1 is generated. If it is less than or equal to mutation probability, it is
flipped from 0 to 1 and vice versa.
3.1.4. Fitness function
The fitness function evaluates quality (fitness) of candidate solutions. The fitness function
for scheduling problem is based on the variables tight lower bound and channel utilization.
The termination point determines whether optimal solution is determined in that generation
or not. The optimal solution is the one, which satisfies both criteria. When the generation of
evolution reaches this termination point, algorithm stops and output optimal solution for the
given network, else elitism method is done on populations and proceeds to next generation. At
the end of iteration before evaluating fitness function, populations produced in the generation
are taken for duplicate row elimination i.e., time slot which is repeated is removed from the
population in order to produce optimized TDMA frame.
3.2. Optimization properties of Immune Genetic Algorithm compared to GA
In GA two main genetic operators, crossover and mutation, not only give each individual’s
the evolutionary chance to obtain global optimum but also cause the degeneracy to some extent
because of random and unsupervised searching during the entire process. On the other hand,
GA is lack of capability of making use of some basic and obvious characteristic or knowledge in
pending problem. Based on the considerations above, immune Genetic Algorithm is proposed.
Algorithm 1 shows the structure of immune genetic algorithm. The solution after crossover
is taken for immune operations. IGA is an intelligent optimization algorithm, which mainly
constructs an immune operator accomplished by two steps: Immune selection and Vaccination.
The initial populations are created using elite population method of Chakraborty [3]. Random
selection method for parent selection and 10% of best solution is taken to next generation. The
knowledge added IGA algorithm performed in the following way.
Algorithm 1. Immune Genetic Algorithm
Immune Genetic Algorithm(G=(V,E), Psel, Pcr, Psize, maxgen)
{
Generate compatibility and hop information matrix;
Find the degree of the network;
for loop=1:Psize
population = generate initial population using Elite population;
end for
while (until reaches maxgen or optimal solution is identified )
// two parents are selected based on the selection probability
Spop = selection (Psel, population);
// single point crossover is done on the selected parents
Cpop = crossover (Pcr, Spop);
// the chromosomes after reproduction is taken for immunization
Immunization (Cpop)
{
// solutions that satisfies primary and secondary constraints are taken by immune selection
ISel = ImmuneSelection (Cpop);
// resulting solutions are arranged according to channel utilization variable
// vaccination is performed based on hop information matrix
Vpop = Vaccination (ISel);
}
24
Evaluate the fitness of each Vpop
If optimal solution is identified then
break;
// come out of the while loop and print the solution
else
// replace the best few offspring with the initial population and continue the loop
Survival (initial population, Vpop);
end if
end while
}
3.2.1. Crossover
The modified crossover operator given by Chakraborty [3] is performed in IGA. Crossover is
done on rows of the TDMA cycle. Rows from members of the population, i.e., different TDMA
frames are selected using predetermined crossover probability and are marked to be members of
mating pool. A pair of rows PR1 and PR2 selected randomly from the mating pool for crossover.
The objective of crossover is to create an offspring with a better schedule for a time slot with
more transmissions by combining parent schedules. By crossover operation, only one child is
created which may or may not replace the parent, depending on how good it is. First, PR1 is
logically AND with PR2, then logically exclusive OR operator is performed for PR1 and PR2.
Both new populations are scanned from left to right, the first occurred one is copied to same
position of new row. The next one is checked with corresponding row of hop information matrix
if it does not make conflict then it is copied to new row else, it is rejected. In the same way, it
will continue until end of both population and a new row is created from PR1 and PR2. The
offspring may replace with PR1 and PR2 based on the condition set given by Chakraborty [3].
3.2.2. Mutation
Random mutations alter a certain percentage of bits in the list of chromosomes. A single point
mutation changes 1 to 0, and vice versa. Increasing number of mutations increases algorithm’s
freedom to search outside the current region of variable space. It also tends to distract the
algorithm from converging on a correct solution so mutations are not allowed for best solutions.
They are designated as elite solutions destined to propagate unchanged. Such elitism is very
common in GAs. Why throw away a perfectly good solution? However, in previous algorithm
solutions generated after crossover is taken for mutation operation and random mutation is
done on populations, which may change best population. To avoid this and to make algorithm
knowledgeable, immune concept is added to genetic algorithm. A normal mutation operation is
not done in IGA instead, knowledge added two steps are performed in place of mutation. The
two steps of IGA are Immune selection and vaccination.
A. Immune selection
The newly created population after crossover, which satisfies the primary and secondary
constraints, is selected for reduction of time slots. Those populations selected are stored in
vaccine pool. From the vaccine pool, one population is taken and the algorithm finds repeated
time slot in that population. If so then the repeated time slot is deleted from the population.
This step is carried out for the remaining populations that are in vaccine pool. The resulting
time slot reduced populations are arranged according to the channel utilization variable and
stored in the vaccine pool by replacing the old populations.
B. Vaccination
Vaccination is used to improve fitness by modifying the genes of an individual population with
prior knowledge to gain higher fitness with greater probability. A chromosome from vaccine pool
is taken for vaccination. IGA identifies a node transmits first in the population in a time slot.
During same time slot some other node, which does not create interference with transmitting
25
node can be allowed to transmit in the same time slot. To perform this a node is selected
randomly and checked with Hop information matrix whether it creates an interference with
currently transmitting node, if not node value is mutated to one allowing the selected node
to transmit in same time slot. The genes of selected chromosome are modified based on the
knowledge obtained from Hop information matrix of given network. Hence, the vaccination
process increases number of transmissions.
4
Solving BSP Using IGA: Simulation and Results
A series of simulations are carried to evaluate the performance of IGA to solve broadcast
scheduling problem, in comparison with finite state approach [1], GA [3], GA with collision free
set (GACFS) [4], Mean field annealing [13] and competent permutation genetic algorithm [14].
The performance of the GA and IGA were tested for a large number of times, using Ad hoc
networks of different sizes. In the following sections, we discuss simulation results regarding the
number of nodes |N |, the number of timeslots |M | and the computation time. The IGA was
tested by 100 randomly generated graphs, each representing an Ad hoc network topology.
The simulation results are based on population size 50, maximum number of generations 500,
crossover rate 0.30, mutation probability 0.001 and on the measures,
1 Tight lower bound ∆ value is one.
2 Channel utilization variable to find the improvement in number of transmission.
4.1 Results obtained from GA
The purpose this simulation was to investigate performance of genetic algorithm for different
networks. The number of nodes taken for simulation ranges from five to hundred. Smaller node
networks executed with more number of transmission in an acceptable generation. However,
for a 100 node network with 200 edges identifies the optimal solution TDMA frame after 489
generations. The average number of generations for 100-node network is 410. This has to be
decreased in order to reduce execution time.
4.2 Results obtained from IGA
The simulation results based on IGA is given in the following Table 1. Compared to genetic
algorithm, knowledge added IGA could improve the searching ability, adaptability and greatly
increase the converging speed. During vaccination process, selected antigen is improved with
more number of transmissions so channel utilization is increased. Comparing simulation results
of IGA with GA number of generations is reduced and utilization of each network is improved.
Even for large network, the solution is identified with acceptable generation.
The values of Table 2 clearly imply IGA performs better than standard GA. One main aim
is to reduce number of time slots so there is a steady decrease in utilization index as the number
of nodes increases.
4.3 Optimum schedule performance with other methods
The channel utilization of IGA is high compared to the channel utilization of elite population
genetic algorithm given by Chakraborty [3] is shown in Fig.4. Similarly, for networks with
nodes 14, 16 and 40 number of transmissions generated by IGA is somewhat closer to GA [3]
whereas for large networks IGA has improved on an average of 20 to 30 transmissions is shown in
Fig.4. Table 3 gives a detailed comparison between IGA and existing algorithms. It represents
maximum number of transmissions produced by some recent algorithms for the given number
of nodes and time slots. The results indicate for smaller networks the number of transmission
differs slightly, whereas for a 100-node network number of transmission generated by IGA create
a variation of 10 to 20 transmissions.
Two benchmark problems discussed by [13] are solved using IGA and results are compared
with other algorithms such as, the finite state machine based algorithm FSMA [1], co-evolutionary
26
genetic algorithm for collision free set GACFS [4], gradual hysteretic noisy chaotic neural network
G-HNCNN [7], Mean field annealing algorithm MFA [13] and the competent permutation genetic
algorithm CPGA [14] is shown in Table 4. 30 nodes with 70 edges is analyzed in problem instance
# 1 the channel utilization is largely improved compared to other algorithms similarly the time
delay is also reduced reasonably by IGA compared to other methods. 40 nodes with 66 edges
and maximum degree of 7 is analyzed in problem instance # 2 to this the channel utilization is
increased moderately and the time delay is reduced by IGA. The results of other algorithm GHNCNN, GACFS, FSMA, CPGA and MFA are given in [7], [4], [1], [14] and [13]. The algorithm
GACFS and CPGA have not calculated the average time delay, which is represented by hyphen
in Table 4.
The average time delay is calculated by
|N |
|M | ∑
η=
|N |
i=1
(
)
1
∑|M | |T Mij |
(5)
j=1
The average time delay η for each node represents the average availability of the network,
and the minimal η is very important for the optimal Ad hoc network design and its evaluation
is described by Ming sun et al., [7].
In Fig.5 (a) the input graph topology for a 100-node network with 200 edges with average
degree of 4 and its optimum solution produced by IGA is shown in Fig.5 (b). The solution is
identified with 9 time slots and 152 transmissions compared to recent algorithms the number of
transmissions produced by IGA is largely improved that attain the aim of this paper.
Problem No. of Average No. of time
Total
Channel
Computation
Number Nodes Degree
slots |M |
Transmissions Utilization ρ
time
1
14
3
6
24
0.285
0.5 sec.
2
16
3.6
5
23
0.287
0.5 sec.
3
40
4
8
67
0.209
2.1 sec.
4
100
4
10
152
0.152
5.2 min.
5
200
4
9
282
0.157
15.7 min.
6
300
3
10
489
0.163
36.1 min.
7
400
4
10
623
0.156
69 min.
Table 1: Simulation results of IGA with total number of transmission, channel utilization ρ,
and the computation time.
Problem No. of Average No. of time
Channel
Number Nodes Degree slots |M | Utilization ρ
1
10
3
4
0.3
2
20
3.6
7
0.286
3
30
4
8
0.208
4
50
4
9
0.2
5
100
4
10
0.152
6
120
4
9
0.149
Table 2: Results obtained by IGA for varying number of nodes, number of time slots and
channel utilization ρ.
27
No. of No. of time IGA GACFS GA Finite state
Component
approach
permutation GA
Nodes
slots |M |
15
8
20
18
17
20
20
30
9
37
28
33
35
37
40
8
69
65
65
64
64
100
9
152
139
133
134
136
Table 3: Comparison of time slots |M | and total number of transmissions using IGA with other
competitive algorithms.
Instance
Parameter
IGA G-HNCNN GACFS FSMA
|M |
10
10
10
10
#1
ρ
0.19
0.1233
0.093
0.1167
8.83
—
9.2
Avg. Time delay 7.54
|M |
8
8
8
8
0.237
0.2125
0.203
0.200
#2
ρ
Avg. Time delay 5.212
5.7056
—
6
Table 4: Comparison of time slot |M |, channel utilization ρ and average
node by IGA with other competitive algorithms.
CPGA MFA
10
12
0.1233 0.1056
—
10.5
8
9
0.200
0.197
—
6.9
time delay of each
Figure 4: (a) Channel utilization vs. number of nodes, (b) Number of transmissions vs. number
of nodes for GA [3] and IGA
5
Conclusion
The basic genetic algorithm and knowledge added immune genetic algorithm are discussed
to improve broadcast scheduling problem in Ad hoc. According to our knowledge, this paper
is the first one to study immune genetic algorithm to the broadcast scheduling problem. Compared to GA, IGA actively aims on improving solutions, while GA blindly wanders over search
space. Immune genetic algorithm gets knowledge from hop matrix during vaccination process
and increases number of transmissions in a reduced time slot in an acceptable computation time
compared to recently proposed algorithms. The simulation results confirm advantages of IGA
in terms of channel utilization, number of generations and running time. The outcome validates
the effectiveness and efficiency of IGA for broadcast scheduling problem. Further research may
28
Figure 5: (a) Graph topology with 100 nodes, 200 edges and average link degree of 4, (b) Solution
found by IGA for the same network
be performed to improve the performance of IGA by finding variations in crossover and altering
the immune operator by adding refined knowledge to vaccination.
Acknowledgement
We gratefully acknowledge Department of Science and Technology, INDIA for providing financial support to carry out this research work under PURSE scheme.
Bibliography
[1] I. Ahmad, B. Al-Kazemi, and A.S. Das. (2008); An efficient algorithm to find broadcast
schedule in ad hoc TDMA networks, Journal of Computer Systems, Networks, and Communications, 12 : 1-10.
[2] Dingwei Wang, Richard Y.K. Fung, and W.H. Ip. (2009); An immune-genetic algorithm for
introduction planning of new products, Computers and Industrial Engineering, 56 : 902-917.
[3] Goutam Chakraborty. (2004); Genetic algorithm to solve optimum TDMA transmission
schedule in broadcast packet radio networks, IEEE Transactions on Communications, 52
(5) : 765-777.
[4] R. Gunasekaran, S. Siddharth, P. Krishnaraj, M. Kalaiarasan, and V. Rhymend Uthariaraj.(2010); Efficient algorithms to solve broadcast scheduling problem in WiMAX mesh
networks, Computer Communications, 33 : 1325-1333.
[5] Licheng Jiao and Lei Wang.(2000); A novel genetic algorithm based on immunity, IEEE
Transactions on Systems, Man, and Cybernetics-Part A: Systems And Humans, 30(5) : 552561.
[6] M. Liu, W. Pang, K.P. Wang, Y.Z. Song, and C.G. Zhou.(2006); Improved immune genetic
algorithm for solving flow shop scheduling problem. Computational Methods, 1057-1062.
[7] Ming Sun, Lin Zhao, Wei Cao, Yaoqun Xu, Xuefeng Dai, and Xiaoxu Wang.(2010); Novel
hysteretic noisy chaotic neural network for broadcast scheduling problems in packet radio
networks. IEEE Transactions on Neural Networks, 21(9).
29
[8] C. Y. Ngo and V. O. K. Li.(2003); Centralized broadcast scheduling in packet radio networks
via genetic-fix algorithms, IEEE Transactions on Communications, 51(9) : 1439-1441.
[9] Y. Peng, B.H. Soong, and L. Wang.(2004); Broadcast scheduling in packet radio networks
using mixed tabu-greedy algorithm, Electronics Letters, 40 (6) : 375-376.
[10] S. Ramanathan and E. L. Lloyd.(1993); Scheduling algorithms for multihop radio networks,
IEEE/ACM Transactions on Networking, 1(2) : 166-177.
[11] S. Salcedo-Sanz, C. Bousono-Calzon, and A.R. Figueiras-Vidal.(2003); A mixed neuralgenetic algorithm for the broadcast scheduling problem, IEEE Transactions on Wireless
Communications, 2 : 277-283.
[12] Syam Menon.(2009); A sequential approach for optimal broadcast scheduling in packet radio
networks, IEEE Transactions on Communications, 57(3) : 764-770.
[13] G. Wang and N. Ansari.(1997); Optimal broadcast scheduling in packet radio networks using
mean field annealing, IEEE Journal on selected areas in Communications, 15 : 250-260.
[14] X. Wu, B.S. Sharif, O.R. Hinton, and C.C. Tsimenidis.(2005); Solving optimum TDMA
broadcast scheduling in mobile ad hoc networks: a competent permutation genetic algorithm
approach, IEE Proceedings: Communications, 152(6) : 780-788.
8(1):30-36, February, 2013.
Mihaela Breaban, Henri Luchian
"Alexandru Ioan Cuza" University of Iasi, Romania
E-mail: {pmihaela, hluchian}@infoiasi.ro
Abstract:
The current work proposes and investigates a new method to identify outliers in
multivariate numerical data, driving its roots in projection pursuit. Projection pursuit
is basically a method to deliver meaningful linear combinations of attributes. The
novelty of our approach resides in introducing nonlinear combinations, able to model
more complex interactions among attributes. The exponential increase of the search
space with the increase of the polynomial degree is tackled with a genetic algorithm
that performs monomial selection. Synthetic test cases highlight the benefits of the
new approach over classical linear projection pursuit.
Keywords: outlier detection, nonlinear projections, genetic algorithms
1
Introduction
Mining for outliers is of great importance in many domains: fraud detection, disease identification,intrusion detection, fault diagnosis and so on.
Outliers or anomalies represent rare observations/events that deviate from the majority of
data either in magnitude or with respect to an overall pattern. Whatever the data, a statistical
model can be attached to it and consequently the data can be considered to be generated by a
statistical process. In this view, outliers correspond to very low probabilities under the underlying
distribution.
Outliers may sometimes simply occur due to erroneous recording of data and not due to
meaningful anomalous data which would correspond in the theoretical model to changes in the
generative process. Identifying them must be one of the first steps in data analysis as their
presence misleads many algorithms from the machine learning area solving tasks like clustering,
classification or regression.
This paper proposes and investigates a new framework for outlier detection derived from
the classical projection pursuit methodology, aiming at alleviating some of the drawbacks of the
standard approach. An analysis of the popular approaches for outlier detection highlight the
benefits of the new approach.
The material is structured as follows. Section 2 surveys existing computational methods
for outlier detection. Section 3 succinctly describes the framework of projection pursuit. The
method we propose is presented in section 4 and is empirically investigated in section 5, while
section 6 draws the conclusions.
2
Computational methods for outlier detection
We consider the unsupervised framework of outlier detection: there is no pre-specified generative model for the data and no labels are available to provide examples from which an algorithm
could learn.
Several surveys exist in literature that provide a state-of-the-art for outlier detection in this
framework [1–4]. Generally, the literature distinguishes among several classes of methods.
Copyright ⃝
31
Most statistical methods for outlier detection are parametric methods: given a certain kind
of statistical distribution outliers are detected as those points with low probability of being generated. In the univariate case usually the Normal distribution is used and outliers are considered
those observations that lie at a distance larger than k standard deviations from the mean. For
the multi-variate case the Mahalanobis distance is considered to compute the distance of the
observations from the mean. The main drawback of this approach is that the parameters of the
distribution (mean and standard deviations) are computed based on all observations, including
the possible outliers, and therefore they may be highly biased. Non-parametric approaches based
on standard deviation identify the subset of observations that, after exclusion, determines the
highest decrease in variance; using any form of the Minkowski metric the method can be generalized for the multi-variate case. The drawback of this method is that the size of the search
space is exponential w.r.t. the number of observations. The first and the third quartiles are also
used to identify outliers in the univariate case.
Distance-based approaches [10–13] for outlier detection base their decisions on computing
the distances between each data point and its neighbors. These are multi-variate approaches.
The main advantage of these methods over statistical approaches is that no hypothesis on the
type of distribution is made, hence such methods are applicable without distribution-dependent
restrictions. These methods are computationally expensive due to the calculations of distances
between data points; therefore, various ways of scaling them up for large databases have been
proposed. Still, one important drawback is present: even if they do not make any assumptions
on the type of distribution they are not parametric-free methods. The result is very sensitive
to some user-tuned parameters like the radius of the neighborhood, the number of neighbors to
be used or the threshold indicating the average distance to the neighbors above which a data is
considered as outlier.
Clustering-based approaches employ an unsupervised clustering algorithm to identify groups
in data. In this context outliers are identified as unusually small groups in data. Popular
algorithms used in this context are hierarchical clustering methods (mostly the single-link variant)
and density based clustering methods like DBSCAN. The methods have the advantage of being
generally applicable to any distributions in data. Their drawbacks reside in high computational
costs requiring distance computations between data items and sensitivity to parameters - in case
of density-based methods.
All the above-mentioned multi-variate techniques take into account the entire attribute space.
Based on distance computations, a drawback is inherent: in high-dimensional spaces the ratio of
the distances of the nearest and farthest neighbors to a given target is almost one making outlier
detection an impossible task.
Projection pursuit is hardly mentioned in existing surveys on outlier detection. When it is,
attention is given usually to a particular exponent of this class of methods - Principal Component
Analysis (PCA)- which is in fact a dimensionality reduction method that aims at preserving as
much as possible the variance in data and not a method dedicated to outlier detection. Projection
pursuit can be used to identify subspaces of the original attribute space where outliers are present,
alleviating the mentioned drawback resulting from high dimensionality. This paper highlights
the important role projection pursuit can play for outlier identification and enhances the classical
methodology by introducing nonlinear projections.
3
Projection pursuit
A k dimensional projection of a data set X ∈ Rn×d consisting of n items described by d
numerical attributes is a linear transformation involving k orthogonal vectors in a d-dimensional
space. These vectors form an orthogonal basis A ∈ Rk×d . The projection of X into A is
32
the product Z = X · AT resulting in a new representation for each of the n data items in a
k-dimensional space.
Projection Pursuit (PP) [5] is a technique aiming at identifying low-dimensional projections
of data that reveal interesting structures. The framework of PP is formulated as an optimization
problem with the goal of finding projection axes that minimize/maximize a measure of interest
called projection index. The projection index can be formulated to identify subspaces where clusters are visible, linear combinations that discriminate between given classes or low-dimensional
views of data that reveal the presence of outliers. Depending on the formulation of the index
under maximization/minimization analytical methods exist (the case of PCA), gradient-based
methods may be used (if the index is continuously differentiable) or probabilistic heuristics like
Hill Climbing or Simulated Annealing are employed.
The current work is conducted towards identifying single-dimensional views (one-dimensional
projections) of data that present outliers. These can be manually inspected, or simple statistical
rules (univariate analysis) can be applied to identify and exclude the outliers.
A popular index used to derive projections with high chances of containing outliers is kurtosis
defined as the fourth moment around the mean divided by the square of the variance:
kurt =
n
∑
(y (l) − µ)4
l=1
(n − 1)σ 4
(1)
where µ is the mean and σ is the standard deviation of the single-dimensional projection. A
value close to 3 indicates a normal distribution. Higher values indicate the presence of extreme
deviations while lower values indicate bimodal distributions. In consequence, this index should
be maximized to derive projections containing outliers.
Other indices also exist but are more expensive computationally [6, 7].
One drawback of the classical projection pursuit approach is that linear projections corresponding to linear combinations over the original attributes, are not able to model complex
generative models and consequently are not able to detect the outliers in all cases. Such a case
is illustrated in Figure 1. To alleviate this drawback we extend the original framework to allow
the derivation of nonlinear projections. To this aim, we extend the data set by introducing
new features built as products of the original ones. Projection pursuit can be further conducted
with classical methods proposed in literature. Because the extension we propose increases considerably the search space, we also design and investigate an optimization framework based on
multi-modal genetic algorithms, which allows searching simultaneously for the relevant attributes
combinations and the coordinates of the projection axis.
4
An algorithm for detecting outliers with nonlinear projections
Nonlinear projections can be performed after introducing in the analysis new features generated as products of original features. This is a standard approach in other data mining tasks
(i.e. regression and classification) to extend standard methods that derive linear models but
hardly used in the field of projection pursuit in general and outlier detection in particular. One
drawback is inherent to this approach: the exponential increase of the number of new attributes
introduced into the analysis with the increase of the degree of the polynomial model.
If the number of original attributes in a data set is m, the number of monomials of degree 2
that can be introduced in the analysis is m(m+1)
while the number of monomials of degree 3 is
2
m(m+1)(m+2)
. In general, the number of monomials of degree i that can be formed on m variables
( 6 )
in m+i−1
and
the number of attributes that can be introduced to derive polynomials up to a
i
) (m+k)!
∑ (
certain degree k is Nk = ki=1 m+i−1
= m!k! − 1.
i
33
To deal with the large number of features and speed up the projection pursuit algorithm
we incorporate a feature/monomial selection mechanism. To this aim a genetic algorithm is
designed that simultaneously selects good monomials and searches for good projection axes
within the selected monomial subspace.A multi-modal genetic algorithm is used in order to
allow for simultaneous exploitation of several good monomial subspaces. A candidate solution
corresponding to a chromosome in population consists of two parts:
• a boolean string of length equal to the number of monomials to a given degree k: value 1
corresponds to monomials to be included in search of projections;
• a vector in the Euclidean space playing the role of the projection axis, corresponding in
fact to numerical weights in the resulted polynomial transformation
Such mixed representations are common in feature selection tasks solved with genetic algorithms
in the context of clustering [15, 16] and classification [14].
The projection y (l) ∈ R of an item x(l) ∈ Rm in the subspace encoded by a chromosome is
∑ k
(l)
(l)
computed as follows: y (l) = N
i=1 bi · wi ∗ mi where mi is the ith monomial computed as a
product over elements from x(l) . Using only a subset of the monomials to conduct further the
search of axes is equivalent to assigning weight 0 for the rest.
For multi-modal search we use the Multi Niche Crowding GA [8], an algorithm able to maintain stable subpopulations within different niches, to maintain diversity throughout the search
and to converge to multiple local optima. MNC is a steady state algorithm that implements
replacement based on pairwise comparisons.
Both the selection and replacement operators implement a crowding mechanism. Mating and
replacement within members of the same niche are encouraged while allowing at the same time
some competition for the population slots among the niches.
Selection for recombination takes place in two steps: one individual is selected randomly from
the population; its mate is the most similar individual from a group of size s which consists of
randomly chosen individuals from the population. The two chosen individuals are subject to
recombination operators and one offspring is created.
The individual to be replaced by the offspring is chosen according to a replacement policy
called worst among most similar : f groups are created by randomly picking g (crowding group
size) individuals per group from the population and one individual from each group that is most
similar to the offspring is identified; then, the one with the lowest fitness value among these
is replaced. In the original MNC algorithm the replacement is always performed, even if the
fitness of the offspring is lower than the fitness of the individual chosen to be replaced. In our
implementation we adopt a Simulated Annealing strategy: lower fitness survival is accepted with
a probability that decreases during the run of the algorithm.
The similarity between two individuals is computed based on the boolean part of the chromosome that encodes the monomial subspace; the Hamming distance is used.
Recombination between two chosen individuals consists of crossover that generates one offspring which is subsequently mutated.
Dedicated crossover and mutation operators are designed and applied to each of the two
segments of a chromosome.
The crossover operator applied to two chromosomes consists in fact of two operations performed independently on the two parts of the chromosomes. Uniform crossover is used on the
binary segment encoding the monomial subspace. On the numerical segment, crossover generates each gene of the offspring as a convex combination between the corresponding genes of the
parents.
Mutation is also applied in two distinct phases. Each gene in the binary segment is flipped
with a given probability which we call binary mutation rate. To each weight in the numerical
34
segment corresponding to a selected monomial a random value in the interval (-0.25, 0.25) is
added with a probability called weights’ mutation rate. The binary mutation rate is lower than
the weights’ mutation rate in order to encourage better exploitation of a given subspace for
optimal projection axes.
Before evaluation, the weight vector in the selected subspace of the offspring is normalized
to unit length. The evaluation consists in computing the projection of all data items on the axis
given by the weight segment of the chromosome in the encoded monomial subspace, followed
by the computation of a projection pursuit index dedicated to detecting clusters in data. The
index we use is the kurtosis. We choose to maximize this index mainly because its reduced
computational complexity compared to other proposed indices for cluster detection. When the
projection is normalized to mean 0 and variance 1 the index consists in summing up the values
raised to the fourth power, divided by the total number of items. Usually, projection pursuit is
preceded by a linear transformation on the data called sphering that guarantees that every linear
projection is distributed with mean 0 and standard deviation 1, eliminating the need for further
normalization. A sphering procedure is also applicable in our case after all monomials to degree
k are added to the original data. However, in our experiments we normalize each projection prior
to computing the kurtosis.
Without a multi-modal search scheme, outliers should be identified and eliminated incrementally, based on linear combinations at first, then monomials of degree 2, 3 and of higher order
can be introduced iteratively. The multi-modal algorithm allows for several single-dimensional
projections maximizing the index to be returned in one run. This brings some advantages: in
a standard heuristic only one solution is returned; identifying all outliers requires iteratively
the exclusion of the identified outlier and a new execution of the algorithm. The benefits of
multi-modal search were recently highlighted in the context of linear PP [9].
5
Experiments
Figure 1 represents a data set containing 100 observations in a two-dimensional space. It is
illustrative for the drawback of classical PP: linear projections of data on one axis are not able
to identify the interior outliers.
The parameters of the new method are set as follows: s = 0.15 ˇ pop size, g = 0.10 ˇ pop size,
f = 0.15 ˇ pop size. The mutation operator is applied at different rates on the two segments of a
chromosome: approximately one mutation during 10 iterations is applied on the binary segment
while 1 mutation per iteration is applied on the numerical segment; using a steady-state scheme,
only one offspring is generated and evaluated at each iteration. The population consists of 50
individuals, randomly initialized: on the binary segment approximately 4 monomials are selected
(set to 1) while on the numerical segment the values are generated in the interval [-1,1].
The algorithm was executed at first with monomials of degree one, to simulate one run of
a classical PP algorithm: the search is conducted in the original feature space. The black line
in Figure 1 a) represents the projection axis generating a linear combination of attributes of
maximum kurtosis, returned by our method in this step. Figure 1 b) represents the histogram
of the data under this linear combination of maximum kurtosis: the exterior outlier appears at
the left while the interior outliers get mixed with the rest of the data.
Without excluding the identified outlier the algorithm was executed again including in the
search space all monomials of degree 2. Figure 1 c) represents the distribution of the nonlinear
combination (of maximum kurtosis) of the same data derived with our method: the three interior
outliers can be identified at the right, outlier "1" being at the extreme, followed at the left by
outliers "2" and "3". As a second test case we are interested in the ability of our method to detect
the outliers when noise attributes are introduced. To this aim, 3 uniformly-distributed attributes
35
are added to the data set in Figure 1, the result consisting in 100 observation in a 5-dimensional
space. Figure 2 presents the results of two runs of our algorithm: the linear projection (b) is
similarly oriented in the original space, with the outlier at the left and the nonlinear projection
(b) identified the three interior outliers at the right.
Figure 1: A synthetic data set containing 4 outliers a) The projection axis detected with classical
PP is drawn in black; b) The histogram of the linear projection detected with classical PP: only
the exterior outlier can be identified; c) The histogram of the nonlinear projection returned with
new method: the three interior outliers appear at the right
Figure 2: A synthetic data set containing 4 outliers in a 2-dimensional space and 3 more
uniformly-distributed attributes a) The projection axis detected with classical PP is drawn in
black; b) The histogram of the linear projection detected with classical PP c) The histogram of
the nonlinear projection returned by the new method
6
Conclusions
The paper proposes an extension of the classical linear projection pursuit framework in the
context of outlier detection. By creating nonlinear projections of data the new method is capable of modeling diverse generative processes and identify outliers which linear projection pursuit
does not detect. The proposed method compares positively to distance-based and density-based
approaches: the new method provides results which are not altered by the presence of many uniform/gaussian attributes, as it happens with the other approaches, where distance computations
over the entire space of attributes are performed. This is because PP intrinsically performs subspaces/attributes selection and moreover, our method deals with monomial selection explicitly.
At the same time, the (non)linear combinations of attributes identified to contain outliers can
provide useful explanatory information to the user on the nature/source of outliers. Excepting
the univariate analysis performed in the last step for outlier exclusion, our method is parameterfree. Scaling up for very large databases is favored by the fact that the proposed algorithm can
be easily parallelized: as the most demanding step is projecting the entire data set on a given
axis, this operation can be executed in parallel for distinct observations.
36
Acknowledgement
This work was supported by POSDRU/89/1.5/S/63663 CommScie grant.
Bibliography
[1] H.-P. Kriegel, P. Kröger, A. Zimek, Outlier Detection Techniques, Tutorial at 16th ACM
SIGKDD Conference on Knowledge Discovery and Data Mining, Washington DC, 2010.
[2] V. Hodge, J. Austin, A Survey of Outlier Detection Methodologies, Artif. Intell. Rev.,
22(2):85-126, 2004.
[3] Irad Ben-Gal, Outlier detection, In: Maimon O. and Rockach L. (Eds.), Data Mining and
Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers,
Kluwer Academic Publishers, 2005.
[4] V. Chandola, A. Banerjee, V. Kumar, Anomaly detection: A survey, ACM Comput. Surv.,
41(3), Art. 15, 2009.
[5] J. H. Friedman and J. W. Tukey, A projection pursuit algorithm for exploratory data analysis,
IEEE Trans. Comput., C23(9):881-890, 1974.
[6] Stahel, W. A., Breakdown of covariance estimators, Research report 31, Fachgruppe fur
Statistik, E.T.H. Zuurich, 1981.
[7] J.H. Friedman, Exploratory Projection Pursuit, J AM STAT ASSOC, 82(1):249-266, 1987.
[8] V. Vemuri and W. Cedeńo, Multi-Niche Crowding for Multimodal Search. Practical Handbook
of Genetic Algorithms: New Frontiers, Ed. Lance Chambers, vol.2, 1995.
[9] A. Ruiz-Gazen, S. L. Marie-Sainte, and A. Berro, Detecting multivariate outliers using projection pursuit with particle swarm optimization, Proc. of COMPSTAT2010, 89-98, 2010.
[10] Knorr, E.M. and Ng, R.T., A unified approach for mining outliers, Proc. Conf. of the Centre
for Advanced Studies on Collaborative Research (CASCON), Toronto, Canada, 1997.
[11] Knorr, E.M. and Ng, R.T., Finding intensional knowledge of distance-based outliers, Proc.
Int. Conf. on Very Large Data Bases (VLDB), Edinburgh, Scotland, 1999.
[12] Angiulli, F. and Pizzuti, C., Fast outlier detection in high dimensional spaces, Proc. European Conf. on Principles of Knowledge Discovery and Data Mining, Helsinki, Finland, 2002.
[13] Hautamaki, V., Karkkainen, I., and Franti, P.. Outlier detection using k-nearest neighbour
graph, Proc. IEEE Int. Conf. on Pattern Recognition (ICPR), Cambridge, UK, 2004.
[14] A. Sierra, High-order Fisher’s discriminant analysis, Pattern Recognition, 35(6):1291-1302,
2002.
[15] J. Handl, J. Knowles, Feature subset selection in unsupervised learning via multiobjective
optimization, Int. J. of Computational Intelligence Research, 3:217-238, 2006.
[16] M. Breaban, H. Luchian, A unifying criterion for unsupervised clustering and feature selection, Pattern Recognition, 44(4):854-865, 2011.
8(1):37-41, February, 2013.
Bio-inspired Sensory Systems in Automata for Hazardous
Environments
L. Canete
Lucio Canete
Universidad de Santiago de Chile
Av. Ecuador 3769, Estación Central,
Santiago de Chile, Chile.
E-mail: [email protected]
Abstract: Every automaton in dynamic and complex environments requires sensory
systems with an appropriate level of attention on the hazardous environment. This
property in any efficient automaton is analogous to that observed in animal sensory
systems. In this context, it is noted that to ensure its viability, the sensory systems of
animals must maintain a continuous state of alertness or attention to the environment.
However, the state consumes energy so it is impossible to keep a constant level over
time. In this regard, biologists have designed models for explaining the variation in
the level of surveillance in two vital activities of animals: Work and Rest.
In an alternating pattern between Work and Rest, the Attention Level V(t) declines
and increases as the animal works and rests respectively along the time. For each of
the two states, there is one relation: dV/dt = −α * V while working and dV/dt =
β*(1-V) while resting. In this model α is the loss rate of surveillance that depends
on the difficulty of the work and β is the recovery rate which depends on the quality
of rest. In the case of automata, this phenomenon is analogous to that observed
in the Animal Kingdom. Even if the automatic machines have relief structures to
monitor their environments, they always require that its sensory system recovers
the alertness after being hit by the inexorable entropy. If the task is hard (α is
large), the Attention Level decreases rapidly. Once the level has dropped below a
threshold of tolerance, it must be recovered. If rest is poor, the automaton will take
a lot of time to achieve the desired level. Obviously, machines do not rest, but in
analogous terms, this phenomenon is emulated in the way of maintenance activities.
Parameter β represents the quality of these maintenances. This model has been tested
with computer simulations to study the performance of automatic machines in hostile
environments.
After tests, it was possible to quantify α and β for each kind of task-environment and
each kind of maintenance. The bio-inspired model showed to have explicative and
predictive applications to the conquest of hostile scenarios by means of automata.
Indeed it is an interesting conceptual tool for increasing the performance of machines.
Keywords: attention level, model, performance, emulation, automata.
1
Introduction
In the continuing quest to provide new qualities in automata to improve productivity in crisis
scenarios, this work uses the Market of Ideas in the sense that each field of knowledge is not
selfsufficient and therefore must rely on others to import ideas [1].
Biology is in this paper the source of ideas. This science shows that over the 3.8 billion years
since life is estimated to have begun to appear on Earth, evolution has resolved many of nature’s
challenges leading to lasting solutions with maximal performance using minimal resources [2].
Can Biology export ideas toward Automation? Of course, there are many examples, but the
more representative is a robot. This device is a non living thing with some qualities emulated from
humans, manufactured to replace them [3]. After all Biology and Automation have something
in common: the study of entities capable of operating by they own.
Copyright ⃝
38
L. Canete
In the same way, both automaton and living been have in common a vital component to
interact with environments: a sensorial system [4]. In this context, the biological studies about
the behavior of this system in living things, maybe animals, can be useful for improve the behavior
of automata.
Regarding these assumptions, the present work examines how to model in automaton the
level of attention on the hazardous environment. The desired model can contribute to better
management of robots and other automata in crisis scenarios, especially where Automation
has not satisfactory solutions [5]. For reach this goal, this paper identifies a requirement from
Automation, then search for the right supply in Biology and finally applies the idea in a real
situation.
2
A requirements from Automation
For facing complex and dynamic environments, any human organization must maintain a
high level of attention to the outside to ensure its viability [6]. This requirement is stronger
when the environment is adverse cause of either natural or cultural variables.
Performing a continuous task, this level decays because the central nervous system of human
been cannot sustain a high quality of information processing for a long time [7]. Then, the
technology allows to replace human been by automata for watching the hazard environments.
Nevertheless, when automata performance tasks, they reduce their level of attention too.
Indeed, automata are made of structures which obey of the Entropy Law [8]. So, in metaphorical
sense, automata get tired. This tiredness emerges when their sensory systems show a gradual
reduction in performance or vigilance decrement. In fact, optical and mechanical pieces of the
sensory system are exposed to environmental aggression such as weathering of shocks.
It is possible to know in real time how tired is the automaton? What can do manager whether
the know this level ? This kind of question can be answered by Biology.
3
A supply of Biology
Any decay of this level may involve a loss of information and thus jeopardize the viability of
the automaton.
Biologists note that subjects that perform a continuous and difficult task (such a working in
crisis scenarios) "show a gradual reduction in performance or vigilant decrement" [9]. Vigilance
implies a general state of alertness that results in enhanced processing of information by brain.
Performing a difficult task reduces the quality of information processing and this reduction
results in decreased performance over time [10]. After perform a task, animals rest and recover
the vigilance. This recovery depends on the quality of rest.
Regarding this phenomena, ecologists assumed that a forager under risk of predation and the
same time looking for food, alternates between two short periods of activity and rest [11]. This
model explain the variation of Attention Level V(t):
{
dV /dt =
−α ∗ V
while foraging
β ∗ (1 − V )
while resting
α: rate of vigilante decrement, positive and associated with task difficulty;
β: rate of vigilante recovery, positive and associated with quality of rest.
(1)
39
Figure 1: Graphic representation of the biological model
4
The imported model
Does an automaton get tired? In metaphorical or analogous sense: yes [12]. After any task,
the automaton will be deteriorated and the more deteriorated it is, the lower will be the Attention
Level. In management, it is important to know this level because, like persons, it is possible to
allocate the work in charge. If the automaton is very tired, its sensory system probably is not
in optimal state and this quality of survey will be unsatisfactory, so it is not advisable to apply.
How does the automaton recover the satisfactory Attention Level? Stop performing the task
and start maintenance, this is the way to recover it.
How fast does the Attention Level decrease and recover. This speed is determined by α and
β respectively.
How can managers determinate these parameters? Each task and each maintenance has an
own α and β respectively, both measurable.
For getting numerical indicators, the phenomenon was studied in an autonomous vehicle in
order to survey (collect field data) environmental variables in Patagonia. This rover in Figure
2 was programmed to cycle through different sections of equal length, in the same direction at
the same speed and other conditions Ceteris paribus, differing only in sections of the terrain
roughness, measured as the vertical dispersion over an imaginary horizontal line of the surface
measured longitudinally every 0.1 m. When the vehicle did the tours, it was assumed that was
on stage for the rigorous entropy field and affects the mechanical components including optical.
When the automata faced irregular geomorphology and bad weathers, the Attention Level
declined rapidly cause of environmental aggression on mechanical and optical devices; so its
performance was unsatisfactory. In friendly environments the Attention Level declined slowly,
requiring small maintenances for retake the level desired. This behavior is shown in Figure 3.
Figure 2 shows how decreased the percentage the achievements (identification of cryptic
elements) cause of reduction Attention Level.
After many tests, it was possible to quantify α and β for each kind of task-environment and
each kind of maintenance. With this knowledge, managers took new and better decisions. For
example set a minimum tolerance Vo for Attention Level. Any lower than this, they rejected the
work of robots.
The bio-inspired model showed to have explicative and predictive applications to the conquest of hostile scenarios by means of automata. Indeed it is an interesting conceptual tool for
increasing the performance of machines.
40
L. Canete
Figure 2: The rover which "worked" in Patagonia
Figure 3: Variation observed of Attention Level in two kinds of environments in Patagonia
5
41
Conclusions
Given the syntax, semantics and praxis of the model, it was possible tested it with computer
simulations to study the performance of automatic machines in hostile environments. Besides,
there were tests in different terrains, most of them in extreme zones of Chile to study the
quality of the information gathered by surveyor robots. When the automata faced irregular
geomorphology and bad weathers, the Attention Level declined rapidly cause of environmental
aggression on mechanical and optical devices; so its performance was unsatisfactory. In friendly
environments the Attention Level declined slowly, requiring small maintenances for retake the
level desired. The behavior observed confirmed the hypothesis of this work.
This model inspired by biological phenomena has an explanatory utility and a predictive use
because it can measure the level of difficulty of tasks and the recovery rate (α and β respectively),
to forecast the decay of productive factor of interest. Even known values of α and β and several
observations regarding the performance, it can be through a statistical procedure to discover the
function V = f (α, β), a mission that the author of this trial have begun to develop.
The inspiration observed in living phenomena may have more similarities in inert bodies
and therefore opens an interesting line of research that contributes to better management of
productive factors.
Bibliography
[1] Morin E., Introduction á la pensée complexe, Editions du Seuil, 2005.
[2] Ovchinnikov Y., Basic Tendencies in Physico-Chemical Biology, Mir Publisher, 1987.
[3] Siciliano B., Sciavicco L., Villani L., Oriolo G., Robotics: modelling, planing and control,
Springer, 2010.
[4] Maturana H., Varela F., De máquinas y seres vivos, Editorial Universitaria, 1994.
[5] Siegwart R., Nourbakhsh I., Scaramuzza D., Autonomous mobile robots, The MIT Press,
2011.
[6] Pérez J., Design and diagnosis for sustainable organizations: the viable system method,
Springer, 2012.
[7] Mason P., Medical neurobiology, Oxford University Press, 2011.
[8] Atkins P., Four laws that drive the universe, Oxford University Press, 2007.
[9] Dukas, R., Constraints on information processing and their effects on behavior, The University
of Chicago Press,1998.
[10] Gendron R., Staddon J., Searching for cryptic prey: the effects of search rate, American
Naturalist, ISSN 00030147, 121: 172-186, 1983.
[11] Parasuram R., Mouloua M., Interaction of signal discriminability and task type in vigilance
decrement, Perception, ISSN 0301-0066, 41: 17-22.
[12] Gleich P., Pade C., Petschow C., Pissarskoy E., Potencials and Trends in Biomimetics,
Springer. 2009.
8(1):42-49, February, 2013.
Adrian Copie, Teodor-Florin Fortiş,
Victor Ion Munteanu
1. West University of Timişoara
Romania, Timişoara, bvd. V.Pârvan 4, and
2. Institute e-Austria, Timişoara
Romania, Timişoara, bvd. V.Pârvan 4
[email protected],[email protected]
[email protected]
Abstract:
The Small and Medium Enterprises benefit now, due to the scale adoption of Cloud
Computing, from an emerging market where they can associate and collaborate to
form virtual enterprises or virtual clusters, aiming to compete with the large enterprises and provide tailored IT solutions for their customers. However the lack of
standardization for the cloud services and technologies leads to a myriad of different components that cannot be easily set to work together in the absence of a real
cloud governance solution. Cloud governance acts like a catalyst to allow Small and
Medium Enterprises to easily manage and optimize their services infrastructure, and
to facilitate collaboration in a clustered or virtual-enterprise environment.
We have proposed a Cloud Governance architecture based on mOSAIC’s multi-agent
Cloud Management solution. The Cloud Governance solution relies on various datastores that are responsible with maintaining and managing a set of crucial data that
are used during the cloud governance process . Our paper is focused to analyze and
emphasize the requirements that must be fulfilled by different database systems in
order to have a reliable storage system and also to suggest a concrete solution.
Keywords: Cloud Computing, Cloud Governance, Datastores, Databases
1
Introduction
In the last years, Cloud Computing becomes a common paradigm, due to the straightforward
way of resource provisioning, dynamically scalability, service orientation and its simple pay-asyou-go financial model. Along with these characteristics, in [1] are identified others like strong
fault tolerance, loosely coupling, virtualization, ease of use and the link with the business model
which lead to the development of new business models [2], [3].
The selected architectural model allows SMEs to collaborate and associate in virtual enterprises or virtual clusters and expose services in direct competition with the large enterprises.
However, there are still issues to overcome in order to benefit of this kind of collaboration due
to the diversity of services offered by the cloud providers and the lack of standardization.
Efforts have been made in this direction, different solutions that aim to abstract the characteristics of some existing cloud providers have been released: mOSAIC1 , CloudFoundry2 , Morfeo
4CaaSt3 , ActiveState Stackato4 , OpenShift5 , Reservoir6 , SLA@SOI7 and many others . Even if
they are a step forward in the process of the cloud services standardization, [4], [5] and [6] suggest
1
http://www.mosaic-cloud.eu
http://cloudfoundry.org
3
http://4caast.morfeo-project.org
4
http://www.activestate.com/stackato
5
http://openshift.redhat.com/app/
6
http://www.reservoir-fp7.eu/
7
http://sla-at-soi.eu/
2
Copyright ⃝
43
complementary services related to the Cloud Management which assures that the resources in
the cloud are used optimally and properly interact with the users and other services. Taking
into account the growth rate of the cloud services, [7], [8], [9] reveal the need a better integration and a demand for complementary mechanism for Cloud Management: Cloud Governance.
This is an evolutionary step in the Service Oriented Architecture (SOA) which makes possible
the awareness of the existence of the cloud services for the potential consumers. Due to its
similarities, the awareness mechanism rely on different specialized datastores which hold critical
information and which have various requirements in order to correctly function and offer reliable
data. However, according to [11] there must be a clear separation between the management and
governance processes, because they describe different activities, have distinct goals and involves
different organizational structures.
2
The mOSAIC Project
’Open source API and Platform for multiple Clouds’ (mOSAIC) is a FP7-ICT project that
aims to provide an open-source platform and an API which abstracts the particularities of various
cloud providers and encourages the applications development based on the cloud-programming
paradigm. Its main components are the Cloud Agency which is an embedded Cloud Management
solution based on agents that is designed to negotiate cloud resources and provide them to the
second component called mOSAIC platform which consumes them by offering a cloud-oriented
application development framework.
3
Cloud Governance
Our proposed Cloud Governance architecture is based on variuos proposals, including Distributed Management Task Force’s (DMTF) white papers [6], [10] and is built around the mOSAIC’s Cloud Agency. The Cloud Agency is a core Cloud Management component in the mOSAIC platform that exposes its functionality as a service that is consumed inside our Cloud
Governance component (Figure 1).
Figure 1: Cloud Governance
The architecture reveals several distinct components like the Cloud Management solution
based on mOSAIC’s Cloud Agency, the Cloud Governance Bus, the Cloud Governance functional
modules and the datastores used to persist various data during the governance process. The
Cloud Agency’s main role is to assure the management of the resources but also to perform
44
Table 1: General Requirements for the Datastores in Cloud Governance
Crt.No. Requirement
S1
Cost optimisation by trying to find solutions in order to minimize the costs
with the storage and virtual machines in the cloud through intense multiplexing
S2
High performance in terms of throughput, small latency, scalability independent of the data size and the dynamic of the workload
S3
Security (confidentiality, integrity, privacy)
S4
High availability
D1
Simple internal API in order to expose a relative small number of methods
to be used by the governance agents
various SLAs monitoring activities. The messages and data exchange between the cloud services
is made through the Cloud Governance Bus. However the governance process is realized using
four specialized agencies namely Service Management, Security Management, Audit Management
and Governance Management. The Service Management takes care about the lifecycle of the
services registered in the cloud governance environment. Security Management handles the
identities of the service consumers and provides the security tokens used to access the requested
services. The Audit Management agency processes the audit information obtained from the
cloud management component and in the same time allows the access to a wide range of logs
generated by the interacting components. Finally, the Governance Management coordinates the
entire governance activity based on a set of rules, policies and settings.
4
Every cloud governance agency handles specific data, being in direct relation with a dedicated
datastore called namely Service Datastore, Security Datastore, Audit Datastore and Governance
Datastore.
Every datastore is in charge with keeping crucial information related to the functionality of
the entire system but also very sensitive from the privacy and confidentiality point of view, such
as credentials, contracts, partners, policies, etc. In the same time the data storage system must
be very responsive in what concerns the time took to perform various operations over the data
set and also must offer a good performance in terms of bandwidth, to accommodate with the
processed data flow. This is why the data storage systems must fulfil some general requirements,
as revealed in Table 1.
The general requirements for the data storage systems have been divided in two categories:
requirements related directly to the intrinsic characteristics of the storage like cost optimization,
high performance, high availability, high security and also requirements related to the development phase, like a very simple and intuitive API. These requirements apply to all the datastores
inside the governance environment and, from case to case, there are other specific requirements
that will be exposed in the appropriate section.
4.1
Governance Datastore
The cloud governance implies the use of various policies related to the way in which the cloud
services interact with the consumers, together with different types of constraints that limit the
45
access to the underlying resources, or even functionality of the governance process building blocks.
There are several types of policies that are administered by the Cloud Governance component
like Service Level Agreements (SLAs) and Service Level Objectives (SLOs) and which represent
the cornerstone of the cloud governance process [4], [6]. The relevant constraints involved in
the decision taking process are deployment constraints, data residency constraints, auditability
constraints and security constraints. After two partners, namely the provider and the consumer,
agree about a set of constraints, they will further govern the interaction between the entities
involved in the contract.
The Governance Management Agency coordinates the entire activity performed by the Cloud
Governance component based on different policies, constraints and system settings that are stored
in the Governance Datastore like access policies, virtual machine settings, security credentials,
interaction policies, etc.
Every cloud service provider has its own policies and constraints that will be published and
used during the service lifecycle. Some of the policies and constraints could be editable by the
service consumers in order to customize them, like access control lists, some of them are not, like
Quality of Services, all of them being part of the request and offer operations. The information
represented by the policies and constraints is usually structured, being related to the guaranteed
functionally parameters of the services, security policies, configuration parameters of the virtual
machines on which the agents are executed and many more.
The data persisted in the Governance Datastore is critical to the whole governance process,
therefore requirements of the storage system are also in terms of highly availability and reliability.
Some of the data is very sensitive so a level of encryption must be added. In the same time,
this datastore must maintain relations with other storage systems which recommends a flexible
graph database to hold the information related to governance. This choice will assure high
performances in terms of records processing, allowing to be installed either in the private cloud
on commodity servers or in the public cloud through virtual machines.
4.2
Services Datastore
This is the most complex datastore, part of the cloud governance process. It could be seen
like a service itself offering essential interfaces for the cloud services that want to register in the
Cloud Governance component and then to be discovered and used, together with other additional
features related to offers, contracts, audits and billing. Figure 2 depicts the schematic of the
Service Datastores.
Figure 2: Services Datastore and their internal relationships
The Service Datastores, through their specific mechanisms and information that compound
46
them, facilitates the services publishing and discovery processes. Like in the case of SOA Governance, Cloud Governance relies on a Service Repository model in which the services that want
to offer their functionality must register and must offer methods to be discovered by the consumers. This datastore is also tied with global catalogues that maintain information about the
system’s customers and the partners. The character of the information contained in the Service
Datastore is extremely heterogeneous, holding many kinds of specific information, grouped in
several sub-components:
• Service Descriptors. The services functional and non functional requirements are described
through service descriptors and they are persisted inside the Descriptors component. This
is usually a structured information, the service descriptors contains basically information
of the same type and must be highly query-able in order to offer flexible information about
the contained services.
• Semantic Definitions. Besides the syntactic description stored inside the Descriptors component, the cloud services are described and defined from their semantic point of view
through semantic descriptors and stored in the Semantic Definition sub-component.
• Offers. The relation between provider and consumer usually benefit from a resource model
that describes what the service can offer in terms of functionality. Service templates are
used to describe in a generic form what a provider can offer, but when the template
contains information about a specific provider, the template becomes an offer and it is
persisted for future use in the Offers sub-component. Because the information is based on
generic templates, it could be structured so an appropriate storage system could be used.
• Contracts. When a consumer looks for a specific service and discover it in the Services
Datastore, it consults its offer, agrees with the SLAs and if it decides to use that services
the offer becomes a contract that is also stored inside the Contracts sub-component. For
a specific service, more than one contract could exist.
• Instance Sets. After a cloud service is contracted, in order to be effectively used it is instantiated and an additional information called Service Instance is generated. This information
is an aggregate of trading, billing and deploying information which is stored in a separate
sub-component of Services Datastore called Instance Sets.
• Running Instances. The information about the running instances of the cloud services is
very much the same as the one contained in the Service Instances and is persisted in the
Running Instances sub-component
• Billing. For every contract, billing information is generated in accordance with the parameters agreed during the contracting phase and this information is located in the Billing
sub-component.
Taking into account the format of the data persisted in the individual database components
in the Service Datastores, together with the general requirements already exposed, the choices
in terms of database types are
• NewSQL database for the Service Descriptors database
• RDF datastore for the Semantic Definitions database in order to benefit from the specialized
SPARQL query language
• Graph Database for the Offers, Contacts, Instance Sets, Running Instances and Billing
databases
4.3
47
Security Datastore
The Cloud Governance involves distinct entities, components, agents running in behalf of
various parties, therefore the security is one of the most important concerns in order to assure a
safe and trustful environment. Inside the Cloud Governance system we distinguish two security
planes: the security of the cloud services, usually assured by the service providers and the intrinsic
security of the Cloud Governance component.
The access to the cloud services must be done in a secure way, only authorized consumers
having the rights to use the provided functionalities, sometimes in a granular way through various access rights and security policies. The Security Management Agent is responsible for the
credentials management for the authentication and authorization process together with providing the security tokens used to access different cloud resources after the successful authentication
process. All the credentials and security policies governing the cloud services in the system are
stored in the Security Datastore.
In most cases, the services are offered on a pay-as-you-go schema, being accessed by providing
credentials in the form of user name and password, or in terms of security keys. This special kind
of data is kept inside the Security Datastore. To add an extra security layer over the persisted
data, the information contained in the Identities sub-component must be encrypted and the
passwords hashed. The security policies used to consume the cloud services are related to the
consumer identities and establish what kind of actions are allowed over a given resource by an
authenticated consumer.
The appropriate database type to hold the security information is NewSQL, allowing to be
implemented either in the private or public cloud.
4.4
Audit Datastore
The services involved into a complex cloud system that requires governance are provided on
a paid scheme, and must respect an SLA previously negotiated and agreed between the cloud
service provider and the consumer. Usually the service is monitored only on the provider side,
such that the client is not aware when the cloud service does not respect the parameters that
are stipulated in the contract.
Through the accepted SLA, different metrics are established in order to facilitate the service
control and monitoring mechanisms. The mOSAIC’s Cloud Agency component provides the
required functionality that allows the service monitoring on the consumer side and to permanently
compare the measured parameters with the contractual ones that is leveraged by the Audit
Management Agent which obtains all the necessary information about a specific service. Along
with the performance measurements, billing information is generated based on the time spent
for a cloud service to perform specific tasks together with audit information , triggered when the
cloud service is accessed by various entities. The audit mechanism is working based on various
policies established at the service level. All these data together with the audit policies is stored
in the Audit Datastore component.
Usually the data contained in the Audit Datastore is not structured, but has a large volume
depending on the number of the performance parameters monitored and the sampling frequency.
Because many decisions in the process of governance are taken based on the data analysis,
it is recommended that the storage system that host this data to offer high availability, high
writability and scaling, which is better achieved by a key-value datastore.
48
5
Conclusions and Future Works
Cloud Computing adoption embraced by the Small and Medium Enterprises makes possible
the access on new markets where they can associate in virtual clusters or virtual enterprises in
order to be able to compete with the large enterprises, by offering complex solutions, tailored
on the customers specific needs. We have proposed a Cloud Governance architecture based
on mOSAIC’s Cloud Management components that aims to manage and govern the services
infrastructure. This government solution relies on a set of datastores that maintains and manage
various data produced and consumed during the services lifecycle.
Our paper focused in expanding the information contained in the most important datastores
in Cloud Governance, determining the requirements for every database that form the datastores
and suggests the appropriate database types.
This paper is a base for a future work consisting in the concrete implementation of a Cloud
Governance component.
Acknowledgments
This work was partially supported by the grant of the European Commission FP7-ICT-20095-256910 (mOSAIC), The views expressed in this paper do not necessarily reflect those of the
corresponding projects consortium members.
Bibliography
[1] C. Gong, J. Liu, Q. Zhang, H. Chen, and Z. Gong, The characteristics of cloud computing,
in W.-C. Lee and X. Yuan (Eds.), ICPP Workshops, IEEE Computer Society, pp. 275-279,
2010.
[2] C. Weinhardt, A. Anandasivam, B. Blau, N. Borissov, T. Meinl, W. Michalk, J. Stoer,
Cloud Computing A Classification, Business Models, and Research Directions, Business
and Information Systems Engineering,ISSN: 1867-0202, 1(5):391-399, 2009.
[3] C. Weinhardt, A. Anandasivam, B. Blau, and J. Stoer, Business Models in the Service World,
IEEE IT Professional, Special Issue on Cloud Computing, ISSN: 1520-9202, 11(2):28-33,
2009. [Online].
Available: http://dx.doi.org/10.1109/MITP.2009.21
[4] Cloud Computing Use Cases Group. (2010, July) Cloud computing use cases white
paper. [Online]. Available: http://opencloudmanifesto.org/Cloud Computing Use Cases
Whitepaper-4 0.pdf
[5] Use Cases and Interactions for Managing Clouds. Distributed Management Task
Force, (2010, June) [Online]. Available:
http://www.dmtf.org/sites/default/files/
standards/documents/DSP-IS0103 1.0.0.pdf
[6] DMTF. (2010, June) Architecture for Managing Clouds. Distributed Management Task
Force. [Online]. Available: http://dmtf.org/sites/default/files/standards/documents/DSPIS0102 1.0.0.pdf
[7] P. Wainewright. (2011, August) Time to think about cloud governance. [Online]. Available:
http://www.zdnet.com/blog/saas/time-to-think-about-cloud-governance/1376
49
[8] S. Bennett, T. Erl, C. Gee, R. Laird, A. T. Manes, R. Schneider, L. Shuster, A. Tost, and C.
Venable, SOA Governance: Governing Shared Services On-Premise in the Cloud. Prentice
Hall/PearsonPTR, 2011. [Online]. Available: http://www.soabooks.com/governance/
[9] T. Cecere. (2011, November) Five steps to creating a governance framework for cloud
security. Cloud Computing Journal. [Online]. Available: http://cloudcomputing.syscon.com/node/2073041
[10] DMTF.
(2010,
June)
Use
Cases
and
Interactions
for
Managing
Clouds.
Distributed
Management
Task
Force.
[Online].
Available:
http://www.dmtf.org/sites/default/files/standards/documents/DSP-IS0103 1.0.0.pdf
[11] ISACA.
(2012,
March)
COBIT
5
Introduction.
[Online].
http://www.isaca.org/COBIT/Documents/COBIT5-Introduction.ppt
Available:
8(1):50-60, February, 2013.
A Fuzzy Control Heuristic Applied to Non Linear Dynamic
System using a Fuzzy Knowledge Representation
Felisa M. Cordova
University of Santiago of Chile
Ecuador 3769. Estacion Central
Chile, Santiago
Guillermo Leyton
University of La Serena
Benavente 980
Abstract:
This paper presents the design of a fuzzy control heuristic that can be applied for modeling nonlinear dynamic systems using a fuzzy knowledge representation. Nonlinear
dynamic systems have been modeled traditionally on the basis of connections between
the subsystems that compose it. Nevertheless, this model design does not consider
some of the following problems: existing dynamics between the subsystems; order
and priority of the connection between subsystems; degrees of influence or causality
between subsystems; particular state of each subsystem and state of the system on the
basis of the combination of the diverse states of the subsystems; positive or negative
influences between subsystems. In this context, the main objective of this proposal
is to manage the whole system state by managing the state combination of the subsystems involved. In the proposed design the diverse states of subsystems at different
levels are represented by a knowledge base matrix of fuzzy intervals (KBMFI). This
type of structure is a fuzzy hypercube that provides facilities operations like: insert,
delete, and switching. It also allows Boolean operations between different KBMFI
and inferences. Each subsystem in a specific level and its connectors are characterized by factors with fuzzy attributes represented by membership functions. Existing
measures the degree of influence among the different levels are obtained (negatives,
positives). In addition, the system state is determined based on the combination
of the statements of the subsystems (stable, oscillatory, attractor, chaos). It allows
introducing the dynamic effects in the calculation of each output level. The control
and search of knowledge patterns are made by means of a fuzzy control heuristic.
Finally, an application to the co-ordination of the activities among different levels of
the operation of an underground mine is developed and discussed.
Keywords: Fuzzy Systems, Knowledge Representation, Heuristics, Nonlinear Dynamic Systems.
1
Introduction
Organizations can be visualized as complex systems composed of various subsystems that
respond to different problems and have their own dynamics. This process in turn is recursive,
so each subsystem has a particular dynamics. Such is the case of Managements, Business Areas,
Departments, primary and support activities of the value chain, activities plans, besides many
other systems and subsystems existing in the organization. Each subsystem is characterized by
its variables and by inputs that can alter its performance and its outputs, which are the inputs
of other subsystems, whose dependent effects are known only approximately. This constitutes
a situation of a set of highly dynamic subsystems and with clearly nonlinear characteristics.
Usually, these factors are not considered in the decision making processes.
Copyright ⃝
A Fuzzy Control Heuristic Applied to Non Linear Dynamic System using a Fuzzy Knowledge
Representation
51
It is clear that a universe of this kind is quite heterogeneous, dynamic, and growing. Also,
because of the nature of the stated problem, it must be considered that these subsystems represent inputs among themselves, giving the problem a high dose of parallelism. Insofar as these
subsystems serve as inputs among themselves, feedback is taking place continuously, making the
system’s dynamics difficult to control, predict, manage and administer [1]. It is also necessary
to take into account the increasing number of data, information and knowledge that current
systems must administer, in particular their adequate representation [9]. If we consider that the
problem of knowledge-based management and decision making must be carried out in organizations having these characteristics, then it is ever more important to support conceptual models
and tools adequate for the planning, management and control processes of this dynamics.
On the other hand, the representation knowledge is a fundamental component in any intelligent system that allows coding knowledge, objects, objectives, actions, and processes. The
scheme for the chosen representation of knowledge determines the reasoning process and its efficiency. Numerous studies on the representation of knowledge show that a representation can
be more adequate than another one for a particular case or it can be capable of covering a
greater number of cases [8]. The more traditional methods used are Semantic Networks, Frames,
Production Rules, Trees, and Bits Matrices. Cazorla et al. [3] suggest that knowledge can be
classified according to the specific application to be used that develops knowledge: procedural,
declarative, meta-knowledge, heuristic, or structural. However, the theory of diffuse sets proposed by Zadeh [12], [13] allows the generation of knowledge representations that are closer to
the nature itself of what it is desired to represent.
The conceptual models of systems, their representation based on knowledge, and the tools
for supporting management and decision making must then consider in their design factors such
as high dynamism, parallelism, feedback, incompleteness, handling of uncertainty, nonlinearity,
vagueness, qualitative definitions and behaviors, personal opinions, etc. Along this line, some
authors [1], [16], [15] make a profound development of various concepts such as fuzzy function
approximations, chaos and fuzzy control, and processing of fuzzy signals. However, his greatest
contribution refers to the calculation and representation of knowledge by means of fuzzy cubes
and fuzzy cognitive maps. McNeill [6] also works with fuzzy theory as a means of representing
environments with uncertainty usually characterized by their nonlinearity. Welstead, on the other
hand, supported by one of Kosko’s results [11] suggests that fuzzy rules can be represented by one
or more fuzzy associative memory matrices (FAM); combining the above with genetic algorithms
he proposes a model to approach prediction problems. They also use fuzzy representations
centered mainly on the interaction of fuzzy theory, neural networks, and genetic algorithms,
supporting a new line of work known as Computational Intelligence. Tsoukalas [10] is more
centered on the interaction and creation of fuzzy theory and neural network hybrids. To approach
these kinds of problems, models are designed making use mainly of causal diagrams or knowledge
maps with a series of nodes that would represent the concepts that are relevant to the system,
and links between them that show the causal relation (influence) between concepts. In this
context, the objective of this paper is to make a study and analysis that will allow modeling
some types of dynamic systems, representing knowledge by means of a knowledge base matrix
of fuzzy intervals and fuzzy cognitive maps [4], [14] and [15] with the purpose of achieving their
categorization and fuzzy weight, as well as the levels of incidence in other subsystems, in this
way characterizing the complete system with its levels of fuzzy incidence [5], [10].
2
Modeling of the Diffuse Knowledge Base Matrix
In this proposal each of the map’s concepts corresponds to a fuzzy set, and it is specifically
a particular Knowledge Base Matrix of Fuzzy Intervals (KBMFI). The connections between
52
concepts will have an associated value in the [-1,1] range that represents the degree of influence
of a (KBMFI) node on another. If the value is positive, it indicates that an increase on the
evidence of the origin concept increases the meaning, the evidence or the truth value of the
destination concept. If it is negative, an increase of the evidence of the source causes a decrease
of that of destination. If the value is 0, there is no connection, and no causal relation.
In this way it is possible to get blurred cognitive maps from the opinion of one or various
experts on the relations between some aspects of the evaluation process of a hypothetical case.
Also, the clear recursiveness involved in these types of systems is considered, and a vision of
granularity is proposed that allows overcoming the various levels of abstraction subjacent in
the dissimilar subsystems. On the other hand, internally each subsystem can be represented by
KBMFIs, allowing their incidence weight to be obtained with respect to other subsystems and
at the same time represent their particular behavior.
Definition 1. Let X be a classical set of objects, called the universe. Belonging to a subset A
of X can be defined in terms of the characteristic function:
µA : X −→ [0, 1] x −→ µA (x)
{
where:
µA (x) =
(1)
1 x∈A
0 x ̸∈ A
If the evaluation set 0, 1 is extended to the real interval [0,1], then it is possible to talk about
the partial belonging in A, where µA (x) is the degree of belonging of x in A, and the values 0
and 1 are interpreted as "non-belonging" and "total belonging", respectively.
Clearly, A is a subset of x, which has no defined boundaries. This leads to the following
definition.
Definition 2. Let X be an object’s space. A fuzzy set A of X is characterized by the set of pairs:
A = {(x, µA (x))/x ∈ X} where µA : X −→ [0, 1]
(2)
The fuzzy concept proposed by Zadeh [11] is based on the fact of allowing the partial belonging
in a set for certain elements of a given universe.
Definition 3. A fuzzy hypercube can be considered as a unit hypercube, i.e., a hypercube
I n = [0, 1]n . The n fuzzy cube has two vertices or binary subsets.
A fuzzy cube contains all the fuzzy sets of a set X of n objects. The non-fuzzy sets are found
at the vertices of the cube. The continuum of fuzzy sets is in the cube.
Definition 4. Knowledge Base Matrix of Fuzzy Intervals (KBMFI) means the hypercube that
is constituted by the various knowledge E1 , E2 , E3 , ..., En , relative to a domain of knowledge,
considering also the different weight or importance that each of them has in the particular
domain.
The KBMFI is a fuzzy hypercube where E1 , E2 , ..., En , represent the various contingencies
or characteristics of the area under discussion, according to the opinion of the experts. Ej , with
j = 1, 2, ..., n, do not necessarily have the same relevance or weight, they can be in particular
fuzzy frames consisting of S1 , ..., Sm , where S1 , S2 , ..., Sm , are the possible factors, not necessarily
disjoint, such that each characteristic Ei can be expressed by means of some particular union of
S1 , S2 , ..., Sm factors. Now the particular determination of each Ei through its particular factors
Representation
53
S1 , S2 , ..., Sm , model systems composed of a range of nodes N1 , N2 , ..., Nn , continually influencing
each other if and where the incidence of one with respect to others is completely dynamic. In
particular, this outlines a vision of dynamic nonlinear systems which in similar but not equal
versions are seen as causality maps.
If the map is adjusted to the opinions of several experts, one would have to get the assessments
of all of them and therefore establish the definitive values associated with the causality relations.
It must be noted that in general the causalities mentioned by the experts with respect to the
various influences exerted by the nodes of the maps are more attributable to qualitative than
quantitative concepts.
As already stated, nonlinear dynamic systems involve nonlinear and feedback behaviors. In
these systems the output of a process or node is used as input for the following node or iteration,
and the output of this can again be the input of the same previous node, i.e., self-recurrent
behaviors. This behavior corresponds to the following equation:


 Xn−1
Xn
f (x0 ) =


Xn+1
Assuming that the following situation occurs when modeling the system: x1 , x2 , x3 , ..., xn .
Definition 5. Let x0 be an arbitrary starting node, then the above sequence is called the
Trajectory.
Considering these definitions, several behaviors can occur, such as, for example: fixed points;
periodic trajectories; behaviors given by attractor nodes, and chaos.
3
Case Study
The case study corresponds to the situation of an underground mine which has three levels:
Production Level, Reduction Level, and Transport Level. The problem consists in "providing
support to activity scheduling management". The problem consists in "providing support to
activity scheduling management" [2]. The total system shown by Figure 1 consists of these three
subsystems and the dynamics that exists between them. This situation is denoted as Level 1.
Figure 1: Production, Reduction and Transport Levels.
N1 : Production Level considers N11 , N12 , N13 , as subsystems; N2 : Reduction Level considers
N21 , N22 , as subsystems; N3 : Transport Level considers no subsystems.
Looking at it at a more particular abstraction level, Level 2 appears, as shown in Figure 2.
From the particular situation shown, in Figure 1 it is seen that: N1 influences N2 negatively and
N3 positively, N2 influences N1 and N3 positively, N3 influences N2 negatively and N3 positively.
54
However, Figure 2 shows that the information obtained at Level 1 of abstraction of the
system does not have the sensitivity or reliability that is obtained at Level 2 of abstraction,
whose granularity or disaggregation is slightly higher.
Figure 2: Diagram of influence at the different levels.
If both levels are confronted, it may be incorrectly deduced that apparently contradictory
information is obtained. For example, if we look at Level 1 and Level 2 for the case of N3
with N2 , at Level 1 it was stated that N3 influences N2 negatively, but at Level 2 it could be
concluded that both have the opposite influence, N31 influences N22 negatively and N21 influences
N3 1 positively. This apparent contradiction can be explained, for example, by saying that when
production at the Reduction Level decreases, there is less pressure on the demand for trains or
cars, and on the other hand, if there is not sufficient transport from N31 there is an impact due
to accumulation of material at the Reduction Level, which is considered a negative influence.
Then the question is, which of the two situations has greater incidence weight? According to
Figure 3, and only as an example, it can be stated that the negative impact from N31 to N22 is
greater than the influence of N22 on N31 .
The main observations to the system are: it is clear that it is a Dynamic Fuzzy System.
In turn, every Ni is a Dynamic Fuzzy Subsystem. The connections between the various Ni are
fuzzy. These connections can be positive or negative. If positive, Ni influences positively on Nj .
If negative, Ni influences negatively on Nj .
4
Design and Implementation of the KBMFI Matrix
Going more deeply into Table 1, the experts draw these KBMFI as causal tables. They do
not state equations, but make links between subsystems. The KBMFI systems convert each
pictograph into a Fuzzy Rules Weight Matrix. The nodes of the KBMFI can model the complex
nonlinearities between the input and output nodes. The KBMFI can model the dynamics that
occur in the multiple iterations that take place in these dynamic systems.
The KBMFIs with N nodes have N n arcs. Since Ni (t) nodes are fuzzy concepts, their values
∈ [0, 1]; a state of a KBMFI is the Ni (t) = (N1 (t), N2 (t), ..., Nn (t)) vector, so it is a point of the
hypercube I n = [0, 1]n .
An inference in a KBMFI is a road or sequence of points in I n , i.e., it is a fuzzy process or
an indexed family of fuzzy sets N (t). It is clearly seen that the KBMFIs can perform "forward
chaining," and whether they can perform "backward chaining" (nonlinearity inverse causality) is
an open question. The KBMFIs form, as nonlinear dynamic systems, Semantic Fuzzy Networks
and act as neural networks. The KBMFIs can converge to a fixed point, to a limited cycle, that
Representation
55
can be a stable or oscillating state or a chaotic attractor in the fuzzy cube In. In this context,
one of the basic questions to be answered is: what happens if the input to the (KBMFI) system
is known? In this sense, each KBMFI stores a set of global rules of the form:
IF N (0) T HEN attractor A
(3)
A KBMFI with a single fixed global point has only one global rule. The size of the attractor
regions in the fuzzy cube governs the number of these global regions or hidden patterns. The
KBMFIs can have large and small attractor regions in In, each of them with a different degree of
complexity. Therefore an input state can lead to chaos and a relatively close input state can end
up in a fixed point or limited cycle or a stable state. Since the KBMFIs correspond to a Semantic
Fuzzy Network structure, it is possible to associate a matrix M. This matrix lists the causal links
between Ni nodes. As an example, if it is considered again the case described by Figure 2, the
corresponding KBMFI is presented where a row is the incidence of Ni on Nj ; columns are nodes
influence Ni and α, β, γ, δ, η, τ , are values. Fuzzy function: [little, moreorless, much, etc.].
N11
N12
N13
N21
N22
N31
N11
0
0
+γµ
0
+δµ
+τ µ
N12
−αµ
0
0
−ηµ
0
0
N13
+αµ
0
0
−ηµ
0
0
N21
0
+βµ
0
0
0
0
N22
0
0
−γµ
−ηµ
0
−τ µ
N31
+αµ
0
0
0
+δµ
0
The proposed model is decomposed in diverse abstraction levels and at each level is represented by a corresponding KBMFI. Initially, observing Figure 2, the Abstraction Level 0 appears.
Only the influence shapes are observed. A Node Ni can influence positively or negatively to the
Node Nj . Abstraction Level 0 appears:
N1
N2
N3
N1
0
+
+
N2
0
-
N3
+
0
Experts are asked to qualify the degree of influence between: µ = [nothing, irrelevant, few,
influence, regular, alter, a lot, very much, so much] as shown in the Incidence Graphic of Figure
3.
Applying the incidence graphic, a second level of abstraction 01 is obtained:
N1
N2
N3
N1
0
+µ
+µ
N2
−µ
0
−µ
N3
+µ
−µ
0
It is observed that the degree of incidence between a node Ni with a node Nj , this means
that it exists a bigger degree of specificity (granulation) between them. This enhancement of
specificity is explicit in the following level, it exist a "slot" between Ni with Nj . In this case
different situations are denoted: N1 influences in a negative way to N2 ; N1 influences in a positive
way to N3 ; N2 influences in a positive way to N1 ; N2 influences in a negative way to N3 ; N3
influences in a positive way to N1 ; N3 influences in a negative way to N2 .
If it is considered that a Node Ni can be decomposed in Ni1 , Ni2 , ..., Nik , in where those Nim ,
m = 1, 2, ..., k, with a particular dynamic conforms a Ni , the situation in the analyzed case is as
56
Figure 3: Incidence Graphic.
follows:
N1 = (N11 , N12 , N13 ); at Level 0; Node or Subsystem N1 .
N1 at Level 01, Node or Subsystem N1i is defined by:
N11
N12
N13
N11
0
0
+
N12
0
0
N13
+
0
0
N1 at Level 011 is defined by:
N11
N12
N13
N11
0
0
+αµ
N12
−αµ
0
0
N13
+αµ
0
0
N 2 = N21 , N22 ; at Level 0; Node or Subsystem N2 .
N2 at Level 01, Node or Subsystem N2i is defined by:
N21
N22
N21
0
0
N22
0
N2 at Level 011 is defined by:
N21
N22
N21
0
0
N22
−ηµ
0
Applying the same procedure to node N3 and it is only characterized by N31 , N3 at Level
011 is defined by:
Representation
57
CHARACTERISTICS OF LEVEL
1 (PRODUCTION)
1.
2.
3.
4.
5.
6.
FACTORS 1
Number of workmen present
Drilling, agents, and resources
Blasting, agents and resources
Technologies involved
Number of equipments
Lectures
ATTRIBUTES (Metrics or fuzzy functions
Table 1.1)
Fuzzy Functions
Rel. Card. (CRC) 1 2 3 4 5 6 7 8
9
Relative cardinality of Level 1 (CRN1)
Table 1: Relevant characteristics of Level 1 at Production Level.
N31
N31
0
At this point only the fuzzy subsystem cohesion is developed. So, it is necessary to visualize
what it happens with the external dynamic between subsystems, in order to obtain the fuzzy
matching inter systems. Continuing with the fuzzy cohesion procedure, links between nodes N1 ,
N2 and N3 , at Level 0 by Nodes are obtained:
N1
N1
0
N2
-
N3
+
At Level 01 by Node N1 :
N1
N1
0
N2
−αµ
N3
+αµ
N11
N1
0
N21
0
N22
0
N31
+αµ
N11
N1
0
N21
+βµ
N22
0
N31
0
In this way, influences are obtained allowing the fuzzy matching.
5
Heuristic Control for the KBMFI
Each Ni level has Fij factors that determine it, with i = 1, 2, 3; j = 1, 2, ..., m. Table 1 shows
relevant characteristics, factors, attributes and fuzzy functions at Production Level.
Table 2 shows relevant factors, attributes and fuzzy functions at Production Level.
Each Fij factor has Aij s attributes that determine it, where i = 1, 2, 3; j = 1, 2, ..., m;
s = 1, 2, ..., k (see Table 1).
58
FACTORS AND ATTRIBUTES
PROJECT SYSTEM
1. Number of workmen present.
(Decision making complexity).
1.1 Number of engineers.
1.2 Number of technicians.
1.3 Number of miners.
1.4 Number of equipments
2. Drilling, agents and resources.
2.1 Planned drillings.
FUZZY FUNCTIONS
µ11 (x) = 1 − 25−x
25 ) 10 ≤ x ≤ 25
( 75−x
2
1
µ2 (x) = 1 − 75
30 ≤ x ≤ 75
( 150−x )2
1
µ3 (x) = 1 − 150
60 ≤ x ≤ 150
µ14 (x) = 1 − 30−x
12
≤
x ≤ 30
30
x = amount of engineers, miners, ...
For evaluating this characteristic, first the predominant factor must be identified and then the calculation can be made.
For example, if x = 25 or 30 or 90 or 21, for respective:
µ1i : µ11 (25) =√1; µ12 (30) = 0.64; µ13 (90) = 0.84; µ14 (21) = 0.91
2.2 Direct agents involved.
µ21 (x) = 1 −
2.3 Indirect agents involved.
µ22,3 (x)
=1−
50−x
50
( 15−x
)3
20 ≤ x ≤ 50
30
6 ≤ x ≤ 15
Table 2: Factors, attributes and fuzzy functions at Production Level.
Each Aij s has attribute metrics associated with its nature. These metrics are functions of
fuzzy membership (see Table 2).
For the above points, it is possible to state that the degrees of influence (negative or positive)
that exist between the various levels can be measured, allowing the calculation of the existing
dynamics of the system to achieve an Intelligent Fuzzy Control with the purpose of keeping the
system in a desirable state (stable).
6
Heuristic
The proposed heuristic consists of the following stages:
Stage 1: Obtaining the Fij factors of each level Ni .
Stage 2: Obtaining the Aij s attributes of each Fij factor.
Stage 3: Determining the metrics associated with each Aij s attribute, i.e., determining the
fuzzy membership functions for each Aij s.
Stage 4: Determining the "formula" that corresponds to each Fij from the Aij s, for example:
Fij = λ1 Aij 1 ⊕ λ2 Aij 2 ⊕ ... ⊕ λk Aij k
where ⊕ is the operator to be determined (=>, ∨, ∪, etc.) and
∑
(4)
λk = 1.
Stage 5: Determining Ni from the Fij , for example:
Ni = λ1 Fi1 ⊕ λ2 Fi2 ⊕ ... oplusλm Fim
where ⊕ is the operator to be determined (=>, ∨, ∪, etc.) and
Note that the output of all Nj must be between 0 and 1.
∑
(5)
λm = 1.
Representation
59
Stage 6: Determining whether the "influence" of the output of Ni to other levels is negative
or positive.
Stage 7: Recalculating the Nt output, with its internal values, considering the influence exerted on it by the recursive dynamics of the nodes Ni at Stages 1, 2,..., 5.
Stage 8: Determining the output of Nt , input of Nl , and determining whether we feed Nl
or Ni , and specifying the times. Note that in this step we distinguish between what influences
what, or we make a push, we make a pull, or both at the same time, with a delay of one with
respect to the other, etc.
7
The work done in the paper allows the characterization of a complex system through subsystems considering the dynamics and the incidence of each subsystem on the others. From the
display of the complexity of the system and subsystems, the KBMFI is constructed, which allows
an adequate representation of diffuse knowledge and the dynamics associated with the system.
A fuzzy control heuristic is also designed that allows managing the KBMFI.
In the case of the planning of mining operations, the KBMFI and the associated heuristic
allow the evaluation of the impact of the incidence of various factors such as reduction of the
number of planned workers in a shift, faults in Load Haul and Dump LHD equipment, rock
breakers, shafts, and trains, among others.
If someone is considering developing software from this proposal, it should be kept in mind
that in the tool there should be an agent module that is informed (alert) of the acceptable critical
values for each node, so that this node does not alter acceptable states (experts) of the nodes
with which it interacts. In such case the agent must learn about the acceptable critical values,
know and learn preventive measures; know and learn mitigation measures, and know and learn
corrective measures.
Bibliography
[1] D. Alahakoon, S. K. Halgamuge, and B. Srinivasan, Dynamic self organizing maps with
controlled growth for knowledge discovery, IEEE Trans. Neural Networks, 11:601-614, 2000.
[2] F. Cordova, L. Canete, L. Quezada, F. Yanine, An Intelligent Supervising System for the
Operation of an Undergound Mine, INT J COMPUT COMMUN, ISSN 1841-9836, 3(S):
259-269, 2008.
[3] M. Gupta, R.K. Ragade,Yager, Advances in Fuzzy Sets Theory and applications, North Holland, Amsterdam, 1979.
[4] B. Kosko, Fuzzy Engineering, Prentice Hall, 1997.
[5] G. Martinez, Servente and Pasquini, Sistemas Inteligentes, NL Nueva Libreria, Argentina,
2003.
[6] T. McNeill, Fuzzy Logic a Practical Approach, Academic Press, 1997.
60
[7] H. Roman, Sobre Entropias Fuzzy, Tesis de doctorado, Universidad de Campinas, Brasil,
1989.
[8] E. Schnaider, A. Kandel, Applications of the Negation Operator in Fuzzy Production Rules,
Fuzzy Sets and Systems, 34: 293-299, Noth Holland, 1990.
[9] W. Silder, J. Buckley, Fuzzy expert system and fuzzy reasoning, John Wiley and Sons Inc.,
New Jersey, 416, 2005.
[10] U. Tsoukalas, Fuzzy and Neural Approaches in Engineering, Wiley Interscience, 1997.
[11] S. Welstead, Neural Network and Fuzzy Logic Applications in C++, Wiley Interscience,
1994.
[12] L.A. Zadeh, The role of fuzzy logic in the management of uncertainly in Expert Systems,
Aproximate Reasoning in Expert Systems, Elservier Science Pub., North Holland, 3-31, 1985.
[13] L.A. Zadeh et al (eds.), From Natural Language to Soft Computing: New Paradigms in
Artificial Intelligence, Editing House of Romanian Academy, 2008.
[14] Zadeh, L.A., Outline of a new approach to the analysis of a complex systems and decision
processed, IEEE Trans. Syst. Man Cybern., 3: 28-44, 1973.
[15] L.A. Zadeh, Fuzzy sets and fuzzy information: granulation theory, Beijing Normal University
Press, Beijing, 1997.
[16] L.Zhong, W.A. Halang,G. Chen, Integration of Fuzzy Logic and Chaos Theory, SpringerVerlag, Berlin Heidelberg, 2006.
8(1):61-69, February, 2013.
CRCWSN: Presenting a Routing Algorithm by using
Re-clustering to Reduce Energy Consumption in WSN
Arash Ghorbannia Delavar, Amir Abbas Baradaran
Department of Computer Engineering and Information Technology,
Payam Noor University, PO BOX 19395-3697, Tehran, IRAN
[email protected], [email protected]
Abstract:
In this paper, we have presented an algorithm, based on genetics and re-clustering,
to reduce energy consumption in Wireless Sensor Networks. Algorithm CRCWSN
could be best used by selected chromosomes in different states. In this algorithm,
a new technique of selecting cluster head(CH) has been initially used by genetic
algorithm. These CHs have been used individually in each round to transmit data. In
this research, considering distance and energy parameters, we have created a target
function having more optimum conditions, compared to previous techniques. The
created target function has been evaluated by input chromosome, and the combination
of chromosomes has been done by a new technique having more efficiency compared
to previous similar techniques. Consequently, the timing of generation repeat is based
on local distribution in chromosomes, and their using in sending data from source to
destination that decrease generations’ repeat, compared to previous methods .Results
by simulation show that, at the end of each round, the number of alive nodes in
the suggested algorithm increases, compared to previous methods, which increases
network’s lifetime.
Keywords: Genetic algorithm, wireless sensor network (WSN), routing, reduce
energy consumption, re-clustering.
1
Introduction
Recently, Wireless Sensor Networks have widely been considered[1].These networks include
some nodes receiving environment data and then sending them to a Base Station(BS)[2,3].Generally,
in these kinds of networks, nodes have batteries of limited energy which guarantee network’s lifetime[1,2,3,4]. The most important problem in wireless sensor networks is to create effective
routing protocols in order to decrease energy consumption and increase network’s lifetime[2,4,5].
So many methods, having advantages and limitations have been presented , to find the best
route and transmit data to BS. One of the most effective ways of finding optimum route and
data transmission from common nodes to BS, is using Genetic Algorithm(GA)[2,6]. An important advantage of routing by using GA, is a multi- purpose search rather than a point to point
search, which leads to using all points in all processes of running GA. This causes non-optimum
points in previous stages, to be attended in next stages[2]. To solve the problems, GA uses a
group of chromosomes related to one population. In each running of GA, current chromosomes
take a genetic operation for next generation to be appeared. This operation includes selection,
composition and mutation[2,3,5]. Optimum route is created by GA, after running several generations. In this research, we try to present a genetic and re-clustering based routing algorithm to
reduce energy consumption of sensor network. Routine is comparing to previous methods.
2
Related Works
One of the most important re-clustering algorithms is LEACH which is based on some
rounds[2,3,5]. Each round includes two phases; setup and steady. In setup phase, clusters are
Copyright ⃝
62
formed and common nodes and CHs are determined and in steady phase, data are transmitted
from common node to CH and from CH to BS. In LEACH algorithm; in each round, each node
can be a CH or a common node. Whether a node can be a CH or not depends on the following
threshold[2,7,8,9]:
{
p
if n ∈ G,
1−p∗(r mod p1 )
T (n) =
(1)
0
otherwise
in which :
P: CH decision percentage( percentage of being CH, for example p=0.05.);
R: current round;
G: a set of rounds not being CH in 1/p of current round;
Another routing way by GA, is the algorithm presented by Annie S.Wu,Ming Zhou, Shiyuan
Jin[10].
In this algorithm, selecting and clustering nodes is done by GA. Of other methods of routing
by GA, we can refer to the RGWSN algorithm [2]. In this algorithm, clustering and selection of
optimum CHs are done by GA. Another genetic- based method, presented by Jianming Zhang,
Yaping Lin, Cuihong Zhou, Jingcheng Ouyang could reduce energy consumption of sensor networks by considering distance and energy parameters[11].
2.1
Genetic Algorithm
GA is a method, based on natural inspiration, which numerically does direct and random
search. Finding an optimum solution is based on repeat and differs from other search methods
in applying natural selection[2]. GA includes bit strands named chromosomes, each bit of which
is called a gene. Chromosomes, represent the entire set of variables and GA, in each repeat,
will use non-optimum points of previous repeat[2,12]. One important advantage of GA , in
comparison with other searching methods, is multi-point selection rather than one-point selection,
in searching space. So GA, in searching and finding an optimum solution, is less likely to converge
into a local maximum. GA includes some stages. Each stage in GA is called a generation and
a series of solutions is called a population[2,5,9]. GA initiates with initial population and after
operations of genetic operators including selection, composition and mutation, a new population
is created. The creation of new population is done by fitness function , i.e. after applying fitness
function on created population , by termination of each generation, a new population is created.
The processes are running through different generations to find optimum solution. Generally,
GA includes the following stages[2,6,12]:
1. Selection: in this stage, two chromosomes having higher fitness, are selected as parent.
2. crossover: in this stage, two parents selected in the previous stage, are composed and
new children are created.
3. Mutation: In this stage, the child having mutation conditions, mutates. After this stage,
children are decoded and compared to fitness function. If, by regarding fitness function, conditions are not optimum, new children will be used in initial population and algorithm proceeds.
In this stage, generated chromosomes are treated as initial population and answers of low fitness
are omitted, and algorithm proceeds by n chromosomes.
Under following conditions GA can come to an end[2,6]:
1. The best degree of fitness for children is achieved.
2. No improvement achieved by running the algorithm during several generations.
3. Mean value of fitness function reaches a fixed measure as per several repeats(e.g. during
50 generations).
4. The number of generations reaches a fixed value after several repeats.
5. A combination of the above- mentioned items occurs.
Consumption in WSN
2.2
63
Network Model
All sensor nodes and BS are motionless and after establishment can not be added or omitted.
Also the base energy of nodes differs and sensor nodes are informed of situation , i.e. they need
hardware , s such as GPS to do this.
2.3
Radio Model
Sensors, when receiving data or transmitting data, consume energy[7,13]. Standard radio
model used in WSN, uses free space and multi-pass fading models depending on the distance
between sender and receiver . This distance is the shortest crossover distance dcrossover [7,9].
Transmit power equals[14]:
pt G t G r λ 2
(2)
pr (d) =
(4πd)2
In which:
pt : transmit power;
Gt : gain of transmit antenna;
λ: wave length of carrier signal (in meter).
When receiver distance is longer than dcrossover , transmit power equals[7,9]:
pr (d) =
pt Gt Gr h2t h2r
(d)4
(3)
ht : height of sender antenna (in meter).
hr : height of receiver antenna (in meter).
To transmit an n-bits message in d meter distance, radio energy consumption equals[7,9]:
ET X (n, d) = n(Eelect + ∈f s d2 ) d < dcrossover
ET X (n, d) = n(Eelect + ∈mp d4 ) d ≥ dcrossover
(4)
To receive n-bits message radio energy equals[9]:
Erx (n) = nEelect
(5)
∈f s and ∈mp are parameters depending on sensitivity(intelligence)of receiver and noise shape
and Eelect is an electric energy depending on some factors as digital code, modulation, filtering[9].
3
The New Presented Algorithm CRCWSN
The new proposed algorithm includes some rounds each of which has two setup and steady
phases. Clustering is formed in setup phase and data transmission is in steady phase. Setup
phase initiates with an initial message from BS, including the nodes’ position and initial energy,
and after running GA, some generations having lower fitness than other generations , are selected
(optimum generations), and common nodes and CHs are selected from available nodes in selected
generations. (For example, if GA stops after running 50 generations, the 5 generations having
lower fitness than other generations are selected). In the proposed method, a binary coding
system has been used and we suppose 0 bits representing common nodes and 1 bits representing
CHs. After selection of common nodes and CHs, data are transmitted in steady phase. The routine in setup phase is as follows: the distributive environment of nodes is divided into separated
areas called grid. Then some nodes, likely to be optimum CH, from each grid are selected. The
way of selecting nodes of each grid is based on the distance from gravity center of nodes of each
64
grid, and the initial energy thereof, i.e. the nodes having more initial energy and short distance
to gravity center are better selected.
Then the nodes selected as chromosomes ’bits are attended in GA. For example, if the environment is divided into 10 grids and 4 nodes from each grid are selected, the chromosomes
attended in GA will have 40 bits(0,1). Routine in GA is as follows: Firstly, from initial chromosome which is initial population, some random populations are created and then randomly by
using genetic operators of selection , composition and mutation, new children are created binarily
from created populations. For example, if we create 50 new populations from initial populations,
we will have 100 new children after applying genetic operators. After creating children, we apply
fitness function on new population to select some populations(for example 5 populations) having
lower fitness , as optimum generations. This trend will be proceeded for several generations until
GA stops. (when GA reaches a fixed number of generations). For proposed methods, fitness
function equals mean energy consumed by entire network in each population. Fitness function
is calculated , regarding Heinzelman model. Heinzelman has stated, in a model, that each node,
to transmit L bits of data in d distance from itself, consumes Et energy.
Et = LEelect + L ∈f s d2
d < d0
Et = LEelect + L ∈mp
d ≥ d0
d4
(6)
In which:
d0 : the shortest crossover distance.
Eelect : energy required to activate electronic circuits.
∈mp , ∈f s : parameters related to receiver’s sensitivity and noise shape.
Also the energy to receive L bits of data equals:
Er = LEelect
(7)
In the proposed algorithm and setup phase, we compute fitness value for the bits in final selection,
and suppose that 0 bits representing common nodes and 1 bits representing CHs. The total
consumed energy of network equals:
E = E1 + E2 + E3 + E4
In which:
E1 : energy necessary to send from common node to CH;
E2 : Energy necessary to receive CH data from common nodes;
E3 : Aggregation energy in CH;
E4 : Energy necessary to send from CH to BS.
Which we have:

2

 E1 = LEelect + L ∈f s ddistoch ddistoch < d0
oR


E1 = LEelect + L ∈mp d4distoch ddistoch ≥ d0
In which:
ddistoch : distance from common node to CH;
L: number of bits.
E2 = LEelect × N _Common
(8)
(9)
(10)
N _Common: common node number
E3 = LEag × N _ch
(11)
Consumption in WSN
65
Eag : aggregation energy in CH
N _ch: CHs number (bits 1)
E4 = LEelect + L ∈f s d2distoBs
ddistoBs < d0
oR
E4 = LEelect + L
(12)
∈mp d4distoBs
ddistoBs ≥ d0
ddistoBS : distance from CH to BS.
In setup phase, after applying fitness function on final populations(population of selected
optimum generations) and specifying common nodes and CHs,E1 for common nodes(0 bits) and
E2 ,E3 , E4 for 1 bits, are calculated. Finally in steady phase nodes’ energy is reduced, based on
transmissions. For example if we want to run algorithm in 1000 rounds and 5 optimum populations (population resulted by 5 optimum populations)are selected in setup phase, second round
with second population, third round with third population, fourth round with fourth population,
and fifth round with fifth population, is run. Similarly, the sixth round with first population
and the seventh round with second population is run. Code I shows the proposed method. Also
Chart 1 shows flowchart of the proposed algorithm.
Code I. Proposed method:
1. state = normal
2. Grid
3. Node Selection of each grid
4. create initial population
5. create a random population of initial population
6. Running GA
7. Selection 5 optimum selected generation
8. Running steady phase
9. if round = 0 go to 10 else go to 6
10. END.
4
Simulation
The proposed algorithm analysis has been done by MATLAB software. In this analysis, some
indices as alive nodes at the end of each round, number of grids, number of selected nodes in
each grid are considered to compose initial population. Also initial nodes’ energy, are random
measures between 0.3 to 0.5.
Other parameters used in simulation are as follow: 1. nodes are randomly placed in a square
–shaped environment; 2. BS position is variable; 3. Eelect : 50nj/bit; 4. ∈f s : 10pj/bit/m2; 5.
∈mp : 0.0013pj/bit/m4; 6. Eag : 5nj/bit/signal;
√
∈f s
7.d0 =
∈mp
66
Figure 1: Flowchart of the proposed algorithm
Figure 2: Total number of alive nodes in the RGWSN, LEACH,GSAGA,RCSDN,RGWSN
The proposed method has been compared to LEACH, RCSDN, GSAGA , and RGWSN
methods. Figure 1 shows the number of alive nodes at the end of 1400 rounds and figure 2 shows
fitness function for population of 5 optimum selected generation. Table 1 shows simulation
parameters.
Total number of cluster in different round in LEACH.
Consumption in WSN
Figure 3: The energy consumed by entire network for 5 optimum selected generation
Figure 4: Total number of cluster in different round in LEACH
Figure 5: Total number of cluster in different round in CRCWSN
67
68
TABLE I.Simulation Parameters
Value
100*100 m
50,50 m
rand [0.3,0.5] J
50nJ/bit
10pj/bit/ m2
0.0013pj/bit/m4
5nj/bit/signal
100
6
5
87m
Parameter
Network size
Base station location
Initial energy for
node
Eelec
εf s
εmp
Data aggregation energy
Nodes number
Grids Number
Nodes number of
each grid
d0
1. The energy consumed by entire network for 5 optimum selected generation.
As shown by diagrams, after running 1400 rounds, the alive nodes in the proposed algorithm
are more than those in LEACH, RCSDN,GSAGA, and RGWSN methods. So the network’s
lifetime increases. In this method, we also compare the number of clusters in various rounds, to
LEACH methods. Figures 3,4 show the number of clusters in LEACH and proposed methods.
Total number of cluster in different round in CRCWSN.
As shown by diagrams, in the proposed method, clusters formation in different rounds is
more balanced than in the LEACH method.
5
Conclusion
In this paper, we presented a new method of clustering to transmit data from common
nodes to CH and from CH to BS in sensor networks. Selection of optimum cluster play an
effective role in increasing sensor network’s lifetime. We show, by multi simulations, that the
proposed algorithm differs from other proposed algorithm in reducing energy consumption and
can significantly increase network’s lifetime compared to similar previous methods.
Bibliography
[1] GAO De-yun, ZHANG Lin-juan, WANG Hwang-cheng, Energy saving with node sleep and
power control mechanisms for wireless sensor networks,in: National Engineering Laboratory
for Next Generation Internet Interconnection Devices, School of Electronics and Information
Engineering, Beijing Jiaotong University, China, 18(1):49-59, 2011.
[2] A. G. Delavar, A. Abbas Baradaran, J. Artin, RGWSN: Presenting a genetic-based routing
algorithm to reduce energy consumption in wireless sensor network,International Journal of
Computer Science Issues, Vol. 8, Issue 5, No 1, 54-59, September 2011.
[3] ]Y. Zhu, W. Wu, J. Pan, Y. Tang, An energy-efficient data gathering algorithm to prolong
lifetime of wireless sensor networks, Comput. Commun., 33:639-647, 2010.
Consumption in WSN
69
[4] CHENG Hong-bing, YANG Geng, NHRPA: a novel hierarchical routing protocol algorithm
for wireless sensor networks, Journal of China Universities of Posts and Telecommunications,
15(3): 75-81, 2008.
[5] A.H. Mohajerzadeh, M.H.Yaghmaee, H.S.Yazdi,A.A.Rezaee, A Fair Protocol Using Generic
Utility Based Approach in Wireless Sensor Networks, Ultra Modern Telecommunications &
Workshops, 2009. ICUMT ’09. International Conference on, pp. 1-4, 2009.
[6] S.Yussof, R.Z. Razali, O.H.See, A Parallel Genetic Algorithm for Shortest Path Routing
Problem, 2009 International Conference on Future Computer and Communication, DOI
10.1109/ICFCC.2009.36, 2009.
[7] A.G. Delavar,J.Artin,M.M.Tajari, RCSDN : a Distributed Balanced Routing Algorithm
with Optimized Cluster Distribution, ICSAP 2011, 3rd International Conference onSignal
Acquisition And Processing , 26-28, February, 2011, Singapore
[8] A.G. Delavar, J.Artin, M.M.Tajari, PRWSN: A Hybrid Routing Algorithm with Special
Parameters in Wireless Sensor Network, in: A. Özcan, J. Zizka, and D. Nagamalai (Eds.):
WiMo/CoNeCo 2011, CCIS 162, pp. 145–158, 2011.
[9] Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H., Energy efficient communication
protocol for wireless sensor networks, Proc. of the 33rd Hawaii International Conference on
System Science, vol. 2, DOI: 10.1109/HICSS.2000.926982, 2000.
[10] Shiyuan Jin, Ming Zhou, Annie S. Wu, Sensor Network Optimization Using a Genetic
Algorithm, School of EECS University of Central Florida Orlando, FL 32816
[11] Jianming Zhang,Yaping Lin,Cuihong Zhou,Jingcheng Ouyang, Optimal Model for EnergyEfficient Clustering in Wireless Sensor Networks Using Global Simulated Annealing Genetic
Algorithm, DOI 10.1109/IITA.Workshops.2008.40
[12] V.Purishotham Reddy, G.Michael, M.Umamaheshwari, Coarse-Grained ParallelGeneticAlgorithm to solve the Shortest Path Routing problem using Genetic operators, Indian Journal
of Computer Science and Engineering, ISSN : 0976-5166, 2(1):39-42, 2011.
[13] Wang, Q., Yang, W. Energy consumption model for power management in wireless sensor
networks, In 4th Annual IEEE communications society conference on sensor, mesh and ad
hoc, communications and network, DOI:10.1109/SAHCN.2007.4292826, 2007.
[14] T. Rappaport, Wireless Communications: Principles & Practice, NJ, Prentice Hall, 1996.
8(1):70-78, February, 2013.
Alina Madalina Lonea
"Politehnica" University of Timisoara,
Faculty of Automation and Computers
B-dul Vasile Parvan, nr. 2, 300223, Timisoara, Romania
E-mail: madalina _ [email protected]
Daniela Elena Popescu
University of Oradea, Faculty of Electrical Eng. and Information Tech.
Universitatii street, nr. 1, 410087, Oradea, Romania
Huaglory Tianfield
School of Engineering and Built Environment,
Glasgow Caledonian University
Cowcaddens Road, Glasgow G4 0BA, United Kingdom
E-mail: h.tianfi[email protected]
Abstract:
This paper is focused on detecting and analyzing the Distributed Denial of Service
(DDoS) attacks in cloud computing environments. This type of attacks is often the
source of cloud services disruptions. Our solution is to combine the evidences obtained
from Intrusion Detection Systems (IDSs) deployed in the virtual machines (VMs) of
the cloud systems with a data fusion methodology in the front-end. Specifically, when
the attacks appear, the VM-based IDS will yield alerts, which will be stored into the
Mysql database placed within the Cloud Fusion Unit (CFU) of the front-end server.
We propose a quantitative solution for analyzing alerts generated by the IDSs, using
the Dempster-Shafer theory (DST) operations in 3-valued logic and the fault-tree
analysis (FTA) for the mentioned flooding attacks. At the last step, our solution uses
the Dempsters combination rule to fuse evidence from multiple independent sources.
Keywords: cloud computing, cloud security, Distributed Denial of Service (DDoS)
attacks, Intrusion Detection Systems, data fusion, Dempster-Shafer theory.
1
Introduction
Cloud computing technology is in continuous development and with numerous challenges
regarding security. In this context, one of the main concerns for cloud computing is represented by
the trustworthiness of cloud services. This problem requires prompt resolution because otherwise
organizations adopting cloud services would be exposed to increased expenditures while at a
greater risk. A survey conducted by International Data Corporation (IDC) in August 2008
confirms that security is the major barrier for the cloud users.
There are two things that cloud service providers should guarantee all the time: connectivity
and availability, and if there are not met, the entire organizations will suffer high costs [1].
This paper is focused on detecting and analyzing Distributed Denial of Service (DDoS) attacks
in cloud computing environment. This type of attacks is often the source of cloud services
disruptions. One of the efficient methods for detecting DDoS is to use the Intrusion Detection
Systems (IDS), in order to assure usable cloud computing services [2]. However, IDS sensors
have the limitations that they yield massive amount of alerts and produce high false positive
rates and false negative rates [3].
Copyright ⃝
71
With regards to these IDS issues, our proposed solution aims to detect and analyze Distributed Denial of Service (DDoS) attacks in cloud computing environments, using DempsterShafer Theory (DST) operations in 3-valued logic and Fault-Tree Analysis (FTA) for each VMbased Intrusion Detection System (IDS). The basic idea is to obtain information from multiple
sensors, which are deployed and configured in each virtual machine (VM). The obtained information is integrated in a data fusion unit, which takes the alerts from multiple heterogeneous
sources and combines them using the Dempster’s combination rule. Our approach quantitatively
represents the imprecision and efficiently utilizes it in IDS to reduce the false alarm rates.
Specifically, our solution combines the evidences obtained from Intrusion Detection Systems
(IDSs) deployed in the virtual machines (VMs) of the cloud system with a data fusion methodology within the front-end.
Our proposed solution can also solve the problem of analysing the logs generated by sensors,
which seems to be a big issue [4].
The remainder of this paper is organized as follows: section 2 introduces Dempster-Shafer
Theory. Section 3 presents the related work of IDS in Cloud Computing and the related work of
IDS using data fusion. Section 4 introduces the proposed solution of detecting DDoS attacks in
Cloud Computing. Finally, in section 5 the paper presents the concluding remarks.
2
Dempster-Shafer Theory (DST)
Dempster-Shafer Theory is established by two persons: Arthur Dempster, who introduced it
in the 1960’s and Glenn Shafer, who developed it in the 1970’s [5].
As an extension of Bayesian inference, Dempster-Shafer Theory (DST) of Evidence is a
powerful method in statistical inference, diagnostics, risk analysis and decision analysis. While
in the Bayesian method probabilities are assigned only for single elements of the state space
(Ω),in DST probabilities are assigned on mutually exclusive elements of the power sets of possible
states [6], [7].
According to DST method, for a given state space (Ω) the probability (called mass) is allocated for the set of all possible subsets of Ω, namely 2Ω elements.
Consequently, the state space (Ω) is also called frame of discernment, whereas the assignment
procedure of probabilities is called basic probability assignment (bpa) [6], [7], [8].
We will apply the particular case of DST, i.e., the DST operations in 3-valued logic using the
fault-tree analysis (FTA), adopted by Guth (1991) and also used in Popescu, et al. (2010).
Thus, if a standard state space Ω is (True, False), then 2Ω should have 4 elements: { ϕ,
True, False, (True, False) }. The (True, False) element describes the imprecision component
introduced by DST, which refers to the fact of being either true or false, but not both. DST is a
useful method for fault-tree analysts in quantitatively representing the imprecision [8]. Another
advantage of DST is it can efficiently be utilized in IDS to reduce the false alarm rates by the
representation of ignorance [6], [7], [10].
For the reason that in DST the [sum of all masses] = 1 and m(ϕ) = 0,we have the following
relation:
m(T rue) + m(F alse) + m(T rue, F alse) = 1
(1)
In order to analyze the results of each sensor we’ll use the fault tree analysis, which can be
realized by boolean OR gate. Table 1 describes the Boolean truth table for the OR gate.
From Table 1 we have:
m(A) = (a1, a2, a3) = {m(T ), m(F ), m(T, F )}
(2)
72
Table 1: BOOLEAN TRUTH TABLE FOR THE OR GATE
b1
b2
b3
∨
T
F
(T,F)
a1
T
T
T
T
a2
F
T
F
(T,F)
a3 (T,F) T (T,F) (T,F)
m(B) = (b1, b2, b3) = {m(T ), m(F ), m(T, F )}
(3)
⇒ m(A ∨ B) = (a1b1 + a1b2 + a1b3 + a2b1 + a3b1; a2b2; a2b3 + a3b2 + a3b3)
(4)
m(A ∨ B) = (a1 + a2b1 + a3b1; a2b2; a2b3 + a3b2 + a3b3)
(5)
At the last step, our solution applies the Dempster’s combination rule, which allows fusing
evidences from multiple independent sources using a conjunctive operation (AND) between two
bpa’s m1 and m2 , called the joint m12 [11]:
∑ ∩
B C=A m1 (B)m2 (C)
,
(6)
m12 (A) =
1−K
when : A ̸= ϕ
m12 (ϕ) = 0
∑
and K = B ∩ C=ϕ m1 (B)m2 (C)
The factor 1-K, called normalization factor, is constructive for entirely avoiding the conflict
evidence.
Data fusion is also applied in real world examples: robotics, manufacturing, remote sensing
and medical diagnosis, as well in military threat assessment and weather forecast systems [12].
Sentz and Ferson (2002) demonstrated in their study that Dempster’s combination rule is
suitable for the case that the sources of evidences are reliable and a minimal conflict or irrelevant
conflict is generated.
3
3.1
Related Work
Intrusion Detection Systems (IDS) in Cloud Computing
One of the IDS strategies proved reliable in cloud computing environments is its applicability
to each virtual machine. This is the method we’ll choose for our proposed solution. Mazzariello,
et al. (2010) presented and evaluated this method in comparison with another IDS deployment
strategy, which uses single IDS near the cluster controller. IDS applied to each virtual machine
in cloud computing platform eliminates the overloading problem, because in a way the network
traffic is split to all IDSs. Thus, applying IDS to each virtual machine gets rid of the issue of the
IDS strategy near the cluster controller, which tends to be overloaded because of its necessity to
monitor all the supposed traffic from the cloud computing infrastructure. Another advantage of
this strategy as described by Roschke, et al. (2009) is the benefit of reducing the impact of the
possible attacks by the IDS Sensor VMs.
However, the limitation of IDS strategy applied to each virtual machine is the missing of the
correlation phase, which is suggested in the future work by Mazzariello, et al. (2010).
73
The correlation phase will be included in our proposed solution, because beside the IDS for
each virtual machine, our IDS cloud topology will include a Cloud Fusion Unit (CFU) on the
front-end, with the purpose of obtaining and controlling the alerts received from the IDS sensor
VMs as presented by Roschke, et al. (2009) in their theoretical IDS architecture for cloud, which
utilizing an IDS Management Unit.
Compared to Roschke, et al. (2009) who suggested the utilization of IDMEF (Intrusion
Detection Message Exchange) standard, a useful component for storage and exchange of the
alerts from the management unit, the alerts in our proposed solution will be stored into the
Mysql database of Cloud Fusion Unit. The Cloud Fusion Unit will add the capacity to analyze
the results using the Dempster-Shafer theory (DST) of evidence in 3-valued logic and the FaultTree Analysis for the IDS of each virtual machine and at the end the results of the sensors will
be fused using Dempster’s combination rule.
A similar method of using a IDS Management Unit is proposed in Dhage, et al. (2011),
who presented a theoretical model of an IDS model in cloud computing, by using a single IDS
controller, which creates a single mini IDS instance for each user. This IDS instance can be
used in multiple Node controllers and a node controller can contain IDS instances of multiple
users. The analysis phase of the mini IDS instance for each user takes place in the IDS controller.
Compared with Roschke, et al. (2009) where the emphasis is on how to realize the synchronization
and integration of the IDS Sensor VMs, in Dhage, et al. (2011) the focus is to provide a clear
understanding of the cardinality used in the basic architecture of IDS in cloud infrastructure.
Applying the IDS for each virtual machine is an idea suggested also by Lee, et al. (2011), who
increases the effectiveness of IDS by assigning a multi-level intrusion detection system and the log
management analysis in cloud computing. In this sense the users will receive appropriate level
of security, which will be emphasized on the degree of the IDS applied to the virtual machine,
and as well on the prioritization stage of the log analysis documents. This multi-level security
model solves the issue of using effective resources.
Lo, et al. (2010) proposed a cooperative IDS system for detecting the DoS attacks in Cloud
Computing networks, which has the advantage of preventing the system from single point of
failure attack, even if it is a slower IDS solution than a pure Snort based IDS. Thus, the framework
proposed by Lo, et al. (2010) is a distributed IDS system, where each IDS is composed of three
additional modules: block, communication and cooperation, which are added into the Snort IDS
system.
3.2
IDS using Dempster-Shafer theory
Dempster-Shafer Theory (DST) is an effective solution for assessing the likelihood of DDoS
attacks, which was demonstrated by several research papers in the context of network intrusion
detection systems. Dissanayake (2008) presented a survey upon intrusion detection using DST.
Our study is to detect DDoS attacks in cloud computing environments. Dempster-Shafer
Theory (DST) is used to analyze the results received from each sensor (i.e. VM-based IDS).
Data used in experiments using DST vary: Yu and Frincke (2005) used DARPA DDoS
intrusion detection evaluation datasets, Chou et al. (2008) used DARPA KDD99 intrusion
detection evaluation dataset, Chen and Aickelin (2006) used the Wisconsin Breast cancer dataset
and IRIS plant data, while others scientists generated their own data [7]. The data to be used in
our proposed solution will be generated by ourselves, by performing DDoS attacks using specific
tools against the VM-based IDS.
Siaterlis, et al. (2003) and Siaterlis and Maglaris (2005) performed a similar study of detecting
DDoS using data fusion and their field was an operational university campus network, while in
our solution the DDoS attacks are proposed to be detected and analyzed in our private cloud
74
computing environment.
Additionally, we consider to analyze the attacks generated against the TCP, UDP, ICMP
packets, like Siaterlis, et al. (2003) and Siaterlis and Maglaris (2005). However, instead of
applying DST on the state space Ω = {N ormal, U DP − f lood, SY N − f lood, ICM P − f lood},
our study uses DST operations in 3-valued logic as suggested by Guth (1991) for the same
flooding attacks: TCP-flood, UDP-flood, ICMP-flood, for each VM-based IDS. Like Siaterlis
and Maglaris (2005), Chatzigiannakis, et al., (2007) chosen the same frame of discernment, while
Hu, et al. (2006) used a state space: {Normal, TCP, UDP and ICMP}.
Furthermore, compared with the study performed by Siaterlis, at al. (2003) and Siaterlis and
Maglaris (2005), who use a minimal neural network at the sensor level, our proposed solution will
assign the probabilities using: DST in 3-valued logic, the pseudocode and the fault tree analysis.
Whilst the computational complexity of DST is increasing exponentially with the number of
elements in the frame of discernment [12], the DST 3-valued logic proposed to be used in our
research will not encounter this issue, which will meet the efficiency requirements in terms of
both detection rate and computation time [15].
Finally, the data fusion of the evidences obtained from sensors studied by Siaterlis and
Maglaris (2005) will be used in our study. The data fusion will be realized using the DempsterShafer combination rule, which was demonstrated in Siaterlis and Maglaris (2005) for its advantages, i.e., maximization of DDoS true positive rates and minimization of the false positive
alarm rate, by combining the evidence received from sensors. Therefore, the work of cloud
administrators will be alleviated, whereas the number of alerts will decrease.
4
Proposed Solution
In order to detect and analyze Distributed Denial of Service (DDoS) attacks in cloud computing environments we propose a solution as presented in Figure 1. For illustration purpose, a
private cloud with a front-end and three nodes is set up. Whilst the detection stage is executed
within the nodes, more precisely inside the virtual machines (VMs), where the Intrusion Detection Systems (IDSs) are installed and configured; the attacks assessment phase is handled inside
the front-end server, in the Cloud Fusion Unit (CFU).
The first step in our solution includes the deployment stage of a private cloud using Eucalyptus open-source version 2.0.3. The topology of the implemented private cloud is: a front-end
(with Cloud Controller, Walrus, Cluster Controller, Storage Controller) and a back-end (i.e.
three nodes). The Managed networking mode is chosen because of the advanced features that it
provides and Xen hypervisor is used for virtualization.
Then, the VM-based IDS are created, by installing and configuring Snort into each VM. The
reason of using this IDS location is because the overloading problems can be avoided and the
impact of possible attacks can be reduced [2], [13].
These IDSs will yield alerts, which will be stored into the Mysql database placed within the
Cloud Fusion Unit (CFU) of the front-end server. A single database is suggested to be used
in order to reduce the risk of losing data, to maximize the resource usage inside the VMs and
to simplify the work of cloud administrator, who will have all the alerts situated in the same
place. A similar idea of obtaining and controlling the alerts received from the IDS Sensor VMs
using an IDS Management Unit was presented by Roschke, et al. (2009) as a theoretical IDS
architecture for cloud. A similar method of using an IDS Management Unit is proposed in
Dhage, et al. (2011). However, our solution adds the capacity to analyse the results using the
Dempster-Shafer theory of evidence in 3-valued logic.
As showed in Figure 1, the Cloud Fusion Unit (CFU) comprises 3 components: Mysql
database, bpas calculation and attacks assessment.
75
Figure 1: IDS Cloud Topology
I. Mysql database
The Mysql database is introduced with the purpose of storing the alerts received from the
VM-based IDS. Furthermore, these alerts will be converted into Basic Probabilities Assignments
(bpas), which will be calculated using the pseudocode below.
II. Basic probabilities assignment (bpa’s) calculation
For calculating the basic probabilities assignment, first we decide on the state space Ω. In this
paper we use DST operations in 3-valued logic {True, False, (True, False)} Guth (1991) for the
following flooding attacks: TCP-flood, UDP-flood, ICMP-flood, for each VM-based IDS. Thus,
the analyzed packets will be: TCP, UDP and ICMP. Further, a pseudocode for converting the
alerts received from the VM-based IDS into bpas is provided. The purpose of this pseudocode
is to obtain the following probabilities of the alerts received from each VM-based IDS:
(mU DP (T ), mU DP (F ), mU DP (T, F ))
(mT CP (T ), mT CP (F ), mT CP (T, F ))
(mICM P (T ), mICM P (F ), mICM P (T, F ))
76
Figure 2: BPA’s calculation
Pseudocode for converting the alerts into bpa’s:
For each node
Begin
For each X ∈ {UDP; TCP; ICMP}:
Begin
1: Query the alerts from the database when a X attack occurs for the specified hostname
2: Query the total number of possible X alerts for each hostname
3: Query the alerts from the database when X attack is unknown
4: Calculate the Belief (True) for X, by dividing the result obtained at step 1 with the result
obtained at step 2
5: Calculate the Belief (True, False) for X, by dividing the result obtained at step 3 with the
result obtained at step 2
6: Calculates Belief (False) for X: 1- Belief (True) - Belief (True, False)
end
end
Furthermore, after obtaining the probabilities for each attack packet (i.e. UDP, TCP, ICMP)
for each VM-based IDS, the probabilities for each VM-based IDS should be calculated following
the fault-tree as shows in Figure 2. Figure 2 reveals only the calculation of the probabilities (i.e.
mS1 (T ), mS1 (F ), mS1 (T, F )) for the first VM-based IDS.
Thus, using the DST with fault-tree analysis we can calculate the belief (Bel) and plausibility
(Pl) values for each VM-based IDS:
Bel(S1) = mS1 (T )
(7)
P l(S1) = mS1 (T ) + mS1 (T, F )
(8)
III. Attacks assessment
The attacks assessment consists of data fusion of the evidences obtained from sensors by
using the Dempster’s combination rule, with the purpose of maximizing the DDoS true positive
rates and minimizing the false positive alarm rate. mS1,S2 (T ) can be calculated using Table 2
and equation (6).
77
Table 2: BOOLEAN TRUTH TABLE FOR THE OR GATE
mS1 (T)
mS1 (F)
mS1 (T,F)
mS2 (T)
mS1 (T) mS2 (T)
mS1 (F) mS2 (T)
mS1 (T,F) mS2 (T)
mS2 (F)
mS1 (T) mS2 (F)
mS1 (F) mS2 (F)
mS1 (T,F) mS2 (F)
mS2 (T,F) mS1 (T) mS2 (T,F) mS1 (F) mS2 (T,F) mS1 (T,F) mS2 (T,F)
5
Conclusions
To detect and analyze Distributed Denial of Service (DDoS) attacks in cloud computing
environments we have proposed a solution using Dempster-Shafer Theory (DST) operations in
3-valued logic and the Fault-Tree Analysis (FTA) for each VM-based Intrusion Detection System
(IDS). Our solution quantitatively represents the imprecision and efficiently utilizes it in IDS to
reduce the false alarm rates by the representation of the ignorance.
Whilst the computational complexity of DST is increasing exponentially with the number of
elements in the frame of discernment [12], the DST 3-valued logic in our solution does not have
this issue, which meets the efficiency requirements in terms of both detection rate and computation time. At the same time, the usability requirement has been accomplished, because the work
of cloud administrators will be alleviated by using the Dempster rule of evidence combination
whereas the number of alerts will decrease and the conflict generated by the combination of
information provided by multiple sensors is entirely eliminated.
To sum up, by using DST our proposed solution has the following advantages: to accommodate the uncertain state, to reduce the false negative rates, to increase the detection rate, to
resolve the conflicts generated by the combination of information provided by multiple sensors
and to alleviate the work for cloud administrators.
Acknowledgment
This work was partially supported by the strategic grant POSDRU/88/1.5/S/50783, Project
ID50783 (2009), co-financed by the European Social Fund - Investing in People, within the
Sectoral Operational Programme Human Resources Development 2007-2013.
Bibliography
[1] Perry, G., Minimizing public cloud disruptions, TechTarget, [online]. Available at:
http://searchdatacenter.techtarget.com/tip/Minimizing-public-cloud-disruptions, 2011.
[2] Roschke, S., Cheng, F. and Meinel, C.,Intrusion Detection in the Cloud. In Eighth IEEE
International Conference on Dependable, Autonomic and Secure Computing, pp. 729-734,
2009.
[3] Yu, D. and Frincke, D.,A Novel Framework for Alert Correlation and Understanding. International Conference on Applied Cryptography and Network Security (ACNS) 2004,
Springer’s LNCS series, 3089, pp. 452-466, 2004.
[4] Lee, J-H., Park, M-W., Eom, J-H. And Chung, T-M., Multi-level Intrusion Detection System
and Log Management in Cloud Computing. In 13th International Conference on Advanced
Communication Technology (ICACT) ICACT 2011, Seoul, 13- 16 February, pp.552- 555,
2011.
[5] Chen, Q. and Aickelin, U., Dempster-Shafer for Anomaly Detection. In Proceedings of the
International Conference on Data Mining (DMIN 2006), Las Vegas, USA, pp. 232-238, 2006.
78
[6] Siaterlis, C., Maglaris, B. and Roris, P., A novel approach for a Distributed Denial of Service
Detection Engine. National Technical University of Athens. Athens, Greece, 2003.
[7] Siaterlis, C. And Maglaris, B., One step ahead to Multisensor Data Fusion for DDoS Detection. Journal of Computer Security, 13(5):779-806, 2005.
[8] Guth, M.A.S., A Probabilistic Foundation for Vagueness & Imprecision in Fault-Tree Analysis. IEEE Transactions on Reliability, 40(5), pp.563-569, 1991.
[9] Popescu D.E., Lonea A.M., Zmaranda D.,Vancea C. and Tiurbe C. , Some Aspects about
Vagueness & Imprecision in Computer Network Fault-Tree Analysis. INT J COMPUT
COMMUN, ISSN: 1841-9836, 5(4):558-566, 2010.
[10] Esmaili, M., Dempster-Shafer Theory and Network Intrusion Detection Systems. Scientia
Iranica, Vol. 3, No. 4, Sharif University of Technology, 1997.
[11] Sentz, K. and Ferson, S., Combination of Evidence in Dempster-Shafer Theory. Sandia
National Laboratories, Sandia Report, 2002.
[12] Dissanayake, A., Intrusion Detection Using the Dempster-Shafer Theory. 60-510 Literature
Review and Survey, School of Computer Science, University of Windsor, 2008.
[13] Mazzariello, C., Bifulco, R. and Canonico, R., Integrating a Network IDS into an Open
Source Cloud Computing Environment. In Sixth International Conference on Information
Assurance and Security, pp. 265-270, 2010.
[14] Dhage, S. N., et al., Intrusion Detection System in Cloud Computing Environment. In International Conference and Workshop on Emerging Trends in Technology (ICWET 2011) ’
TCET, Mumbai, India, pp. 235-239, 2011.
[15] Lo, C-C. , Huang, C-C. And Ku, J., A Cooperative Intrusion Detection System Framework
for Cloud Computing Networks. In 39th International Conference on Parallel Processing
Workshops, pp.280-284, 2010.
[16] Yu, D. and Frincke, D., Alert Confidence Fusion in Intrusion Detection Systems with Extended Dempster-Shafer Theory. ACM-SE 43: Proceedings of the 43rd ACM Southeast Conference, pp. 142-147, 2005.
[17] Chou, T., Yen, K.K., Luo, J., Network intrusion detection design using feature selection of
soft computing paradigms. International Journal of Computational Intelligence, 4(3):102105, 2008.
[18] Chatzigiannakis, V., et al., Data fusion algorithms for network anomaly detection: classification and evaluation. Proceedings of the Third International Conference on Networking
and Services (ICNS’07), 2007.
[19] Hu, W., Li, J. and Gao, Q., Intrusion Detection Engine Based on Dempster-Shafer’s Theory of Evidence. Communications, Circuits and Systems Proceedings, 2006 International
Conference, 3:1627-1631, 2006.
8(1):79-86, February, 2013.
Reliable Critical Infrastructure: Multiple Failures for Multicast
using Multi-Objective Approach
Ferney A. Maldonado-Lopez, Yezid Donoso
Universidad de los Andes, Bogotá, Colombia
[email protected]
Abstract:
Multicast is the keystone for multimedia Internet. Multicast is one of the new and
most used services in telecommunication networks. However, these networks meet big
challenges when facing failures from diverse factors, including natural disasters and
bad configurations. Networks operators need to establish mechanisms to maintain
available multicast services, and plan actions to handle incidents. We study and
implement an elitist evolutionary algorithm based on Strength Pareto Evolutionary
Algorithm - SPEA. Our implementation recalculates network routes, even when there
are multiple failures. The results indicate that our product finds lower-cost and higheravailability multicast tree to protect multicast services.
Keywords: resilience, protection, survivability networks, multi-objective evolutionary algorithm.
1
Introduction
Multicast is one of the new and most used services in telecommunication networks. Some
applications on the Internet use a multicast service which is a key factor in multimedia, video
conferencing, distributed games, Internet television, and telepresence. A multicast service simultaneously sends messages from only one source to a group of destinations creating a multicast tree
over a network. Networks are exposed to several sources of failure, including misconfiguration
or operational errors, natural disasters, attacks, and environmental challenges. Additionally, to
maintain the services, they must consider unusual but legitimate traffic [1], such as peak traffic
on specific hours or dates. Due to multicast applications demand important amount of traffic,
they need a mechanism to protect them from network failures. The Internet, for example, has
been designed to survive random failures. If a node is down, the reliable protocol is able to
reroute the traffic around and use alternative connections.
In networking, Resilience and Survivability refer to the abilities of the network to overcome
failures and maintain its services working. The study of resilience and survivability has become
an important aspect of managing infrastructure in multicast networking. In this paper, we
formulate a mechanism to protect a multicast service when the network experiences multiple
failures. Failures in multicast have been widely studied [2] [3]. A common strategy to face this
problem is finding a redundant multicast tree (RMT). Some authors consider that a RMT could
be completely link-disjoint from the original. They propose to find a RMT with the minimal
cost. To determine the optimal RMT, we consider the NP-hard Steiner tree problem [4]. Several
algorithms have been proposed to deal with this issue, such as Topological methods [5], or the
Nearest Participant First (NPF) [6]. Furthermore, there are network protocols which calculate
a multicast tree with the minimal cost, such as MOSPF, PIM-DM, PIM-SM, and CBT.
However, there are some limitations with existing approaches. First, the algorithms that
calculate RMTs were designed to optimize only one decision variable; thus, they use only distance
to find the minimum cost tree. Furthermore, algorithms and protocols previously mentioned
were tested with single failures, it means, when only a link disruption occurs. Nonetheless,
Copyright ⃝
80
telecommunication operators need to find not only routes with the minimal cost, but also routes
with high availability. Moreover, operators require techniques to plan and reconfigure the network
where a set of links are damaged or eliminated.
We propose a mechanism to protect multicast services, with the maximum availability and
the minimum cost, considering multiple failures and diverse decision variables. Our approach
applies an heuristic multi-objective evolutionary algorithm (MOEA) which is a stochastic search
method to estimate the optimal RMT. We implement the Strength Pareto Evolutionary Algorithm (SPEA) which is an elitist evolutionary algorithm that finds Pareto-optimal solutions [9].
The used method has polynomial computational complexity O(mn2 ) where m is the number
of arcs and n nodes [10]. Result exhibits possible configurations after multiple failure for this
problem considered NP-hard.
2
Multicast Protection Problem
Multicast Protection Problem (MPP) is analysed from the communication survivability and
resilience view. Multicast services, methods for provisioning, and failures are presented by formal
description. A network is depicted as a weighted, directed, connected graph G = (N, A). A set
N of n nodes and a set A of arcs. Nodes are elements labelled 1 . . . n and an arc is an ordered
pair of nodes, an arc between nodes i and j is denoted by (i, j) = {(i, j)|i, j ∈ N}. A network is
represented by an n × n adjacency matrix A. The ijth element is 1 if (i, j) ∈ A and 0 otherwise.
Arc symbolizes a communication link in the network; so, it has assigned two link’s attributes:
cost and availability. Cost indicates the length of the link and availability is the probability
that the link works during a period of time. These attributes are represented by n × n matrices
W = {wij } for distance and V = {vij } for availability. Table 1 contains used notation to model
the problem.
N
A
A
W
V
Set of nodes labelled {1, 2, . . . , n}
Set of arcs {(i, j)|i, j ∈ N}
Adjacency matrix, n × n
Matrix of cost, W(Aij ) = w(i,j)
Matrix of availability, V(Aij ) = v(i,j)
s
D
T
T
F
Multicast source node s ∈ N
Destinations set {d1 , d2 , · · · , dl : di ∈ N}
Multicast Session T = (s, D)
Multicast session subgraph T = (N, A′)|A′ ⊆ A
Set of failures {f1 , f2 , · · · , fk |fi ∈ A}
Table 1: Graph notation
Definition 1. Multicast session is delivery data from source node s to l destinations nodes that
belong to set D.
According to definition 1, we model a multicast session as a graph T. It is a directed tree
with root s and terminal vertices di ∈ D. T is a set of paths from s and di ∈ D. A path P is a
walk without repeated nodes, and it is represented as a nodes sequence pj = {s, i1 , i2 , · · · , dj } :
s, ik , dj ∈ N.
A failure is a link disruption; therefore, multiple failures are described as a set of damaged
or eliminated links. This set, called vector of failures, is labelled by F where k is the number of
edges that have failed. The failures are uncorrelated, and they occur following a uniform random
distribution.
Definition 2. A failure fi is a pair (i, j) ∈ A, where i, j ∈ N. After a failure the graph G
changes to G′ = (N′, A′), N′ ⊆ N and A′ ⊆ A
Then, the multicast protection problem against multiple correlated failures is described as
follows.
Reliable Critical Infrastructure: Multiple Failures for Multicast using Multi-Objective Approach
81
Definition 3. Given a weighted graph G = (N, A), a multicast demand T , and a set of links that
have failed F, the Multicast Protection Problem (MPP) is providing a RMT G′ with maximum
availability V(G′) and minimum cost W(G′).
MPP is similar to find the optimal tree, which is a classic graph theory problem knowing
as Steiner tree problem that was proven as NP-hard. Consequently, MPP is also a NP-hard
problem.
3
Optimization Model
In order to make feasible the computation, the network is represented by a matrix of adjacency
A. We assume that all links on the network are directional. The matrix of adjacency has zero
entries on the main diagonal because there are not loops in the vertices. The attribute values of
cost and availability are represented into matrices W and V.
W(G) is the cost function of graph G which is the sum cost of all arcs in G (1). Similarly,
availability function V(G) is the conditional probability that all arcs in G are able (2).
W : G → ℜ,
W(G) =
∑
w(i,j)
∀i, j ∈ G
(1)
∀i, j ∈ G
(2)
(i,j)
V : G → [0, 1],
V(G) =
∏
v(i,j)
(i,j)
Figure 1 shows three trees able to carry data to multiple destinations. Figure (1a) is a graph
representing a complete network; Figure (1b) is a subgraph able to reach both destinations;
Figure (1c) is another subgraph reaching the same destinations. Note that both subgraphs are
multicast trees, but these are disjoint trees; it means, one is a multicast protection tree of the
other.
Figure 1: Network and Multicast Protection Tree for a demand from node 0 to nodes 4 and 5.
3.1
Variable definition
We set xi,j in (3) as a binary decision variable that symbols if an edge is used or not by the
session T . xi,j corresponds to the input (i, j) value in the matrix of adjacency A.
82
{
x(i,j) =
3.2
1 when the link is used from node i to node j,
0 in other case.
(3)
Objective functions and constraints
The expressions (4) and (5) are the objective functions. Recall that the goal is to find a set
of optimal trees that minimize the total cost and maximize the total availability.
min W(G) = min
∑
x(i,j) w(i,j)
∀i, j ∈ A
(4)
x(i,j) v(i,j)
∀i, j ∈ A
(5)
(i,j)
max V(G) = max
∏
(i,j)
In networking, a multicast session is a particular case where only one package is sent from s
and multiples copies are delivered to di . Multicast session is a variation of the transport problem.
In the transport problem an intermediate node ik receives and delivers same amount of goods.
In multicast, an intermediate node can replicate packages and send them by several output arcs.
Then, the multi-objective optimization problem is subject to:
x(i,j) ≤
∑
x(s,j) = 1
j∈A
(6)
x(j,di ) = −1
j∈A
(7)
j ̸= s, j ̸= di
(8)
xj,k |x(i,j) = 1, k ∈ A
k
x(i,j) ≥ 0
i, j ∈ A
(9)
Constrains (6) and (7), called constrains of offer and demand, are associated to root and
destinations nodes. Constrain (8) guaranties the replication of package, and the last constrain is
a positive flow.
4
Case Study
In this paper we study multiple failures in telecommunications networks and reliability for
multicast services. We propose a different mechanism to reconfigure the network. As an example,
we propose a case study, generate multiple failures, run our implementation to avoid failed links
and generate a new RMT. This work is divided into three stages. First, we implement a mechanism based on SPEA. Secondly, COST239 PAN-EUROPEAN Network tests the implementation.
Finally, the simulation is executed, and data is obtained to be analysed.
4.1
Creating an evolutionary algorithm
Genetic algorithm (GA) is a stochastic search method based on chromosomes. Each chromosome represents a valid solution to the problem. Space of solution Ω contains all feasible
solutions that satisfy problem constraints. Thus, particular solution ωi ∈ Ω is a tree represented
by a chromosome. GA initiates a population P which is a set of chromosomes randomly generated. After, each chromosome is rated by the objective function or fitness function. Then, a pair
of solutions, or individuals, are selected and produce an offspring by mixing their genetic information. Successor solutions are generated by combining two parent states or modifying a single
83
one, analogue to natural selection. There are two kind of evolutionary algorithms, non-elitist
and elitist ones. Non-elitist algorithms use whole latter population for the next iteration; also,
this procedure allows exploit non-dominated solutions, it means, found optimal Pareto P solutions. Second, elitist algorithm gives the chance to preserve the best solutions, or elite solutions
P , directly to next generation. This kind of problem representation is used to solve complex
optimization problems [11]. We use a GE optimization in this paper to solve RMT problem.
Strength Pareto Evolutionary Algorithm - SPEA
Zitzler and Thiele proposed an evolutionary algorithm called SPEA [9]. This algorithm
maintains elitism by an external population P . This population is a collection of non-dominated
solutions ω. Let say ω dominates ω, ω ≽ ω, if W(ω) 6 W(ω) and V(ω) > V(ω). The algorithm
finds non-dominated solutions and compares them with previous external population until segregate a new external population. This algorithm preserves elitist population P . SPEA has been
widely used to solve network optimization problems [11]. This particular problem was faced from
two phases. First, we used a high-level modelling system for mathematical optimization which
allows to solve linear, nonlinear, and mixed-integer optimization problems. Second, a SPEA
genetic algorithm was implemented. This method demonstrated polynomial computational complexity O(mn2 ) where m is the number of arcs and n nodes [10] solving a problem considered
NP-hard [12].
Chromosome
The chromosome is a tree that can be represented by an adjacency matrix subset of A.
However, a matrix representation has difficulties when it is necessary to reconstruct a unique
route. Consequently, we change the chromosome representation for a list of paths following the
original model. The Figure 2 depicts the structure representation of a chromosome.
Path p0
s
...
d0
Path p1
s
...
d1
.
.
.
Path p(k-1)
s
d(k -1)
Path pk
s
dk
Figure 2: Chromosome as a list of paths.
4.2
SPEA implementation
We design and implement a specific tool to find optimal RMTs. Initial population P0 is
generated by a random tree generator, Algorithm 1. Random tree generator creates random
walks from s node to each destination node di . Also, we implement matrix operations and
processes for creating multicast demands T , operators for chromosomes, including crossover and
mutation. Algorithm 2 is a generalized description that we follow to generate a set of solutions.
84
Simulation scenario
COST239 PAN-EUROPEAN network tests the implementation. The network with 28 nodes
represents the Internet backbone, and connects main cities in Europe. After setting topologies,
we set the parameters, and generate random link failures in the tree of multicast service. Table
2 summarizes simulation parameters. Then, our implementation compute RMTs and they are
stored for analysis.
Parameter
Number of generations
Initial population size
Max population size
Max non-dominant population
Crossover probability
Mutation probability
Value
15
10
30
20
0.2
0.2
Table 2: Simmulation Paremeters
5
Analysis and Conclusions
After running our implementation, we notice that SPEA, for reliability in MPP and network design, is a powerful tool. Figure (3) is an example using COST239 PAN-EUROPEAN
Network as test topology. First, the implementation find a tree, which in our hypothesis is the
original multicast tree (3a). After that, the multiple failures module chooses arcs and delete
them, simulating link failures and a reaction against the service. Then, our implementation of
SPEA for reliable networks identifies those paths that become unreachable and creates new protection routes. Figures (3b) and (3c) show two examples of RMTs calculated to reliable network
infrastructure.
However, COST239 and NSFNet are small network topologies to test real performance of the
implementation. For that reason, we also test it with a generated topology. We create a virtual
85
Figure 3: Multicast network trees
over all network including thirty nodes and twenty-nine arcs for each one. Results are condensed
in Figure (4). We can observe that the network is able, high availability and low cost. When
it experiments failures, our algorithm recalculate the multicast tree and re-configure the paths.
In some cases, the services can be fulfilled without problems; however, services can be degraded
or cancel in the worst case. As a result, we can establish a correlation between the topologies,
mechanisms to find RMTs, and complexity needed to solve alternatives trees. Here we can notice
that our implementation finds new low-cost high-availability trees, even if the failures increase.
We present and investigate a different approach of protection of survivable multicast sessions
in networks. Results allow to optimize designs in both variables: total cost and availability. We
also deploy a SPEA heuristic algorithm to MPP. We use SPEA due to its reduced computational
time and complexity O(mn2 ). After using our approach in the PAN-EUROPEAN and the
NSFNET networks, we find that the performance of the multicast protection schemes is better
when the size of the network is bigger and the network experiences uncorrelated multiple failures.
Moreover, the results show that our implementation is an important tool to support decisions
for a network operator with similar conditions to this scenario. Also, results display the trade-off
between cost and availability in a network and present how multicast sessions can be restored or
re-configured using this tool.
Figure 4: Cost and availability for solutions RMTs
86
Bibliography
[1] James P.G. Sterbenz, David Hutchison, Egemen K. Çetinkaya, Abdul Jabbar, Justin P.
Rohrer, Marcus Schller, Paul Smith, Resilience and survivability in communication networks:
Strategies, principles, and survey of disciplines, Computer Networks, 54(8):1245-1265, 2010.
[2] M. Zotkiewicz, and W. Ben-Ameur, and M. Pióro, Finding Failure-Disjoint Paths for Path
Diversity Protection in Communication Networks, Communications Letters, IEEE 14:776778, 2010.
[3] Jia Weijia, Cao Jiannong, Jia Xiaohua, H. Lee Chan, Design and analysis of an efficient and
reliable atomic multicast protocol, Computer Communications, 21:37-53, 1988.
[4] Medard, M. and Finn, S.G. and Barry, R.A. and Gallager, R.G., Redundant trees for
preplanned recovery in arbitrary vertex-redundant or edge-redundant graphs, Networking,
IEEE/ACM Transactions on, 5(7):641-652, 1999.
[5] A. V. Panyukov, The Steiner Problem in Graphs: Topological Methods of Solution, Automation and Remote Control, 65:439-448, 2004.
[6] H. Takahashi, and A. Matsuyama, An approximate solution for the Steiner problem in graphs,
Math Japonica, 24:573-577, 1980.
[7] K. Singhal Narendra, and Ou Canhui, and Mukherjee Biswanath, Cross-sharing s. self-sharing
trees for protecting multicast sessions in mesh networks, Computer Networks, 50:200-206,
2006.
[8] Yang Chyi-Bao, and Wen Ue-Pyng, Applying tabu search to backup path planning for multicast networks, Computers & Operations Research, 32:2875-2889, 2005.
[9] E. Zitzler, and L. Thiele, Multiobjective evolutionary algorithms: a comparative case study
and the strength Pareto approach, Evolutionary Computation, IEEE Transactions on, 3:257271, 1999.
[10] Deb Kalyanmoy, Multiobjective evolutionary algorithms: a comparative case study and the
strength Pareto approach, Wiley, Chichestr, New York, 2001.
[11] Yezid Donoso and Ramon Fabregat, Multi-Objective Optimization in Computer Networks
Using Metaheuristics, Auerbach Publications, 2007.
[12] Chern Maw-Sheng, On the computational complexity of reliability redundancy allocation in
a series system, Operations Research Letters. 11:309-315, 1992.
8(1):87-96, February, 2013.
Simulating the Need of Working Capital for Decision Making in
Investments
Mariana Nagy
Aurel Vlaicu University of Arad
Romania, 310130, Bd. Revolutiei no.77, Arad
Valentin Burca
SC Leoni Wiring Systems SRL Arad
Casian Butaci, Gabriela Bologa
Agora Univeristy, Oradea, Romania
Piata Tineretului nr.8, 410526
E-mail: [email protected],
[email protected]
Abstract:
Simulation is one of the main instruments within the financial techniques of modeling
decisions in condition of risk. The paper compares a couple of simulation methods for
Sales and their impact on the need of short term financing. For simulating the need
of working capital, the original software implementation is based on the data analysis
and statistical facilities of a common spreadsheet program. The case study aims at
proving the utility of the software for furnishing results with three of the main known
simulation methods and helping the decisional process.
Keywords: investment cycle, working capital, stochastic models, computer simulation, case study.
1
The premises of the operative financing in condition of risk
In the contemporary society that deals with a sum of unexpected events, the knowledge
based management has to accept an uncontrollable component of the economic reality that
needs corrective and, moreover, preventive actions. The managerial decisions taken in conditions
of risk have to limit their effects to values complying with a tolerance set up in advance.
Managers have to be innovative and to find solutions that prevent the negative effects of the
unexpected events. In the context of relaunching of the economy, in the new basic economic cycle,
the main parameters have to be controlled in order to correctly assure the financial resources.
As the usual forecasting methods are based on historical data, the decision maker takes in
consideration the financial and time resources, the construction and validation of models for
the behavior of the company in crisis conditions and the particular type of activities within the
company. Forecasting the financial resources, that means a correct dimensioning of the working
capital, is a pre-condition for fulfilling the company’s short or medium time strategies. [5]
Simulation of the company’s behavior and of the needed working capital integrates the inputs
and outputs of the company’s budget. One of the main components of the budget are the Sales.
Evaluating the historical data for the Sales, the cronograma can be divided in data belonging
to the precedent cycle until the economic recession and data registered during the crisis. From
a statistical point of view, in the post-crisis economical cycle, is recommended to consider only
data before the cycle. In reality, such a simulation deals with errors due to neglecting the anticrisis strategies and the already implemented corrective actions. A better approach is based on
Copyright ⃝
88
the whole set of available historical data that includes also the present phase of relatively weak
economical increase.
2
2.1
Theoretical aspects and proper simulation instruments
Financial calculation flow
Simulating the short term financing uses a set of relatively non-complicated arithmetical
calculations. These are based on the formula for determining the cash conversion cycle, as
deducing the average time for current debts’ payment from the sum of average transformation
time of the stocks and debts in liquidity.
The need of operating working capital is calculated as the cash conversion cycle, measured
in days, multiplied by the Sales - as resulting by the different simulation methods.
2.2
Simulation methods and instruments
For a good preview of the complex economical reality, the scenarios are built-up on repeated
simulations that reflect possible values for monthly sales (x). As the literature presents many
simulation methods, the decision maker has to choose the method for simulating the monthly
sales that best fits his company [3].
The present paper deals with three modeling methods along with a user-friendly computer
implementation, using built-in and user defined spreadsheet functions:
• simulation by using the Random generation number tool;
• simulation by using the inverse of the normal cumulative distribution for the specified mean
and standard deviation;
• simulation by Monte Carlo method [6].
For the decision maker, the software implementation is almost fully automated, in the background being used the advanced tools included in the Data Analysis Tool Pack, a powerful add-in
of MS-Excel.
The technique of Random Number Generation
The Random Number Generation is the most primitive simulation model. It consists of
generating a set of random numbers based on the normal probability distribution of the simulated
variable. The repartition function can be continue or discreet, depending on the type of the
available historical data.
The Random number generator in MS Excel is a complex tool that allows the user to generate
a set of values according to a normal probability distribution, a user defined histogram or a
patterned distribution. [2]
The technique of using the inverse of the normal distribution
The technique of the reverse transformed considers for the simulated variable a probability
function f(x) and a continue repartition function F(x). A random number r ∈ [0,1] is generated;
the simulated variable takes the value that satisfies:
F (x) = r
(1)
that is x = F −1 (r),
(2)
89
where F −1 (r) is the inverse of the F(x) repartition function of the considered variable.
The RAND() spreadsheet function is used for generating normally distributed, below unit
positive numbers. These values are turned than into a set of simulated values by using the
NORMINV (probability, average, standard deviation) function, where probability is the randomly
generated r, average refers to the average of historical data and standard deviation is the measure
of it’s variation.
The Monte Carlo technique
Monte Carlo method is similar to the statistical experiments as the characteristics of the
probability distribution are calculated on the basis of multiple random experiments. The method
is different as it is limited to a discreet probability for the simulated variable
(
Am,n =
x1 · · · xi · · · xn
p1 · · · pi
)
· · · pn
(3)
that depends on a continuous probability function f(x) and a continuous repartition function
F(x). However, it is also a normal distribution of positive below unit values for the simulated
variable x.
The simulation method consists of the following sequential steps:
• building-up of a histogram that reflects the probability distribution of the variable, based
on the historical data;
• simulating as many time as possible the probability of occurrence of each value for the
variable according to the histogram;
• identifying the value of the variable according to the simulated cumulative distribution [6].
The simulated probably values r are than transformed in values for the variable x satisfying
x=F −1 (r), where F −1 (r) is the inverse of the F(x) repartition function of the considered variable.
If applying for monthly sales, based on the simulated probability, an integrated decision
function is used:
IF (r < p1 ; x1 ; IF (r < p2 ; x2 ; ...; IF (r < pn−1 ; xn−1 ; xn )...))
(4)
The minimum number of iterations needed for obtaining relevant results with Monte Carlo
method is given by:
n>
2
z1−
α · σSales
2
d
(5)
2
where σSales is the standard deviation, z1−
α is the theoretic value for α confidence level and
2
d is the maximum admitted error for a chosen accuracy. [3]
2.3
Comparative sensitivity analysis
The comparative analysis underlines some aspects of the utility of simulation procedures in
the decisional process and the sensitivity of the results, depending on the method chosen for
building-up the sample of the simulated values.
In the context of simulating the monthly sales, on the one hand is important to calculate
some basic indicators used in the decisional process - the forecasted need of operating working
capital, the coefficient of variation and the confidence level for the forecast, and on the other
hand, statistic tests are needed for comparing the results obtained with different techniques [1].
90
The need of working capital (NOWC) is the central indicator used for any analysis in condition
of risk, being calculated by weighting the Sales by the probability of occurrence of each value:
N OW C =
n
∑
pi · Salesi
(6)
i=1
For measuring the homogeneity of the simulated time series, the coefficient of variation is
calculated:
%=
∆
· 100,
N OW C
v
u n
u∑
where ∆ = t
pi (Salesi − N OW C)2 · pi
(7)
(8)
i=1
is the standard deviation of the need of working capital calculated on the basis of the simulated
probability distribution.
The values of the forecasted operating working capital cover a confidence interval
N OW C − z1− α2 · σN OW C < N OW Cf orecast < N OW C + z1− α2 · σN OW C
(9)
where z1− α2 are the theoretical values for the Gauss-Laplace distribution [4].
From a practical point of view, important are the two last methods. The statistical tests aims
at comparing for significant differences the values obtained by the technique of reverse transform
and the Monte Carlo simulation. A z-test and a t-test are used.
For analyzing the impact of the simulation method, z-test is applied for the two sets of results
for forecasting the working capital. As the means of the samples are positive, the univariat test
is performed, with α confidence level and the following null hypothesis:
H0 : N OW Creverse_transf ormed − N OW CM onte_Carlo = 0
(10)
The z statistic uses the normal values ztheoretic = z1−α
(11)
and the alternative hypothesis is:
Ha : N OW Creverse_transf ormed > N OW CM onte_Carlo
(12)
For a normal distribution of the sample, the statistics of the test is:
(N OW Crandom_numbers − N OW Cmonte_carlo ) − 0
√ 2
2
(13)
having the mean : N F RErandom_numbers − N F REM onte_Carlo
(14)
zcalculated =
σreverse_transf oemrd +σM onte_Carlo
n
2
2
and the spread of data about the mean is given by: σreverse_transf
ormed + σM onte_Carlo [4].
The software instrument used for describing the probability distribution of the simulated
values is FREQUENCY (data_array; bin_array) where data_array is the array of previously
simulated monthly sales and bin_array refers to the intervals considered for counting the occurrence for each value of the simulated time series.
The z-test is applied using the appropriate statistical instrument included in the Data Analysis
Tool Pack.
3
91
Case study
Let’s consider "Crisis Ltd" a company that makes available data from its balance sheet for
the last financial year and financial documents for the years 2002-2011.
3.1
The financial diagnosis
The items in Figure 1 are given by the balance sheet of the company for 2011. The monthly
Sales of "Crisis Ltd" are presented in Figure 2 as data entry for the application. The chart in
Figure 3 represents the Sales, along with a linear and a polynomial approximation.
Figure 1: Financial data from the balance sheet for 2011
Figure 2: Sales for 2002 -2011
The descriptive statistics [4] in Figure 4 shows that the company is in recession since 2008
significant changes take place in the resources involved in the production and trade flows, in the
need of operating working capital.
Considering the impact of uncertain elements on Sales’ evolution, the model explaines the
main tendency (82.20%) while the random factors are responsible for 16.43% of the Sales cronograma. The seasonality represents only 1.37% in the sales evolution. Applying an F-test /
ANOVA on the monthly means (Figure 5) with a null hypothesis of equal means, leads to
Fcalculated = 0.749 < F0.95;11 = 1.887. The hypothesis of equal means is accepted and confirms
the weak influence of the seasonality.
The modeling of the sales will be based either on 360 values, representing monthly average
sales or on scenarios built on the probability distribution of sales.
The size of the sample is justified by the minimum number of iterations needed for obtaining relevant results with Monte Carlo method. According to (5), for σSalesri =1036.72,
α = 5%, z97.5 =1.96, 2.6% tolerance, d=0.026 · 4234.71 = 110.10, the relevant sample has n> 341
values.
92
Figure 3: Monthly Sales (chart)
Figure 4: Descriptive statistics for Sales (historic data)
Figure 5: ANOVA on monthly Sales
3.2
93
Results of the simulation with different techniques
The simulation spreadsheet is built-up on the presented theoretical basis, applying the probability distribution specific to each method. In order to emphasize the automatic calculation for
distributions and the simulated vales for the need of working capital, the same spreadsheet is
presented in three different views.
Figure 6: The distribution based on the Random number generation
Figure 7: The distribution based on the inverse of the normal cumulative distribution
An intermediary step for implementing the Monte Carlo distribution is based on the probability distribution of the historical data, presented in the right-upper corner of Figure 6.
3.3
Comparative analysis of the results
A first visual comparison for the three distributions is presented in the histogram in Figure
9 and the chart in Figure 10.
94
Figure 8: The distribution based on Monte Carlo technique
The coefficient of variation for the three methods is rather similar and high, proving a low
homogeneity of the time series simulated by each method. The explanation is given by including
in the simulation models the period of crisis, when the sales decreased. The confidence level for
the three considered models is also similar.
Figure 9: Comparative histogram for the three considered distributions
The coefficient of variation for the three methods is rather similar and high, proving a low
homogeneity of the time series simulated by each method. The explanation is given by including
in the data entry the period of crisis, when the sales decreased. The confidence level for the three
considered models is also similar.
As the results obtained by using the Random number generation and the inverse of the normal
cumulative distribution are almost identical, further comparison will take in consideration only
the last two methods: the inverse of the normal cumulative distribution and the Monte Carlo
distribution.
As the maximum values for the need of working capital differs according to the model, it’s
obvious that the probability of fulfilling an optimistic scenario differs too.
The historic data is characterized by the means presented in Figure 11.
Applying z-Test: Two Sample for Means to the sample consisting of 360 values for each
method gives the results presented in Figure 12.
The variance shows the spread of statistic data to the mean, being calculated from the
Figure 10: Comparative chart for the NOWC calculated with the different methods
Figure 11: Average of historic data
Figure 12: z-Test for comparing the two methods
95
96
historical data. Obviously, it presents the same value for both samples: σ=10880.40 u.m.
The z-test reveals a significant difference between the two samples, as |zcalculated | = 2186.038 >
ztheoretic = 1.645. This can be explained by the fact that results based on the inverse of the normal
cumulative distribution are closer to historical data than results based on Monte Carlo simulation. Moreover, it confirms the theory of central limit and recommends the use of continuous
repartition functions rather than the discreet repartition.
4
Conclusions and further work
The three considered simulation methods generate well balanced results for the need of working capital, bearing with similar coefficients of variation. As regarding the means, a significant
difference is registered between the mean of the values simulated with the inverse of the cumulative normal distribution and Monte Carlo method.
In the simulation based on the inverse of the cumulative normal distribution, for converting the randomly generated numbers, a discreet function is recommended, such as the Poisson
repartition function [7].
The spreadsheet can be further developed by fully automating the z-test in order to avoid
any intervention of the decision maker in the calculating process. However, the present software
implementation proves the utility of spreadsheet programs in decision making and offers a relevant
set of data for the need of working capital that can improve management in investments.
Bibliography
[1] Anghelache C., Vintila G., Dumbrava M., Aspects of statistics inference, Theoretical and
Applies Economy (in Romanian), 505(10):41-44, 2006.
[2] Chiou J.R., Cheng L. and Wu H.W., The Determinants of Working Capital Management,
The Journal of American Academy of Business, 10(1):149-155, 2006.
[3] Hayajneh O.S., Ait Yassine F.L., The Impact of Working Capital Efficiency on Profitability
- an Empirical Analysis on Jordanian Manufacturing Firms, International Research Journal
of Finance and Economics, 66: 67-76, 2011.
[4] Hibiki N., Multi-period Stochastic Optimization Models for Dynamic Asset Allocation, Journal of Banking and Finance, 30(2):365-390, 2006.
[5] Lupse V., Dzitac I., Dzitac S., Manolescu A., Manolescu M.J., CRM Kernel-based Integrated
Information System for a SME: An Object-oriented Design, INT J COMPUT COMMUN,
ISSN 1841-9836, 3(S):375-380, 2008.
[6] Muntean C., The Monte Carlo Simulation Technique Applied in the Financial Market, Economy Informatics, 1-4:113-115, 2004.
[7] Shaskia G. Soekhoe, The Effects of Working Capital Management on the Profitability of Dutch
Listed Firms, January, 2012, http://essay.utwente.nl/61448/1/MSc_S_Soekhoe.pdf
8(1):97-104, February, 2013.
Managing Information Technology Security in the Context of
Cyber Crime Trends
Diana-Elena Neghina
Institute of Doctoral Studies - ASE
11, Tache Ionescu Street, Bucharest, Romania
Emil Scarlat
Academy of Economic Studies
Department of Economic Cybernetics
6, Romana Square, Bucharest, Romania
Abstract:
Cyber-attacks can significantly hurt an organization’s IT environment, leading to serious operational disruptions, from simply damaging the first layers of IT security up
to identity theft, data leakage and breaking down networks. Moreover, the dangers
through which current cybercrimes practices affect organizations present a tendency
of developing more rapidly that decision makers can assess them and find countermeasures. Because cyber threats are somewhat new thus a critical source of risks,
within the context of the constantly changing IT environments (e.g. cloud services
integration) organizations may not effectively implement and manage cyber threat
risk assessment processes. This paper highlights the importance of designing effective security strategies and proactively addressing cybercrime issues as key elements
within the organizational risk management approaches.
Malware rises constantly in impact and complexity and has surpassed the traditional
security model. One of the main ideas of the study is to present the main areas of risks
related to cyber security to which an organization is subject to and provide a baseline
of an analysis model that would adequately evaluate input data, rank priorities and
represent the results and solutions to decrease these risks. The importance of this
study is to increase awareness efforts and to highlight the critical importance of using
the full extent of resources provided. Each member of an organization has a significant
role in decreasing the exposure to the vulnerabilities created by cyber-attacks.
Keywords: Cybercrime, IT security, risk assessment, vulnerability management.
1
Introduction
The dangers through which current cybercrimes practices affect organizations present a tendency of developing more rapidly that decision makers can assess them and find countermeasures, exposing entities to significant risks. Vulnerability events are created constantly, through
the exposure over shared environments(such as cloud solutions), permanent transfers of critical
information between an organizations branches, social platforms, electronic banking solutions,
displaying intellectual property, all enabling favorable circumstances for appropriation, disturbances and misuse of critical information and data.
The purpose of this paper is to make entities aware of the changes in the last decades of
cyber threats within network environments and how to effectively approach these elements,
timely finding solutions and actions to be taken. One of the main ideas of the study is to present
the principal areas of risk related to cyber security to which an organization is subject to in the
current IT environments and to determine a specific analysis model that would adequately lead
Copyright ⃝
98
to input data evaluations, rank priorities and represent the results and solutions to mitigate the
identified risks.
On the current market, the tendency of decision makers is still to take unintended or unexpected risks by following classical patterns of behavior and standard models of security, applied
additionally in new environments that are continuously changing, often with the consequence of
significantly affecting organizational value.
Our opinion is that the increasing number of attacks and constantly developing threats are
leading to significant gains that are further supporting the constant adaptation of cyber criminals
to the classical implementation and management of security tools. We consider that our paper is
raising an alarm signal on the lack of perception of the fact that traditional models are starting
to be considered as outdated and decision makers must evaluate the need of combining them with
additional elements from the operational areas to obtain a better understanding; the real issue
is the fact that professional cybercrime tools and methods are so advanced that are continuously
increasing the security issue.
Managements ability of forecasting security breaches that are considered critical for the compliance of a company with its strategies is essential in using corporate information as an asset,
as well as a significant competitive advantage. This further sustains the capability of a close
collaboration between IT resources and operational ones. Technical understanding of events and
processes must be combined with the business perception of processes and activities for entities
to adjust more quickly to challenges and take immediate actions to unexpected events.
Even though IT governance and compliance methods, identity management, applications
security and network tools are becoming more and more complex, are processing information
from different areas of an organizational environment, cyber criminals are also aware of the
changes and are also adjusting their techniques, becoming better at hiding their trace, gaining
undetected access and all for extended durations of time; studies present the fact that they are
characterized by stealthy and passive methods for attacking a system (Schudel, Wood) [1].
Entities must better determine the key elements that are making them a target of cyber
criminals and the way they are perceived by external parties regarding critical information of
interest and act on these vulnerabilities first, these being considered as the initial layers of attack.
All security analysis and prevention methods should begin from the idea of unauthorized
access being gained and core business data being misused, thus leading to appropriate security
measures of classification of critical information (high to low risk assessments). As a general
overview of IT reviews, there are not many entities that are implementing levels of security and
classification of data based on values or risk considerations.
As a summary conclusion, the actual warning is the tendency of cybercrime attacks of continuously becoming more dangerous and complex, evolving into schemes difficult to anticipate.
The attacks are determined to be more destructive, more advanced and more studied (significant resources are invested in these activities, new capabilities are researched based on existing
security tools) and have a serious impact over economic and even national security elements.
2
Cyber-Crime Trends Analysis
The current approaches of cyber criminals may be characterized as proactive. Further, attackers actions are dynamic through their frequency and most aggravating, collaborative, in
the sense that services emerge over the Internet with the scope of committing fraud, theft and
exploitation of systems vulnerabilities.
2.1
99
Brief development of cyber crime
As a summary of cyber-attacks tendencies, during the first periods of modern Internet and
IT environments, cyber-attacks were generally performed by employees, within organizational
networks, generally due to different dissatisfaction reasons. This type of threat was favored until
the 1980 and attackers took advantage of privileged and granted access privileges to IT resources;
they altered information mainly for financial advantages or simply sabotaged data for revenge
at employers.
Studies revealed that programmers developed their abilities of writing malicious software,
including self-replicating programs, to interfere with personal computers. During the 1990s,
financial crime over the Internet completed through penetration and subversion of computer
systems increased significantly. By the late 1990s and in the years following 2000, fraud attacks
and identity theft developed. Moreover, organized attacks started to be performed more often
including groups of cyber criminals and increasing the time of acting out without any detection
(Kabay) [3].
Throughout the 2000s, cyber attackers started to externalize services and offer their unauthorized activities for more complex crimes. There were identified many denial-of-services attacks
opposing known websites and malware was designed to record logs of keystrokes and then send
this information through secure Internet communication channels to cyber criminals.
2.2
Trends and security challenges
Thus, business data, organizational assets are increasingly threatened and traditional IT
security approaches are offering only the basis of solutions. As a general characteristic, the present
IT environment can be considered as reactive, without any time allocated for risks assessments;
cyber criminals are aware of all these exposures and take advantage of them through the use of
end users (social engineering, theft of credentials), phishing attacks, through all sorts of original
deceptions, penetration and encryption techniques to make their trace inaccessible.
As mentioned previously, security events are continuously increasing, in frequency as well as
in impact. However, in the same time quantitative information related to these events are both
difficult to obtain as well as hard to place into a meaningful framework (Shimeall, Williams) [2].
On a general level, organizations should transition from a mainly security based approach
to a more risk assessment approach, thus addressing vulnerabilities within the risk management
planning and methods. Entities should continuously increase their security awareness procedures,
apply active monitoring procedures and complete periodical trainings for all operational and
technical personnel to achieve an effective cyber security stance.
Operational and information technology solutions are shifting towards collaboration environments. IT environments sustain shared resources and services, including core business applications, centralized architecture and infrastructure elements for which entities must put in place
controls to identify and counteract effectively unauthorized actions. Thus there must be put
in place techniques that set the baseline for risk control and business efficiency, outlining the
key issues and introducing risk assessments of specific elements regarding enterprise information
protection and personal privacy.
However, the vast majority of organizations have restricted abilities to identify and take
effective actions against security breaches. Researches show that preventive actions are extremely
hard to implement. Moreover, most of the current analysis tools are based on prior behavior
and activities and may not produce significant models to appropriately identify other attacks or
to be used for efficient preventive actions (Jajodia et al) [4]. Vulnerabilities are thus analyzed
constantly based on prior security events and not on emerging cyber threats, adding value to the
security outline of the organization.
100
However, in practice, defined security assessments performed by organizations are not capable
any more to cope with the continuously evolving threats. These are not focused on collecting
cyber threat data from a variety of business and operational areas and different sources due
to increased time allocation and costs of implied resources. Security methods applied are based
mainly on high level information and tools and technologies implemented are generally configured
based on existing standards, and not customized on the features of the environments that would
have the ability to promptly identify, enclose and restrict, evaluate and restore compromised
characteristics.
With the difficulty of obtaining and updating cyber intelligence information, the implementation of risk management methodologies should be performed. These must be appropriately
designed and used to effectively challenge or even block access attempts, security breaches or
fraudulent transactions. In the following section we present the approach distinguished through
a set of specific tools and techniques that can be easily customized and applied to specific organizations profiles, networks, and significantly improve the already implemented security controls
by organizations.
3
Cyber Threats Risk Management Capability
The risks characteristic for an organization and its related industry shape the operational environment, its readiness and effective response to different interactions with internal and external
environments. The general characteristic of the current organization is its increasing reliance on
technology, information sharing, and connectivity elements. Thus dependence leads to risks at
all levels.
Due to the fact that cyber threats are a relatively new and constant source of risk, continuously changing, entities are not as capable at managing cyber threat risk as they are at managing
any other operational risk related to business activities.
3.1
The fundamentals of a cyber-threat risk assessment process
Unless an organization is considerably developed related to cyber threat risk management
practices, it cannot have the risk assessment infrastructure and governance elements designed to
sustain an adequate security environment. For example, if the basic elements are not defined,
such as specific risk definitions and business impact analysis, risk limits of acceptance or specific
key performance indicators.
If an enterprise cannot sustain at least the above mentioned elements, it is advisable as a
starting point the evaluation of the following set of information security practices that significant
for an appropriate cyber risk assessment process:
1. Existing security controls, implemented by the entity to identify and record known types
of cyber-attacks that are characterized by stealth breaches. Here are included also the security
tools and techniques used to timely identify and contain compromised IT resources;
2. Available methods of recording security breaches information from multiple sources (internal as well as external). Added to this category are the abilities of the entity to implement
cyber-crimes risk models in order to collect relevant cyber intelligence information and generate
value and actionable data for decision making purposes;
3. Exposure of employees to complex social engineering attacks that allows malware to be
integrated in the administrative consoles or workstations. Here are included the procedures of
detecting advanced, persistent threats within the entitys own business environments in the case
of identity theft or unauthorized use of authentication elements.
101
These are not topics for an elaborate analysis process, but they do represent basic elements
of an effective defense mechanism against current cyber-attacks. By applying a more elaborate
cyber threat risk assessment framework (presented in Figure 1. and further detailed in the
following section) an organization can better protect its operational environments as well as gain
valuable insights on its vulnerabilities and improvement actions to be taken.
Figure 1: Model score after boosting
3.2
Core cyber threats risk management capabilities
Below we present a new approach of cyber threat risks that, if unaddressed, may lead to the
security vulnerabilities exploited firstly by attackers.
1. Monitoring and key performance indicators:
Starting from the basic incident driven communication with management, periodical and
formalized communication with decision makers should be implemented. Evidence should be
kept for ongoing dialogue and critical metrics should be reviewed, analyzed and used to improve
already existing security controls.
Key performance indicators should be designed based on cyber intelligence information that
must be logged for predefined periods of time. These must be standardized within the organizational units and have a clear linkage to business value (generally these should be defined in
quantitative terms).
2. Organizational personnel:
Employees must be aware of cyber threat risk, recognize attacks as a potential risk area and
have basic knowledge of designed security policies, processes, and implemented tools; roles and
responsibilities for risk management should be established that would ensure the integration of
specific risk assessments into larger security decisions and controls affecting an organization.
IT personnel must have specialized knowledge about cyber threat risk; moreover, as a required
competency security feeds from operational units must be centrally coordinated to manage and
102
keep cyber threat risks within defined acceptance levels.
Written and approved IT security policies, training missions, and communications must be
effectively distributed and compliance periodically monitored and reviewed for appropriate enforcement; most of employees should have clearly defined responsibilities for cyber risk management, appropriate to their roles and responsibilities.
3. Operational processes:
Organizational business units are characterized by fragmented processes, which are not always
communicating between them, certain manual input and execution of operational activities.
Defined processes must align with enterprise-wide risk management framework above mentioned
and must be monitored by IT personnel and executive management at different levels; processes
must reach the level of being consistently integrated, automated, and clearly documented related
to cyber threats risk assessments.
Organizations must formally measure and monitor process effectiveness; automation must
be viewed as an objective; cyber threat risk management may be organized at a higher level as
a self-standing unit and through which all processes are addressed by continuous improvement
efforts; the overall objective is to accomplish structured cyber threat risk management programs
that are integrated with the already existing IT risk management and enterprise risk management
agendas.
4. Security tools and techniques:
Technology already installed for security reasons must be enabled to log security events, to
centralize them and as a basis must send alerts in case of incidents recorded or exceptions;
signature-based controls such as anti-virus and intrusion-detection software must be implemented. For an evolved organizational environment, forensic tools may be used for reacting
to exceptional security events.
Commercially available threat monitoring feeds may be integrated with centralized logging
solutions and monitoring software to generate automated alerts. For more complex risk management existing security tools may be automatically enabled to perform advanced correlations
related to threat information and to convert obtained data into actionable alerts. These methods applied may be used to automate not just threat monitoring and alerts, but further security
events identified, such as malware, or complete forensic analysis, as well as more complex threat
assessments.
The presented framework of continuously evolving cyber threat risk management capability
model may sustain an organization in the process of implementing better security techniques to
avoid cyber-attacks of criminals that make it past the already existing access controls mechanisms.
For example, a well-developed cyber threat risk management model will include safety elements against unauthorized information distribution, as well as protection opposed to unauthorized information access. Effective completion of these features lead to the use of technologies
and processes that monitor outbound information traffic for both content and destination in
particular. This can be configured as an alert point if data is being transferred to a location
outside the normal operational environment, where the organization has not been present before. An accomplished capability will also be able to contain the transfer of information timely,
to isolate the elements involved and assess the suspicious communication networks until their
authentication credentials are clear; in case of unauthorized actions, procedures are in place for
forensic analysis.
Because cyber-attackers are continuously improving their identity theft techniques, an organization shouldnt assume that each user that authenticates at network level and within the
IT application systems and is performing the required activities with legitimate credentials is in
reality a legitimate user, an employee or an agreed service provider of the entity.
103
A complex cyber threat risk management process will be designed to use at least two verification methods, depending on the classification of the information being protected. These
activities include the verifications of a persons physical identity through techniques including
biometrics such as laptop fingerprint readers, PIN code token devices that must be carried by
the legitimate users at all times, and finally behavioral programs that track post-login activity
against historical patterns defined for a specific user in order to determine the likelihood of that
person authenticating of being legitimate.
4
Discussions
Any cyber-attack can hurt an organization in any number of ways, ranging from minor
damages to a website informative page to shutting down core networks, committing fraud, and
stealing intellectual property. Thus organizations should implement actionable, risk-based intelligence processes in order to timely identify unauthorized cyber activity. Entities should maximize
the use of existing security solutions, logged information and most important, completing cyber
threat awareness programs among employees. As a summary process, an organization should
effectively identify the cyber-criminal risk, assess and evaluate this risk, integrate it, respond
and take isolation actions to the risk, design, implement and test appropriate security controls,
further monitor events, as well as assure and escalate future cyber-attacks.
As mentioned within prior studies, cyber-crime events will become more precise, more specialized, and for this reason organizations must integrate a dedicated cyber threat analysis tools
and IT experts. As a general idea, an organization’s security resources will need to focus more
on analyzing available internal and external data sources, customize controls and center less on
managing and maintaining standard security controls.
5
Conclusions
Entities are responsible for implementing and maintaining an integrated approach between
its employees, operational process, and technology resources implemented in order to complete
effective risk management procedures. Resources must be allocated to gather and process cyber
threat analysis information, notifying the results and defining alerts for better security controls
and measures to be taken by the operational units.
Complex cyber risk management processes are repeatable, clearly defined, well-documented,
and aligned with an organizations larger IT risk management. Future work will focus upon cyber
intelligence collection methods and processing algorithms, behavioral trends of cyber attackers,
which could accommodate customized improvements to the risk management activity of an organization. As the research of cyber security capabilities transforming from raw data to actionable
intelligence will provide valuable cyber threat research. Hence, such an analysis would support
improvements in multiple key threat indicators and metrics related to IT security analysis.
Bibliography
[1] Gregg Schudel, Bradley Wood, Modeling Behavior of the Cyber-Terrorist, in
http : //www.dli.gov.in/data/HACKIN G_IN F ORM AT ION/P RIN T ED20P AP ERS
/M odeling20Behavior20of 20cyber20terrorist.pdf .
[2] Tim Shimeall, Phil Williams, Models of Information Security Trend Analysis, in http :
//www.dli.gov.in/data/HACKIN G_IN F ORM AT ION/P RIN T ED20P AP ERS/
models20f or20inf 20security20T REN D20AN ALY SIS.pdf .
104
[3] M.
E.
Kabay,
MA
Brief
History
http://www.mekabay.com/overviews/history.pdf.
of
Computer
Crime,
in
[4] Sushil Jajodia, Peng Liu, Vipin Swarup, Cliff Wang, Editors, Cyber situational awareness:
Issues and Research, in Springer International Series on ADVANCES IN INFORMATION
SECURITY.
[5] Sumit Ghosh, Elliot Turrini, Editors, Cybercrimes:
Springer-Verlag Berlin Heidelberg, 2010.
A Multidisciplinary Analysis, in
[6] Martin C. Libicki, Cyberdeterrance and Cyberwar, Rand Corporation, 2009.
[7] Jean-Marc Seigneur, Adam Slagell, Collaborative Computer Security and Trust Management,
in Information Science Reference (an imprint of IGI Global), 2010.
8(1):105-110, February, 2013.
Flexible GPS/GPRS based System for Parameters Monitoring in
the District Heating System
Aleksandar Peulic, Snezana Dragicevic,
Zeljko Jovanovic, Radojka Krneta
University of Kragujevac, Technical faculty
Serbia, 32000 Cacak,
E-mail: apeulic, snezad, zeljko.jovanovic,
[email protected]
Abstract:
Energy consumption for heating purposes accounts for a significant part of the budgets of individual and collective users. This increases the importance of issues related
to the monitoring of heating energy flows, analysis of flow parameters, verification
of fees and, in the first place, minimization of energy consumption. The goal of this
paper is to develop, by employing Global Positioning System receivers, measurement
techniques that are suited to the continuous monitoring of the heating substation parameters. This paper presents the design and implementation of GPS/GPRS (Global
Positioning System/General Packet Radio Service) system for low power data acquisition using MSP430 Texas Instruments microcontroller for monitoring of the heating
substation parameters. The system is implemented in heating stations for a temperature and pressure monitoring. It contains GPS/GPRS gateway and 8 analog sensor
inputs. Acquisition module and the server base station are suitable for industrial applications, home applications and for other appliances. The proposed measurement
procedures, which are different from commercially available measurement units, are
based on general-purpose acquisition hardware and processing software, thus guaranteeing the possibility of being easily reconfigured and reprogrammed according to
the specific requirements of different possible fields of application and to their future
developments.
Keywords: Distributed measurement systems, GPS/GPRS, computer data acquisition, low power microcontroller.
1
Introduction
An adequate control of all relevant parameters of the heating process is one of the most
significant means of power consumption optimization. District heating substations, as a link
between hot-water network and internal heating installations in buildings, are used to adapt highpressure hot water to temperature and pressure conditions required by space heating systems
of buildings as well as by the systems for the preparation of hot service water in buildings.
To control the energy transfer in the district heating substation, some kind of control system is
needed. The overall efficiency of district heating could clearly be improved by using new strategies
for measurement and control. To maximize energy efficiency in the district heating network it is
essential to have a large temperature drop across the substation between supplies and return pipes
in the distribution network. A larger temperature drop will contribute to more possible customers
in available district heating networks without increasing the production power. An efficient
system will reduce the amount of wasted energy while maintaining comfort, and indirectly reduce
CO2 emissions for heating purposes, which accounts for 30% of the worlds current CO2. Very
rough estimates shows savings of more than 1 million year when increasing the temperature drop
across the substation between by 5 ◦ C in a 760 GWh district heating system. Today, in most
substation control a system focuses on indoor comfort and do not generally consider temperature
Copyright ⃝
106
drop across the substation, since it is not measured by the control system. The rapid progress in
microprocessor and communication technologies over the last ten years or so has provided great
potential for innovative applications in the field of protection and substation control [1]. There
are a number of applications of different strategies of monitoring and control of district heating
system components: a new control and communication architecture based on WSN and SOA for
district heating substations are developed in [2], [3]. Reference [4] shows the issue of integrating
of intelligent electronic devices (IED) data recording by different IED types and focuses on how
to facilitate the use of the integrated data; The water temperature control of a district heating
substation using soft computing methods, based on fuzzy logic, is presented in [5]. Fuzzy logic
control is implemented and the good performance of the fuzzy control proves that this can be an
alternative to the classic control. The control and monitoring system for the heat distribution
network with a multi-layer structure, which integrates several state-of-the-art technologies and
standards applied in modern industrial automatics, are presented in [6]. The applied control
system and supervisory control algorithms have result in power savings. Reference [7] proposes
an alternative approach to the problem of district heating monitoring parameters selection. The
wireless technology comparing with non-wireless technology has some important benefits, for
example the system cost reduces and easier the installation and maintenance. Some of the most
popular low power wireless sensors networks are ZigBee, Bluetooth, distance between sensors
and the base station is limited to about 1500 m are presented in [8], [9]. This paper presents
GPS/GPRS based wireless acquisition system. The Global Positioning System (GPS), which is a
satellite based system, is the main synchronizing source that is used to provide a time reference
on the communication networks, and its widespread availability makes it possible to obtain,
at each point of the tested system, a clock signal that is synchronized with the one generated
in other remote places. Currently, GPS is the only satellite system with sufficient availability
and accuracy for most distributed monitoring and control applications in distribution systems.
Alternatives will eventually become available, with GALILEO being the most promising at this
time [10].
2
The remote acquisition system
The basis of hardware part of GPRS - based system for data acquisition (GPRSuC) from
remote locations consists of low power MSP430F147 microcontroller and Telit GM-862 GPRS/
GSM/GPS module.
They communicate with each other in the process of collecting and sending data to remote
server. The main objective of the microcontroller is to sample data from eight multiplexed
analog inputs, and to form data blocks with time stamp from GPS sentences. Telit module
is used to send those data blocks to the server base over GPRS system. Besides the standard
functions of the devices used in M2M (Machine to Machine) communication, this module has
a GPS receiver, as well as the GPS dedicated port on which the data obtained from GPS are
shown in NMEA(National Marine Electronics Association) format. System’s operating range is
from -10 ◦ C to + 55 ◦ C, which can be potential problem if GPRSuC is used in environments
with very low temperatures. The GSM modem is made in a way that the RF transmission is not
continuous else, it is packed into bursts at a base frequency about 216 Hz. Firmware is written in
C, and its structure is shown in Fig. 1. Telit module acts as a slave carrying out the commands
that are sent by the microcontroller.
Since the platform is designed to be suitable for the monitoring of the mobile location, GPS
accuracy is very important. After the initialization, moving system variable need to be set.
Complete algorithm depends on the value of this variable. After checking the network status
system is trying to catch GPS signal (see Fig. 1). When there is no GPS signal Telit module
Flexible GPS/GPRS based System for Parameters Monitoring in the District Heating System
107
Figure 1: Flow chart of the firmware
returns NULL in latitude and longitude fields. This routine repeats until it is established that
the GPS signal is captured. Last good GPS position is saved and used in case that GPS signal is
lost due the impact of various barriers, which is especially important when system is moving. If
the system is not moving then saved GPS position is used. Time is extracted from GPS sentence
in short routine and it is used even though the system is not moving.
3
Communication resources used for remote aquisition
General Packet Radio Services standard allows data transfer in a completely different way
from the Circuit Switch Data (CSD) type of transmission. In CSD, data is transmitted by establishing connection with other, remote modem, directly, so all devices in between are used to
provide simulation of the physical connection between the final points (point-to-point connection). Besides the obvious disadvantage in terms of low utilization of network resources, there
are also problems of long delays in establishing a connection and high fees for using network
resources, based on time period of established link, not amount of data like in GPRS [11]. One
message from remote acquisition system consists of eight measured parameters, time stamp and
identification field, and has near 110 bytes, so it is appropriate to use GPRS system. Mobile
operators provide fixed IP address service, and it is possible to achieve communication in both
directions with changes in overall software of the system. In realization of GPRSuC prototype
this service is not used in order to make remote acquisition platform as cheap as possible. As
practical part of this project is regarded, its main goal is to presents the data collected in a way,
which was explained in the previous chapter. The entire software solution has been realized using
the open source J2EE technology. One of the reasons why the J2EE technology is selected is the
possibility of extension to mention the system used on mobile phones in a J2ME applications.
Completed software solution can be roughly divided into two parts, which exchange data with
hardware and presents results to the client The part of the software used to exchange data with
hardware, is the interface with the hardware support. This part of the software has no visual
interpretation and is executed only when the http request is passed to the Servlet by a device
108
that forwards data. It has been develop as a Java Servlet, which call by hardware is using the
HTTP GET request. To the appropriate data Servlet that performs the same processing and
saves them in the MySQL database. The server confirms a successful reception of data and it is
able to send the parameters which correct the way that device work. Since the Servlet can be
accessed with the use of HTTP requests from anywhere in the world, this way of communication
gives this project one global level. There is the ability of usage in different cities, countries and
even continents. The preference listed in the realization of communication in addition to the
great advantages is also a defect that opens a project to the attacks and simulation of some
other persons who would like to emulate values of the passed parameters. As the container,
Apache Tomcat version, 6.0.16 is used. The project is implemented as a Web application, which
is located on the server in Laboratory. It provides current monitoring of more parameters from
one station and displays their values in real time. Since this system has a role in monitoring,
the value of parameters for easier viewing, the marker may be green, yellow or red depending
on the values of parameters, i.e. whether they allow or do not allow critical range, respectively.
In addition to transferring data, hardware has a role that does the processing of data, and as
a parameter sends its status. Processing is not on the server from the simple reason that for
n stations which monitoring m parameters would representing an (m x n) processing every few
seconds. It is very important that every measured value is stored in a MySql database and thanks
to that it could be seen whenever it is needed. This is important because the further analysis
of data is possible in a very easy way. Since a system sends GPS coordinates, GIS support is
used for better data presentation. This is especially important with moving systems. It is also
interesting using GIS presentation when there is more then one system for tracking and their
positions are easy to monitor. As a geographical support in the project, the Google Maps API
is used. It provides great opportunities thanks to the entire globe coverage with satellite and
aero-photo shots of high resolution. The principle of working with Google Maps API is that
a complete GIS system is on Google’s server. The user passes the coordinates and parameters
for the display, to the corresponding server that replies by sending the required graphic content.
Google Maps usage is free of charge and only Google key is needed. The Google supplies the key
and a Gmail account (mail) is required. In order to get the key, it is needed to enter the URL of
the Web server on which Google Maps will be used. Mouse click on the marker surface displays
current parameter values for wanted system (see Fig. 2).
Figure 2: Multiple systems monitoring with wanted system data presentation
If live monitoring for one system is needed, web application offers functionality (see Fig.
3) where user activity is not needed since AJAX is refreshing a web page after any data value
change. Alarm values are shown with red background color. There is also a possibility for history
observation of measured parameters values with table or chart view.
Flexible GPS/GPRS based System for Parameters Monitoring in the District Heating System
109
Figure 3: Live temperature and pressure monitoring in laboratory area
4
In this paper, a flexible measurement system for parameters monitoring in district heating
system is proposed. This system is able to react rapidly to any incidental parameters changes
and alarm responsible server in base station. GSM network is widespread, reliable and cheap.
GPRSuC system describes an attempt of integration low power microcontroller, Telit module
and server applications into a distributed system for data acquisition and monitoring remote
measurement sites. The flexibility of system arises from the use of general-purpose acquisition
hardware, which allows the system to be easily upgraded and/or reconfigured according to the
specific measurement needs existing and evolving in modern District Heating systems. Communication of the system is complete wireless, easily operable and low power. To maximize
the energy efficiency in the district heating network, it is essential to have a large temperature
drop across the substation between supplies and return pipes in the distribution network and
the proposed system has and economic reasons for implementation. The proposed measurement
systems could be further improved simply by using more sophisticated acquisition hardware.
Bibliography
[1] P. Bornard, Power system protection and substation control: trends, opportunities and problems,International Journal of Electrical Power and Energy Systems, 10(2):101-109, 1998.
[2] J. Gustafsson, Distributed Wireless Control Strategies for District Heating Substations, Licentiate thesis, Dept. of Comp. Sci. and Elect. Eng., Lulea University of Technology, Lulea,
Sweden, 2009.
[3] J. V. Deventer, J. Gustafsson, J. Delsing, J. Eliasson, Wireless Infrastructure in a District
Heating Substation,IEEE Int. Sys. Conf., pp. 139-143, Vancouver, Canada, 2009.
[4] M. Kezunovic, T. Popovic, Substation Data Integration for Automated Data Analysis Systems,IEEE PES General Meeting, pp. 1-6, Tampa, Florida, 2007.
[5] L. Mastacan, I. Olah, C. C. Dosoftei, District Heating Substations Water Temperature Control Based on Soft Computing Technology,6th Int. Conf. on Electromechanical and Power
Systems, pp. 172-175, Rep.Moldova, 2007.
[6] W. Grega, K. Kolek, Monitoring and Control of Heat Distribution,Int. Carpathian Control
Conf. ICCC, pp.439-444, Malenovice, Czech Republic, 2002.
110
[7] P. Malinowski, P. Ziembicki, Analysis of District Heating Network Monitoring by Neural
Networks Classification,Journal of Civil Engineering and Management, 12(1):21-28, 2006.
[8] J. A. Gutierrez, M. Naeve, E. Callaway, M. Bourgeois, V. Mitter, B.Heile, IEEE 802.15.4: a
developing standard for low-power low-cost wireless personal area networks,IEEE Network,
15(5):12-19, 2002.
[9] A. Z. Alkar, An internet based wireless home automation system for multifunctional devices,
IEEE Trans. Consumer Electronics, 51(4):1169-1174, 2005.
[10] IEEE Standard for Synchrophasors for Power Systems, IEEE Std. C37.118-2005 (Revision
of IEEE Std. 1344-1995), 2006.
[11] A. Alheraish, Design and implementation of home automation system, IEEE Trans. Consumer Electronics, 50(4):1087- 1092, 2004.
8(1):111-126, February, 2013.
Radio Resource Adaptive Adjustment in Future Wireless
Systems Based on Application Demands
Emanuel Puschita
Tudor Palade, Rebeca Colda
Irina Vermesan, Ancuta Moldovan
Dept. of Communications, Technical University of Cluj-Napoca,
28 Memorandumului Str., Cluj-Napoca, Romania
[email protected]
Abstract:
In wireless communication systems the resource management needs to integrate adaptive techniques to the varying network conditions, due to the eventual dramatic
changes that may occur in the link quality. Therefore, it may be desirable to support
adaptable resource management techniques that are able to find their decisions in
the network configuration information or in the source application description. Accordingly, the paper identifies, explores and proposes adaptive techniques for resource
management so as to enhance the transmission quality on wireless systems either
through a feedback channel or by making use of the network virtualization concept.
Setting up dependencies between the application requests and the radio channel conditions, a feedback loop adaptively configures modulation and coding schemes, calibrates multi-antenna system, controls power per beam allocation or invokes a linear
precoding. Finally, when the application requests exceed the network capacity, by the
network virtualization process the adaptive potential of the application parameters
can be employed, either through source fragmentation or source code adaptation.
Keywords: Resource management, adaptive techniques, feedback loop, network virtualization.
1
Introduction
Since the radio resources are limited and the demand for more and more complex wireless
services is increasing, adaptive techniques are considered as powerful means for improving the link
performance of future wireless networks [1]. The proper delivery of a certain application is usually
conditioned by a set of source application’s requested parameters, for which some minimum
requirements are imposed. By providing an adaptive controlling of the system resources, both
from the network’s and application’s perspective, a better radio resource management can be
achieved, as well as an improved transmission quality adapted to the varying channel conditions.
The goal of the paper is to analyze and highlight the adaptive potential of different techniques:
either at the physical layer (PHY), based on a feedback loop, or at the application layer (APP),
based on network virtualization. A similar approach can also be found in [2], where the effect
of some parameters such as: number of transmitting nodes, packet length, modulation scheme
and mobile nodes speed was investigated. If in [2] the improvements obtained by applying each
of these adaptive techniques were viewed separately, only at a block level, the current paper
goes beyond, and targets to a more unified approach by analyzing the impact of applying these
techniques at a system level. Also, this idea of adaptability, started in [2], is continued with the
most recent in use standards and concepts, IEEE 802.16e Wireless Metropolitan Area Networks
(WMAN), IEEE 802.11n Wireless Local Area Networks (WLAN) and the network virtualization
process. Other adaptive techniques approaches, at block level, were developed in [3] where it
Copyright ⃝
112
is shown that, in order to meet the performance requirements, the employed modulation and
coding schemes vary with the channel conditions. Also, an antenna management algorithm is
presented in [4]; it can adaptively disable some of the employed antennas of the system so to meet
the capacity and the energy per bit constraints. In [5], spatial diversity is considered as means to
enhance the throughput of the WLAN physical layer, under different channel conditions, without
the transmitter being aware of the channel variation. Thus, by an adaptive controlling of the
network configuration parameters or by an adaptive adjusting of the application transmission
parameters, a more efficient use of the available resources can be obtained in order to better
satisfy the application requested parameters.
The conducted analyses, developed around this concept of dynamic network awareness and
dynamic control of the available radio resources, based on the received feedback information, will
be extremely important for the development of future wireless systems characterized by high data
rate services having strictly quality of service (QoS) demands [6]. In order to demonstrate that,
the paper is organized in seven sections, as follows: after a short introduction presented in Section
1, Section 2 focuses on the identification and grouping of these adaptive techniques into two main
categories, indicating also their adaptive potential. By making use of an adaptive technique at
the network configuration level, the transmission performances can be greatly increased. Section
3 aims to evaluate the network’s adaptive potential, at the transmitter side, mainly with respect
to the adaptive modulation and coding (AMC) schemes that can be employed. By implementing
such a mechanism, significant improvements of the system’s performance can be obtained. In
Section 4, the air interface reconfiguration is proposed, due to the channel capacity dependency
on the number of active antennas as well as on the propagation channel state. In Section 5,
we evaluate the adaptive potential of spatial transmit processing techniques, when a precoder is
used to optimize pertinent criteria. When due to operational costs or technical constrains the
adaptability level with respect to the network configuration parameters is limited, the application’s transmission parameters must be adjusted. Section 6 illustrates the adaptive potential at
the application level, by making use of network virtualization. Finally, the conclusions are drawn
in Section 7.
2
Adaptive Techniques in a Wireless System: Identification and
Potential
The paper identifies and further evaluates the adaptive potential of some techniques that can
be used for resource management in wireless systems. The paper groups these adaptive techniques in two major categories, as follows: (1) PHY adaptive techniques, based on a feedback
loop: modulation and coding schemes, multi-antennas system configuration, optimal power allocation and linear precoding, and (2) APP adaptive techniques, based on network virtualization:
application layer source packet size or number of transmitted packets.
The adaptive solutions synthesized in Figure 1a illustrate the adaptive potential of different
transmitter (Tx) / receiver (Rx) blocks. These techniques need or include a feedback reaction
from the physical network level. By using different configuration techniques on the Tx / Rx
blocks, the available network resources can be efficiently adapted to the imposed source application requested parameters.
Some critical conditions or cost constrains could limit the network’s resource optimization
level. These situations impose the adaptation of the source application parameters to the network
available resources. In order to indicate the adaptive potential, but in a strict parametric control
way, at the application level, the paper introduces the concept of network virtualization. Network
virtualization is a technique that consists in clustering logical resources into virtual networks
Demands
113
from the physical network infrastructure. The objective of this virtualization process is to make
each virtual network appear to the user as a dedicated network infrastructure, with dedicated
resources and services available for application requests. Therefore, the adaptive potential of
the application parameters network virtualization, presented in Figure 1b, invokes a model of
selecting the virtual network that best integrates and satisfies the application requests at the
physical network level.
Figure 1: (a) Adaptive resource management techniques through a feedback loop (left), and (b)
Adaptive resource management techniques based on the network virtualization process (right).
Thus, in the next Sections of this paper, different resource management techniques are evaluated both at the network and application level, by highlighting their adaptive potential.
3
Adaptive Potential of Transmitter Modulation and Coding
Schemes
Link adaptation techniques rely on the dynamic configuration of certain transmission parameters such as the modulation and coding scheme, according to the variable channel conditions,
given some constraints imposed by the communication system. These constraints can be expressed in terms of a target bit error rate (BER) and throughput. Several mechanisms can be
employed to maximize the throughput in a time varying channel, but all of these involve the
presence of feedback between the transmission and the reception. Feedback is critical especially
when we refer to adaptive modulation and coding (AMC). In this case, for an efficient management of the available resources, the transmitter needs to be able to anticipate the channel’s
variations and adapt accordingly to it. In Figure 2 this link adaptation mechanism is illustrated,
as well as the parameters taken into account.
The benefits that result from applying a link adaptation technique, in this case AMC,
will be illustrated for the case of a wireless metropolitan area system (IEEE 802.16e). The
WirelessMAN-OFDMA (Orthogonal Frequency Division Multiple Access), basic radio interface
both for portable and mobile applications, has been selected for the performed simulations [7].
System Parameters: The main values for the system parameters are presented in Table 1,
and are set in accordance to the recommendations from [7] and [8], such as to build a realistic
system. A downlink transmission was considered, from the BS (Base Station) transmitter to the
MS (Mobile Station) receiver.
The radio channel plays a key role in the link adaptation process, with respect to the evaluation of the transmitter’s parameters. One of the most used set of channel models for the
simulation of different types of environments affected by frequency-selective fading is the ITU-R
set of channel models [8]. The parameters characterizing the Ped.B channel model, which is used
for modeling indoor to outdoor pedestrian single input single output (SISO) environments can
114
Figure 2: AMC parameters and feedback information.
Parameter
Carrier frequency (GHz)
Channel bandwidth (MHz)
Transmitter power (dBm)
BS antenna gain (dBi)
MS antenna gain (dBi)
FFT size
Subcarrier allocation mode
Duplexing scheme
Velocity of the MS (km/h)
Distance between BS and MS (km)
Value
2.3
10
20
15
0
1024
DL PUSC
TDD
3
0.1
Table 1: System parameters.
be found in [8]. In this case the channel profile is determined by the number of multipath taps
that reach the receiver, and the relative delay and average power associated with each individual
multipath component. Such a channel can be considered as a good representative of an urban
macro-cellular environment.
System Requirements: As expressed in [9] for most applications (especially multimedia)
a BER less than 10−6 is required. Also, for being able to satisfy even the most bandwidth
consuming applications, we impose a minimum of 1Mbps link throughput even under a worst
case scenario.
System Analysis: The BER performance for various modulation schemes as a function of
the received SNR (Signal-to-Noise Ratio) is presented in Figure 3, for the case of a convolutional
turbo coding (CTC) scheme. Such a coding scheme is recommended to be used especially when
referring to wireless applications on anticipated non line-of-sight environments. In this way, it
is possible to counteract the unwanted effects which might appear due to the different delays of
the multipath components which may lead to inter-symbol interference (ISI) [10].
A certain imposed QoS level, in 802.16 systems, from a PHY perspective can be assured by
making use of adaptive modulation and variable FEC (Forward Error Correction) coding. Such
a technique, known under the name of adaptive modulation and coding (AMC) can be applied
on a sub-carrier basis, according to the random fluctuations of the radio channel [3] [11]. In this
case, different spectrally efficient modes (where a mode is defined as a combination of modulation
Demands
115
Figure 3: BER versus SNR system performance.
and FEC coding rate) can be alternated for increasing the throughput, assuming that the signalto-noise (SNR) thresholds required for passing from one mode to another are available at the
transmitter. In applying AMC, feedback information related to the current state of the radio
channel is needed. The MS can feed back channel state information (CSI), that can be used by
the BS scheduler to assign a modulation and coding scheme that maximizes the throughput for
the available SNR, using for this the Channel Quality Indicator CHannel (CQICH) included in
the TDD uplink subframe.
In Figure 4a, the link throughput versus SNR envelope is presented, taking into account the
application error requirements imposed above (BER <10−6 ). For high SNR values the highest
order throughput scheme will be selected (64QAM 5/6) in order to efficiently use the channel’s
capacity. During deep fades when the quality of the radio channel degrades rapidly in time,
a lower order throughput scheme will be employed (QPSK 1/2) in order to avoid an excessive
number of dropped packets as well as to avoid losing the connection quality and link stability.
A similar envelope generated using AMC, to increase or decrease the link speed depending on
the received SNR, but this time with respect to the maximum operating distance, is presented in
Figure 4b. The maximum operating distance that can be reached by using a certain link speed
was derived from the path loss equation for indoor to outdoor pedestrian environments expressed
in [8] and the link budget equation expressed in [3].
Figure 4: Ped.B channel model: (a) Total achievable link throughput (left). (b) Maximum
operating range (right).
Based on the application constraints and on the fact that each mode needs a certain robust-
116
ness level in order to be activated (a minimum SNR value) we conclude that each mode is optimal
to be used in a different channel quality region. An example of a lookup table at the transmitter,
where both the SNR domain and the maximum operating range for each of the possible modes
that can be employed, is presented in Table 2.
Source application requested parameters: BER < 10−6 , Data Rate > 1 Mbps
SNR (dB) Operating range (m)
Modulation and coding scheme (MCS)
12.1 - 17
178 - 127
QPSK 1/2
QPSK 3/4
17 - 24
127 - 95
16QAM 1/2
16QAM 3/4
24 - 29.1
95 - 68
64QAM 1/2
29.1 - 30.1
68 - 62
64QAM 2/3
30.1 - 32.9
62 - 55
64QAM 3/4
> 32.9
< 55
64QAM 5/6
Table 2: Transmitter lookup table for the case of the Ped.B channel model.
If the channel’s variations are sufficiently slow and the channel quality information (received
SNR) can be fed back to the transmitter, then by making use of the AMC technique an optimum
use of the available radio resources is obtained. Still, in applying such a technique a cross layer
interaction between the PHY and APP layers must exist, as the system needs to know about
the application’s QoS requirements as well as the received SNR in order to be able to determine
the appropriate switching point and link speed for the current flow [6].
4
Adaptive Potential of Multi-Antennas System Configuration
The main idea of this section is to explain the need for an adaptive configuration of a Uniform
Linear Array - Multiple Input Multiple Output (ULA - MIMO) system in what concerns the
number of active antenna elements used at the transmitter and receiver, when the inter-element
distance is constant. The basic criteria which dictate the system reconfiguration are the channel
status, the changeability of the propagation environments, conjunctively with the constraints
related to the channel capacity in bps/Hz needed to support the transmission of the required
amount of information. Figure 5 depicts the channel and the air interface parameters that are
taken into account when performing the adaptability of multi antenna system configuration.
System Parameters: The MIMO air interface is characterized by the number of antenna
elements used at both transmitter and receiver side and by the inter - element distance, given in
wavelength, λ. We consider a 4× 4 uniform linear transceiver with an inter- element distance of
0.5λ. At this value, the correlation degree at both ends of the communication link provides a low
diversity order which characterizes the real propagation environments [12]. In this paragraph we
use a uniform power allocation scheme and a variable number of simultaneous active antennas.
The channel matrix that describes the behaviour of the propagation environment was modeled
for a frequency- flat channel, affected by Rayleigh fading. The matrix was derived by integrating
the influence of spatial correlation between the transmitted and received signals. The correlation matrices are dependent on the PDP (Power Delay Profile), the physical ray propagation
parameters described in [13] like: AOA/ AOD (Angle of Arrival/ Angle of Departure), number
of clusters, PAS (Power Azimuth Spectrum), but more strongly on AS (Angle Spread) and on
the inter-element distance [14] and [15].
In this paper, two channel matrices are built in order to model environments like residential/small offices (channel C) and large indoor spaces (channel F). The former is characterized
Demands
117
Figure 5: Air interface parameters and feedback information in a multi-antennas system.
by a RMS delay spread of 30 µsec and 2 clusters, while the later has a larger RMS delay spread
of 150 µsec and 6 clusters.
System Requirements: We start showing the importance of an adaptive multi-antennas
system by considering that the required channel capacity is 10 bps/Hz. The evaluation is performed for a system that is subject to the radio channel variations within the same propagation
environment, but also to the changes between different types of propagation environments.
4.1
Antennas-System Adaptive Potential on the Channel Variation
Because of the variability of the wireless channel, the system configuration that needs to be
employed in order to meet the channel capacity constraint (C = 10 bps/Hz) can vary over the
SNR domain. The channel capacity variation with SNR, obtained after modeling the channel
type C for all antenna combinations, is depicted in Figure 6.
Figure 6: Channel C capacity (bps/Hz).
Table 3 gives the optimal configuration when the system has only CSIR (Channel State
Information at Receiver) and considering that the transmitter has a fixed number of antenna
elements, from 1 to 4. When the transmitter has only one antenna, the capacity constraint could
be fulfilled only if the signal-to-noise ratio is high (in the range of 20 to 25 dB) and the number
of receive antennas is 4. For this situation, the capacity constraint cannot be fulfilled when the
channel state aggravates, so a higher number of transmit elements would be needed (e.g. 2 or
in some cases even 3 or 4). For lower SNR domains, a higher order configuration needs to be
employed in the system than for the situations with a higher signal-to-noise ratio.
118
SNR (dB)
Tx=1
Tx=2
Tx=3
Tx=4
5-9
Rx=4
Rx=4
10-14
Rx=4
Rx=3
Rx=3
15-19
Rx=3
Rx=3
Rx=3
20-25
Rx=4
Rx=3
Rx=2
Rx=2
Table 3: CSIR MIMO configurations for C=10bps/Hz.
In order to make sure that the required channel capacity is satisfied even under bad channel
conditions, the system needs to be reconfigurable. To that effect, a feedback link to the transmitter must exist in order to properly decide which is the most suitable configuration that can
be used to assure the required channel capacity. On the other hand, if the system had CSIT, the
posible needed configurations would be as the one proposed in Table 4; the configurations are
chosen so that the receiver would have the minimum number of active antennas.
SNR (dB)
Tx x Rx
5-9
3x4
10-14
3x3
15-19
2x3
20-25
4x2
Table 4: CSIT MIMO configuration for C = 10 bps/Hz, model C
On the assumption in [16], the costs of 2×2 and 4×4 reconfigurable MIMO system are 1.5
and 2.5, respectively, times the cost of a SISO (Single Input Single Output). For this reason,
the required feedback information at transmitter shown in Table 4 is correlated not only for
guaranteeing the minimum required channel capacity, but also to meet the cost reasons.
4.2
Antennas-System Adaptive Potential Based on Channel Characteristics
The MIMO system reconfiguration can also be requested when the propagation environment
changes (e.g. from a small office to a large space), under the same channel capacity constraints.
In the F channel type, due to its characteristic propagation parameters, the spatial correlation
degree at transmitter and receiver are different from those in the C channel type [17]. As a
consequence, the MIMO system configurations that guarantee a C = 10 bps/Hz at different SNR
domains are different than those used in the first propagation environment. Assuming that the
feedback link exists in the MIMO system, the employed configurations in model F are analogously
specified in Table 5.
SNR (dB)
Tx x Rx
5-9
3x4
10-14
4x3
15-19
3x3
20-25
3x2
Table 5: CSIT MIMO configuration for C = 10 bps/Hz, model F.
As a consequence of the variability of the wireless communication channel, of the mobility
of the end users within the same channel environment and to other propagation surroundings,
there must exist a reconfigurable MIMO system at transmitter and receiver in order to meet the
application constraints.
In this section it was shown the importance of MIMO air interface reconfigurability at both
transmitter and receiver such that the imposed channel capacity (which in turn is related to
the maximum amount of information that can be reliably transmitted over the bandwidth of
the communication channel) can be satisfied regardless the SNR values and the propagation
Demands
119
environment. An important condition that can allow the reconfigurability at the transmitter is
the presence of a feedback channel that transports the required channel state information (e.g.
channel matrix, SNR) in concordance the reconfiguration is performed with.
The advantages of a feedback channel are emphasized in the following, considering a MIMO
system that operates in channel C type. For these analyses, we start from the same system
requirements conditions, which impose a minimum channel capacity of 10 bps/Hz. In addition,
we consider a SNR level of 20 dB. From Tab.4 the required antennas configuration that can
satisfy these two restrictions is 4×2.
5
Adaptive Potential of Spatial Transmitting Processing Techniques
The achievable performances enhanced due to the use of spatial processing techniques depend
on the nature of the channel state information (CSI) available at the transmitter and receiver
[18].We assume perfect CSI at the receiver and evaluate the adaptive potential of the system
considering that either channel quality indicator (CQI) (received SNR), or statistic channel
information (channel mean) can be exploited at the transmitter. The idea is to adapt the
application requests, according to the capabilities of the network at the physical layer.
Figure 7: Adaptive potential of spatial transmit processing techniques.
The amount of channel knowledge at the transmitter has a direct influence on the adaptive
adjustment capability of the system. It establishes the design parameters that can be adapted to
different channel conditions. As they are mentioned in Figure 7, in the data processing blocks,
the parameters are: the MIMO transmission technique, the codeword matrix, the modulation
scheme and the constellation size, the beams direction, the power allocated to each beam, subject
to a total transmit power constraint.
5.1
Multi-Antennas System Adaptive Potential through MIMO Encoding
Scheme
The approach is based on switching between different transmission algorithms, depending
on the variable channel condition, in order to provide the application with the requested QoS
profile.
120
System parameters: We consider an uncorrelated 2×2 MIMO system, a 4-PSK modulation
and a quasi-static flat fading channel model. Two open-loop encoding techniques are to be used
at the transmitter: STC (Space-Time Codes) [19], [20] which improve the link reliability thanks
to transmit diversity, and SM (Spatial Multiplexing) [21] used to increase the peak error free
data rate by transmitting separate data streams from each antenna. At the receiver, optimal
ML (Maximum Likelihood) detection is performed in both cases. The packet length considered
for the throughput computation is 200 bytes.
System requirements: The application requires a minimum throughput value of 1 Mbps.
Figure 8: Capacity gain in multi-antennas systems with perfect CSIT.
Based on the CQI, the transmitter can select the MIMO encoding scheme that maximizes the
throughput link. As it can be depicted in Figure 8, the throughput switching point, for a 2×2
MIMO system, is around a SNR = 14 dB. At low to medium values of the SNR, STCs provide
a highest throughput due to their robustness against poor channel conditions. At high values of
the SNR, SM is the best choice as it provides a high error-free data rate.
5.2
Multi-Antennas System Adaptive Potential by Optimal Power Allocation
System parameters: In Section 4 it was shown that, if the MIMO channel follows the IEEE
802.11n C design model, in order to ensure a minimum capacity value of 10 bps/Hz, at a SNR =
20 dB, a possible transmit/receive antennas configuration is 4×2. In this section, based on the
same antennas configuration, we evaluate the capacity gain that is obtained if the transmitter
has perfect CSIT [22].
System requirements: Channel capacity maximization with an efficient use of the radio
resources.
The transmitter relies on perfect CSIT to distribute the available transmit power based on the
channel modes. Through waterfilling (WF) more power is allocated to the strongest eigenmodes,
as it can be depicted in Figure 9, whereas without CSIT, the power is equally divided (EP Equal
Power) between the transmit antennas. The capacity gain due to CSIT is significant, reaching
2.5 bps/Hz at a SNR = 20 dB for a spatially correlated 4×2 MIMO system. Due to efficient
use of the transmit power, through optimal power allocation, a minimum capacity of 10 bps/Hz
can be ensured with the same number of transmit and receive antennas (4×2), even if the SNR
value decreases to 12 dB.
5.3
Multi-Antennas System Adaptive Potential through Linear Precoding
System parameters: We consider MIMO systems having the same diversity order, an
Demands
121
Figure 9: Capacity gain and power allocation in a correlated 4×2 system with perfect CSIT.
uncorrelated flat fading channel model and a ML detection. The transmitter uses STBC (SpaceTime Block Codes), a special case of space-time codes. A combination of STBC codeword
matrices, characterize by their coding rate, and modulation schemes are selected in order to
ensure a spectral efficiency of 2 bits/s/Hz for each MIMO transmission. Also, we consider that,
through a feedback channel, an estimated channel matrix is available at the transmitter [23].
System requirements: The imposed application requests a minimum throughput value of
5 Mbps, which has to be satisfied also for low SNR values. In order to fully exploit the presence
of multi-antennas, the linear precoding aims to adapt the transmitted signal to the channel state,
by adjusting the beams directions and the power allocated to each beam. In [23] it was shown
that, the data preprocessing is based on the noise variance, the eigenvalues of the estimated
channel matrix and the eigenvalues of the codeword error matrix.
The simulation results are depicted in Figure 10. Both CSIT and no CSIT transmissions were
considered. For the precoder design it is assumed a small CSIT error. Without adaptation, for
low values of the SNR, the application requirements are satisfied by the system using 2 transmit
antennas and 4-PSK modulation as it is more robust against noise. If the channel condition
is available at the transmitter, by means of precoding, the transmit data can be adapted to
the channel state. The highest throughput improvement is obtained for the systems with a
higher number of transmit antennas. A link throughput value of 5 Mbps can be provided by
the 4×4 MIMO precoded system, if the channel quality indicator is above -5 dB, while without
adaptation, a minimum of 1 dB is needed to fulfil the same requirements. In the case of a 4x2
system, the SNR gain for the same throughput is about 3 dB, while for the 2×4 the SNR gain
is only 1 dB.
Regarding the adaptive potential, channel knowledge at the transmitter is essential in the
low SNR region and can be efficiently exploited in high transmit diversity systems.
6
Adaptive Potential of the Application Parameters
The model used to demonstrate the adaptive potential of the application parameters through
network virtualization is the I-NAME (In-Network Autonomic Management Environment) QoS
model, a model that calibrates the source application requests to the available network resources
[24]. For the I-NAME model, the network virtualization process is assisted by the QoS profiles,
a set of messages including QoS parametric description of resources.
Network virtualization process assumes similar parameterization of resources in terms of
delay, throughput and jitter, both for application and network elements, as presented in Figure
11. The application QoS profile encapsulates the parameters initially requested by the source
122
Figure 10: Throughput enhancement in precoded multi-antennas systems.
application, further identified in the network and finally agreed for a virtual network established
between the source and the destination node. Assuming that multiple virtual networks may share
the same overlaying physical network infrastructure [25], the network QoS profile involves the
identification and selection of network elements that integrate the application’s specific requests
for resources. Based on the use of QoS profiles message exchange, the I-NAME model indicates
the best path to the destination in terms of the selected virtual network elements and of the
most appropriate source application configuration settings, through adaptation.
Figure 11: Adaptive potential of the application parameters.
System Parameters: To demonstrate the adaptive potential of the application parameters,
we test the behaviour of time-critical applications under specific network conditions: with the
I-NAME QoS support model for adaptation, over a BE (Best-Effort) network support, and based
on QoS network layer classification (IP Precedence 3 and 6). Different application behaviours
were modeled by varying the packet size in the source application between 200 and 1600 bytes.
Using QualNet Developer 4.5, a network scenario which models two radio access segments (IEEE
802.11 and 802.16) connected through a core network segment composed by multiple wired or
wireless connections was built. The source application was located in the IEEE 802.11 network
segment, while the destination node was located in the IEEE 802.16 access segment.
System Requirements: The requirement for resources is included in the application QoS
Demands
123
profile in terms of a minimum requested delay of 0.01 sec and a minimum requested jitter of 0.1
sec.
As we have already mentioned, the conjunction between the source application requests and
the network context is achieved through the QoS profiles message exchange and it represents the
process of network virtualization. Even the improvements added by the I-NAME QoS model
through network virtualization are significantly better compared to the support offered by the
other models, the results presented in Figure 12a indicate the overcome of the minimum imposed
average end-to-end transmission delays value of 0.01 sec in 40% of cases for variations of the
number of packets transmitted per second.
When the application requirements exceed the network capacity, the I-NAME model proposes an adaptation of the application parameters: through source fragmentation or by source
code adaptation. The coding adaptation procedure of the source application implies increasing
the interval between packets sent by the source application to the value <interval*2> while
maintaining the transmitted packet size for all the cases in which the surpassing of the maximum accepted average delay value is indicated. After the coding adaptation process is applied,
application’s average end-to-end transmission delay is kept within the imposed limits by the QoS
profile, as illustrated in Figure 12b.
Figure 12: (a) The effect of I-NAME QoS support upon the average end-to-end delay variation
(left), and (b) The effect of I-NAME adaptation process through source code adaptation (right).
Therefore, the adaptive potential of the application parameters involves network virtualization with respect to the imposed application QoS profile and gradual adaptation of the source
application by means of source fragmentation or source code adaptation.
124
7
Conclusions
The scope of this paper was to identify, explore and propose adaptive techniques that can be
used for achieving an efficient resource management, in the attempt to enhance the transmission
quality of wireless systems. The presented adaptive solutions have illustrated the capability to efficiently use the available system resources, by applying different techniques based on a feedback
loop or on network virtualization. For systems using a feedback loop, the adaptive potential on
resource management was explored through modulation and coding schemes calibration, multiantennas system configuration, optimal power allocation strategies or through linear precoding.
Through the network virtualization process, the adaptive potential at the application level is
emphasized, by applying source fragmentation or source code adaptation. Thus, by realizing
adaptation at either the network level or application level, all these techniques guarantee an
efficient resource management in wireless systems.
Acknowledgements
This paper was supported by the project "Development and support of multidisciplinary
postdoctoral programmes in major technical areas of national strategy of Research - Development
- Innovation" 4D-POSTDOC, contract no. POSDRU/89/1.5/S/52603, project co-funded by the
European Social Fund through Sectoral Operational Programme Human Resources Development
2007-2013. The logistics costs of the work (research infrastructure) were supported PN II-RU
research grant PD184/2010, supported by CNCSIS-UEFISCSU.
Bibliography
[1] M. Mueck, A. Piipponen, G. Dimitrakopoulos, K. Tsagkaris, et al., ETSI reconfigurable
radio systems: status and future directions on software defined radio and cognitive radio
standards, IEEE Communications Magazine, ISSN 0163-6804, 48(9):78-86, 2010.
[2] N. Crisan, C. L. Cremene, E.Puschita, T. Palade, Evaluation of Adaptive Radio Techniques
for the under-11GHz Broadband Wireless Access, INT J COMPUT COMMUN, ISSN 18419836, 3(S):232-237, 2008.
[3] M. Tran, G. Zaggoulos, A. Nix, A. Doufexi, Mobile WiMAX: Performance Analysis
and Comparison with Experimental Results, IEEE 68th Vehicular Technology Conference
(VTC), 1-5, 2008.
[4] Y. Hang, Z. Lin, S. Ashutosh, Power Management of MIMO Network Interfaces on Mobile
Systems, IEEE Transactions on Very Large Scale Integration Systems, 20(7):1175-1186,
2012.
[5] A. Doufexi, E. Tameh, A. Nix, A. Pal, M. Beach, C. Williams, Throughput and Coverage
of WLANs Employing STBC under Different Channel Conditions, 1st International Symposium on Wireless Communication Systems, Mauritius, 20-22 September 2004, 367-372,
2004.
[6] J. Salazar, I.Gomez, A. Gelonch, Adaptive Resource Management and Flexible Radios for
WiMAX, Journal of Telecommunications and Information Technology, 4:101-107, 2009.
Demands
125
[7] WiMAX Forum, Forum Mobile System Profile Specification: Release 1.5 Common Part,
Revision 0.2.1: 2009-02-02.
[8] Recommendation ITU-R M.1225, Guidelines for evaluation of radio transmission technologies for IMT-2000, 1997.
[9] Project BROADWAN - Broadband services for everyone over fixed wireless access networks,
Deliverable 21. Planning guidelines for broadband access networks with case studies, May
2006.
[10] B. Baumgartner, M. Reinhardt, G. Righter, M. Bossert, Performance of Forward Error
Correction for IEEE 802.16e, Proc. of the 10th International OFDM Workshop (InOWo),
Hamburg, Germany, 2005.
[11] M. Tran, A. Nix, A. Doufexi, Mobile WiMAX MIMO Performance Analysis: Downlink
and Uplink, IEEE 19th International Symposium on Personal, Indoor and Mobile Radio
Communications (PIMRC), Cannes, 15-18 September 2008, 1-5, 2008.
[12] L. Zhinin, Z. Wenjun, X. Youyun, An Improved Simulator for the Correlation-based
MIMO Channel in Multiple Scattering Environments, Wireless Personal Communications,
52(4):777-788, 2009.
[13] E. Vinko, et.al., IEEE P802.11 Wireless LANs TGn channel models, doc: IEEE 802.1103/940r4.
[14] I. Schumacher, K.I. Pedersen, P.E. Morgensen, From antenna spacing to theoretical capacities guidelines for simulating MIMO Systems, Proc. PIMRC Conf., 2:587-592, 2002.
[15] S. Durrani, M.E. Bailkowski, Effect of angular energy distribution of an incident signal on
the spatial fading correlation of a uniform linear array, Microwaves, Radar and Wireless
Communications MIKON, 2:493-496, 2004.
[16] D. Lasevla, I.Z.Kocacs, L.M. Del Apio, E. Torrecilla, Feasibility and deployment analysis of the SURFACE concept, deliverable 1.1, 2009, available at: http://www.istsurface.org/deliverables/SURFACE-D8.320v1.1.pdf.
[17] I. Vermesan, A. Moldovan, T. Palade, R. Colda, Multi Antenna STBC Transmission Technique Evaluation under IEEE 802.11n Conditions, 15th IEEE International Conference on
Microwave Techniques, COMITE 2010, 51-54, 2010.
[18] M. Vu, A. Paulraj, MIMO Wireless Linear Precoding, IEEE Signal Processing Magazine,
24(5):86-105, 2007.
[19] V. Taroch, H. Jafakhami, R. Calderbank, Space-Time Block Codes from Orthogonal Designs, IEEE Transactions on Information Theory, 45(5):1456-1467, 1999.
[20] V. Taroch, N. Seshadri, R. Calderbank, Space-Time Codes for High Data Rate Wireless
Communication: Performance Criterion and Code Construction, IEEE Transactions on
Information Theory, 44:744-765, 1998.
[21] G.J. Foschini, Layered Space-Time Architecture for Wireless Communication in a Fading
Environment When Using Multiple Antennas, Bell Laboratories Technical Journal, 1(2):4159, 1996.
126
[22] I.E. Telatar, Capacity of multi-antenna Gaussian channels, European Transactions on
Telecommunications, 10:585-595, 1999.
[23] W. Huang, E.K. Au, V.K.N. Lau, Linear Precoding for Space-Time Coded MIMO Systems
using Partial Channel State Information, IEEE International Symposium on Information
Theory, 391-395, 2006.
[24] E. Puschita, T.Palade, A. Moldovan, R.Colda, I. Vermesan, An Innovative QoS Paradigm
based on Cognitive In-Network Management of Resources for a Future Unified Network
Architecture: I-NAME QoS Model, The Second International Conference on Advances in
Future Internet AFIN 2010, ISBN 978-0-7695-4091-7, 37-43, 2010.
[25] A.G Prieto, D. Dudkowsky, C. Meirosu, C. Mingardi, G. Nunzi, M. Brunner, R. Stadler, Decentralized In-Network Management for the Future Internet, IEEE International Conference
on Communications, ICC09, 1-5, 2009.
8(1):127-135, February, 2013.
A Novel Method for Service Differentiation in IEEE 802.15.4 :
Priority Jamming
S.Y. Shin
Soo Young Shin
School of Electronic Engineering,
Kumoh National Institute of Technology,
Room 112, Digital Building, Yanghodong, Gumi, Gyeongbuk, Korea
Abstract: IEEE 802.15.4 employs carrier sense multiple access with collision avoidance (CSMA/CA), which is known to have difficulty for supporting quality of service
(QoS). In this paper, a new priority scheme called priority jamming (PJ) is proposed
for service differentiation for IEEE 802.15.4. The main idea of the proposed scheme
is deferring the low priority packet transmission for the high priority packet. Clear
channel assessment of IEEE 802.15.4 is modified to support the proposed PJ. The efficiency of the proposed scheme is validated by comparing the delay, throughput, and
energy-per-bit with those of standard CSMA/CA. Simulations results showed that
PJ improves the delay and throughput simultaneously while maintaining marginal
difference in energy efficiency.
Keywords: CSMA/CA, QoS, priority jamming, service differentiation, IEEE
802.15.4
1
Introduction
IEEE 802.15.4 is originally designed for low duty cycle and low rate applications such as
wireless personal area network (PAN) [1]. However, it can be adopted for some delay-sensitive
applications such as emergency detecting, intruder alarming, health care, and so on. For supporting these applications, QoS (quality of service) requirements need to be supported.
IEEE 802.15.4 adopts carrier-sense multiple access with collision avoidance (CSMA/CA)
mechanism. Therefore, every node has a statistically equal chance to use the wireless medium
and to transmit packets. Because of unpredictable and undeterministic nature of CSMA/CA
cased by collisions and random backoff algorithm, to provide QoS in CSMA/CA is generally
known to be difficult.
Some works such as [2], [3], [4] studied the QoS support in IEEE 802.15.4. In [2], guaranteed
time slot (GTS) mechanism is exploited for the real-time service when enabling beacon mode.
The GTS uses the contention-free period (CFP), which is optional feature in IEEE 802.15.4
standard. In addition, if the available resources are not sufficient, i.e., the number of nodes
requires real-time service is large, the GTS may not be sufficient because of limited number
of GTS slots. In [3], differnet backoff exponent(BE) and contention window(CW) are used to
provide service differentiation in IEEE 802.15.4. However, nodes with different priorities can have
the same value of BE and/or CW based on slotted carrier-sense multiple-access with collision
avoidance (CSMA/CA) algorithm of IEEE 802.15.4, which leads to collisions among the nodes
with different priorities and increased delay for successful packet transmission. In [4], the frame
tailoring (FRT) and the priority toning (PRT) are proposed to support QoS. Using the FRT, i.e.
padding zeroes, IEEE 802.15.4 nodes with high priority performs one clear channel assessment
(CCA) only. However, the FRT causes a packet overhead and the acknowledgement packet of
the normal priority node and the data packet of the high priority node can collide. To prevent
collisions among high priority and normal priority packets, the priority toning (PRT) is used
to defer the normal priority packets by allocating some active portion of active period for the
Copyright ⃝
128
S.Y. Shin
Table 1: Parameters of IEEE 802.15.4
Parameters
aMinBE
aMaxBE
CW
macMaxCSMABackoffs
Slot duration (UnitbackoffPeriod)
CCA duration of low priority packet
CCA duration of high priority packet
Jamming signal duration
Values
3 (default)
5 (default)
2 (default)
5 (default)
20 symbols
20 symbols
8 symbols
8 symbols
high priority packet transmissions. However, if the number of the high priority packets is large
and network load is high, collisions may frequently occur among the high priority packets in
the allocated portion by the PRT. Moreover, those methods are designed for the beacon-enabled
IEEE 802.15.4 network only.
To support service differentiation, a novel method for supporting priority called priority
jamming (PJ) is proposed in this paper. The core idea of the PJ is deferring the normal priority
packet transmissions using a jamming signal transmitted for high priority nodes who have packets
ready to transmit. Because the proposed method exploit channel sensing part of IEEE 802.15.4,
i.e., clear channel assessment (CCA), it can be applied both beacon and non-beacon enabled
network. Although the proposed method considers slotted CSMA/CA in this paper, it can be
used both slotted and unslotted versions of IEEE 802.15.4 CSMA/CA.The paper is organized as
follows. The suggested PJ is described in the next section. Section 3 evaluates the performance
of PJ and compares with standard CSMA/CA. Finally, we draw our conclusions in Section 4.
2
Priority Jamming
In this paper, non-beacon enabled network with slotted CSMA/CA is assumed. Let’s examine
the operation of standard CSMA/CA of IEEE 802.15.4 first. IEEE 802.15.4 CSMA/CA works
as follows. Three variables are maintained at each device for a channel access: NB, CW and BE.
NB is the number of times the CSMA/CA backoffs while attempting the current transmission,
and is reset to 0 for each new data transmission. CW is the contention window length, which
is reset to 2 either for a new data transmission or when the channel is found to be busy. BE
is the backoff exponent, related to the backoff periods a device should wait before attempting
carrier sensing. When a device needs to transmit, it delays for a random number of backoff
periods (up to 2BE − 1 periods) and then determines if the channel is clear. If the channel is
busy, the MAC increases both NB and BE by one, and resets CW to 2. If NB is less than or
equal to macM axCSM ABackof f s, the CSMA/CA delays for a random time again, otherwise it
terminates with a failure. If the channel is assessed to be idle, it must ensure that the contention
window is expired before starting transmission. For this, the MAC sublayer first decrements CW
by one. If CW is not equal to 0, it must go to another channel sensing step. Otherwise, it starts
transmission on the boundary of the next slot period [1].
In an IEEE 802.15.4 network, it is assumed that there are two data categories, namely, high
and normal priority packets. The main idea of PJ is to provide high priority packets with greater
possibility to access the channel compared to normal priority packets. Figure 1 describes the
operation of the proposed scheme and some parameters related to both PJ and standard MAC
Ready for transmit
BE=aMinBE, NB=0, CW=2
Locate Backoff period boundary
Delay for random [0,2BE-1]
unitBackoffPeriods
High priority packets?
Y
Transmit priority
jamming signal
N
Perform CCA
Channel idle?
Y
CW=CW-1
N
CW=2, NB=NB+1,
BE=min(BE+1, aMaxBE)
N
NB=
macMaxCSMABackoffs?
Y
Transmission
failure
N
CW=0?
Y
Transmission
success
Figure 1: Operation of Priority Jamming in IEEE 802.15.4
129
130
S.Y. Shin
operation are summarized at Table 1. To transmit either a high or normal priority packet,
nodes will contend to access channel using CSMA/CA of IEEE 802.15.4 as described earlier.
To introduce PJ into IEEE 802.15.4, grey-colored portion is changed or modified compared to
standard CSMA/CA of IEEE 802.15.4 as shown in Figure 1. When a node is ready to transmit
high priority packet, it not only performs CCA to see whether the channel is busy or idle, but
also set aside some slot time to send a jamming signal to notify other nodes that a high priority
packet is ready to be transmitted. In other words, a node with high priority packets, performs
CCA during 8 symbol times as defined in IEEE 802.15.4 standard and transmits a jamming
signal. For jamming signal, any signal whose duration is less than (slot duration-8) symbols can
be used. In this paper, we used the preamble of IEEE 802.15.4 packet for the jamming signal.
With PJ, a node with normal priority packets performs CCA for entire slot time (20 symbol
time), not for just 8 symbol times. Because a node with normal priority packets listens entire
slot time and then the channel will be assessed to be busy due to jamming signal by the node
with high priority packets ready to be transmitted. Then, a node with normal priority packets
will defer its transmission and perform another random backoff procedure. By deferring the
transmissions of normal priority packets, the collision probability among nodes with high and
normal priority packets will be reduced compared to standard CSMA/CA. In other words, the
effective node numbers which contend for the channel will be reduced by adopting PJ.
Note that if all the packets have the same priority level, i.e., under either 100% high priority
or normal priority traffic, the behavior of PJ nodes will be identical to standard CSMA/CA.
Also, when nodes with standard CSMA/CA coexist with nodes using PJ, they are considered as
nodes with high priority packets. This means that PJ guarantees backward compatibility with
legacy IEEE 802.15.4 standard.
(i-1)
Channel
Data
(i+1)
i
(i+2)
ACK
Node
with
normal
priority
Backoff
Node
with
high
priority
Backoff
Data
CC
A1
C
CA
0
C
CA
1
Backoff
CC
A0
CC
A1
Transmission
Priority Jamming
Figure 2: An Example of Priority Jamming in slotted CSMA/CA of IEEE 802.15.4
Figure 2 shows an example of PJ operation with one node with a high priority packet and
one node with normal priority packet. As illustrated, a node with normal priority packet finishes
backoff at (i − 1)-th slot and performs the first CCA, CCA1. At this time, the channel is assessed
to be idle and (CW = CW − 1). At i-th slot, a node with high priority packet finishes its backoff
and performs CCA1 with PJ. At the same time, the node with normal priority starts the second
CCA, CCA0. Now the channel is assessed to be busy because of the jamming signal transmitted
131
by the node with high priority packet. Then, at (i+1)-th slot, another random backoff and CCA0
are performed by the node with normal priority and the node with high priority, respectively.
Finally, the node with high priority has a chance to transmit packet at (i + 2)-th slot.
3
Performance Evaluation
To evaluate the performance, we developed an OPNET simulation model of the slotted
CSMA/CA and priority jamming of IEEE 802.15.4. Star topology network with one coordinator
and 20 end devices is used for the simulations as illustrated in Figure 3.
Figure 3: Simulation scenario of IEEE 802.15.4 CSMA/CA with Priority Jamming
132
S.Y. Shin
All end devices send data packets to Coordinator, which responds with corresponding ACK
packets. All end devices generate 102 bytes long packets based on exponential distribution with
mean 1/λ. Two simulation parameters are inter-arrival time, i.e. λ, and ratio of high priority
packets in the total traffic. The λ varies from 0.15 to 0.4, which means the traffic introduced
to the network varies from 108.8 to 40.8 kbps. The ratio of high priority in the traffic, is one
of 10, 30, and 50 %. As the measure of performance, delay, throughput, and energy efficiency
are used in this paper. Then, the performances of PJ are compared to those of standard slotted
CSMA/CA.
0.05
std
high (50%)
normal(50%)
high(30%)
normal(30%)
high(10%)
normal(10%)
0.045
0.04
delay (sec)
0.035
0.03
0.025
0.02
0.015
0.01
0.15
0.2
0.25
0.3
packet inter−arrival time (sec)
0.35
0.4
Figure 4: Delay vs. inter-arrival time
Figure 4 shows packet delays as the inter-arrival time increases. Here, (x %) in the legend
of Figure 4 means that x percentage of the total traffic introduced to IEEE 802.15.4 network is
high priority packets. The parallel lines to y-axis means the delay doesn’t converge. In other
words, there are so many collisions that the packet delay increases. For example, with λ < 0.2,
the delay of standard CSMA/CA doesn’t converge.
However, when PJ is applied, the delays of high priority packets are bounded even for λ =
0.15. This is because the collision probability of high priority packets decreases due to service
differentiation provided by PJ. For example, at λ = 0.2, the collision probabilities of the high
and normal priority packets are 0.007, 0.026, 0.049, and 0.111, 0.089, 0.060 for 10, 30, 50 %
cases, while that of the standard is 0.129. As the ratio of high priority packets increases, the
collision probabilities of high priority packets and those of normal priority packets increase and
decrease because the node numbers of high priority and normal priority increases and decreases,
respectively. Note that the delays of normal priority packets with 10% high priority packages at
133
λ = 0.18 are about 5 second.
For λ ≥ 0.18, the delays of the high priority packets increases at the ratios of the high
priority packet increases. As λ decreases, the channel will be more crowded, even for the CCA of
the high priority nodes. Then, channel could be assessed as "busy" and may cause many random
back-offs. 1
As the packet inter-arrival time decreases, the channel occupation by packet transmissions is
more dominant than collisions. For larger λ, the channel is not frequently occupied, the delay is
more dependent on the collision probability. Hence, the delay of high priority packet decreases
as the ratio of high priority decreases.
Note that the delay experienced by both high and normal priority packets in PJ is less than
that of standard slotted CSMA/CA (std).
120
std
PJ total(50%)
high (50%)
normal(50%)
PJ total(30%)
high(30%)
normal(30%)
PJ total(10%)
high(10%)
normal(10%)
100
throughput (kbps)
80
60
40
20
0
0.15
0.2
0.25
0.3
packet inter−arrival time (sec)
0.35
0.4
Figure 5: Throughput vs. inter-arrival time
Fig. 5 illustrates throughput of PJ with high priority, low priority, and total (sum of high
and normal priority throughput) and compares to throughput of standard IEEE 802.15.4. The
throughput for high priority packets are monotonically increases as the inter-arrival time decreases. This is because the high priority packets really have priority to access the channel by
priority jamming. The total throughput with PJ outperforms that of standard IEEE 802.15.4
due to decreased collision probability as mentioned earlier. Why the delay with the standard
CSMA/CA diverges when λ = 0.18? The reason is that the throughput is 81.2 kbps, smaller
than the generated traffic, 90.7 kbps. Contrarily, throughput with PJ are 91.8 kbps (50% of high
1
In standard CSMA/CA, channel is already saturated at λ = 0.2.
134
S.Y. Shin
priority), 92.6 kbps (30% of high priority), 92.9 kbps (10% of high priority), which are higer that
the generated traffic, so delays with PJ are still bounded.
−4
3.5
x 10
3
Energy per bit (µJ/bit)
2.5
std
high (50%)
normal(50%)
high(30%)
normal(30%)
high(10%)
normal(10%)
2
1.5
1
0.5
0
0.15
0.2
0.25
0.3
inter arrival time (sec)
0.35
0.4
Figure 6: Energy-per-bit vs. inter-arrival time
Fig. 6 compares the energy efficiency of PJ and standard CSMA/CA using the concept of
energy consumption for transmitting a bit. The power consumption of idle, transmit and receive
states are set to Pidle =712 µW , Ptx = 31.32 mW , and Prx =35.28 mW , respectively [5]. Because
PJ scheme has energy consuming elements such as jamming signal (node with high priority
packets) and longer CCA time (node with normal priority packets), the energy consumption of
PJ is higher than that of standard CSMA/CA. However, the µJ/bit differences are relatively
small because throughput with PJ are increased and energy-per-bit values are the same order of
magnitude.
4
Conclusion
In this paper, a new service differentiation scheme which is called priority jamming (PJ) is
proposed for IEEE 802.15.4. The main idea of the proposed priority jamming is deferring the
transmissions of normal priority packets using a jamming signal transmitted for high priority
nodes who have packets ready to transmit. By using the proposed scheme, the average delay of
high priority packets is reduced. In addition, the net throughput of IEEE 802.15.4 is increased
and delay of normal priority packets is decreased because of lowering the collision probabilities
among IEEE 802.15.4 nodes.
135
Although PJ consumes more energy for a packet transmission because energy consuming
factors such as jamming signal transmission and long CCA time, PJ shows better performance
both in the delay and the throughput. Therefore, the differences in PJ and standard IEEE
802.15.4 are relatively small.
The proposed algorithm can be used both beacon-enabled and non-beacon enabled mode of
IEEE 802.15.4 and guarantee backward compatibility with legacy IEEE 802.15.4 standard.
By providing the service differentiation using PJ, this paper may contribute to enlarge the
delay-sensitive application area of IEEE 802.15.4 such as as emergent alarms and intruder detections.
Acknowledgments
This paper was supported by Research Fund, Kumoh National Institute of Technology.
Bibliography
[1] IEEE 802.15 TG, IEEE Std.802.15.4: IEEE Standard for Wireless Medium Access Control
(MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs), IEEE Standard, 2006
[2] A. Koubaa, M. Alves, E. Tovar, Guaranteeing Real-Time Services for Industrial Wireless Sensor Networks With IEEE 802.15.4, IEEE Transactions on Industrial Electronics, 57(11):38682876, 2010.
[3] M. J. Kim, C. H., Priority-Based Service-Differentiation Scheme for IEEE 802.15.4 Sensor Networks in Nonsaturation Environments, IEEE Transactions on Vehicular Technology,
59(7):3524-3535, 2010.
[4] T. H. Kim, S. Choi, Priority-Based Delay Mitigation for Event Monitoring IEEE 802.15.4
LR-WPANs, IEEE Communications Letters, 7(3):213-215, 2006.
[5] B. Bougard, F. Catthoor, D. Daly, A. Chandrakasan, W. Dehaene, Energy efficiency of the
IEEE 802.15.4 standard in dense wireless microsensor networks: Modeling and improvement
perspectives, IEEE Proceedings of DATE, 196-201, 2005
8(1):136-145, February, 2013.
Efficiency Consideration for Data Packets Encryption within
Wireless VPN Tunneling for Video Streaming
Daniel Simion, Mihai Florentin Ursuleanu
Adrian Graur, Alin Dan Potorac, Alexandru Lavric
"Stefan cel Mare" University of Suceava
Universitatii Street, No.13, RO-720229, Suceava, Romania
[email protected], [email protected], [email protected]
Abstract:
With the help of the Internet today we can communicate with anyone from anyplace
to access all types of data with a high level of QoS. This mobility is available for
legitimate users, as well as for illegitimate ones, for this reason we need extra data
security. A solution for QoS and data confidentiality is Virtual Private Network
(VPN); ways, in which we can reduce operational costs, grow productivity, simplify
network topology and extend the area of connectivity. Video data packets must
arrive with a constant and low delay at the same rate in order to have e real time
transmission. This paper presents an analysis of different protocols used and the way
that video data packets are encapsulated and encrypted for a high level of QoS in a
VPN connection.
Keywords: encryption, IPSec, L2TP, VPN, videostreaming, wireless tunneling.
1
Introduction
Sending video streams in IP networks is not a trivial problem even more so if we do it
on a wireless network. In the wireless standard 802.11 for each carried data packet, timing
intervals and additional overheads are mandatory. Often attacks of this sort compromise the
network availability of the application. Data integrity and confidentiality can be compromised
by unauthorized external access, for example hackers who can modify data content and data
bases.
The following research is a part of a bigger project that aims the optimization of data communication. The main focuses of the research are towards technologies like video streaming,
VoD, VoIP, and IPTV.
Virtual Private Network (VPN) is used to avoid DoS (Denial of Service) attacks, eavesdropping, masquerade and traffic analysis; to reduce operational costs and to increase productivity;
to simplify network topology and extend the geographical connectivity area without adding other
costs.
VPN security solution offers two major advantages: network scalability and a low implementation cost. The client doesn’t have to rent other networks to cover all of the company’s
locations; he can connect them through a local connection to any licensed ISP (Internet Service
Provider), at the rate required by that provider.
Also, the client doesn’t need remote access servers. In order to connect two locations, a
company will use a single dedicated line (see Figure 1), but as the number of work points will
multiply so will the connection costs will grow. For example a company which will have 4 work
point will need 6 dedicated lines to interconnect them; for 6 work points a company will use
15 dedicated lines for interconnections fact that will influence negatively the QoS of the video
stream transmission.
Copyright ⃝
Efficiency Consideration for Data Packets Encryption within Wireless VPN Tunneling for
Video Streaming
137
2
Virtual Private Network Tunneling
VPN is a private and secure connection [1] between two or more networks or computers who
share protected data, using a single secure channel between the endpoints, over a public data
network (for example WAN) or through the Internet. Tunneling represents the ability to make
circuit oriented connections in WAN topologies oriented on packets. This process is the main
technical concept of VPN.
Figure 1: Virtual Private Network Tunneling concept
Unlike packet oriented protocols, like IP, which can send data packets on different routes to
a common destination; a tunnel represents a dedicated virtual circuit between two endpoints
of a communication network. Since this process takes place over a shared network and tunneling can be implemented on a medium technological level, VPN is economically efficient; with
implementation costs between packets based communications and leased line communications.
VPN was created not to replace the other security mechanisms of the IEEE 802.11 standard,
but to complete them.
The main modes of use supported by VPN are:
• LAN-to-LAN internetworking;
• Controlled access within an intranet;
• Internet remote access client connections.
Protocols based on the OSI model (for data link layer and network layer) have been implemented in VPN tunneling. In order to send data on layer 2 VPN uses frames and on layer 3
VPN uses packets for data sending.
In Figure 2 can be seen a representation of VPN protocols on OSI Model.
Point-to-Point Tunneling Protocol (PPTP) encapsulates PPP frames for transmission
over IP internet works in IP datagrams. For tunnel maintenance a TCP connection is used in
PPTP, in order to encapsulate PPP frames in tunnelled data is used a modified version of GRE
(Generic Routing Encapsulation). The content encapsulated with PPP frames can be compressed
or encrypted (view Figure 3).
H = HGRE + HP P P + HIP
(1)
Layer Two Tunneling Protocol (L2TP) encapsulates PPP frames, encrypted and/or
compressed, which can be sent over X.25, ATM, Frame Relay or IP networks [2]. For a secure
enabled tunnel L2TP protocol can be combined with IPSec. L2TP tunneled data uses UDP to
send L2TP encapsulated PPP frames (view Figure 4).
138
Figure 2: VPN protocols on OSI Model
Figure 3: PPTP Tunnel Data Frame Format
Figure 4: L2TP Tunnel Data Frame Format
Video Streaming
139
H = HIP + HU DP + HL2T P + HP P P
(2)
Internet Protocol Security (IPSec) is a collection of multiple related protocols. It can
be used as a complete VPN protocol solution or simply as the encryption scheme within L2TP
or PPTP. IPSec ESP (Encapsulating Security Payload) encrypts the L2TP packet. Known to be
the strongest authentication and encryption method IPSec works at layer 3 of the OSI network
model.
An alternative to IPSec are SSL VPN’s, they operate at a higher layer level than IPSec
and it offers network administrators a greater control access to different network resources.
SSL (Secure Socket Layer) enables secure transactions of data and relies on several security
measures like private or public key and digital certificates [3]. Using SSL security encryption in
a WLAN environment forces a mobile wireless equipment to authenticate itself before any data
transactions.
Figure 5: L2TP/IPSec Tunnel Data Frame Format
H = HIP + HESP + HU DP + HL2T P + HP P P
(3)
Due to the standardization of tunneling protocols they become vulnerable to any firewall
stopping and blocking at any level. VPN uses encrypting routers to ensure unauthorized access
to the data that is being sent in a communication transmission; also limiting third parties data
access to the network connection.
There are lots of encryptions type’s algorithms. In most algorithms, the original data is
encrypted using a certain encryption key. Only the receiver computer or the recipient user can
decrypt the message using a specific decryption key. SSL, DES and PGP are some encryption
algorithms that create or change these keys. Authentication and encryption used in VPN depend
on implementation. Implementations like PPTP use the RC4 algorithm on 40/56/128 bits, while
L2TP and IPSec can use a wide range on encryption algorithms, like AES on 128/192/256 bits,
DES on 56 bits and 3DES on 168 bits.
In VPN PPTP encryption is weak, sending distributed passwords in clear. Unlike PPTP,
L2TP utilizes server-client digital certificates based ok PKI (Public Key Infrastructure). Some
solutions of IPSec have the option of using pre-shared keys or PKI digital certificates [4].
When considering a secure wireless VPN connection:
HSEC = H + HSOH
(4)
where, HSOH are security overheads; we have to take in account supplementary overheads (20
Bytes for WPA TKIP, 8 Bytes for WEP, 16 Bytes for WPA CCMP) [5, 6].
3
Practical Approach
This practical approach has tried to evaluate the theoretical approach presented before, in a
stable environment (without electromagnetic pollution). Certain measurements were made using
140
different communication scenarios.
In the first scenario, we have sent a video stream with 1.164 GB of data through our
private VPN, which uses PPTP encryption protocol, between two clients (n=1, where n is the
number of clients); one has an Ethernet connection to the VPN server and the other has a
wireless connection.
Figure 6: LAN-WLAN VPN Connections
For the first scenario the results have been achieved in interval 2.008 Mbps - 10.036 Mbps
(see Figure 7).
Figure 7: Minimum and Maximum values for LAN-WLAN VPN connection
In Figure 8 can be seen a network report for the first scenario, LAN-WLAN VPN connection.
Excluding IP overheads for PPTP, we have achieved an overhead length between 26 and 32
Bytes [7].
For an average upload speed of 5,398 Mbps we have 674,75 KB/s (U ps = 674.75KB/s).
We have an upload rate (U pr ) of:
U pr =
U ps
MTU
(5)
where, MTU is Maximum Transmission Unit or maximum efficiency for the IP packet.
Normally, without PPTP VPN tunneling protocol, according to equation (5), we have an
upload rate of 449.83 packets/s:
Video Streaming
141
Figure 8: Network report for LAN-WLAN VPN connection
U pr =
U ps
M T U + P P T Poverheads
(6)
Using equation (6), for a maximum PPTP overheads length (32Bytes) used for VPN tunneling, the packet rate will decrease at 440.43 packets/s.
At a maximum IP packet size (1.5KB) we have a loss of 32 bytes equaling 2.09% of bandwidth
lost.
For encapsulating 1.5KB IP packet into L2TP, the packet becomes 1.54KB (1.5KB +0.04KB
of UDP, IP and L2TP headers). Sending packets of data over Ethernet is a job that’s requires
fragmentation of the initial data into 1.5KB of data. So, the packet will be fragmented.
The first fragment has 1.5KB of data (1.46KB from the original IP packet and 0.04KB from
L2TP encapsulation).
The second will have 0.06KB (0.02KB from IP overhead and last 0.04KB from the original
IP packet). From the whole packet, only the first fragment of the packet will contains the L2TP
header. The second fragment of the data packet has only IP header. Careless of the L2TP client
type (LNS or LAC), the peer will assemble the two packet fragments back into original 1.54KB
size.
U pr =
U ps
M T U + L2T Poverheads
(7)
When we used L2TP VPN protocol the packet rate has decreased at 438.15 packets/s, in
comparison with the case when we used PPTP VPN protocol, as concluded from (7).
At a maximum IP packet size (1,5KB) we have a loss of 60Bytes equalling 3.89% of bandwidth
lost. When we used IPSec VPN protocol encapsulation the packet rate has decreased at 433.64
packet/s.
142
U pr =
U ps
M T U + IP Secoverheads
(8)
In this case we have a 76Bytes loss (about 4.88% of all bandwidth), at a maximum IP packet
size (1.5KB) (see Figure 9).
Figure 9: Bandwidth loss when using VPN Tunneling Protocols
In infrastructure wireless LANs with one access point (AP), the data frames do not travel
directly among clients. Wireless clients send the data frame to the AP and then the AP resend
the payload content of the original data frame, packed in a new data frame, to the receiving
client. The AP bandwidth, and the radio space, is shared between the AP radio clients and the
user available bandwidth is thus split among those clients [8].
In the second scenario, we have sent a video stream through our private VPN, which uses
PPTP encryption protocol, between two wireless clients ( n=2, where n is the number of wireless
clients) (see Figure 10).
The link utilization factor is reflected in the efficiency of the communication channel. This
can be viewed as the ratio between the total times that the channel is busy and the time for
sending the data payload. The channel efficiency is the rapport between payloads (the useful
bits of information) and the all the bits sent. For the ideal channel, the efficiency is:
Ef =
L
(L + H)
(9)
where, Ef is channel efficiency, L is the number of useful data bits and H is the overheads bits.
Figure 10: WLAN-WLAN VPN Connections
For the second scenario the results have been achieved in interval 1.857 Mbps - 14.326 Mbps
(see Figure 11).
Video Streaming
143
Each packet has 8L useful bits. The determination of the total number of successful sent
payload bits is made using the formula S * N * 8L, and the number of total transmitted bits can
be calculated using the relation N * 8 * (L + H).
Figure 11: Min and Max values for WLAN-WLAN VPN connection
Supposing that in a unit of time are passed though the VPN channel N packets of data, and
a part of them are successfully received by the another VPN client, the channel efficiency (Ef 0 ),
is (10):
Ef 0 =
N ∗ 8L ∗ (1 − p)8∗(L−H)
L
=
∗ (1 − p)8∗(L+H)
N ∗ 8 ∗ (L + H)
L+H
(10)
where, p is bit error probability.
The maximum size of one data packet sent on wireless environment is over 50% larger than
the maximum packet size sent on the Ethernet networks. The maximum size of one data packet,
in ideal condition, sent unencrypted on wireless environment is 2.304KB.
For the ideal VPN wireless channel, with one wireless VPN client, the efficiency is (11):
Ef =
Nw
L w + HV P N
(11)
where, Nw is the number of frames in a unit of time, Lw is the number of useful data bits in
wireless medium and HV P N is the VPN overheads.
In a simple wireless video streaming, if no error occurs, the efficiency is (12):
Ef =
Nw
Lw
(12)
From our scenarios we have found that the 7.54% from the maximum data packet unit is
responsible for wireless VPN data packaging and 4.64% from the maximum data packet unit is
responsible for wireless data packaging (see Figure 12).
In VPN the bandwidth reservation can be a challenge because of the unknown load distribution in a point-to-point connection [9].
As argued in [10], IPSec security encapsulation in VPN shows the performance in terms of
average added overhead.
144
Figure 12: Overheads contribution in Wireless and Wireless VPN environment
4
This paper evaluates the data communication efficiency for continuous data streaming and
different scenarios in a wireless environment using a VPN solution. The results of the research
would be considered as a base for the implementation of new solutions in the field of data
streaming using heterogeneous communications medium and technologies.
Using wireless environment instead of Ethernet solution for sending video streaming data
packets we lose approximately 34.89% from whole packet sent. When we used VPN for video
streaming we also lose 2.9% from the packet sent. The biggest WLANs have about 100 nodes.
A way in which we can extend them is by using VPN Tunneling.
With WiMAX and LTE technologies VPN video data transmission speeds will increase, both
including "best-effort" and priority based QoS scalable solutions. Considering that only 2.09
% of the packet size is lost through VPN encapsulation is a price worth paying for a secure
connection between two work points.
We conclude that we have achieved better speeds in a WLAN-WLAN video streaming scenario
when we used PPTP tunneling protocol in given conditions compared to L2TP and IPSec VPN
tunneling protocols.
In our next papers we will present a study on WiMax and LTE on video packet frames
structure for downlink and uplink with a specific simulating software. We will simulate wireless
VPN communication network with the two technologies, WiMax and LTE, on high performance
hardware to obtain maximum speeds for data transfer. Future papers will analyze the effects of
WiMax and/or LTE on live video streams, IPTV streams and multimedia.
Acknowledgements
This paper was supported by the project "Knowledge provocation and development through
doctoral research PRO-DOCT - Contract no. POSDRU/88/1.5/S/52946", project co-funded
from European Social Fund through Sectoral Operational Program Human Resources 20072013 and by the project "Improvement of the doctoral studies quality in engineering science for
development of the knowledge based society - QDOC" contract no. POSDRU/107/1.5/S/78534,
project co-funded by the European Social Fund through the Sectorial Operational Program
Human Resources 2007-2013.
Video Streaming
145
Bibliography
[1] Shihyon P., Bradley M., D’Amours D., William J. McIver Jr., Characterizing the Impacts of
VPN Security Models on Streaming Video, Communication Networks and Services Research
Conference (CNSR), Montreal, QC, Canada, ISBN:978-1-4244-6248-3, pp. 152 - 159, 2010.
[2] Townsley W., Valencia A., Rubens A., Pall G., Zom G., Palter B., Layer two tunneling
protocol (L2TP), RFC 2661, 1999.
[3] Hamzel K., Pall G., Verthein W., Taarud J., Little W, Zom G., Point-to-point tunneling
protocol (PPTP), RFC 2637, 1999.
[4] Prasad A.R., Prasad N.R., 802.11 WLANs and IP networking: Security, QoS and mobility,
Boston: Artech House, ISBN 1-58053-789-8, 2005.
[5] Potorac A.D., Considerations on VoIP Throughput in 802.11 Networks, Advances in Electrical
and Computer Engineering - AECE, ISSN: 1582-7445 e-ISSN: 1844-7600, 9(3):45-50, 2009.
[6] Khan M.A.U., Khan T.M., Khan R.B., Kiyani A., Khan M.A. Noise Characterization in
Web Cameras using Independent Component Analysis, INT J COMPUT COMMUN, ISSN
1841-9836, 7(2):302-311, 2012.
[7] Hossein B., The Internet encyclopedia, ISBN 0-417-22201-1, vol.3:425-428, 2004.
[8] Potorac A.D., Coca E. QoS Consideration for 802.11 Networks, European Conference on
the Use of Modern Information and Communication Technologies - ECUMICT 2006, Ghent,
Belgium, 30-31 March 2006, ISBN 9-08082-552-2, pp. 45-50, 2006.
[9] Volner R., Smrz V., Virtual Private Networks - Based Home System, Electronics and Electrical Engineering - Kaunas: Technologija, ISSN 1392-1215, 8(96):62-64, 2009.
[10] Berioli M., Trtta F. IP mobility support for IPsec-based virtual private networks: An architectural solution, 3rd IEEEGlobal Telecommunications Conference - GLOBECOM ’03,
Conference Publications, ISBN: 0-7803-7974-8, 3:1532 - 1536, 2003.
8(1):146-152, February, 2013.
Solving Method for Linear Fractional Optimization Problem with
Fuzzy Coefficients in the Objective Function
Bogdana Stanojević
Mathematical Institute of the Serbian Academy of Sciences and Arts
36 Kneza Mihaila, 11001 Belgrade, Serbia
Milan Stanojević
University of Belgrade,
Faculty of Organizational Science
154 Jove Ilića, 11000 Belgrade, Serbia
Abstract:
The importance of linear fractional programming comes from the fact that many real
life problems are based on the ratio of physical or economic values (for example cost/time, cost/volume, profit/cost or any other quantities that measure the efficiency of a
system) expressed by linear functions. Usually, the coefficients used in mathematical
models are subject to errors of measurement or vary with market conditions. Dealing
with inaccuracy or uncertainty of the input data is made possible by means of the
fuzzy set theory.
Our purpose is to introduce a method of solving a linear fractional programming
problem with uncertain coefficients in the objective function. We have applied recent
concepts of fuzzy solution based on α-cuts and Pareto optimal solutions of a biobjective optimization problem.
As far as solving methods are concerned, the linear fractional programming, as an
extension of linear programming, is easy enough to be handled by means of linear
programming but complicated enough to elude a simple analogy. We follow the construction of the fuzzy solution for the linear case introduced by Dempe and Ruziyeva
(2012), avoid the inconvenience of the classic weighted sum method for determining Pareto optimal solutions and generate the set of solutions for a linear fractional
program with fuzzy coefficients in the objective function.
Keywords: fuzzy programming, fractional programming, multi-objective programming.
1
Introduction
In the present paper the fuzzy linear fractional optimization problem (with fuzzy coefficients
in the objective function) is considered (FOLFP). The importance of linear fractional programming comes from the fact that many real life problems are based on the ratio of physical or
economic values (for example cost/time, cost/volume, profit/cost or any other quantities that
measure the efficiency of a system) expressed by linear functions. Usually, the coefficients used
in mathematical models are subject to errors of measurement or vary with market conditions.
Dealing with inaccuracy or uncertainty of the input data is made possible by means of the fuzzy
set theory.
In [6] a basic introduction to the main models and methods in fuzzy linear programming is
presented and, as a whole, linear programming problems with fuzzy costs, fuzzy constraints and
fuzzy coefficients in the constraint matrix are analyzed.
[9] presents a brief survey of the existing works on comparing and ranking any two interval
numbers on the real line and then, on the basis of this, gives two approaches to compare any two
Copyright ⃝
Solving Method for Linear Fractional Optimization Problem with Fuzzy Coefficients in the
Objective Function
147
interval numbers. [1] generalizes concepts of the solution of the linear programming problem with
interval coefficients in the objective function based on preference relations between intervals and
it unifies them into one general framework together with their corresponding solution methods.
Dempe and Ruziyeva [2] solved the linear programming problem with triangular fuzzy coefficients in the objective function. They derived explicit formulas for the bounds of the intervals
used for defining the membership function of the fuzzy solution of linear optimization and determined all efficient points of a bi-objective linear programming problem by means of weighted sum
optimization. Their approach is based on Chanas and Kuchta’s method [1] based on calculating
the sum of lengths of certain intervals.
The state of the art in the theory, methods and applications of fractional programming is
presented in Stancu-Minasian’s book [8]. Multiple-optimization problems are widely discussed in
[4]. Many mathematical models considers multiple criteria and a large variety of solution methods
are introduced in recent literature (see for instance [3, 5]). [7] introduces a linear programming
approach to test efficiency in multi-objective linear fractional programming problems.
Interpreting uncertain coefficients in FOLFP as fuzzy numbers, we derive formulas for the
α-cut intervals that describe the fuzzy fractional objective function. Operating with intervals,
we construct a bi-objective parametric linear fractional programming problem (BOLFP). We
use the procedure introduced by Lotfi, Noora et al. to generate all efficient points of (BOLFP).
The membership value of a feasible solution is calculated as cardinality of the set of parameters
for which the feasible solution is efficient in (BOLFP). In this way, the fuzzy optimal solution
is obtained as a fuzzy subset of the feasible set of (FOLFP) and the decision-maker will have
the opportunity to choose the most convenient crisp solution among those with the highest
membership value.
In Section 2 we formulate the fuzzy optimization problem FOLFP, set up notation and
terminology, construct equivalent bi-objective linear fractional programming problem BOLFP,
and describe a new way to generate efficient points for BOLFP. A procedure to compute the
membership function of each feasible solution is given in Section 3. We give an example of FOLFP
and its fuzzy solution obtained by applying the new solving method in Section 4. Conclusions
and future works are inserted in Section 5.
2
Fuzzy linear fractional optimization problem
The linear fractional programming problem with fuzzy coefficients in the objective function
is
max
x∈X
e
cT x + e
c0
deT x + de0
(1)
where X = {x ∈ Rn |Ax ≤ b, x ≥ 0}, A is the m × n matrix of the linear constraints, b ∈ Rm is
the vector representing the right-hand-side of the constraints, x is the n−dimensional vector of
decision variables, e
c, de ∈ Rn , e
c0 , de0 ∈ R represent the fuzzy coefficients of the objective function.
2.1
Equivalent problems
We replace Problem (1) by Problem (2) that describes the maximization of the α-cut intervals
of the objective function over the initial feasible set.
[
cL (α)T x + c0L (α) cR (α)T x + c0R (α)
,
max
0
T
x∈X dR (α)T x + d0
R (α) dL (α) x + dL (α)
]
(2)
148
where cL (α) (cR (α)), dL (α) (dR (α)), c0L (α) (c0R (α)), d0L (α) (d0R (α)) represent the vectors of the
left sides (right sides respectively) of the α-cut intervals of the fuzzy coefficients of the objective
function. For a deeper discussion of α-cut intervals we refer the reader to [10] and [11].
According to Chanas and Kuchta [1], an interval [a, b] is smaller than an interval [c, d] if and
only if a ≤ c and b ≤ d with at least one strong inequality. In this way, for each fixed α-cut,
Problem (2) is equivalent to the bi-objective linear fractional programming problem (3)
max
cL (α)T x + c0L (α)
cR (α)T x + c0R (α)
,
max
dR (α)T x + d0R (α)
dL (α)T x + d0L (α)
(3)
subject to
x ∈ X.
2.2
The special case of triangular fuzzy numbers
Let us consider now a subclass of fuzzy numbers – the continuous triangular fuzzy that are
represented by a triple (cL , cT , cR ) [2]. In this case cL (α) = αcT + (1 − α) cL and cR (α) =
αcT + (1 − α) cR . By simple analogy, similar formulas are derived for dL (α), dR (α), c0L (α),
c0R (α), d0L (α) and d0R (α). Using these formulas, Problem (4) is constructed in order to be solved
instead of Problem (3).
max
max
(αcT + (1 − α) cL )T x + αc0T + (1 − α) c0L
(αdT + (1 − α) dR )T x + αd0T + (1 − α) d0R
(4)
(αcT + (1 − α) cR )T x + αc0T + (1 − α) c0R
(αdT + (1 − α) dL )T x + αd0T + (1 − α) d0L
subject to
x ∈ X.
So far, the construction is similar to the construction given in [2] for the linear case. The
analogy cannot continue due to the fact that the weighted sum of linear fractional objectives is
not linear fractional objective any more. That is why we formulate a new method for generating
efficient points for Problem (4) in next section.
2.3
The generation of efficient points
Solving a multiple-objective linear fractional problem by optimizing the weighted sum of
the objective functions is not impossible but it is quite complicated. Instead of that, we will
construct the convex combination of the marginal solutions of Problem (4), and use each point
in the combination to generate an efficient point. By our method all points on the segment that
connects the two marginal solutions are mapped to the set of all Pareto optimal solutions of
Problem (4). The generation method is based on a linear procedure (proposed by Lotfi, Noora
et al. [7]) that determine whether a feasible point is efficient for a linear fractional programming
problem or not. Theorem 1 reformulates the results introduced in [7] by applying them to the
bi-objective linear fractional programming problem (3).
Theorem 1. (adapted from [7]) For arbitrarily fixed α ∈ [0, 1], x∗ ∈ X is a weakly efficient
solution in (3) if and only if the optimal value of problem (5) below is zero.
Objective Function
149
max
s.t.
t
+
0 ≤ t ≤ d−
1 + d1 ,
+
0 ≤ t ≤ d−
2 + d2 ,
(
)
T ∗
0
cL (α)T x + c0L (α) − d+
1 = cL (α) x + cL (α) θ1 ,
(
)
T ∗
0
dR (α)T x + d0R (α) + d−
1 = dR (α) x + dR (α) θ1 ,
(
)
T ∗
0
cR (α)T x + c0R (α) − d+
2 = cR (α) x + cR (α) θ2 ,
(
)
T ∗
0
dL (α)T x + d0L (α) + d−
2 = dL (α) x + dL (α) θ2 ,
(5)
x ∈ X,
−
+ −
d+
1 , d1 , θ1 , d2 , d2 , θ2 ≥ 0.
We have obtained the following theoretical result that will be needed in Section 3.
Proposition 2. For an arbitrary fixed α ∈ [0, 1] and for any convex combination x∗ ∈ X of the
marginal solutions of (3), the components of x in the optimal solution of (5) represent an weakly
efficient point of (3).
We give only the main ideas of the proof. We have to construct the cone that contains all
points that dominate the given point on the segment that connects the two marginal solutions.
One particular case is presented in Figure 1 but the basic idea of the proof can be drawn from it.
Let us consider that the feasible set of Problem (3) is the quadrilateral ABCD and for a given
value of α, rotational points of the two objective functions are E and F respectively. Marginal
solutions are B for the first objective and D for the second one. G is an arbitrary point on the
segment BD. The value of the first objective function is constant on the line EG and can be
improved by rotating EG anticlockwise toward EB. On EB the maximal value is reached. The
value of the second objective function is constant on F G and can be improved by rotating F G
clockwise toward F D. On F D the maximal value is reached. Hence, all points contained in the
cone EGF dominate point G. By solving Problem (5) with starting point G, the feasible points
from the cone EGF are analyzed and an efficient point from the cone is selected.
A more complete theory may be obtained by deriving conditions under which Theorem 1 can
be used in generating all efficient points for a multiple-objective linear fractional programming
problem.
3
Solving method
In this section we propose a procedure to compute the membership function of each feasible
solution of Problem (1). The procedure is essentially based on the new method of generation of
efficient points for the bi-objective linear fractional programming problem (3).
• For each α ∈ [0, 1]:
– Initialize ψ (α) = ϕ.
– Find marginal solutions x1α and x2α in (3) and construct their convex combination
xα (λ) = λx1α + (1 − λ) x2α , λ ∈ [0, 1].
– For each λ ∈ [0, 1], solve Problem (5) with x∗ = xα (λ). Due
Theorem 1, an efficient
{ to }
ef f
ef f
point xαλ for Problem (3) is obtained. ψ (α) = ψ (α) ∪ xαλ .
• For each x ∈ X calculate its membership value µ (x) = card {α|x ∈ ψ (α)}.
150
Practically, ψ (α) represents the set of Pareto optimal solutions of (3) for corresponding parameter
α. Each feasible solution of (1), for which at least one α ∈ [0, 1] exists so that it is efficient in
(3), has non-negative membership value in the fuzzy solution of (1).
4
Example
To complete the discussion we describe the procedure of finding fuzzy solution for one simple
example.
max
e 2 +e
−e
1x1 + 10x
4
e
e
e
2x1 + 5x2 + 1
(6)
subject to
x1 ≥ 1,
−x1 + 2x2 ≤ 1,
2x1 + x2 ≤ 8,
2x2 ≥ 1,
x1 , x2 ≥ 0.
Figure 1: Feasible set of Problem (6) and efficient points for Problem (4) for α = 0.2
Figure 1 describes the feasible set of the problem as the quadrilateral ABCD.
First time, fuzzy( coefficients were
) treated as triangular fuzzy numbers
( 0with parameters
)
0
0
0
(c − 1, c, c + 5) and c − 1, c , c + 1 for nominator, (d − 1, d, d + 2) and d − 1, d0 , d0 + 10
for denominator. Two pairs of marginal solutions were found: x1α = (1, 1), x2α = (1, 0.5) for
α ∈ [0, 0.55], and x1α = x2α = (1, 1) for α ∈ [0.56, 1]. Hence, µ ((1, 1)) = 1 and µ (x) = 0.55 for
each x on the segment line [(1, 1) , (1, 0.5)]. All other feasible solutions have membership value
equal to 0 in the fuzzy optimal solution. See Table 1.
α
[0, 0.55]
[0.56, 1]
x1α
(1, 1),
(1, 1)
x2α
(1, 0.5)
(1, 1)
ψ (α)
AB
B
x
x ∈ [AB)
x=B
µ (x)
0.61
1
Table 1: Computational results for the first set of fuzzy numbers
Objective Function
151
Second time, fuzzy
( coefficients were
) treated as triangular fuzzy numbers
( with parameters)
(c − 1, c, c + 5) and c0 − 1, c0 , c0 + 1 for nominator, (d − 2, d, d + 10) and d0 − 1, d0 , d0 + 10
for denominator Three pairs of marginal solutions were found: x1α = (1, 1), x2α = (3.75, 0.5)
for α ∈ [0, 0.23], x1α = (1, 1), x2α = (1, 0.5) for α ∈ [0.24, 0.62], and x1α = x2α = (1, 1) for
α ∈ [0.63, 1]. Hence, µ ((1, 1)) = 1, µ (1, 0.5) = 0.62, µ (x) = 0.23 for each x on the segment line
[(3.75, 0.5) , (1, 0.5)], and µ (x) = 0.62 for each x on the segment line [(1, 1) , (1, 0.5)]. All other
feasible solutions have membership value equal to 0 in the fuzzy optimal solution. For α = 0.2,
the marginal solutions and the rotational points of the two objectives are presented in Figure 1
by B and D and E and F respectively. The efficient set is, in this case, AB ∪ AD. See Table 2.
α
[0, 0.23]
[0.24, 0.62]
[0.63, 1]
x1α
(1, 1)
(1, 1),
(1, 1)
x2α
(3.75, 0.5)
(1, 0.5)
(1, 1)
ψ (α)
AB ∪ AD
AB
B
x
x ∈ (AD]
x ∈ [AB)
x=B
µ (x)
0.23
0.61
1
Table 2: Computational results for the second set of fuzzy numbers
5
Conclusions and future works
In the present paper the fuzzy linear fractional programming problem with fuzzy coefficients
in the objective functions is considered. The calculation of the membership function of the fuzzy
solution is described. The initial problem is first transformed into an equivalent α-cut interval
problem. Further, the problem is transformed into a bi-objective parametric linear fractional
programming problem. For each value of the parameter, the bi-objective problem is solved by
generating its efficient points from any convex combination of its marginal solutions. The solving
method works for any kind of fuzzy numbers but particular formulas were derived for the special
case of triangular fuzzy numbers.
The decision making process under uncertainty is widely studied nowadays. In our future
works we will focus on identifying how the procedure of generation of efficient points for biobjective fractional programming problems can be applied in a more general case. Also, we will
study other kind of fuzzy optimization problems with fractional objectives.
Acknowledgements
This research was partially supported by the Ministry of Education and Science, Republic of
Serbia, Project numbers TR36006 and TR32013.
Bibliography
[1] Chanas S., Kuchta D., Linear programming problem with fuzzy coefficients in the objective
function, in: Delgado M., Kacprzyk J., Verdegay J.L., Vila M.A. (Eds.), Fuzzy Optimization,
Physica-Verlag, Heidelberg, 148-157, 1994.
[2] Dempe S., Ruziyeva A., On the calculation of a membership function for the solution of a
fuzzy linear optimization problem, FUZZY SETS AND SYSTEMS, ISSN 0165-0114, 188(1):
58-67, 2012.
[3] Duta L., Filip F.G., Henrioud J.-M., Popescu C., Disassembly line scheduling with genetic
algorithms, INT J COMPUT COMMUN, ISSN 1841-9836, 3(3): 270-280, 2008.
152
[4] Ehrgott M., Multicriteria Optimization, Springer Verlag, Berlin, 2005.
[5] Harbaoui D.I., Kammarti R., Ksouri M., Multi-Objective Optimization for the m-PDPTW:
Aggregation Method With Use of Genetic Algorithm and Lower Bounds, INT J COMPUT
COMMUN, ISSN 1841-9836, 6(2): 246-257, 2011.
[6] Cadenas J., Verdegay J., Towards a new strategy for solving fuzzy optimization problems,
FUZZZY OPTIMIZATION AND DECISION MAKING, ISSN 1568-4539, 8: 231-244, 2009.
[7] Lotfi, F.H., Noora, A.A., Jahanshahloo, G.R., Khodabakhshi, M., Payan, A., A linear
programming approach to test efficiency in multi-objective linear fractional programming
problems, APPLIED MATHEMATICAL MODELING, ISSN 0307-904X, 34: 4179-4183,
2010.
[8] Stancu-Minasian I.M., Fractional Programing, Theory, Methods and Applications, Kluwer
Academic Publishers, Dordrecht/Boston/London, 1997.
[9] A. Sengupta, T. Pal, On comparing interval numbers, EUROPEAN JOURNAL OPERATIONAL RESEARCH, ISSN 0377-2217, 127: 28-43, 2000.
[10] Uhrig R.E., Tsoukalas L.H., Fuzzy and Neural Approaches in Engineering, John Wiley and
Sons Inc., New York, 1997.
[11] Zimmermann H.-J., Fuzzy Set Theory and its Applications, Kluwer Academic Publishers,
1996.
8(1):153-160, February, 2013.
PSO for Graph-Based Segmentation of Wrist Bones
in Bone Age Assessment
P. Thangam
Assistant Professor, CSE Department,
Coimbatore Institute of Engineering and Technology,
Coimbatore - 641109, Tamilnadu, India.
K. Thanushkodi
Director, Akshaya College of Engineering and Technology,
Coimbatore -642109, Tamilnadu, India.
Email: [email protected]
T.V. Mahendiran
Assistant Professor, EEE Department,
Coimbatore Institute of Engineering and Technology,
Coimbatore - 641109, Tamilnadu, India.
Abstract:
Skeletal maturity is a reliable indicator of growth and skeletal bone age assessment
(BAA) is used in the management and diagnosis of endocrine disorders. Bone age can
be estimated from the left-hand wrist radiograph of the subject. The work presented
in this paper proposes the development of an efficient technique for segmentation of
hand-wrist radiographs and identifying the bones specially used as Regions of Interest
(ROIs) for the bone age estimation process. The segmentation method is based on
the concept of Particle Swarm Optimization (PSO) and it consists of graph-based
segmentation procedure. The system provides an option of either segmenting all the
bones totally or segmenting only the specific ROIs under consideration. The system
is validated with a data set of 100 images with 50 radiographs of female subjects and
50 of male subjects. The time taken for segmenting each bone is calculated and the
results are discussed.
Keywords: skeletal maturity, bone age assessment (BAA), particle swarm optimization (PSO), graph-based segmentation, left-hand wrist radiograph.
1
Introduction
The chronological situations of humans are described by certain indices such as height, dental
age, and bone maturity. Of these, bone age measurement plays a significant role because of its
reliability and practicability in diagnosing hereditary diseases and growth disorders. Bone age
assessment using a hand radiograph is an important clinical tool in the area of pediatrics, especially in relation to endocrinological problems and growth disorders. A single reading of skeletal
age informs the clinician of the relative maturity of a patient at a particular time in his or her life
and integrated with other clinical finding, separates the normal from the relatively advanced or
retarded. [1] The bone age of children is apparently influenced by gender, race, nutrition status,
living environments and social resources, etc. Based on a radiological examination of skeletal
development of the left-hand wrist, bone age is assessed and compared with the chronological
age. A discrepancy between these two values indicates abnormalities in skeletal development.
This is applied in the management and diagnosis of endocrine disorders.
Copyright ⃝
154
2
Background of BAA
The main clinical methods for skeletal bone age estimation are the Greulich & Pyle (GP)
method and the Tanner & Whitehouse (TW) method. GP is an atlas matching method while
TW is a score assigning method. [2] GP method is faster and easier to use than the TW method.
Bull et al performed a large scale comparison of the GP and TW method and concluded that
TW method is the more reproducible of the two and potentially more accurate. [3] In GP
method, a left-hand wrist radiograph is compared with a series of radiographs grouped in the
atlas according to age and sex. The atlas pattern which superficially appears to resemble the
clinical image is selected. TW method uses a detailed analysis of each individual bone, assigning
it to one of eight classes reflecting its developmental stage (in terms of scores). The sum of all
scores assesses the bone age. This method yields the most reliable results. In detail, in the TW
method twenty regions of interest (ROIs) located in the main bones are considered for the bone
age evaluation. Each ROI is divided into three parts: Epiphysis, Metaphysis and Diaphysis; it is
possible to identify these different ossification centers in the phalanx proximity. The development
of each ROI is divided into discrete stages, as shown in Figure 1, and each stage is given a letter
(A,B,C,D, . . . I), reflecting the development stage as:
• Stage A – absent
• Stage B – single deposit of calcium
• Stage C – center is distinct in appearance
• Stage D – maximum diameter is half or more the width of metaphysis
• Stage E – border of the epiphysis is concave
• Stage F – epiphysis is as wide as metaphysis
• Stage G – epiphysis caps the metaphysis
• Stage H – fusion of epiphysis and metaphysis has begun
• Stage I – epiphyseal fusion completed.
A
B
C
D
E
F
G
H
I
Figure 1: TW stages for phalanx bone.
By adding the scores of all ROIs, an overall maturity score is obtained. This score is correlated
with the bone age differently for males and females. [4] Hence for accurate estimation of bone
age, the ROIs are to be properly extracted and analyzed. So BAA requires efficient segmentation
schemes for further processing. This paper proposes an efficient segmentation technique using
graphs for segmenting the bones in the radiograph. We have done a thorough survey of literature
on BAA methods in our previous work [5], explaining in detail the various work done in BAA and
providing directions for future research. Our previous work [6] describes a computerized BAA
method for carpal bones, by extracting features from the convex hull of each carpal bone, named
as the convex hull approach. We have also proposed an automated BAA method to estimate
bone age from the feature ratios extracted from carpal and radius bones, named as the feature
ratio approach. [7] Our decision tree approach utilizes features from the radius and ulna bones
and their epiphyses for BAA. [8] We have also exploited the epiphysis/ metaphysis region of
interest (EMROI) in BAA using our Hausdorff distance approach. [9] A comparative study of
155
the above four BAA approaches has been conducted using partitioning technique. [10] We have
also proposed an efficient method for feature analysis of radiographs using Principal Component
Analysis (PCA) based on PSO. [11] The work presented in this paper is a novel method for
segmenting the wrist bones from the input radiographs using PSO, applying the graph-based
segmentation.
3
System Design
The system consists of three modules, namely: Image Preprocessing, Edge Detection and
Graph-based segmentation.
3.1
Image Preprocessing
Image preprocessing is performed in two steps, image smoothing and grayscale conversion.
Image smoothing is done to reduce the noise within the image or to produce a less pixilated
image. Most smoothing methods are based on low pass filters. In our system, we have done
smoothing to reduce noise by using a Gaussian filter. Gaussian filter reduces the magnitude of
higher frequencies proportional to the lower frequencies, but at the cost of more computation
time. But the speeding up of smoothing is achieved by splitting 2D Gaussian G(x,y) into two
1D Gaussians G(x)G(y) and carrying out filtering in 1D, first row by row and then column by
column. Grayscale conversion is done as follows. Colors in an image are converted to a shade of
gray by calculating the effective brightness or luminance of the color and using this to create a
shade of gray that matches the desired brightness.
3.2
Edge Detection
Edge occurs where there is a discontinuity in the intensity function or a very steep intensity
gradient in the image. Using this assumption, if one take the derivative of the intensity value
across the image and find points where the derivative is maximum, then the edge could be located.
[12] We have made use of Sobel edge detector to detect the edges. The Sobel operator performs
a 2-D spatial gradient measurement on an image. Typically it is used to find the approximate
absolute gradient magnitude at each point in an input grayscale image. The Sobel edge detector
uses a pair of 3 × 3 convolution masks, one estimating the gradient in the x-direction (columns)
and the other estimating the gradient in the y-direction (rows). A convolution mask is usually
much smaller than the actual image. As a result, the mask is slid over the image, manipulating
a square of pixels at a time. The actual Sobel masks [13] are given below:
The magnitude of the gradient is then calculated using the formula:
√
|G| = Gx2 + Gy 2
(1)
An approximate magnitude can be calculated using:
|G| = |Gx| + |Gy|
(2)
156
3.3
Graph-based segmentation
Graph-based segmentation method [14] measures the evidence for a boundary between two
regions by comparing two quantities:
• Intensity differences across the boundary and
• Intensity differences between neighboring pixels within each region.
Intuitively, the intensity differences across the boundary of two regions are perceptually
important if they are large relative to the intensity differences inside at least one of the regions.
Graph-based image segmentation techniques generally represent the problem in terms of a graph
G = (V, E) where each node vi ϵV corresponds to a pixel in the image, and the edges in E
connect certain pairs of neighboring pixels. A weight is associated with each edge based on
some property of the pixels that it connects, such as their image intensities. Depending on the
method, there may or may not be an edge connecting each pair of vertices. Let G = (V, E) be an
undirected graph with vertices vi ϵV , the set of elements to be segmented, and edges (vi , vj )ϵE
corresponding to pairs of neighboring vertices. Each edge (vi , vj )ϵE has a corresponding weight
w((vi , vj )), which is a non-negative measure of the dissimilarity between neighboring elements vi
and vj .
The segmentation algorithm defines the boundaries between regions by comparing two quantities. The internal difference of a component C in an image is given by:
Int(C) =
max
w(e)
(3)
e ϵM ST (C,E)
where w(e) is the largest weight in the Minimum Spanning Tree (MST) of the component. The
difference between two components C1 , C2 which are vertices of the graph is defined to be the
minimum weight edge connecting the two components, given by:
Dif (C1 , C2 ) =
min
vi ϵ C1 ,vj ϵ C2 ,(vi ,vj ) ϵ E
w((vi , vj ))
(4)
If there is no edge connecting C1 and C2 , we let Dif (C1 , C2 ) = ∞. The region comparison
predicate evaluates if there is evidence for a boundary between a pair or components by checking
if the difference between the components, Dif(C1 , C2 ) is large relative to the internal difference
within at least one of the components, Int(C1 ) and Int(C2 ).
3.4
Overview of PSO
Particle Swarm Optimization (PSO) is an algorithm for finding optimal regions of complex
search space through interaction of individuals in a population of particles. PSO algorithm,
originally introduced in terms of social and cognitive behavior by Eberhart and Kennedy in
1995 [15] has been proven to be a powerful competitor to other evolutionary algorithms such
as genetic algorithms. PSO algorithm simulates social behavior among individuals (particles)
flying through multidimensional search space, each particle representing a single intersection of
all search dimensions. [16]–[19] The particles evaluate their positions relative to a global fitness
at every iteration, and companion particles share memories of their best positions, and then use
those memories to adjust their own velocities and positions. At each generation, the velocity of
each particle is updated, being pulled in the direction of its own previous best solution (local)
and the best of all positions (global). Computation of optimal threshold is handled here with
Particle Swarm Optimization (PSO). Implementation of PSO algorithm analyzed here to find
out the optimal threshold for segmentation. The population size of particles refers the number
157
of particles in iterative process, thus denoting components in the image here. A population of
particles is initialized with random positions and velocities in d-dimensional space. A fitness
function, f is evaluated, using the particle’s positional coordinates as input values. Positions and
velocities are adjusted, and the function is evaluated with the new coordinates at each time-step.
3.5
Implementation of PSO for graph-based segmentation
The implementation of the segmentation algorithm consists of the following steps.
Step 1: Swarm Formation: For a population size p, the particles are randomly generated between the minimum and the maximum limits of the threshold values.
Step 2: Objective Function evaluation: The objective functions of the particles are evaluated.
Step 3: ‘pbest’ and ‘gbest’ initialization: The objective values obtained above for the initial
particles of the swarm are set as the initial pbest values of the particles. The best value
among all the pbest values is identified as gbest.
Step 4: Velocity computation: The new velocity for each particle is computed using equation (5).
v[i] = v[i] + c1 ∗ rand(i)∗ (pbest[i] − present[i]) + c2 ∗ rand(i)∗ (gbest[i] − present[i])
(5)
Step 5: Position computation: The new position for each particle is computed using equation (6).
present[i] = present[i] + v[i]
(6)
where, v[i] is the particle velocity, present[i] is the current particle (solution), pbest[i]
and gbest[i] are defined as stated before, rand (i) is a random number between (0,1), c1 ,
c2 are learning factors. Usually c1 = c2 = 2.
Step 5: Swarm Updation: The values of the objective function are calculated for the updated
positions of the particles. If the new value is better than the previous pbest, the new
value is set to pbest. Similarly, gbest value is also updated as the best pbest.
Step 6: Termination: If the stopping criteria are met, the positions of particles represented
by gbest are the optimal threshold values. Otherwise, the procedure is repeated from
step 4.
4
Results and Discussion
The use of image pre-processing techniques such as image smoothing and gray scale conversion
improves the quality of the digitized radiograph. The noise caused due to radiation and other
external factors are eliminated. Application of Sobel edge detector identifies the boundary of
the bones or the regions of interest. This facilitates in better segmentation. Finally, the PSO
algorithm for graph-based image segmentation is used to individually segment each bone in the
left-hand wrist radiograph. The algorithm provides an option of whether to segment the entire
radiograph (to identify all the bones in the radiograph) or to segment selected ROIs alone (some
individual bone only). Figure 2 (a) depicts the input radiograph image, Figure 2 (b) denotes the
image after smoothening, Figure 2 (c) provides a snapshot of edge detection using Sobel operator,
Figure 2 (d) provides the snapshot of segmenting all the wrist bones in the left hand radiograph
image, while Figure 2 (e) shows segmentation of selective ROIs (in this case the proximal, middle,
and distal phalangeal bones). Figure 3 shows the graph depicting the performance assessment of
the segmentation. The time taken to segment each bone using the PSO technique is calculated
158
(a)
(b)
(c)
(d)
(e)
Figure 2: (a) Input radiograph image, (b) Smoothened image, (c) Edge detected image, (d)
Segmenting all the bones in the image, (e) Segmenting selective (phalangeal) ROIs
1.5
1
A
LN
U
at
am
H
ap
ita
C
e
Lu
na
te
Tr
iq
ue
tra
l
Pi
si
fo
rm
Sc
ap
ho
id
Tr
ap
ez
iu
m
Tr
ap
ez
M
oi
et
d
a
C
D
ar
is
pa
ta
lP
ls
ha
M
id
la
dl
ng
e
Pr
es
Ph
ox
al
im
an
al
ge
Ph
s
al
an
ge
s
R
ad
iu
s
0.5
0
te
Time (seconds)
Performance assessment of Segmentation
2.5
2
ROI Bones
Graph
PSO-Graph
Figure 3: Performance Assessment of Segmentation
and is compared with the plain graph-based technique and the results are tabulated in Table 1.
The segmentation was regarded as accurate if the sum of over selected and under selected pixels
were less than 25. The segmentation process was accurate by 0.94 for males and 0.96 for females,
as tabulated in Table 2. The PSO algorithm is implemented with the following parameters,
Population size: 50, max min w = 0.6, w = 0.1, C1 = C2 = 1.5, Iteration: 50.
5
159
Conclusion
PSO algorithm was used for graph-based segmentation of left hand wrist radiograph images,
which can be further used for skeletal bone age assessment. The input image was first preprocessed to remove noise and was grayscale converted to improve image quality. Sobel edge
detector was used for edge detection and then PSO combined graph-based segmentation was
performed. The segmentation procedure provided two options, whether to segment all the wrist
bones as a whole or to segment selective ROI bones. The time taken to segment each bone was
calculated and the results were tabulated. The system was tested with 100 left hand wrist images
(50 males and 50 females). The quality of the segmentation was influenced by the image quality.
For radiographs over exposed to radiation, further preprocessing was required, to achieve good
results.
Bibliography
[1] Gilsanz, V.; Ratib, O. (2005); Hand Bone Age – A Digital Atlas of Skeletal Maturity,
Springer-Verlag.
[2] Spampinato, C. (1995); Skeletal Bone Age Assessment, University of Catania, Viale Andrea
Doria, 6 95125.
[3] Bull, R.K.; Edwards, P.D.; Kemp, P.M.; Fry, S; Hughes, I.A. (1999); Bone Age Assessment:
a large scale comparison of the Greulich and Pyle, and Tanner and Whitehouse (TW2)
methods, Arch. Dis. Child, ISSN 1468-2044, 81: 172-173.
160
[4] Tanner, J.M.; Whitehouse, R.H. (1975); Assessment of Skeletal Maturity and Prediction of
Adult Height (TW2 method), Academic Press.
[5] Thangam, P.; Thanushkodi, K.; Mahendiran, T.V. (2011); Skeletal Bone Age Assessment
– Research Directions, International Journal of Advanced Research in Computer Science,
ISSN 0976-5697, 2(5): 415-423.
[6] Thangam, P.; Thanushkodi, K.; Mahendiran, T.V. (2012); Computerized Convex Hull
Method of Skeletal Bone Age Assessment from Carpal Bones, European Journal of Scientific Research, ISSN 1450-216X/1450-202X, 70(3): 334-344.
[7] Thangam, P.; Thanushkodi, K.; Mahendiran, T.V. (2012); Efficient Skeletal Bone Age Estimation Method using Carpal and Radius Bone features, Journal of Scientific and Industrial
Research, ISSN 0975-1084, 71(7): 474-479.
[8] Thangam, P.; Thanushkodi, K. (2012); Computerized Skeletal Bone Age Assessment from
Radius and Ulna bones, International Journal of Systems, Applications and Algorithms,
ISSN 2277-2677, 2(5): 60-66.
[9] Thangam, P.; Thanushkodi, K. (2012); Skeletal Bone Age Assessment from Epiphysis/Metaphysis of phalanges using Hausdorff distance, Scientific Research and Essays, ISSN 19922248, 7(28): 2495-2503.
[10] Thangam, P.; Thanushkodi, K.; Mahendiran, T.V. (2012); Comparative Study of Skeletal
Bone Age Assessment Approaches using Partitioning Technique, International Journal of
Computer Applications, ISSN 0975 – 8887, 45(18): 15-20.
[11] Thangam, P.; Thanushkodi, K.; Mahendiran, T.V. (2011); Efficient Feature Analysis of
Radiographs in Bone Age Assessment, International Journal of Computer Applications in
Engineering Sciences, ISSN 2231-4946, 1(2): 601-606.
[12] Cooper, M.C. (1998); The tractability of segmentation and scene analysis, International
Journal of Computer Vision, ISSN 0920-5691, 30(1): 27-42.
[13] Gonzalez, R.C.; Woods, R.E. (2009); Digital Image Processing, Third Edition, Pearson.
[14] Felzenswalb, P.F.; Huttenlocher, D.P. (2004); Efficient Graph-Based Image Segmentation,
International Journal of Computer Vision, ISSN 0920-5691, 59(2): 167-81.
[15] Kennedy, J.; Eberhart, R.C. (1995); Particle Swarm Optimization, Proc of the IEEE International Conference on Neural Networks, Australia, pp. 1942–1948.
[16] http://www.swarmintelligence.org/
[17] http://www.particleswarm.info/
[18] Nedjah, N.; Mourelle, L.M. (2006); Swarm Intelligent Systems, Springer.
[19] Kennedy, J.; Eberhart, R.C.; Shi, Y. (2001); Swarm Intelligence, Morgan Kaufmann Publishers.
8(1):161-165, February, 2013.
Alternative Wireless Network Technology Implementation for
Rural Zones
Francisco J. Watkins,
Ricardo A. Hinojosa,
Astrid M. Oddershede
Universidad de Santiago
Departamento de Ingenieria Electrica
Santiago, Chile
E mail [email protected],
[email protected],
[email protected]
Abstract:
This paper describes a methodology that allows wireless networking allowing interconnection to the Internet through a Gateway, to interact and obtain products and
services delivery. These Wireless Mesh Network(WMN), are based on routers which
are programmed to work as nodes of a network. There are certain routers that allow
the programming of its firmware to form network nodes. Communication is transmitted between nodes in the network and it is possible to cover long distances. The signal
of a distant node, hop from node to node till reach the Gateway. This generates delays
and congestion in the network. A path that contains nodes that make more faster
connection to the Gateway can be designed as a solution. This is called a backbone,
has a different channel frequency of the common nodes. The characteristics of these
networks is its fast implementation and low cost. This make them useful for rural
areas, for developing countries and remote regions.
Keywords: wireless mesh network.
1
Introduction
The methodology can be structured to generate a transmission infrastructure of broadband
hybrid wireless mesh of low cost that shall extend to a great extent with nodes placed point
to point. This to access products and services delivers the Internet. In modern times the
Internet has been a powerful network of communication and exchange of information with a
strong endorsement of communication technology. On the other hand, it can be defined as a
powerful instrument of social development that provides information, communication between
people and a way of obtaining knowledge.
The Internet is a valuable tool for business, industry, trade, education, and social development
of communities. A computer is physically connected to the Internet via a MODEM or a card
NIC (Net Work Interface Card). The logical connection applies standards called protocols. A
protocol is a formal description of a set of rules and conventions that govern so that communicate
with devices on the network. The network connection can use multiple protocols this set is the
(TCP/IP), transport control protocol, Internet packages, to receive or transmit information.
The most important is have the connectivity for getting the Internet benefits.
Based on the modern technology support, a communication network can be implemented,
hacking the characteristic of its software and implementing some passives devices that modify
the hardware characteristics.
The network design for a wireless mesh network will depend on the geographic landscape and
distances between the points to be connected. A combination of point-topoint long distance links
Copyright ⃝
162
(using directional antennas) and local point-to-multipoint links (using omni-directional antennas)
between mesh nodes can create a reliable mesh network.
This type of wireless communication network has been denoted as Wireless Mesh Network [1].
There are several pilot projects in development at underdevelopment countries, mainly in rural
areas.
2
The wireless mesh network (WMN)
It is a network structures by several nodes, that form the backbone of the network. The
nodes, because of its software, can be configure automatically and re configure to maintain the
network connectivity.
Figure 1: Wireless mesh network
In Figure 1, the wireless nodes are interconnected, and a wireless node is a router with its
antenna, The antenna could be omni-directional or directional. A mesh node only communicate
with others mesh nodes. A wireless access point is a point which alloys the interaction with the
wireless mesh network of any wi-fi device. Its is consist of a wireless router and an antenna. In
this case, the antenna is a omni-directional antenna.
In a mesh wireless network any node can be connected in no structured.
A simple wireless network can consist of two wireless routers and its antennas.
3
Characteristics and Advantages of a Wireless Mesh Network
The link between nodes the routers could be configure in different way, generating links that
cover big distances or for giving service to several user in a small area. The connection could be
by physics means or wireless.
163
In mesh wireless network , the unidirectional antennas are used, and in same places its
can used wired connection. The large distance s are cover by using static wireless nodes with
unidirectional antennas.
The mesh wireless network are robust and of simple configuration , because its software will
determine the path of the data in real time. The backbone of the network its depend on the site
topography.
Communication of all mesh nodes are based on wi-fi . All the nodes of the mesh wireless
network operate at same channel frequency. In a WMN, each node must be communicated with
at least with other two nodes in order to maintain a robust mesh connectivity, which is the main
features of a WMN.
In a WMN, each node have the same name and number. The IP address should be unique to
allow to connect to any computer in the network. A computer can connect to the mesh via LAN
cables connected to the mesh node or a wireless connection to a separate access point connected
to a LAN or a mesh node.
A network device, is connected physically to the network through a modem or a NIC(Network
Interface Card) and the logic connection is made through protocols. A protocol is a formal
description of a set of rules and agreements for defining a communication way between the
different devices of the network. The protocols TCP/IP, ( Transport and Control Protocol/
Internet Packages) and OSI (Open System Interconnection ). The users to manage the mesh
thought navigators. A navigators initiate (Start up) the connection to a server and could required
or received information. This software interpret hipertext language label (HTML), which is one
of language to code of a Web page content. In a WMN, a routing protocol, will route de IP
traffic between the wireless interfaces of the mesh nodes. Manage the routing information and
maintains routing tables dynamically. This provide an alternative route when a node fails.
The advantages of a WMN, are the following: Self-forming. The wireless mesh forms their
structure automatically once its nodes have been configured and activated. It is a robust network.
Fault tolerance because of redundant routes exist in the network. Information is not interrupted
in the rest of the network when a node fails. Low cost of the of the nodes of the WMN, which
allows the admission of extra nodes to increment the network, given a low incremental cost of the
network. Easy deployment of the network. New member of the community, with little training
can built their own nodes.
4
The Wireless Mesh Network Design
Wireless Mesh networks are no problematic to build when you have a few nodes. In general
you must follow the following stages:
• Map of the network;
• Place of each node;
• Network topology and channel Allocation;
• Channel Allocation for site users;
• Plan IP address allocation.
4.1
Map of the network
The map of the mesh network star with identification of the sites that will receive a mesh
node, the coordinates can get with a GPS and them plot distribution nodes in a paper. The
sites nodes can be linked together using the map. Each link is a straight line between to nodes.
The length of each link represent the distance between sites.
164
4.2
Place of each node
For the nodes is a solution to build a backbone reaching the gateway. If we have a complex
mesh, several backbones uniformly distributed are needed for the nodes reach the gateway. The
backbones are path included in the connection of certain strategic nodes in such way that for
any others nodes, the connection with the gateway be expedite (Connection with the gateway
be through a few nodes).
4.3
Network Topology and Channel Allocation
Nodes in the mesh can communicate to each other, if they have the same frequency channel.
When a back bone is incorporated, other channel is needed, which works as an independent
network. In this case both network do not have interference problem.
4.4
Channel allocation of the backbone
If we have a mesh network with a backbone we need two IP ranges . A third range is needed
is we adds an access point [2].
4.5
IP Address Allocation
The IP allocation should assure for each PC and each node a unique address, according to
RFC 1918 subnet Scheme [3], [4].
For the assignment of the address of the different element of the network, we have the
following:
Backbone node; wireless interface : 10.0.1.x/24 where 1Ł x < 255;
Ethernet interface: 10.3.x.y/24 where 1Ł x < 255 where 1Ł y < 255;
Normal mesh node: wireless interface: 10.1.1.a/24 where 1Ł a < 255;
Ethernet Interface:10.2.a.b/24 where 1Ł a < 255 where 1Ł b < 255, PC and Laptop connected
to a node will be numbered from 100according to the setting;
Access Point : The subnet assigned to a LAN or hotspot will be the same as Ethernet LAN
connected to the mesh node.
5
5.1
The wireless Mesh Network
First Steps
To build the Mesh Network the following stages should be cover:
• Configure all the mesh nodes and wireless Access Point according to the network document;
• Attach a paper to the device of the mesh node or wireless access point with the configuration
detail;
• Test the equipment to be sure all is working properly;
• Connect a PC to a mesh node with a LAN cable. The PC will require a IP address. Ping
other Mesh node .If the ping is successful, the mesh node of the PC and the other mesh node
are working. If not, check the configuration;
• The gateway is the point where the mesh network will be connected to the internet;
• Installing the mesh nodes from the gateway, in such way you can confirm that the network
is still working when a new mesh node be installed;
• Connect a PC to a mesh node with a LAN cable, ping the gateway first, and if that is
successful, access to any site on the internet (Different web pages) in order to ensure the PC can
connect to the internet.
5.2
165
The Mesh node
In order to start with the mesh node you must have the router, the LAN cable, and the power
supply. To configure the mesh node, the following stage must be cover:
• Upgrade the firmware for the backbone and normal mesh nodes [5];
• Configuration of system settings;
• Configuration of wireless settings;
• Configuration of LAN settings;
• OLSR settings [6] (Optimized Link State Routing Protocol.
(OLSR) is developed for mobile ad hoc networks. It operates as a table driven and proactive
protocol, thus exchanges topology information with other nodes of the network regularly).
6
Conclusions
A methodology has been given in order to plan and design a WMN , which can supply
available communication systems in a similar way of Internet does. But a lower cost of internet.
This can be planned specially on rural areas. Different countries can access to several routers
models, but there are only a few that allow be adapted to this technology. By other hand there
are free open software that allows this implementation.
This methodology makes it possible to cover a niche of potential users, especially in rural
areas and in developing countries where the low density of population, not makes it attractive
to commercial ISP (Internet Service Provided).
Bibliography
[1] Akyildiz I., Wang X.,Wang W., Wireless Mesh Network: A Survey, Computer Networks 47,
445-487,2005.
[2] Ding Y., Xiao L., Channel allocation in multi-channel wireless mesh networks
[3] Bernardos C., Calderon M., Soto I.,Solana A., K Weniger K., Building an IPbased community
wireless mesh network: Assessment of PACMAN as an IP address autoconfiguration protocol,
Computer Networks; 54, 291-303, 2010.
[4] Avallone S., Akyildiz I., A channel assignment algorithm for multi-radio Wireless Mesh Networks, Computer Communications 31 1343-1353, 2008.
[5] Hsu Ch., .Wu J-L, TeWang S., Chi-Yao Hong Ch Y., Survivable and delayguaranteed backbone
wireless mesh network design, Journal Parallel Distribution Computing. 68, 306-320,2008.
[6] http://hipercom.inria.fr/olsr/
8(1):166-175, February, 2013.
Issues on Applying Knowledge-Based Techniques in Real-Time
Control Systems
Doina Zmaranda, Helga Silaghi
Gianina Gabor, Codruta Vancea
University of Oradea
Romania, 410087 Oradea, 1 Universitatii St.
Abstract:
At the time being knowledge-based systems are used in almost all life aspects. The
main reason for trying to use knowledge-based systems in real-time control is to
reduce cognitive load on users (overload), their application proving to be important
when conventional techniques have failed or are not sufficiently effective [1]. The
development of automated diagnosis techniques and systems can help also to minimize
downtime and maintain efficient output. This paper presents some issues of applying
knowledge-based systems to real-time control systems. It describes and analyzes the
main issues concerning the real-time domain and provides possible solutions, such
as a set of requirements that a real-time knowledge-based system must satisfy. The
paper proposes a possible architecture for applying knowledge-based techniques in
real-time control systems. Finally, a way of employing knowledge-based techniques
for extending the existing automatic control and monitoring system for the geothermal
plant from the University of Oradea is presented.
Keywords: Real-time control systems, knowledge-based systems, Programmable
Logic Controller
1
Introduction
Real-time systems generally consist of a series of complex, heterogeneous and critical processes. They are also closely coupled systems, consisting of a physical system part and a control
computer system part. While the physical part reacts to the control signals from the computer,
the control computer part and its software must interact with the dynamic properties of the
physical part. Real-time systems are also, by definition, reactive systems. To increase their
efficiency, different monitoring programs, tools, algorithms, and rules could be utilized. Generally, these programs are used for detecting abnormal behaviors, tracing workflow progress, and
generating alerts and reports during the different phases of the system.
However, using knowledge-based techniques in real-time applications represents a major challenge because of several reasons [10]: problems related to time representation and reasoning about
time; problems related to deadline because a knowledge-based system should provide the best
solution within a given deadline; problems related to asynchronous evens handling that could
lead to interruption of the inference process; problems related to integration of conventional
real-time programming and knowledge-based programming.
Moreover, specific nature of real-time systems that implies interaction with external physical
system that also implies specific features when including knowledge-based components [7]: the
knowledge-based decision making system is hardly related to the external system; the knowledgebased decision making system should be also a real-time system in order to be sure that the
decision making are performed before deadlines; implies both knowledge of control and of the
Copyright ⃝
167
external physical system each of them having a specific form; when dealing with complex realtime systems consisting of several sub-components, the problem of decision making need to be
based on distributed knowledge.
Knowledge-based and real-time control technologies are complementary, rather than competitive technologies. Control technologies generally are oriented to quantitative processing while
knowledge-based integrates both qualitative and quantitative processing [6]. Separating the description (the knowledge) of a process from the control algorithm allows knowledge to be more
explicit, visible and analyzable, instead of being hidden inside of the procedural programming
code.
Knowledge-based systems could be used for different purposes in real-time process control,
the main domains of their applicability include [5] [9]: fault diagnosis, that implies detection,
cause analysis and repetitive problem recognition; complex control schemes; process and control
performance monitoring and statistical process control; real time Quality Management (QM);
control system validation. For real-time control systems, an important issue is represented by
their capability in fault detection and diagnosis because the availability and productivity can be
significantly improved by shortening their downtime. Moreover, because personnels ability for
observations could be incomplete or wrong and leading to incorrect diagnosis, intelligent system
approaches need to be investigated and applied.
2
Knowledge-based techniques in real-time applications
Real-time applications could have different structures, and consequently, different approaches
for using knowledge-based techniques can be employed. One typical structure of real-time control systems comes from the requirement of meeting the quest for automation and flexibility in
complex manufacturing systems and is based on PLC (Programmable Logic Controller) usage.
Such real-time control systems architectures are widespread because PLCs offer an adaptable
and modular solution to the control problem. However, there are some shortcomings of this
approach that are generated by PLCs inflexible programming system that do not support automatic analysis of logic circuits in order to look for a fault. Even if in todays modern PLCs there
are some diagnosis functions available, their usage is limited and need to be extended. Consequently, developing a knowledge-based system for diagnosis purposes could represent a solution
for implementing automated diagnosis techniques into complex manufacturing real-time systems.
In order to create efficient solutions, several specific issues should be considered: knowledge representation and acquisition, real-time reasoning, knowledge validation, integration with real-time
software [11] [12].
2.1
Knowledge representation and acquisition
The main reason for using real-time knowledge-based systems is to reduce cognitive load
on users. Therefore, such systems require a knowledge representation that integrates several
kind of knowledge taken from several sources: analytical models developed by using differential
equations, material or energy balances or overall process behavior kinetics. Generally, each
object could have a behavior that is represented by a combination of analytic (model-based)
and heuristic (rule-based) statements. In PLC-based systems automatic monitoring elements
available for faults are represented by discrete state signals into the PLC memory. These signals
are indicating different operation states of the controlled plant and based on their values further
diagnosis can be carried out. The values can be obtained by accessing PLC memory via a
linkage from the computer that has implemented the diagnosis system. The diagnosis system
must then use specific reasoning algorithms for searching all possible fault causes under the
168
help of relevant knowledge and real-time data. Therefore, the knowledge acquisition task is
very important, and could be made in two ways: artificial knowledge acquisition, model-based
knowledge acquisition [3].
Artificial knowledge acquisition is obtained by knowing specific issues about the controlled
plant. For example, in each plant there are several alarms that have the purpose of protecting
plants equipments or preventing the plant to work in error conditions. These alarms are normally
indicated by one or a combination of PLC signals. For example: temperature too high, pressure
too low or a combination of these. Model-based knowledge acquisition is based on knowledge
achieved during modeling the system behavior and constructing the PLC program, resulting
in improvement of knowledge acquisition and diagnosis efficiency. Let define S the space that
represents the fundamental set for all possible configurations of the control system variables. A
specific configuration set s is given as:
s = s1 , s2 , ......sn ∈ S
(1)
Let define the behavior space B that represents the fundamental set of all determinable
behavioral attributes:
b = b1 , b2 , ....bm ∈ B
(2)
Every subset F of B that comprise specific required profile for faulty behavior could be
expressed by a combination of PLC signals associated with correspondent control system variables
and their specific state. From formal point of view, these could be represented in the following
way [2]:
f ault− > (P LCsignal1 , state), (P LCsignal2 , state)...., (P LCsignaln , state)
(3)
With every fault defined into the faulty behavior profile, all possible sources generating the
fault are considered and specified. For example, if for a specific fault occurrence from the fault
space F, device mapping dealing and inferring the functional relationship is considered:
F : D → Crelevant
(4)
where D represents the device space and Crelevant represents relevant cause space.
The relationship between behavior space B, configuration space S and required profile for
faulty behavior F is presented in Figure 1. Based on evaluation mapping between a specific
behavior and profile of faulty behavior, if a match is found then the corresponding fault could
be identified together with all possible causes.
2.2
Knowledge validation
In traditional real-time control systems, the control problem and its implementation through
control algorithms is based on the exact knowledge of the control plant that is determined usually based on the mathematical model of the plant. Real-time knowledge-based control systems
combine the analytical process model with conventional process control while reasoning about
current, past and future situation in order to assess on-going developments and plan appropriate actions. Such systems allow the application to be structured into a model that is capable
to behave and use its reasoning when taking a decision as human specialists do. Generally, a
full control strategy requires not only variable (parameters) identification, state estimation and
control, but also to check the validity of the data and process models before they are used in
estimations. However, there is a relatively high degree of uncertainty concerning the plant starting from the mathematical model itself: there is not apriori knowledge of some parameters (for
169
Figure 1: The relationship between behavior space B, configuration space S and required profile
for faulty behavior F
example, parameters for achieving stability conditions of a feedback control) or, the plant behavior may be not deterministic [4]. There is an important concern when using knowledge-based
techniques for real-time control systems: the need for validating systems knowledge, allowing
determination if it accurately represents an experts knowledge in the particular domain. In
this idea, simulation, when available, could represent a very useful tool that provides a general
overview of system dynamics.
2.3
Integrating knowledge-based software with real-time software
The real-time control software and rule-based software differ in their underlying execution
models. Procedural software uses generally an imperative model in which software engineer
determines the sequence of actions while rule-based systems represent a general control scheme of
matching, selecting and execution of rules. Consequently, a knowledge-based real-time diagnosis
system should be thought as an extension of existing control software that interacts both with
knowledge database constructed by using artificial and model-based acquisition and with the
real-time data acquired from PLC source code execution. A possible structure is presented in
Figure 2.
Diagnostic reasoning is based on the knowledge base as well as real-time data from realtime database. The reasoning mechanism is based on logic control of faults. Thus, it uses the
logical expression for faults presented in (3) where each term represents a possible cause of fault
indicated by a specific PLC signal. By comparing the fault state with the current state of signals
from real-time database in PLC, an occurrence of a fault state could be identified. Furthermore,
by using systems profile mapping to devices and associated causes, as presented in (4), associated
devices and causes of the fault could be shown.
2.4
Real-time reasoning
First attempt for using knowledge-based systems for real-time process control involves using
static expert systems that take a snapshot of plant data. Static expert systems use pattern
matching of a set of facts and rules. With no time constraints, this approach proves to be
practical but when time constrains come into picture, this might not be a good idea. The
170
Figure 2: Structure of a diagnostic system for a real-time control system
elements that should be considered in this situation are: temporal reasoning and responding
within a given response time.
In the controlled system, variables such as temperatures and pressures may vary in time;
therefore, a time representation together with the possibility of reasoning about time is essential to be included. So, to the elementary entity of knowledge base additional time information
should be attached. Also, the rules defined into the expert system may be extended with specific,
temporal-extended rules, such as: operators always, to formulate the premise of a rule; qualitative statements that refer to the relation of time points or time intervals, without exact time
specification: earlier, after and similar; quantitative statements that allow expressing conditions
with specification of exact point of time: for example at 12:00:00 p.m.
Generally, the basic characteristic of a system that guarantees a certain response time is
represented by its determinism. Knowledge-based systems are by their nature non-deterministic
because the time of inference is dependent on the given situation. If imposed real-time deadlines
are lower than the maximum searching time that is needed for a certain inference, the response
time requirements cannot be met.
In order to meet those requirements, three strategies could be applied: implementing algorithms to quantitatively estimate the maximum searching time, reducing the inference searching
time or define an embedded diagnosis system approach which will integrate the diagnosis models
into the PLC control program so that faults could be diagnosed in real-time.
The problem of including all diagnosis into the PLC has also disadvantages by creating a
much more complex control program. Integrating inference rules into the control program itself
complicate the rules and makes introduction of new rules more difficult. In addition, the process
of integration of different knowledge and information of these systems represent a tedious process.
A mixed approach could be a possible better solution: only for critical faults, the diagnosis part
with corrective actions should be included into the PLC and the rest of them should remain
in charge of the knowledge-based system. In the next chapter, a way of employing knowledgebased techniques for extending the existing automatic control and monitoring system for the
geothermal plant from the University of Oradea is presented.
3
171
The automatic control and monitoring system structure
The automatic control and monitoring system of the geothermal plant from the University of
Oradea is an example of a real-time control system that has been developed using a combination
between a PLC (Programmable Logic Controller) and a PC (for the user interface and supervisory
control). From the structural point of view, the controlled plant is composed of 3 parts: the well
station, the pump station and the heat station.
The system functions in the following way: first, the geothermal water is extracted from the
well station using a deep well pump if the necessary flow rate is greater than the artesian one; the
water is then stored into a reservoir tank, which acts as an accumulator and also separates the
production network from the distribution network. Then, from the reservoir tank the water is
pumped, through the pump station to the heat station and further, in the heat station, the water
is not directly utilized, but through 4 heat exchangers; the water that comes out from these heat
exchangers flows into the distribution network and heats the university campus buildings.
The structure of the control system consists of a PLC for controlling the geothermal heating
system based on a control program embedded to the controller, connected with a PC computer
and contains the user interface for the operator that is implemented using Wonderware InTouch
software [15]. The InTouch display management subsystem handles display call up, real-time
display update, data entry and process schematics. It also maintains the PLC real-time and
historical database that could be used to follow the time evolution of certain parameters or for
statistical calculations. The real-time database provided includes maintenance of historical data
in addition to current values of process variables. In the current implementation approach, the
decision process in order to achieve possible solutions for several faults, such as sensors and other
equipments faults implies changing the plant operation into a safety mode and employing the
operator for tracing the fault, its effects and related causes. Also, PLC control program deals
with diagnosis to only of few critical faults. Based on current situation of the geothermal plant
and the current research in the domain, a proposal for extending the existing control system
with a knowledge-based system is developed and described further in this paper.
3.1
Analysis phase. Knowledge acquisition and validation
The main issue when constructing a knowledge-based system is the way on which description
(knowledge) is built up in accordance with the plant behavior and structure. Consequently, the
development of a knowledge-based system for the existing plant and control system implies selecting the most important characteristics of the system that will be used in order to construct
the knowledge database. Moreover, there is the need for validating systems knowledge, that
means determination if it accurately represents an experts knowledge in the particular domain.
In this idea, the simulator developed for the geothermal plant of the University of Oradea proves
to be of great help. The simulator, which was previously developed, provides a simplified physical
model of the plant dynamics together with PLC control, which is formulated into an easier-tooperate computer simulation: the ACSL solution to the model equations. Key elements of the
improved easy-of-operation are the use of general-purpose simulation language ACSL (Advanced
Continuous Simulation Language), and pre-programmed modules of all important plant components, including control elements [16]. In the development phase of control system, the simulator
has provided a useful tool for testing the system specifications, including the adopted control
strategy; it could be employed also into the process of step-by-step knowledge acquisition and
for further validation and updating. Based on information gathered from simulation and from
the control program development, knowledge acquisition could be achieved. For example, if we
refer to possible faults, a structured notation could be used when doing knowledge aquisition,
172
based on tables, as shown in Table 1.
Table 1: Faults and associated switches in PLC (partial)
Digital inputs
Fault notation
Overload protection switch pump P3/P4
P3_OL, P4_OL
Thermal switch pump P3/P4
P3_TD, P4_TD
Overheat in the heat station
TS2
Local emergency stop switch in the pump station panel
ES1
3.2
Design phase. Knowledge representation
Knowledge acquisition is the most important part when developing a knowledge-based system,
but an important point is represented also by the way on which knowledge representation is
done. Representation is tied to the production rules that should be expressed accordingly to the
knowledge system that is used. Current knowledge-based industrial systems are generally built
within shells, which package a combination of tools. Different shells may include different features
useful for real-time control applications, such as: hierarchy for objects, associative knowledge,
relating objects in the form of connections and relations, rules and associated inference engine,
analytic knowledge, such as functions, formulas, and differential equation simulation, real-time
features such as time stamping and validity intervals for variables, history-keeping, run-time
environment. In this idea, current literature presents several expert systems that could be used
for real-time process control systems, the most known being G2 and JESS.
G2 real-time expert system [13] allows integration of models and rules for combining modelbased and artificial knowledge representation and it is based on an inference engine that can
use generic forms of knowledge, interpreted for specific instances in the domain. It is specifically
designed for process control and related applications and allows the process engineer to implement
and manage the expert system. But, even if G2 is claimed to be real-time, there is no mention
of verifiability from temporal point of view.
Java Expert System Shell or JESS [13] is inspired by the artificial intelligence production
rule language CLIPS being a fully developed Java API for creating rule-based expert systems.
Even if it is architecturally inspired by CLIPS, it exhibits a LISP-like syntax. It consists of
three components: the rules (knowledge base), the working memory (fact base corresponding to
real-time base) and an inference engine (rule engine). JESS uses the Rete (ree-tee) algorithm
to match patterns. Rete algorithm is not time predictable but two newer algorithms Treat and
Leaps have introduced some optimization from this point of view; there are as well not time
predictable but, with some restrictions, it would allow to be time predictable.
Consequently, we consider to develop a solution based on JESS for our system. Thus, a
knowledge model can be constructed starting from observing the main problems associated with
every specific item (for example, pumps P3/P4 from the pump station) in the operational system
and creating the model (Table 2) that presents correlation between problems and items that can
be solved by expert system.
Table 2: Correlation model
SLOT
ITEM
PROBLEMS
Situation (for pumps P3/P4) Engine Overheating, Vibration, Lack of voltage
Afterwards, for each identified problem, the possible causes are identified and associated,
generating a knowledge model (Table 3).
SLOT
Cause
INFERENCE LEVEL
Overheating
Vibration
Lack of voltage
173
Table 3: Knowledge model
TASK LEVEL
Cooler water missing, Too big temperature in pump station
Bend axle, Motor basement snap
Disruption of wires, General pump station switch off
After establishing the knowledge model, the design phase is dedicated for structuring the
rules according to requirements of JESS inference engine. JESS uses the notion of frames that
are hierarchical representation that includes several components (slots, facet, datum and how).
Frame definition is made by using Deftemplate, having the following generic form [2]:
((deftemplate <deftemplate-name> [extends <classname>]
[<doc-comment>]
[(slot <slot-name> [(default | default-dynamic<value>)]
[(type <typespec>))]*)
For example, the template for pumps P3/P4 will look in the following way:
(deftemplate P3/P4
(slot situation (default NF))
(slot cause (default NF)
)
, where NF states from Not Found. JESS basis of knowledge composed from rules could be
further built based on the previously frame definitions. These slots in the JESS rules structure
will be unified by using pattern-matching by the inference engine through the Rete algorithm [8].
3.3
Overall proposed architecture
The proposed modifications of the existing architecture are illustrated in Figure 3. It employs
as additional element an additional computer on which the JESS Expert System is running that
is connected with the PC on which the user interface was developed and on which PLC historical
and real-time database resides. Historical and real-time databases are, in our situation, created
and updated by InTouch in real-time, based on PLC associated inputs.
Figure 3: The Knowledge System architecture
174
In the proposed architecture, the main issue is represented by the way in which collaboration
of several components is achieved, because PLC real-time database has a specific format for
storing, defined by InTouch. There is a need of conversion from this format in order to be able to
interact with the knowledge base that resides on the expert systems PC computer. A possibility
is represented by using an XML file that stores all the tagnames of the faults, generated from
PLC database. The idea is to use the XML file to feed the knowledge base and to interact as
an intermediate level between expert system and PLC real-time database. Then, this file could
be used as input by a JESS function in order to check is value; afterwards, for all tagnames that
are acted (indicating a fault) rules pattern should be evaluated. Consequently, the architecture
colaboration of components could be achieved by integration of JESS engine with the real-time
database through the XML file.
There are several issues that are not completely defined in our architecture. For example,
the way in which the XML file is generated based on PLC real-time database in still in question.
Also, after completing the implementation, system performance should be evaluated in order to
prove that the solution provides satisfactory.
4
Knowledge-based systems are making significant contributions to real-time process control
applications. Their applications are often in areas which complement traditional process control technology, like, for instance, diagnosis and handling abnormal situations. They integrate
knowledge-based techniques with conventional control, having significant benefits in overall quality management. But, a knowledge-based system operating in a real-time situation will typically
need to respond to a changing environment involving asynchronous flow of events and dynamically changing requirements with limitations on time, hardware, and other resources. Determining how fast this system can respond under all possible situations is a difficult problem that
requires using flexible software architecture in order to provide the necessary reasoning on rapidly
changing data. In this paper, various issues on applying knowledge-based techniques in real-time
control systems have been presented; starting from this foundation, an implementation structure of a knowledge-based system for the existing automatic control and monitoring system for
the geothermal plant from the University of Oradea was analyzed and an overall architecture is
proposed for further implementation. The actually proposed knowledge-based system approach
is focused mainly on general architecture and component colaboration. The architecture does
not include time-constrains validation, it relies only on the performance of the JESS engine; a
further solution for creating a specific expert system that includes temporal reasoning will be
also investigated in the future.
Bibliography
[1] Singh A., Verma M., Real Time Expert System - Its Applications, IJCST Vol. 1, Issue no. 2,
ISSN : 2229-4333 (Print) | ISSN : 0976-8491 (Online), 2010.
[2] Farias O., Labidi S., Neto J. F., Albuquerque S., A Real Time Expert System For Decision Making in Rotary Railcar Dumpers, Automation Control - Theory and Practice, A D Rodi (Ed.), ISBN: 978-953-307-039-1, InTech, available
from:
http://www.intechopen.com/books/automation-control-theory-and-practice/a-realtime-expert-system-for-decision-making-in-rotary-railcar-dumpers, 2009.
175
[3] El-Desouky A. I., Arafat H. A., Laban S., Implementing Knowledge-Based Systems in Monitoring Real-Time Systems, Geophysical Research Abstracts, Vol. 10, EGU2008-A-12044,
SRef-ID: 1607-7962/gra/EGU2008-A-12044, EGU General Assembly, 2008.
[4] Eremeev A., Varshavskiy P., Case-Based Reasoning Method for Real-Time Expert Diagnostics
Systems, International Journal "Information Theories and Applications" Vol.15, 2008.
[5] Wai K. S., Latif Abd., Rahman B. A., Zaiyadi M. F., Aziz A. Abd., Expert System in Real
World Applications,
[6] Bubnicki Z., Modern Control Theory, ISBN 10 30540-23951-0, ISBN 13 978-3-540-23951-2
Springer-Verlag Berlin Hedelberg, 2005.
[7] Bubnicki Z., Kowledge-based and learning Control systems, Control Systems, Robotics and
Automation, Ecyclopedia of Life-Support Systems, vol. XVII, 2005.
[8] Grissa-Touzi A., Ounally H., Boulila A., VISUAL JESS: AN Expandable Visual Generator
of Oriented Object Expert systems, World Academy of Science, Engineering and Technology,
2005.
[9] Alexander A. J., How to Use Expert Systems
http://conferences.embarcadero.com/article/32093, 2004.
in
Real
World
Applications,
[10] Vagin V.N., Eremeev A.P., Certain Basic Principles of Designing Real-Time Intelligent
Decision Systems, Journal of Computer and Systems Sciences International, v. 40(6), pp.
953-961, 2001.
[11] Laffey T. J., Cox P. A., Schmidt J. L., Kao S. M., Read J. Y., Real-Time Knowledge Based
Systems, AI Magazine Volume 9 Number 1, 1988.
[12] Charpillet F., Look on the Issue of Building Real-Time Knowledge Based Systems - Research
Summary, Association for the Advancement of Artificial Intelligence Technical Report WS97-06, 1997.
[13] http://www.gensym.com/
[14] http://herzberg.ca.sandia.gov/jess/
[15] http://global.wonderware.com/EN/Pages/WonderwareInTouchHMI.aspx
[16] MGA software, Advanced continuous simulation language (ACSL) - reference manual,1995
Author index
Delavar A.G., 61
Alfaro M.D., 8
Arivudainambi D., 18
Baradaran A.A., 61
Bologa G., 87
Breaban M., 30
Burca V., 87
Butaci C., 87
Canete L., 37
Colda R., 111
Copie A., 42
Cordova F.M., 50
Donoso Y., 79
Dragicevic S., 105
Palade T., 111
Peulic A., 105
Popescu D.E., 70
Potorac A.D., 136
Puschita E., 111
Rekha D., 18
Scarlat E., 97
Sepúlveda J.M., 8
Shin S.Y., 127
Silaghi H., 166
Simion D., 136
Stanojević B., 146
Stanojević M., 146
Thangam P., 153
Thanushkodi K., 153
Tianfield H., 70
Fortiş T.-F., 42
Gabor G., 166
Graur A., 136
Ulloa J.A., 8
Ursuleanu M.F., 136
Hinojosa R.A., 161
Vancea C., 166
Vermesan I., 111
Jovanovic Z., 105
Watkins F.J., 161
Krneta R., 105
Zmaranda D., 166
Lavric A., 136
Leyton G., 50
Lonea A.M., 70
Luchian H., 30
Mahendiran T.V., 153
Maldonado-Lopez F.A., 79
Moldovan A., 111
Munteanu V.I., 42
Nagy M., 87
Neghina D., 97
Oddershede A.M., 161

CCCPublications

Transcription

Similar documents

get protected! - Pine Cellular

January 2015 - ZATIS TECHNOLOGY GROUP

AppRiver - Franchise services Inc.

Creative Cloud: for a community storm

where merchant and mobile services merge

Xactimate version 28

Evaluafion and Development of Strategies for Facial Features

Getting Started with Your Search

FDP/Expert Lectures/Seminars

ACTAUSIYERSIIATISLODZ IENSIS POLIA OECONOMICA 68, 1987