Genetic Learning of the Knowledge Base of a Fuzzy System by

Transcription

Genetic Learning of the Knowledge Base of a Fuzzy System by
Genetic Learning of the Knowledge Base
of a Fuzzy System by Using the
Linguistic 2-Tuples Representation
Rafael Alcal´a and Jes´us Alcal´a-Fdez and Francisco Herrera
Jos´e Otero
Department of Computer Science and Artificial Intelligence
University of Granada
18071 Granada, Spain
E-mail: {alcala, jalcala, herrera}@decsai.ugr.es
Department of Computer Science
University of Oviedo
Campus de Viesques, 33203 Gij´on, Spain
E-mail: [email protected]
Abstract— One of the problems associated to linguistic Fuzzy
Modeling is its lack of accuracy when modeling some complex
systems. To overcome this problem, many different possibilities
of improving the accuracy of linguistic fuzzy modeling have been
considered in the specialized literature, maintaining the desired
trade-off between accuracy and interpretability.
Recently, a new linguistic rule representation model was
presented to perform a genetic lateral tuning of membership
functions. It is based on the linguistic 2-tuples representation
model that allows the lateral displacement of a label considering
an unique parameter. It involves a reduction of the search space
that eases the derivation of optimal models.
Based on the linguistic 2-tuples representation model, in this
work, we present a new method to obtain linguistic fuzzy
systems by means of an evolutionary learning of the data base a
priori (granularity and lateral displacements of the membership
functions) and on the use of a basic rule generation method to
obtain the whole knowledge base. In this way, the search space
reduction provided by the linguistic 2-tuples representation helps
to the evolutionary search technique to obtain more precise and
compact knowledge bases. Moreover, we analyze this approach
considering a real-world problem.
I. I NTRODUCTION
One of the problems associated to linguistic Fuzzy Modeling (FM), modeling of systems building a linguistic model
clearly interpretable by human beings, is its lack of accuracy
when modeling some complex systems. It is due to the inflexibility of the concept of linguistic variable, which imposes
hard restrictions to the fuzzy rule structure [1]. This drawback
leads linguistic FM to sometimes move away from the desired
trade-off between interpretability and accuracy, thus losing the
usefulness of the model finally obtained.
To overcome this problem, many different possibilities to
improve the accuracy of linguistic FM while preserving its intrinsic interpretability have been considered in the specialized
literature [2]. A great number of these approaches share the
common idea of improving the way in which the linguistic
fuzzy model performs the interpolative reasoning by inducing
a better cooperation among the rules composing it. This rule
cooperation may be encouraged acting on three different model
components: the data base (DB) —containing the parameters
of the linguistic partitions—, the rule base (RB) —containing
0-7803-9158-6/05/$20.00 © 2005 IEEE.
the set of rules— and the whole knowledge base (KB) —
containing the RB and the DB—.
This cooperation improvement involves an extension of the
linguistic FM and can be achieved, either by means of methods
considering the learning of the RB and the DB, or by using
post-processing mechanisms that are applied to improve the
system behavior once the RB and the DB are obtained. A
particular case of post-processing was proposed in [3], where
a new linguistic rule representation model was presented for
the genetic tuning of the DB. This approach is based on the
linguistic 2-tuples representation [4], which allows the lateral
displacement of the labels considering an unique parameter
per label. In this way, two main objectives were achieved:
• to obtain linguistic labels containing a set of samples
with a better covering degree (accuracy improvements)
maintaining their original shapes, and
• to reduce the search space respect to the classical tuning
in order to easily obtain optimal models.
Such learning scheme starts from an initial DB and an
initial RB that remains fixed during all the tuning process.
By this reason, the learning process will be askew by the
initial DB and RB, remaining fixed important aspects such
as the granularity of the fuzzy partitions, or the number and
the structure of the obtained rules. Therefore, it would be
desirable a greater degree of cooperation between these two
tasks (RB and DB learning) in order to obtain models with a
good accuracy-interpretability trade-off.
With this aim, we propose an evolutionary method to obtain
whole KBs based on the learning of the granularity and the
linguistic 2-tuples rule representation model, which at the same
time learns the optimal number of labels per variable, the
lateral displacement of such labels and, from this, by means
of a simple rule generation method, obtains the corresponding
RB. As an example, we will analyze this technique by solving
a real world problem from both, the accuracy point of view
and the interpretability point of view.
This contribution is arranged as follows. The next section
describes the linguistic rule representation model based on
the linguistic 2-tuples. The learning scheme considered for
the learning of KBs is introduced in Section III. Section IV
797
The 2005 IEEE International Conference on Fuzzy Systems
proposes the evolutionary learning method considered in this
work. Section V shows an experimental study of the method
behavior applied over a real-world estimation problem. Finally,
Section VI points out some concluding remarks.
II. T HE L INGUISTIC 2-T UPLES R EPRESENTATION
In [3], a new model of tuning of Fuzzy Rule-Based Systems
(FRBSs) was proposed considering the linguistic 2-tuples
representation scheme introduced in [4], which allows the
lateral displacement of the support of a label and maintains
the interpretability associated to the obtained linguistic FRBSs.
This proposal also introduces a new model for rule representation based on the symbolic translation concept.
N
MB
B
y2
M
A
MA
P
N
MB
B
y2
M
A
MA
P
If the Error is “higher than Zero” and
the Error Variation is “a little smaller than Positive”
then the Power is “a bit smaller than High”.
This proposal decreases the learning problem complexity,
since the 3 parameters considered per label are reduced to
only 1 symbolic translation parameter.
In [3], two different rule representation approaches were
proposed, a global approach and a local approach. In our
particular case, the learning is applied to the level of linguistic
partition (considering the global approach). In this way, the
couple (Xi , label) takes the same tuning value in all the rules
where it is considered. For example, Xi is (High, 0.3) will
present the same value for those rules in which the couple
”Xi is High” is initially considered.
III. L EARNING OF THE K NOWLEDGE BASE
Fig. 1.
Lateral Displacement of the Linguistic Label M
Figure 1 shows the lateral displacement of the label M. The
new label “y2 ” is located between B and M, being enough
smaller than M but closer to M.
The symbolic translation of a linguistic term is a number
within the interval [-0.5, 0.5) that expresses the domain of a
label when it is moving between its two lateral labels (interval
[-1, 1]). Formally, we have the couple,
Classically, the learning of the KB of a FRBS is based on the
existence of a previous definition of the DB [5], [6]. Generally,
to define this DB involves choosing a number of linguistic
terms for each linguistic variable and setting the values of the
system parameters by a uniform distribution of the linguistic
terms. However, this operation mode makes the DB to have a
significant influence on the performance of the FRBS finally
obtained, since the obtained RB depends on the DB goodness.
In other works, the DB and the RB are simultaneously obtained [7]. In this way, they have the possibility of generating
better definitions of the KB, but they deal with a larger search
space that makes very difficult to obtain an optimal solution.
(si , αi ), si ∈ S, αi ∈ [0.5, −0.5).
In [4], both the linguistic 2-tuples representation model and
the needed elements for linguistic information comparison and
aggregation are presented and applied to the Decision Making
framework. In the context of the FRBSs, we are going to see
its use in the linguistic rule representation. In the next we
present this approach considering a simple control problem.
Let us consider a control problem with two input variables,
one output variable and a DB defined from experts determining
the membership functions for the following labels:
Learning
Process
Rule Learning
Process
GA
Ad-hoc
Evaluation
Module
(DB+RB)
DB
Fig. 3.
RB
Learning scheme of the KB
Error, Error → {N, Z, P }, P ower → {L, M, H} .
Classical Rule:
R1: If the Error is Zero and the Error Variation
is Positive then the Power is High
Rules with 2-tuples Representation:
R1: If the Error is (Zero, 0.3) and the Error Variation
is (Positive, -0.2) then the Power is (High, -0.1)
Fig. 2.
Classical Rule and Rule with 2-Tuple Representation
Figure 2 shows the concept of classical rule and linguistic 2tuples represented rule. Analized from the rule interpretability
point of view, we could interpret the obtained rule as:
Other way to generate the whole KB consists of obtaining
the DB and the RB separately, based on the DB learning a
priori [8]–[14] (see Figure 3). This way to work allows us to
learn the most adequate context for each fuzzy partition, which
is necessary in different application contexts and different
fuzzy rule extraction models.
The learning scheme considered in this work belongs to this
last group and is comprised of two main components:
• A process to learn the DB, which allows to define:
– The number of labels for each linguistic variable.
– The lateral displacements of such labels.
We consider simple triangular membership functions.
• A quick Ad-hoc data-driven method to derive the RB [6]
considering the DB previously obtained. This method is
798
The 2005 IEEE International Conference on Fuzzy Systems
run from each DB definition generated by a Genetic
Algorithm (GA), thus, allowing the proposed hybrid
learning process to finally obtain the whole definition of
the KB (DB and RB) by means of the cooperative action
of both methods.
IV. E VOLUTIONARY A LGORITHM FOR L EARNING OF THE
K NOWLEDGE BASE
The automatic definition of fuzzy systems can be considered
as an optimization or search process and nowadays, Evolutionary Algorithms, particularly GAs, are considered as the
more known and used global search technique. Moreover,
the genetic coding that they use allow them to include prior
knowledge and to use it leading the search up. For this reason,
Evolutionary Algorithms have been successfully applied to
learn fuzzy systems in the last years, giving way to the
appearance of the so called Genetic Fuzzy Systems [15], [16].
Evolutionary Algorithms in general and, GAs in particular,
has been widely used in the tuning of FRBSs. In this work,
we will consider the use of GAs to design the proposed
learning algorithm. CHC [17] is a GA that presents a good
trade-off between exploration and exploitation, being a good
choice in problems with complex search spaces. This genetic
model makes use of a “Population-based Selection” approach.
M parents and their corresponding offspring are combined to
select the best M individual to take part of the next population.
Considering this approach, the learning process of the DB
have to define both, the granularity of the linguistic partitions
and the lateral displacements of the involved labels. For this
reason, a double coding scheme is considered (granularity +
displacements).
In the following, the components needed to design the
evolutionary learning process are explained. They are: DB codification, chromosome evaluation, initial gene pool, crossover
operator and restarting approach. A scheme of the algorithm is
shown in Figure 4 considering the Wang and Mendel Method.
Initialize population
and THRESOLD
Crossover of
M parents
Evaluation of the
New individuals
WM
(RB)
no
Restart the
population
THRESOLD < 0.0
Selection of the
best M individuals
Scheme of the algorithm
A. DB Codification
A double coding scheme (C = C1 + C2 ) to represent both
parts, granularity and translation parameters, is considered:
• Number of labels (C1 ): This part is a vector of integer
numbers with size N (being N the number of system
variables). The possible numbers of labels considered are
the set {3, . . . , 9}:
C1 = (L1 , . . . , LN ) .
Lateral displacements (C2 ): This part is a vector of real
numbers with size N ∗ 9 (N variables with a maximum of 9 linguistic labels per variable) in which the
displacements of the different labels are coded for each
variable. Of course, if a chromosome does not have the
maximum number of labels in one of the variables, the
space reserved for the values of these labels is ignored in
the evaluation process. In this way, the C2 part has the
following structure (where each gene is associated to the
tuning value of the corresponding label):
1
N
N
C2 = (α11 , . . . , αL
1 , . . . , α1 , . . . , αLN )
B. Chromosome Evaluation
To evaluate a determined chromosome we will apply the
well-known rule generation method of Wang and Mendel [6]
on the DB coded by such chromosome. To do that, each membership function is previously displaced to its new position
in the interval defined between the vertexes of its two lateral
labels. Once the whole KB is obtained, the Mean Square Error
(MSE) is computed and the following function is minimized:
FC = w1 · M SE + w2 · N R,
where, NR is the number of rules of the obtained KB (to
penalize a large number of rules), w1 = 1 and w2 is computed
from the MSE and the number of rules of the KB generated
from a DB considering the maximum number of labels (9
labels) and without considering the displacement parameters,
w2 = α ·
M SEmax−lab
,
N Rmax−lab
with α being a weighting percentage given by the system
expert that determines the trade-off between accuracy and
complexity. Values higher than 1.0 search for linguistic models
with few rules, and values lower than 1.0 search for linguistic
models with high accuracy. A good neutral choice is for
example 1.0 (good accuracy and not too many rules).
For the fuzzy inference, we have selected the minimum
t-norm playing the role of the implication and conjunctive
operators, and the center of gravity weighted by the matching
strategy acting as defuzzification operator.
C. Initial Gene Pool
yes
Fig. 4.
•
The initial population will be comprised of two different
parts (with the same number of chromosomes):
• In the first part, each chromosome has the same number
of labels for all the problem variables and considers
strong fuzzy partitions with translation parameters initialized to zero.
• In the second part, the only change is that each variable
could have a different number of labels.
Since CHC has no mutation operator, the translation parameters remain unchanged and the most promising number
of labels is obtained for each linguistic variable. The algorithm
operates in this way until the first restarting is reached.
799
The 2005 IEEE International Conference on Fuzzy Systems
D. Crossover Operator
Two different crossover operators are considered depending
on the two parent’s scope to obtain two offspring:
• When the parents encode different granularity levels in
any variable, a crossover point is randomly generated in
C1 and the classical crossover operator is applied on this
point in both parts, C1 and C2 (exploration).
• When both parents have the same granularity level per
variable, an operator based on the the concept of environments (the offspring are generated around one parent)
is applied only on the C2 part (exploitation). These
kinds of operators present a good cooperation when they
are introduced within evolutionary models forcing the
convergence by pressure on the offspring (as the case of
CHC). Particularly, we consider the Parent Centric BLX
(PCBLX) operator [18], which is based on the BLX-α.
Figure 5 depicts the behavior of these kinds of operators.
The PCBLX is described as follows. Let us assume that
X = (x1 · · · xn ) and Y = (y1 · · · yn ), (xi , yi ∈ [ai , bi ] ⊂
, i = 1 · · · n), are two real-coded chromosomes that are
going to be crossed. PCBLX generates the offspring Z =
(z1 · · · zn ), where zi is a randomly (uniformly) chosen
number from the interval [li , ui ], with li = max{ai , xi −
I}, ui = min{bi , xi + I}, and I =| xi − yi |. The parents
X and Y will be named differently: X will be called
female parent, and Y will be called male parent. In this
way, by taking X as female parent (Y as male), and then
by taking Y as female parent (X as male) our algorithm
generates two offspring.
E. Restarting approach
To get away from local optima, a restarting mechanism is
considered [17] when the threshold value L is lower than zero.
In this case, all the chromosomes set up their C1 parts to
that of the best global solution, being the parameters of their
C2 parts generated at random within the interval [−0.5, 0.5).
Moreover, if the best global solution had any change from
the last restarting point, this is included in the population
(the exploitation continues while there is convergency). This
operation mode was initially proposed by the CHC authors as
a possibility to improve the algorithm performance when it is
applied to solve some kinds of problems [17].
V. E XPERIMENTS AND A NALYSIS OF R ESULTS
To analyze the behavior of the proposed method, learning of
the granularity together with the global lateral displacements,
several experiments have been carried out considering a realworld problem, the estimation of the maintenance costs of
the medium voltage electrical network in a town [19]. This
problem handles four input variables and therefore, it involves
a large search space. A short description of this problem can
be found in the following subsection.
TABLE I
M ETHODS C ONSIDERED FOR THE E XPERIMENTAL S TUDY
Ref.
[6]
[5]
Fig. 5.
Scheme of the behavior of the BLX and PCBLX operators
On the other hand, CHC makes use of an incest prevention mechanism, i.e., two parents are crossed if their
hamming distance divided by 2 is over a predetermined
threshold, L. It will be only considered in order to apply
the PCBLX operator. Since, we consider a real coding
scheme (the C2 part is going to be crossed), we have to
transform each gene considering a Gray Code with a fixed
number of bits per gene (BIT SGEN E) determined by
the system expert. In this way, the threshold value is
initialized as:
L = (#GenesC2 ∗ BIT SGEN E)/4.0.
Following the original CHC scheme, L is decremented by
one when no cross is performed in one generation. In order
to avoid very slow convergence, in our case, L will be also
decremented by one when no improvement is achieved respect
to best chromosome of the previous generation.
Method
WM
COR
[10]
GA+WM
[8]
GA+COR
—
GLD+WM
Type of Learning
Ad-hoc Data-Driven Method
Cooperative Rules by Using the
Best-Worst Ant System
Granularity + Scaling Factors + Contexts + RB
by DB learning a priori and Using WM
Granularity + Scaling Factors + Contexts + RB
by DB learning a priori and Using COR
Granularity + Global Lateral Displacements + RB
by DB learning a priori and Using WM
Table I presents a brief description of the studied methods.
The WM and COR algorithms are considered as simple rule
generation methods to obtain RBs from a predefined DB. Two
methods to obtain complete KBs are considered for comparisons. They are based on the DB learning a priori obtaining
the granularity, scaling factors and contexts. The proposed
method (GLD+WM) and the GA+WM method integrate the
WM algorithm within its own DB learning process as the
mechanism to obtain the RB. GA+COR integrates the bestworst ant system-based COR algorithm to perform this task.
In the case of WM and COR, the initial linguistic partitions
are comprised by five linguistic terms with triangular-shaped
fuzzy sets giving meaning to them (number of labels by which
they presented the best behavior). Finally, the following values
800
The 2005 IEEE International Conference on Fuzzy Systems
TABLE II
have been considered for the parameters of each method 1 : 50
individuals and 50,000 evaluations; 0.6 and 0.2 as crossover
and mutation probabilities in the case of COR, GA+COR and
GA+WM. The α factor for the fitness function of GLD+WM
was set to 1, 3 and 5, in order to obtain models with different
levels of accuracy and simplicity. The number of bits for the
Gray codification is 30 bits per gene.
R ESULTS O BTAINED BY THE S TUDIED M ETHODS
A. Problem Description: Estimating the Maintenance Costs of
Medium Voltage Lines
Estimating the maintenance costs of the medium voltage
electrical network in a town [19] is a complex but interesting
problem. Since a direct measure is very difficult to obtain, the
consideration of models becomes useful. These estimations
allow electrical companies to justify their expenses. Moreover,
the model must be able to explain how a specific value is
computed for a certain town. Our objective will be to relate the
maintenance costs of medium voltage line with the following
four variables: sum of the lengths of all streets in the town,
total area of the town, area that is occupied by buildings, and
energy supply to the town. We will deal with estimations of
minimum maintenance costs based on a model of the optimal
electrical network for a town in a sample of 1,059 towns.
To develop the different experiments in this contribution,
we consider a 5-folder cross-validation model, i.e., 5 random
partitions of data with a 20%, and the combination of 4 of
them (80%) as training and the remaining one as test. In this
way, 5 partitions considering an 80% (847) in training and a
20% (212) in test are considered for the experiments.
Method
#R
MSEtra
σtra t-test MSEtst
σtst t-test
WM
COR
GA+WM
GA+COR
65
41
51.1
17.8
56136
39640
23014
20360
1498
566
2143
1561
–
–
–
–
56359
41683
24090
22830
4686
1599
3667
3259
–
–
–
–
GLD+WM,
with α = 1
GLD+WM,
with α = 3
GLD+WM,
with α = 5
57.5
10218
1044
12088
1972
41.2
13074
2040
–
15196
2757
–
30.7
16884
2822
–
18943
3649
–
•
B. Results and Analysis
•
For each one of the 5 data partitions, the tuning methods
has been run 6 times, showing for each problem the averaged
results of a total of 30 runs. Moreover, a t-test (with 95 percent
confidence) was applied to the best averaged result in training
or test by comparing one by one this result to the averaged
results of the remaining methods.
The results obtained by the analyzed methods are shown in
Table II, where #R stands for the number of rules, MSEtra
and MSEtst respectively for the averaged error obtained over
the training and test data, σ for the standard deviation and
t-test represents the following information:
–
Denotes the best averaged result
Denotes a significant worst behavior than the best
Analyzing the results presented in Table II we can point out
the following conclusions:
• Although the proposed method (learning of the RB,
granularity and lateral displacements) can learn partitions
presenting a high granularity, the RB is obtained by
means of a generation method to learn few rules (57.5,
1 With these values we have tried to ease the comparisons selecting standard
common parameters that work well in most cases instead of searching very
specific values for each method. Moreover, we have set a large number
of evaluations in order to allow the compared algorithms to achieve an
appropriate convergence. No significant changes were achieved by increasing
that number of evaluations.
41.1 or 30.7 from the 6561 possible rules if the input
partitions consider nine labels, 9x9) which, together with
the objective to minimize the number of rules, allow us to
obtain accurate but compact models and, therefore, more
interpretable. Notice that, the GA+COR method obtains
the linguistic models with less number of rules. It is due
to the rule simplification performed by COR during the
RB learning, which results in linguistic models with too
few rules.
The method proposed in this work shows an important
reduction of the mean squared error over the training and
test data in a problem with a large search space, being
robust to random factors. We can see that the proposed
algorithm does not present significant deviations in the
results, being its standard deviation one of the lowest.
The consideration of an unique parameter per function
reduces the search space respect to the classical learning
of membership functions which usually considers 3 or
4 parameters (in the case of triangular or trapezoidal
membership functions). Therefore, this learning approach
represents the ideal framework to be combined with
other learning schemes, since these combinations greatly
increase the problem search space. It is the case of
the combination with the learning of the RB and the
granularity of the linguistic partitions, the GLD+WM
method. Furthermore, the linguistic models so obtained
are interpretable in a high level since the original shapes
of the initial membership functions are maintained.
Figures 6 and 7 respectively depicts the evolved fuzzy
linguistic partitions and the RB obtained by the GLD+WM
method from one of the 30 runs performed with α = 5. To
easy the graphical representation, in these figures, the labels
are named from ‘l1’ to ‘lLi . Nevertheless, such labels have
associated a linguistic meaning determined by an expert. In
this way, if the ‘l1’ label of the ‘X1’ variable represents
‘LOW’, ‘l1+0.11’ could be interpreted as ‘a little smaller than
LOW’ (based on the expert opinion) or, as in the case of
some classical learning approaches, maintaining the original
meaning of such label. It is the case of Figure 6, where the
new labels could maintain their initial meanings.
801
The 2005 IEEE International Conference on Fuzzy Systems
l1-0.4
l2-0.2
l3+0.1
l4+0.0
approach to obtain more compact and precise models. As
further work, we propose other smart combinations, e.g.,
applying more powerful rule generation approaches as could
be an iterative rule learning, the cooperative rule learning, etc.
X1
l1
l2
l3
l1-0.3
l4
l2-0.0
l3+0.4
ACKNOWLEDGMENT
X2
l1
l1-0.1
l2
l2-0.0
Supported by the CICYT Project TIC-2002-04036-C05-01.
l3
l3+0.1
l4+0.4
l5+0.4
l6+0.5
R EFERENCES
X3
l1
l2
l1-0.1
l3
l4
l2+0.1
l5
l6
l3+0.1
l4+0.2
X4
l1
l1-0.2
l2
l2-0.1
l3
l3-0.1
l4+0.0 l5-0.3
l4
l6+0.3
l7+0.4
Y
l1
l2
l3
l4
l5
l6
l7
Fig. 6. DB without lateral displacements (in gray) and DB with lateral
displacements (in black) of a model obtained by GLD+WM with α = 5
X1
X2
X3
X4
Y
X1
X2
X3
X4
Y
l1 -0.41
l1 -0.41
l2 -0.16
l2 -0.16
l2 -0.16
l2 -0.16
l2 -0.16
l2 -0.16
l2 -0.16
l2 -0.16
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l1 -0.31
l1 -0.31
l1 -0.31
l1 -0.31
l1 -0.31
l1 -0.31
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l1 -0.14
l2 -0.04
l1 -0.14
l1 -0.14
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l3+0.06
l3+0.06
l2 -0.04
l2 -0.04
l2 -0.04
l3+0.06
l3+0.06
l3+0.06
l1 -0.08
l1 -0.08
l1 -0.08
l2+0.10
l1 -0.08
l2+0.10
l1 -0.08
l2+0.10
l1 -0.08
l2+0.10
l1 -0.08
l2+0.10
l3+0.10
l1 -0.08
l2+0.10
l3+0.10
l1 -0.21
l2 -0.06
l1 -0.21
l2 -0.06
l2 -0.06
l3 -0.06
l2 -0.06
l3 -0.06
l3 -0.06
l4+0.04
l2 -0.06
l3 -0.06
l4+0.04
l3 -0.06
l4+0.04
l5 -0.30
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l3+0.15
l4+0.02
l4+0.02
l4+0.02
l4+0.02
l4+0.02
l4+0.02
l4+0.02
l4+0.02
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l3+0.42
l3+0.42
l3+0.42
l3+0.42
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l4+0.36
l4+0.36
l5+0.38
l5+0.38
l4+0.36
l4+0.36
l6+0.50
l6+0.50
l2 -0.04
l2 -0.04
l2 -0.04
l2 -0.04
l3+0.06
l3+0.06
l3+0.06
l3+0.06
l2+0.10
l3+0.10
l2+0.10
l3+0.10
l2+0.10
l3+0.10
l2+0.10
l3+0.10
l1 -0.08
l2+0.10
l3+0.10
l4+0.20
l1 -0.08
l2+0.10
l3+0.10
l4+0.20
l5 -0.30
l6+0.30
l6+0.30
l7+0.43
l5 -0.30
l6+0.30
l6+0.30
l7+0.43
l3 -0.06
l3 -0.06
l4+0.04
l6+0.30
l4+0.04
l4+0.04
l5 -0.30
l6+0.30
#R: 32, MSE-tra: 13680.317, MSE-tst: 14857.941
Fig. 7. RB and displacements of a model obtained by GLD+WM with α = 5
VI. C ONCLUSIONS
This work presents a new method for learning KBs by
means of an a priori evolutionary learning of the DB (granularity and translation parameters). It makes use of the linguistic
2-tuples rule representation model presented in [3]. In the
following, we present our conclusions and further works:
•
•
The used learning scheme and the 2-tuples rule representation model allow an important reduction of the search
space, obtaining more precise and compact models.
Since a global approach is considered and the shapes of
the initial linguistic partitions are preserved, the interpretability of the obtained models is maintained to a high
level respect to the classical learning of fuzzy systems.
An interesting further question is the accomplishment of a
wider analysis about the existent relation between the problem
complexity and the different methods presented in this work.
At this moment, we are working on this issue to present a
complete study of these kinds of techniques.
The use of different FRBS learning schemes considering
the 2-tuples rule representation model seems to be a good
[1] A. Bastian, “How to handle the flexibility of linguistic variables
with applications,” International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems, vol. 3, no. 4, pp. 463–484, 1994.
[2] J. Casillas, O. Cord´on, F. Herrera, and L. Magdalena, Eds., Accuracy
improvements in linguistic fuzzy modeling. Springer-Verlag, 2003.
[3] R. Alcal´a and F. Herrera, “Genetic tuning on fuzzy systems based
on the linguistic 2-tuples representation,” in Proc. of the 2004 IEEE
International Conference on Fuzzy Systems, vol. 1, Budapest (Hungary),
2004, pp. 233–238.
[4] F. Herrera and L. Mart´ınez, “A 2-tuple fuzzy linguistic representation
model for computing with words,” IEEE Trans. Fuzzy Syst., vol. 8, no. 6,
pp. 746–752, 2000.
[5] J. Casillas, O. Cord´on, I. F. de Viana, and F. Herrera, “Learning
cooperative linguistic fuzzy rules using the best-worst ant systems
algorithm,” International Journal of Intelligent Systems, 2005, in press.
[6] L. Wang and J. Mendel, “Generating fuzzy rules by learning from
examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, no. 6, pp. 1414–
1427, 1992.
[7] L. Magdalena and F. Monasterio-Huelin, “A fuzzy logic controller with
learning through the evolution of its knowledge base,” International
Journal of Approximate Reasoning, vol. 16, no. 3-4, pp. 335–358, 1997.
[8] J. Casillas, O. Cord´on, F. Herrera, and P. Villar, “A hybrid learning
process for the knowledge base of a fuzzy rule-based system,” in
Proc. of the 2004 International Conference on Information Processing
and Management of Uncertainty in Knowledge-Based Systems, vol. 3,
Perugia (Italy), 2001, pp. 2189–2196.
[9] O. Cord´on, F. Herrera, and P. Villar, “Generating the knowledge base
of a fuzzy rule-based system by the genetic learning of the data base,”
IEEE Trans. Fuzzy Syst., vol. 9, no. 4, pp. 667–674, 2001.
[10] O. Cord´on, F. Herrera, L. Magdalena, and P. Villar, “A genetic learning
process for the scaling factors, granularity and contexts of the fuzzy
rule-based system data base,” Information Sciences, vol. 136, pp. 85–
107, 2001.
[11] B. Filipic and D. Juricic, “A genetic algorithm to support learning
fuzzy control rules from examples,” in Genetic Algorithms and Soft
Computing, F. Herrera and J. L. Verdegay, Eds. Physica-Verlag, 1996,
pp. 403–418.
[12] W. Pedrycz, “Associations and rules in data mining: A link analysis,”
Int. Journal of Intelligent Systems, vol. 19, no. 7, pp. 653–670, 2004.
[13] D. Simon, “Sum normal optimization of fuzzy membership functions,”
International Journal of Uncertainty, Fuzziness and Knowledge-Based
Systems, vol. 10, no. 4, pp. 363–384, 2002.
[14] Y. Teng and W. Wang, “Constructing a user-friendly ga-based fuzzy
system directly from numerical data,” IEEE Trans. Syst., Man, Cybern.
B, vol. 34, no. 5, pp. 2060–2070, 2004.
[15] O. Cord´on, F. Herrera, F. Hoffmann, and L. Magdalena, GENETIC
FUZZY SYSTEMS. Evolutionary tuning and learning of fuzzy knowledge
bases, ser. Advances in Fuzzy Systems - Applications and Theory.
World Scientific, 2001, vol. 19.
[16] O. Cord´on, F. Gomide, F. Herrera, F. Hoffmann, and L. Magdalena, “Ten
years of genetic fuzzy systems: Current framework and new trends,”
Fuzzy Sets and Systems, vol. 41, no. 1, pp. 5–31, 2004.
[17] L. Eshelman, “The chc adaptive search algorithm: How to have safe
serach when engaging in nontraditional genetic recombination,” in
Foundations of genetic Algorithms, G. Rawlin, Ed. Morgan Kaufman,
1991, vol. 1, pp. 265–283.
[18] F. Herrera, M. Lozano, and A. S´anchez, “A taxonomy for the crossover
operator for real-coded genetic algorithms: An experimental study,”
International Journal of Intelligent Systems, vol. 18, pp. 309–338, 2003.
[19] O. Cord´on, F. Herrera, and L. S´anchez, “Solving electrical distribution
problems using hybrid evolutionary data analysis techniques,” Applied
Intelligence, vol. 10, pp. 5–24, 1999.
802
The 2005 IEEE International Conference on Fuzzy Systems