Empirical Investigation Partial Lamarckianism of the Benefits of

Transcription

Empirical Investigation Partial Lamarckianism of the Benefits of
Empirical Investigation of the Benefits of
Partial Lamarckianism
Christopher R. Houck
Jeffery A. Joines
Box 7906
Department of Industrial Engineering
North Carolina State University
Raleigh, NC 27695-7906
[email protected]
Box 7906
Department of Industrial Engineering
North Carolina State University
Raleigh, NC 27695-7906
[email protected]
Michael G. Kay
James R. Wilson
Box 7906
Department of Industrial Engineering
North Carolina State University
Raleigh, NC 27695-7906
[email protected]
Box 7906
Department of Industrial Engineering
North Carolina State University
Raleigh, NC 27695-7906
[email protected]
Abstract
Genetic algorithms (GAS) are very efficient at exploring the entire search space; however,
they are relatively poor at finding the precise local optimal solution in the region in which
the algorithm converges. Hybrid GAS are the combination of improvement procedures,
which are good at finding local optima, and GAS. There are two basic strategies for using hybrid GAS. In the first, Lamarckian learning, the genetic representation is updated
to match the solution found by the improvement procedure. In the second, Baldwinian
learning, improvement procedures are used to change the fitness landscape, but the solution that is found is not encoded back into the genetic string. This paper examines the
issue of using partial Lamarckianism (i.e., the updating of the genetic representation for
only a percentage of the individuals), as compared to pure Lamarckian and pure Baldwinian learning in hybrid GAS. Multiple instances of five bounded nonlinear problems,
the location-allocation problem, and the cell formation problem were used as test problems in an empirical investigation. Neither a pure Lamarckian nor a pure Baldwinian
search strategy was found to consistently lead to quicker convergence of the GA to the
best known solution for the series of test problems. Based on a minimax criterion (i.e.,
minimizing the worst case performance across all test problem instances), the 20% and
40% partial Lamarckianism search strategies yielded the best mixture of solution quality
and computational efficiency.
Keywords
Lamarckian evolution, Baldwin effect, local improvement procedures, hybrid genetic algorithms.
1. Introduction
Genetic algorithms (GAS) are a powerful set of global search techniques that have been
shown to produce very good results for a wide class of problems. Genetic algorithms search
the solution space of a function through the use of simulated “Darwinian” evolution. In
general, the fittest individuals of any population tend to reproduce and survive to the next
@ 1997 by the Massachusetts Institute of Technology
Evolutionary Computation S(1): 3 1-60
C:. K. Houck,J. A. Joines, M.G. Kay, and J. R. Wilson
generation, thus improving successive generations. However, inferior individuals can, probabilistically, survive and also reproduce. Genetic algorithms have been shown to solve linear
and nonlinear problems by exploring all regions of the state space and exponentially exploiting promising areas through mutation, crossover, and selection operations applied to
individuals in the population. Whereas traditional search techniques use characteristics of
the problem (objective function) to determine the next sampling point (e.g., gradients, Hessians, linearity, and continuiq), GASmake no such assumptions. Instead, the next sampled
points are determined based on stochastic sampling/decision rules, rather than a set of deterministic decision rules. Therefore, evaluation functions of many forms can be used, subject
to the minimal requirement that the function can map the population into a totally ordered
set. A more complete discussion of GAS, including extensions and related topics, can be
found in the books by Davis (1901), Goldberg (1989), 12/litchell (1996), and Michalewicz
(1906).
Hybrid G k s use local improvement procedures as part of the evaluation of individuals. For many problems there exists a well-developed, efficient search strategy for local
improvement (e.g., hill-climbers for nonlinear optimization). These local search strategies
complement the global search strateg?: of the G I , yielding a more efficient overall search
stratea. T h e issue o f whether or not to recode the genetic string to reflect the result of the
local improvement has given rise to research into Lamarckian search strategies (i.e., recoding the string to reflect the improvement) and Baldwinian search strategies (ix., leaving the
string unaltered). In one investigation, \.f%itley, Gordon, and Mathias (1994) found that, for
three test problems, utilizing a Baldwinian search strategy was superior in terms of finding
the optirnal solution for the Schwefel function, whereas Lamarckian search performed better
on the (;riewank function, and both strategies worked equally well on the Rastrigin function.
Although most approaches to hybrid GASuse either a pure Lamarckian or a pure Baldwinian
search strategy, neither is optimal for all problems. Therefore, this work concentrates on a
partial Lamarckian strategy, where neither a pure Lamarckian nor pure Baldwinian search
s t r a t e g is used; rather, a mix of Lamarckian and Baldwinian search strategies are used.
This paper prcscnts an empirical analysis of the use of (;As utilizing local improvement
procedures to solve a \.ariety of test problems. Section 2 describes how GAScan be hybridized
hy incorporating local improvement procedures into the algorithm. Incorporating local
improvement procedures gives rise to the concepts of Lamarckian evolution, the Baldwin
effect, and partial Lamarchanism, all of which are explained in Section 2 . Also, the concept
of a one-to-one geno+-pe to phenotype mapping is reviewed, where genotype refers to the
space that the G.4 searches, whereas the phenotype refers to the space of the actual problem.
T h e concepts introduced in Section Z are applied to three different classes of problems
described in Section 3 : five different multimodal nonlinear function optimization problems,
the location-allocation problem, and the inanuhcturing cell formation problem. Section 4
presents the results of a series of empirical tests of the various search strategies described in
Section 7 .
2.
Hybridizing GAS with Local Improvement Procedures
Ilany researchers (Chu ei Keasley, 1995; Davis, 1991; Joines, I n g , & Culbreth, 1095;
I\.licl;alewicz, 1996; Renders & Flasse, 1996) have shown that (;A5 perform well for global
searching because they are capable of quickly finding and exploiting promising regions of
the search space, but they take a relatively long time to converge to a local optimum. Local
improvement procedures (e.g., pairwise switching for permutation problems and gradient
32
Evolutionan Computation \.ihiiie .iNutnher
,
I
Benefits of Partial Lamarckianism
descent for unconstrained nonlinear problems) quickly find the local optimum of a small region of the search space but are typically poor global searchers. They have been incorporated
into GAS in order to improve the algorithm’s performance through individual “learning.”
Such hybrid GAS have been used successfully to solve a wide variety of problems (Davis,
1991; Joines et al. (1995); Renders & Flasse, 1996).
There are two basic models of evolution that can be used to incorporate learning into
a GA: the Baldwin effect and Lamarckian evolution. T h e Baldwin effect allows an individual’s fitness (phenotype) to be determined based on learning (i.e., the application of local
improvement). Like natural evolution, the result of the improvement does not change the
genetic structure (genotype) of the individual. Lamarcluan evolution, in addition to using
learning to determine an individual’s fitness, changes the genetic structure of the individual
to reflect the result of the learning. Both “Baldwinian learning” and “Lamarckian learning”
have been investigated in conjunction with hybrid GAS.
2.1 Baldwin Effect
T h e Baldwin effect, as utilized in GAS, was first investigated by Hinton and Nolan (1987)
using a flat landscape with a single well representing the optimal solution. Individuals were
allowed to improve by random search, which in effect transformed the landscape to include
a funnel around the well. Hinton and Nolan (1987) showed that, without learning, the GA
fails; however, when hybridized with the random search, the GA is capable of finding the
optimum.
Whitley et al. (1994) demonstrate that “exploiting the Baldwin effect need not require
a needle in a haystack and improvements need not be probabilistic.” They show how using
a local improvement procedure (a bitwise steepest ascent algorithm performed on a binary
representation) can in effect change the landscape of the fitness function into flat landscapes
around the basins of attraction. This transformation increases the likelihood of allocating
more individuals to certain attraction basins containing local optima. In a comparison of
Baldwinian and Lamarckian learning, Whitley et al. (1994) show that utilizing either form
of learning is more effective than the standard GA approach without the local improvement
procedure. They also argue that although Lamarckian learning is faster, it may be susceptible
to premature convergence to a local optimum compared with Baldwinian learning. Three
numerical optimization problems were used to test this conjecture; however, the results were
inconclusive. They believe that these test functions were too easy and that harder problems
may in fact show that a Baldwinian search strategy is preferred.
2.2 Lamarckian Evolution
Lamarckian evolution forces the genotype to reflect the result of the local search. This
results in the inheritance of acquired or learned characteristics that are well adapted to the
environment. T h e improved individual is placed back into the population and allowed to
compete for reproductive opportunities. However, Lamarckian learning inhibits the schema
processing capabilities of GAS (Gruau & Whitley, 1993; Whitley et al., 1994). Changing the
genetic information in the chromosomes results in a loss of inherited schema, altering the
statistical information about hyperplane partitions implicitly contained in the population.
Although schema processing may not be an issue given the higher order cardinality representations for the test problems used in this paper, the problem of premature convergence
remains the same.
Baldwinian learning, on the other hand, certainly aggravates the problem of multiple
genotype to phenotype mappings. A genotype refers to the composition of the values in the
Evolutionary Computation Volume 5, Numher 1
33
C. R. Ilouck, J. A . Joines, XI. G. Kay, and J. R. Wilson
chromosome or individual in the population, whereas a phenotype refers to the solution that
is constructed from a chromosome. In a direct mapping, there is no distinction between
genotypes and phenotypes. For example, to optimize the function 5 cos ( X I ) - sin (2x2), a
typical representation for the chroniosorne would be a vector of real numbers ( X I , xz), which
provides a direct mapping to the phenotype. However, for some problems, a direct mapping
is neither possible nor desired (Xakano, 1991). T h e most common example ofthis is the T S P
problem with an ordinal representation. Here, the genotype is represented by an ordered
list of iz cities to visit, for example, ( q 1 1 , cpl,. . . ,c(,,]). However, the phenotype of the TSP
is a tour, and any rotation of the chromosome yields the same tour; thus, any rotation of a
genotype results in the same phenotype. For example, the two tours ( I , 2,3,4) and (3,4,1,2)
have different genotypes because their gene strings are different; but both strings represent
the same tour and thus have the same phenotype.
It has been noted that having multiple genotypes map to the same phenotype may
confound the G h (Homaifar, &an, & Liepins, 1993; Nakano, 1991). This problem also
occurs when a local improvement procedure (1,IP) is used in conjunction with a GA. Consider
the exainple of optimizing sin (s).Suppose a simple gradient-based LIP is used to determine
the fitness of a chromosome. Then, any genoqpe in the range [ - 7 r / 2 , 3 ~ / 2 ]will have the
same phenotype value of 1 a t T/Z.
2.3 Partial Lamarckian
Hybrid <;*Asneed not he restricted to operating in either a pure Baldwinian or pure Lamarckian manner. Instead, a mix of both strategies, or what is termed partial Lamarckianism,
could be employed together. For example, a possible strategy is to update the genotype to
reflect the resulting phenoqpe in 50% of the individuals. Clearly, a 50% partial Lamarckian
strateg!. has no justification in natural evolution, but for simulated evolution it provides an
additional search strateg.
Oruosh and Davis (1994 advocated the use of a 5% rule for updating individuals when
employing repair functions in GASto solve constrained optimization problems. All infeasible
solutions senerated by the GX are repaired to the feasible domain to determine their fitness.
The 5% rule dictates that 5% of the infeasible individuals have their genetic representation
updated to reflect the repaired feasible solution. This partial updating was shown on a set of
com hinatorial problems to be better than either no updating or always updating. However,
Michalewicz and Nazhivath (1995) determined that a higher percentage, 20% to 25%, of
update did better when using repair functions to solve continuous constrained nonlinear
programming problems.
Previous research has concentrated either o n the comparison of pure Lamarckian and
pure Kaldwinian search or the effectiveness of partial repair for constrained optimization.
This u o r k examines the use of partial Lamarchanism with regard to the use of local search
on a set of hountled nonlinear and comhinatorial optimization test problems.
3.
Experimentation
'li) iiiwstipte the trade-off between disrupted schema processing (or premature convergence
to a local optima) in 1,amarckian learning and multiple genotype mapping to the same phenovpe i n Baldwinian learning, a series of experiments using local improvement procedures
as ei.aluation functions were run on several different test problems. For each test problem,
the (;.A was run with varying levels of Iamrckianisni (i.e., updating the genotype to reflect
the phenotype), as well as a pure genetic approach (i.e., no local improvement). T h e GA was
Benefits of Partial Lamarckianism
run with no Lamarckianism (i.e., a pure Baldwinian search strategy), denoted as 0% update,
as well as 100% Lamarckianism (i.e., a pure Lamarckian search strategy). Combinations of
both strategies (i.e., partial Lamarckian search strategies) were used to determine whether
there was a trend between the two extremes and whether combinations of Baldwinian and
Lamarckian learning were beneficial. In these experiments, individuals were updated to
match the resulting phenotype with a probability of 5%, lo%, 20%, 40%, 50%, 60%, 80%,
YO%, and 95%. These values were chosen to examine the effects at a broad variety of levels,
but focusing attention at each extreme. For each problem instance, the GA was replicated
30 times, with common random seeds across the different search strategies. T h e GA and
parameters used in these experiments are described in Appendix A. T h e GA is of a form
advocated by Michalewicz (1996) and uses a floating-point representation with three realvalued crossovers and five real-valued mutations. However, for the cell formation problems,
the GA is modified to work with integer variables. The GA was terminated when either the
optimal solution was found or after 1 million function evaluations were performed. Using
function evaluations takes into account the cost of using a LIP (i.e., the number of function
evaluations generated by the LIP), thus allowing a direct comparison between the hybrid
GA methods and the pure GA (i.e., no local improvement).
Multiple instances of seven different test problems were used in this investigation to
ensure that any effect observed (I) held across different classes of problems as well as different sized instances of the problem, and (2) was not due to some unknown structure of the
specific class of problem selected, specific instance chosen, or the particular local improvement procedure used. T h e first five test problems are bounded nonlinear optimization test
problems taken from the literature, whereas the last two problems are the location-allocation
and manufacturing cell formation problems. For each of the nonlinear optimization problems, three different sized instances (n = 2,10,20) were used in the study. For both the
location-allocation and cell formation problems, two different sized instances were used.
3.1 Brown’s Almost Linear Function
The first test problem comes from a standard test set of nonlinear optimization problems
(More, Garbow, & Hillstrom, 1981) and is as follows:
where
T h e function is not linearly separable and has the basic form of a nonlinear least squares
problem. Three different instances of this problem were used for this investigation: a 2dimensional instance (Brown-2), a 10-dimensional instance (Brown- lo), and a 20-dimensional
instance (Brown-20).
3.2 Corana Function
The next test problem consists of three instances of the modified Corana problem: a 2-, lo-,
and 20-dimensional instance (Corana-2, Corana- 10, and Corana-20, respectively). This
Evolutionary Computation Volume 5, Number 1
3s
C. R . I-fouck,J. A. Joines, hl. G. Kay, and J. R. Wilson
problem comes froin a family of continuous nonlinear multimodal functions developed by
Corana, Marchesi, iMartini, and Ridella (1987) to test the efficiency and effectiveness of
simulated annealing compared to that of stochastic hill-climbing and Nelder-Mead optimization methods. This parameterized fatnily of test functions contains a large number of
local minima. T h e function can be described as an n-dimensional parabola with rectangular
pockets removed and with the global optimum always occurring a t the origin. These functions are parameterized with the number of dimensions ( I T ) , the number of local optima, and
the width and height of the rectangular pockets. As forinulated in Corana, Marchesi, Martini, and Ridella (1987), the function proved to be too easy: All the learning methods found
the optimal solution every time with the same amount of computational effort. Therefore,
the first condition in the definition of g ( x I )below was modified so that rectangular troughs,
instead of pockets, are removed from the function. This was accomplished by eliminating
the restriction that all, instead of any, x, satisfy k,s, - t, 5 x, 5 k,s; + t,, and all k, are not zero.
T h e modified Corana function is as follows:
1,
f ( s )= C t . , g ( x , ) ,
I=
for
c, E 7 and
s,E
1- ~O,OOO,IO,OOO~
(2)
I
where
i7
0.1 j-;,
$f((S,)
=
: , / =
7
=
.’,
=
.I-;
I
,
if k,s, t, 5
otherwise
~
.I-,5
k;r, + t;, for any integer k
k,s, + I,,
if k, < 0
0,
if k, = 0
k,si - t,,
if k, > 0
(1,1000,10,100,1,10,100,1000,1,10,
1, 10, 100, 1000)
0.2,
t,=0.05
,100,1000,1,10,100,1000,
3.3 Griewank Function
T h e Griewank function is the third test problem, again with three instances of 2, 10, and
2 0 dimensions, respectively (Griewank-2, (hiewink- 10, and Griewank-20). T h e Griewank
function is a bounded optimization problem used by U’hitley et al. (1994) to compare the
effectiveness of Lamarckian and Baldwinian search strategies. T h e Griewank function, like
Brown’s almost linear function, is not linearly separable and is as follows:
3.4 Rastrigin Function
‘l‘hc next test problem, with three instances of 2, 10, and 20 dimensions (Rastrigin-2,
Rastrigin- 10, and Rastrigin-20), is also a bounded minimization problem taken from Whitley
et al. (1094). T h e functional form is as follows:
Benefits of Partial Lamarckianism
3.5 Schwefel Function
The last nonlinear test problem, with three instances of 2, 10, and 20 dimensions (Schwefel-2,
Schwefel- 10, and Schwefel-20), is also a bounded minimization problem taken from Whitley
et al. (1994). T h e functional form is as follows:
where
V=-min{-x
sin(fi):xE
[-512,511]}
and V depends on system precision; for our experiments, V = 418.9828872721625.
3.6 Sequential Quadratic Programming
For these five nonlinear optimization test problems, a sequential quadratic programming
(SQP)' optimization method was used as the LIP. SQP is a technique used to solve nonlinear
programming problems of the following form:
Minimizef(x),
s.t. x E X
where, for n; inequality and n, equality constraints, X is the set of points X
bl<x<bu,
g,(x)LO,
j = 1 , ..., n;;
E iXnsatisfying
hj(x)=O, j = l , . . . , ne
with 61 E 8";bu E W ; f92' :+ R smooth; gj : Rn + R , j = 1,. . . ,n, nonlinear and smooth;
and h, : R"+ R , j = 1,. . . ,n, nonlinear and smooth.
SQP works by first computing the descent direction by solving a standard quadratic
program using a positive definite estimate of the Hessian of the Lagrangian. A line search
is then performed to find the feasible minimum of the objective along the search direction.
This process is repeated until the algorithm converges to a local minimum or exceeds a
certain number of iterations. For the purposes of this study, SQP was stopped after 2 5
iterations if it had not found a local optimum. A full description of the SQP used in this
study can be found in Lawrence, Zhou, and Tits (1994).
3.7 Location-Allocation Problem
The continuous location-allocation (LA) problem, as described in Houck, Joines, and Kay
(1996), is used as the next test problem. T h e LA problem, a type of nonlinear integer program, is a multifacility location problem in which both the location of n new facilities (NFs)
and the allocation of the flow requirements of m existing facilities (EFs) to the new facilities
are determined so that the total transportation costs are minimized. T h e LA problem is
a difficult optimization problem because its objective function is neither convex nor concave, resulting in multiple local minima (Cooper, 1972). Optimal solution techniques are
limited to small problem instances (less than 25 EFs for general l,, distances [Love & Juel,
19821 and up to 35 EFs for rectilinear distances [Love & Morris, 19751). Two rectilinear
distance instances of this problem were used in the investigation: a 200-EF by a 20-NF instance (LA-200) and a 250-EF by a 25-NF instance (LA-250), both taken from Houck et al.
(1996).
1
SUP is available in the Marlah environment (Grace, 1992) and as Fortran and C code (Lawrence e t al., 1994).
Evolutionary Computation Volume 5, Numher 1
37
C . K. lIouck,J.X Joines, hl. G. Kay. a n d J . K. MTlson
<;iven ?it EFs located at known points li/,j = 1, . . . , ?PI, with associated flow requirements
u./,.j= 1, . . . , m, and the number of N F s i i , from which the flow requirements of the EFs are
satisfied, the LA problem can be formulated as the following nonlinear program:
ii
/=I
111
/=I
subject to
where the unknown variable locations of the N F s X/, i = 1 , . . . , T I , and the flows from each
S F i to each EFj, r/,,
i = 1 , . . . , I / , j = 1,. . . , 7 ~ ,are the decision variables to be determined,
and d(&Y,,((,)is the distance between NF i and EFj. .hinfinite number of locations are
possible for each NF i since each /Yl is a point in a continuous, typically two-diinensional,
sp"".
'l'he G-4approach to solve this problem, described in detail in H o u c k e t a]. (1996) is used
in these experiments. T h e individuals in the GX are a vector of real values representing the
starting locations for the NFs. Based on these locations for the NFs, the optimal allocations
are found by allocating each EF to the nearest NF. T h e total cost of the resulting network is
then the weighted sum of the distances from each EF to its allocated h'F.
' f h e alternative LA method developed by Cooper (1072) is used as the LIP for this
problem. 'I'his method quickly finds a local minimum solution given a set of starting locations
for the S F s . T h i s procedure w-orks by starting with a set of NF locations; it then determines
an optimal set o f allocations based o n those locations, which for this uncapacitated version
of the problem reduces to finding the closest NF to each EF. T h e optimal NF locations
for these new allocations are then determined by solving I I single facility location problems,
which is in general a nonlinear convex optimization problem. T h i s method continues until
n o further allocation changes can be made.
3.8 Cell Formation Problem
T h e manufacturing cell formation probleni, as described in Joines, Culbreth, and K m g
(1996), is the final test problem. 'The cell formation problem, a type of nonlinear integer
programming problem. is concerned u.ith assigning a group of machines to a set of cells
and assigning a series of parts needing processing by these machines into families. T h e goal
is to create a set of autonomous manufacturing units that eliminate intercell movements
(i.e., a part needing processing by a machine outside its assigned cell). Tuo instances of this
problem were in the investigation: a 22 machine by a 148-part instance (Cell-I) and a 40
machine by a 100-part instance (<:ell-Z).
Consider an ni machine by 11 part cell formation problem instance with a maximum of
k cells. T h e problem is to assign each machine and each part to one and only o n e cell and
fainily, respectively. Joines et al. (1996) developed an integer programming model with the
following set variable declarations that eliminate the classical binary assignment variables
anti constraints:
.v;
=
y;
=
I,
I,
if machine i is assigned to cell I
if p a r t j is assigned to family 1
T h e model was solved using a GA. In the (;A, the individuals are a vector of integer variables
representing the machine and part assignments (.I-,
,.I-?,. . . ,.I-),,,-y l , y ? , . . . ,y12).T h i s representation has been shown to perform better than other cell formation methods utilizing the
Benefits of Partial Lamarckianism
same objective goines et al., 1996). T h e following “grouping efficacy” measure
as the evaluation function:
Maximize
e - e,
e + e,
(r)is used
(7)
= -
where
k
e = e, + e,,
e, =
C D,
I=
-
di
1
Grouping efficacy tries to minimize a combination of both the number of exceptional elements e, (i.e., the number of part operations performed outside the cell) and the process
variation e, (i.e., the discrepancy of the parts’ processing requirements and the processing
capabilities of the cell). Process variation for each cell i is calculated by subtracting from
the total number of operations a cell can perform Di, the number of part operations that are
performed in the cell di. T h e total process variation is the sum of the process variation for
each cell. Exceptional elements decrease r, since the material handling cost for transportation within a cell is usually less than the cost of the transportation between cells. Process
variation degrades the performance of the cell due to increased scheduling, fixturing, tooling,
and material handling costs.
A greedy one-opt switching method, as described in Joines et al. (1995) was used as the
LIP. Given an initial assignment of machines and parts to cells and families, respectively, the
LIP uses a set of improvement rules developed by N g (1993) to determine the improvement
in the r objective function if machine i is switched from its current cell to any of the remaining
k - 1 cells. T h e procedure repeats this step for each of the m machines as well as each of the
n parts, where each part is switched to k - 1 different families.
4.
Results
For the 30 replications of each problem instance, both the mean number of function evaluations and the mean final function values were determined. To ensure the validity of the
subsequent analysis of variance (ANOVA) and multiple means tests, each data set was tested
first for normality and second for homogeneity of the variances among the different search
strategies. T h e Shapiro-Wilk test for normality was performed on the results at cv = 0.05.
Since there is no established test for variance homogeneity (SAS Institute Inc., 1990), visual inspection was used to determine whether there were significant variance differences.
If there were significant differences and there existed a strong linear relation between the
means and standard deviations (i.e., a linear regression correlation greater than 0.8), a logarithmic transformation was used (SAS Institute Inc., 1990). For the number of functional
evaluations, standard deviations equal to zero (which correspond to the search strategy terminating after 1 million functional evaluations) were removed from the linear regression,
since these values were not due to randomness.
For those results passing the Shapiro-Wilk test for normality, an ANOVAwas conducted
for both the number of function evaluations and the final functional value. If the ANOVA
showed a significant effect, cv > 0.01, the means of each of the 12 different search strategies
considered, namely, no local improvement (referred to as -I), pure Baldwinian (referred to
as 0), pure Lamarckian (loo), and nine levels of partial Lamarckian (referred to as 5 , 10, 20,
40,50,60,80,90, and 95, respectively), were compared using three different statistical methods. T h e three methods, each performing a multiple means comparison, were the Duncan
Evolutionary Computation Volume 5, Number 1
39
C. K.Houck, J. A. Joines, M. C. Kay, and J. R. Wilson
Table 1. Table 1: ,Multiple means comparison for Brown's almost linear function.
Crir.
#E\2I,
A
-\I1
B
B
B
B
B
B
F
I
Fitnr\s
211
I
I
r\
B
I
c
I
c
I
c
c
C
xo
so
I
I
c
c
P
I
c
F
I
F
C
c
c
IM)
95
90
I
I
c
F
c
I
c
c
approach to minimize the Bayesian loss Function (Duncan, 1975); Student-Newman-Keuls
(SNK), a multiple-range test using a multiple-stage approach; and an approach developed
by Ryan (1960), Einot and Gabriel (1975), and Welsch (1977) (REGMr), which is also a
multiple-stage approach that controls the maximum experimentwise error rate under any
complete o r partial hypothesis. Each of the tests was conducted using the general linear
models routine in SAS v6.09 (SAS Institute Inc., 1990). All three of these tests were used
because there is n o consensus test for the comparison of multiple means.
'i'he ranking of both the number of function evaluations to reach the optimal solution
and the ranking of the final fitness are given in Tables 1 through 7 for each of the 1'9 test
problem instances considered in this paper: the five nonlinear test problems (Brown, Corana,
Griewank, Rastrigin, and Schwefel), each at 2 , 10, and 20 dimensions; the two LA problems
(LA-200 and LA-250); and the two cell formation problems (Cell-1 and Cell-2). Each of
the three multiple means comparisons, Duncan's, SNK, and REGVC', were performed o n
each problem instance. In the tables, if all three of the multiple means comparison methods
d o not agree, the results of each are presented; otherwise, the combined results for all the
methods agreeing are presented. 'The means and standard deviations of the final fitness valuc
and the number of functional evaluations for each search strategy are shown in Tables B l to
R7 o f Appendix B for all test problem instances.
I he multiple means composition methods are statistical tests that provide information
on sets of means whose differences are statistically significant. For example, consider the
results in Table 1 for the Brown-10 instance with respect to final fitne
All three multiple means tests agree that the - 1 (no local improvement) search strategy yields significantly
worse results than any of the other strategies. Similarly, the Baldwinian strategy (0% Lamarckian) is significantly better than the - 1 strategy but significantly worse than any of the other
strategies. Finally, the other strategies (5% to 100% Lamarckian) yield significantly the same
tirial fitness value. In some instances, the tests are unable to place a level into o n e group, and
F
>
Benefits of Partial Lamarckianism
Table 2. Multiple means comparison for thc Corana function.
I
l
a range of groups is presented. For example, the 20% partial Lamarckian strategy for the
Brown-20 instance (Table I) belongs to two both groups C and D (i.e., C-U). The strategies
assigned to the best group are shaded. The rank indicates the rank of the various means,
and moving from left to right, the means get better. Each of the test prohlems is discussed
in detail in the following sections.
4.1 Brown's Almost Linear Function
This problem benefits greatly from the use of SQP. As shown in Table 1 , the optimal solution
was readily found by all partial Lamarckian search strategies employing SUP as their local improvement procedure. T h e GA employing no local improvement took considerably longer
to find the optimal solution for the two-dimensional instance and failed to find the optimal for
the other two instances. The pure Baldwinian search also terminated after 1 million function
evaluations for the larger instances. For the two-dimensional instance, there is no statistically
significant difference in the number of function evaluations required to obtain the optimal
solution for all strategies employing SUP. However, the 10- and 20-dimensional instances
show that although the strategies employing a lower percentage of partial Lamarckianism
find the optimal, they require a significantly greater number of function evaluations.
4.2 Corana Function
As shown in Table 2, the two-dimensional Corana function instance is easily solved by
all variants of the GA. The 10-dimensional instance shows an interesting result. Only
the higher levels of partial Lamarckianism were able to find the optimal solution, whereas
Evolutionary Cornputadon \'olume 5, Number 1
41
C. K. Houck, J. A. Joines, hl. (2.Kay, a n d J . R. Wilson
Table 3. .\lultiple iiieaiis comparison for the Griewank function.
pure BaldM-inian search found the worst solutions. For the 20-ciimensional instance, n o
strateR. was ahle to locate the optimal value every time; however, the final function value
shows more statistically significant variation (i.e., the lower the percentage of Larnarckian
learning, the worse the final functional value until 00% is reached, after which all strategies
yield similar results). T h e 90%, 0 5 % , and 100% Lamarckian strategies found the optimal
approximately half the time. It is interesting to note that the G A not employing local
search (~ 1) outperforms the lower Lamarckian levels, including pure Baldwinian learning.
A possible explanation for this is that the Corana function exhibits numerous local minima
and extreme discontinuity wherever a rectangular local minimum has been placed in the
hyperdimensional parabola, making this problem troublesome for SOP. Therefore, the other
strategies waste valuable computational time mapping the same basin of attraction.
4.3 Griewank Function
This problem benefits greatly from the use of SQP. r\S shown in Table 3 , the optimal solution
was reatlily found hy all variants o f t h e GA. Although the quality of the solutions was similar,
only Duncan's method found any difference for the two-dimensional instance, with the <;A
employing no local improvement and several higher levels of 1,amarckianism being more
efficient. For the 10- and 20-climensional instances, the optimal solution was readily found
by all search strategies employing SQP as their local improvement procedure, whereas the
no local improvement (;A terininated after 1 riiillion function evaluations. Ilowever, there
is a statistically significant difference in the number of function evaluations required. Unlike
Benefits of Partial Lamarckianisrn
Table 4. Multiple means comparison for the Rastrigin function.
Kawigin-20
FlIl,C\<
411
I
A
B
B
B
B
B
B
B
B
B
B
B
all other problems, the greater amount of Lamarckianism, the greater number of function
evaluations required to locate the optimal solution.
4.4 Rastrigin Function
As shown in Table 4, all strategies found the optimal solution for the two-dimensional instance; however, no local improvement took significantly longer. Similarly, all strategies, except for pure Baldwinian search, found the optimal solution for the 10-dimensional instance.
There is statistically significant evidence that moderate and high levels of Lamarckianism
(greater than 50%) yield quicker convergence. For the 20-dimensional instance, a similar
trend is present. All strategies except pure Baldwinian search yielded optimal results; for
this size instance, a Lamarckian level of greater than 40% yields the quickest convergence.
It should be noted that no local improvement outperformed several of the lower levels, including pure Baldwinian search. As the problem size increases, more computational time is
used in the initialization of the population, not allowing Baldwinian learning to overcome
the problem of a one-to-one genotype to phenotype mapping before it terminates.
4.5 Schwefel Function
As shown in Table 5 , every strategy found the optimal solution for the two-dimensional
instance; with no local improvement taking significantly longer. T h e 10- and 20-dimensional
instances show faster convergence for greater degrees of Lamarckianism. Low levels of
Lamarckianism do not consistently yield optimal solutions, whereas with 20% or greater
Lamarckianism, the optimal solution is consistently located.
4.6 LAProblem
As shown in Table 6, both LA problem instances showed the same trend seen in the second
cell formation problem instance (Cell-2 in Table 7). With any use of local improvement, the
Evolutionary Computation Volume 5, Number 1
43
C. R. Ilouck, J. A. Joines, 11.G. Kay, and J. R. Wilson
Table 5. .Multiple means comparison for the Schwefel function.
Prob.
Inst.
Crit
#E\al5
Kunl.
-I
I(H1
YO
Y?
80
50
40
60
20
10
0
5
Ail
A
A
A
A
A
A
A
A
A
A
A
A
40
50
60
80
YO
95
100
60
40
10
100
80
90
95
C-E
C-E
D-E
D-E
D-E
€3
E
Schwctcl-2
Rank
I
5
0
10
20
,411
A
A
A
A
A
-
Schuciel-lo
r
SchuelclLlO
Table 6. Alultiple means comparison for the LA problem.
G4 found the optimal solution before the 1 million function evaluation ceiling was reached.
€Iowever, the Baldwinian strategy (0%) required a significantly greater number of function
evaluations than any strategy employing any form of Lamarckianism.
Benefits of Partial Lamarckianism
Table 7. Multiple means comparison for the cell formation problem.
Crit.
Fitness
4.7 Cell Formation Problem
As shown in Table 7 , the first cell formation problem instance, Cell-1, is a case where using
the local improvement routine with any amount of Lamarckian learning resulted in finding
the optimal solution; whereas, when using no local improvement or a strictIy Baldwinian
search strategy, the algorithm did not find the optimal solution and terminated after 1 million
function evaluations, with the Baldwinian strategy finding better solutions than the no local
improvement strategy. The second cell formation instance, Cell-2, is more interesting.
In terms of fitness, any strategy employing a moderate amount of Lamarckianism (>2O%)
found the optimal solution, whereas, the strategies employing either a Baldwinian strategy
or using no local improvement did not consistently find the optimal solution. This problem
instance also shows the familiar pattern of increasing levels of Lamarcluan learning resulting
in fewer function evaluations to reach the optimal solution. A possible explanation for the
poor performance of a Baldwinian search strategy is that most of the 1 million function
evaluations are used in the initialization of the population. Therefore, a strategy employing
pure Baldwinian learning does not have enough functional evaluations left to overcome the
multiple genotype-to-phenotype mapping problem.
4.8 Summary of Results
To concisely present the main results of these statistical tests shown in detail in Tables 1 to
7 , Tables 8 and 9 show rankings of each search strategy with respect to the fitness of the
final result and the number of function evaluations required, respectively. For each strategy,
the rankings were determined by finding the group that yielded the best results for each test
problem instance as determined by the SNK means analysis. If the strategy was placed into
a single group, that group’s rank was used; however, if the method was placed into several
Evolutionary Computation Volume 5, N u d e r 1
45
C. K. I-louck,J.A. Joines, 11. G. Kay, and]. R. Wilson
Table 8. Rank of solutions (SNK). Dagger indicates failed Shapiro-M'ilk test, or ANOVA did not
show signiticant effect a t (I = 0.01.
Search Strategy
ProbIem
1 Brown-10
I Brown-20
I3 I2
I 3 I2
L----kL
Rastrigin-7'
1
Schwefel-2'
~
1
~
Schwefel- 10
I
I LA-200
LA-250
~
~
Cell- 1
Cell-2
-
overl:;pping groups, the method u.as placed into the group with the best rank. Therefore,
these rankings represent how. many groups o f means are significantly better for each test
pro1iei-n instance. T h e SNK multiple means test was arbitrarily chosen for these rankings.
71'ahle 8 shows the rankings for the final fitness of the solutions returned by each of
the search strategies, u h e r e 1 represents the best rank and is shaded, 2 represents the next
hest r;~nk,etc. - I l l the strategies empioying a t least 20% Lamarckian learning consistently
founcl the optimal solution. Nowever, the pure Baldwinian (0%) and the S % and 1O'X1
Benefits of Partial Lamarckianism
partial Lamarckian strategies did find significantly worse solutions for several test problem
instances. T h e GA employing the no LIP (-1) is included to provide a comparison with
how efficiently the local search procedure utilizes function evaluations compared with a pure
genetic sampling approach. For most of these test problem instances, the use of the LIP
significantly increases the efficiency of the GA.
Table 9 shows the rankings for the number of function evaluations required to locate the
final functional value returned by each of the search strategies. T h e table shows an interesting
trend: Neither the pure Lamarckian (100) nor pure Baldwinian (0) strategies consistently
yielded the best performance; this was also observed by Whitley et al. (1994). Using a
binary representation and a bitwise steepest ascent LIP, Whitley et al. (1994) demonstrated
that a Baldwinian search strategy was preferred for the 20-dimensional Schwefel function,
whereas the results in Table 9 show a preference for a Lamarckian search strategy when
using a floating point representation and SQP. Also, a Lamarckian strategy was preferred
in Whitley et al. (1994) for the 20-dimensional Griewank instance, whereas in this study a
Baldwinian search strategy is preferred.
T h e results of this study and those of Whitley et al. (19941, together, show that a combination of the problem, the representation, and the LIP all can influence the effectiveness of
either a pure Lamarckian or pure Baldwinian search strategy. However, with respect to the
representations and LIPs used in this study, a partial Lamarckian strategy tends to perform
well across all the problems considered. Taking a minimax approach (i.e., minimizing the
worst possible outcome), the 20% and 40% partial Lamarchan search strategies provide the
best results. T h e worst ranking of each search strategy, in terms of convergence speed, is
shown as the last row of Table 9. Both the 20% and 40% strategies yield, at worst, the third
best convergence to the optimal solution. T h e other search strategies, 0% and 5% partial
Lamarckian, did not consistently find the optimal. Higher amounts of Lamarckianism, 50%
to loo%, converge slower to the optimal solution. Since no single search strategy is dominant in terms of solution quality and computational efficiency, the worst case performance
of the search strategy can be minimized across the entire set of test problems examined by
employing a 20% to 40% partial Lamarckian strategy. T h e no local improvement search
strategy ( - 1) is shown for comparison to demonstrate the effectiveness of hybrid GAS.
5.
Conclusions
LIPs were shown to enhance the performance of a GA across a wide range of problems.
Even though Lamarckian learning disrupts the schema processing of the GA and may lead
to premature convergence, it reduces the problem of not having a one-to-one genotype-tophenotype mapping. In contrast, although Baldwinian learning does not affect the schemaprocessing capabilities of the GA, it results in a large number of genotypes mapping to the
same phenotype, Except for the Griewank-10 and Griewank-20 problem instances, this may
explain why Baldwinian search was the worst strategy for several of the test problems. In
the empirical investigation conducted, neither pure Baldwinian nor pure Lamarckian search
strategies were consistently effective. For some problems, it appears that by forcing the
genotype to reflect the phenotype, the GA converges more quickly to better solutions, as
opposed to leaving the chromosome unchanged after it is evaluated. However, for Griewank10 and Griewank-20, Lamarckian strategies require more computational time to obtain the
optimal. Therefore, a partial Lamarckian strategy is suggested.
The results of this paper have opened up several promising areas for future research.
This research has shown that local search improves the GAS’ ability to find good solutions.
Evolutionary Computation Volume 5, Number 1
47
C. R. Houck, J. A. Joines, M. G. Kay, and J. R. Wilson
Table 9. Rank of convergence speed (SNK). Dagger indicates failed Shdpiro-Wlk test, or ANOVA
did not show significant effect ‘It (1 = 0.01.
I
I f nothing is known about the particular problem under consideration, a minimax approach
is suggested to determine the amount of partial Lamarckianism. Future research could
investigate the use of an adaptive approach to dcterniine the amount of partial Larnarckianism
to use. Also, it is possible that, instead of a single percentage of Ianarckianism applied during
the entire run, different amounts of Lamarcliianisin over the course of the run would be more
txneticial. ICven though I a n a r c k i a n learning helps to alleviate the problem of one-to-one
Benefits of Partial Lamarckianism
~
1.
2.
3.
4.
5
6.
7.
8.
Supply a population Po of N individuals and respective function values
i t 1
Pi +- selection_function(P,- 1)
P, c reproduction-function(P:)
Evaluate(P,)
i+i+l
Repeat steps 3 - 6 until termination
Print out best solution found
Figure Al. The GA.
genotype-to-phenotype mapping, it can still apply local searches to many points in the same
basin of attraction. Another area of future research would involve determining whether a
local search is even necessary for a new individual (i.e., has the current basin already been
mapped). Therefore, computational time is wasted mapping out the same basin of attraction,
which then can be used to explore new regions. We are currently investigating combining
several global optimization techniques with GAS for the purpose of introducing memory
into the search strategy.
Acknowledgments
T h e authors would like to thank the editor and the referees for their insightful comments
and suggestions, which resulted in a much improved paper. This research was supported, in
part, by the National Science Foundation under Grant DMI-9322834.
Appendix A. Description of Genetic Algorithm
T h e genetic algorithm (GA) used in the experiments in Section 3 is summarized in Figure Al.
Each of its major components is discussed in detail below. T h e use of a GA requires the
determination of six fundamental issues: chromosome representation, selection function, the
genetic operators making up the reproduction function, the creation of the initial population,
termination criteria, and the evaluation function. T h e rest of this appendix briefly describes
each of these issues relating to the experiments in Section 3 .
A.l Solution Representation
For any GA, a chromosome representation is needed to describe each individual in the
population of interest. T h e representation scheme determines how the problem is structured
in the GA and also determines the genetic operators that are used. Each individual or
chromosome is made up of a sequence of genes from a certain alphabet. An alphabet could
consist of binary digits (0 and l),floating point numbers, integers, symbols (i.e., A, B, C, D),
matrices, etc. In the original design of Holland (1975), the alphabet was limited to binary
digits. Since then, problem representation has been the subject of much investigation. It
has been shown that more natural representations are more efficient and produce better
solutions (Michalewicz, 1996). In these experiments, a real-valued representation was used
for the five nonlinear problems and the location-allocation problem, whereas an integer
representation was used for the cell formation problem.
Evolutionary Computation Volume 5, Number 1
49
C;. K. Houck, J. A. Joines, AT. G. Kay, and J. K.Wilson
A.2 Selection Function
T h e selection of individuals to produce successive generations plays an extremely important
role in a GI. There are several schemes for the selection process: roulette wheel selection
and its extensions. scaling techniques, tournament, elitist models, and ranking methods
(Goldberg, 1989; Michalewicz, 1996). Ranking methods only require the evaluation function
to map the solutions to a totally ordered set, thus allowing for minimization and negativity.
Ranking methods assign PI based on the rank of solution i after all solutions are sorted. h
normalized geometric ranking selection scheme Uoines €k Houck, 1994) employing an elitist
model was used in these experiments. Normalized geometric ranking defines PI for each
individual by
l-’\Individual i is chosen] = q’(1 - q)’-’
(All
where
q
=
I.
=
AV
=
=
probabiliy of selecting the best individual;
rank of the individual, \$,here 1 is the best;
population size; anti
q
l-(l-q)\’
A.3 Genetic Operators
Genetic operators provide the basic search mechanism o f the GA. ‘The application of the
two basic types of operators (i.e., mutation and crossover and their derivatives) depends on
the chromosome representation used. Operators for real-valued representations (i.e., an
alphabet
of floats) are described in Michalewicz (1904). For real parents, = (.q,xz, . . . , x p , )
and 1- = ( y l , y!, . . . ,J,,), each of the operators used in the experiments in this paper are described tlriefly below. Xote, for the cell formation problem, the operators were modified to
produce integer results uoines et a]., 1996). Let,n, and), be the lower and upper hound,
respectively, for each variable s,(or
and let F (or .
I) represent the changed parent or
offspring produced by the genetic operator.
x
)I,),
Uniform Mutation Randomly select one variabiej and set it equal to a random number
draun from a uniform distribution C(n,,1 7 , ) ranging from n, to b,:
Multiuniform Mutation Apply Equation iv to all the variables in the parent
x
Boundary Mutation Randomly select one \ariahle/ and set it equal to either its lower or
upper t)ound, where 1- = C(0,1):
(1,.
.r
=
b,,
s,.
so
if i = J ,
if i =.J,
otherwise
7-
< 0.5
I’
2 0.5
Benefits of Partial Lamarckianism
Nonuniform Mutation Randomly select one variablej and set it equal to a nonuniform
random number based on the following equation:
(x;+ (bj - x;)f(G),
- (xi+aj)f(G),
if i = j ,
if i = j ,
otherwise
YI < 0.5
yI
2 0.5
a uniform random number between (0,l);
current generation;
maximum number of generations; and
a shape parameter.
Multi-nonuniform Mutation Apply Equation A4 to all the variables in the parent
Simple Crossover Generate a random number
from 2 to (n - 1) and create two new individuals
where n is the number of parameters:
Y
x.
from a discrete uniform distribution
(x’and r’)using Equations A5 and A6,
x:
y;
=
;,
if i < r
otherwise
yj,
x,,
if i < Y
otherwise
Arithmetic Crossover Produce two complementary linear combinations of the parents,
where Y = U(0,l):
--I
X’=YX+(l-Y)T,
Y =(I -Y)X+YY
(A71
Heuristic Crossover Produce a linear extrapolation of the two individuals. This is the only
operator that utilizes fitness information. Anew individual,
is created using Equation A8,
where Y = U(0,l), and X is better than F in terms of fitness. IfX’ is infeasible (i.e.,feasibiliry
equals 0, as given by Equation A9), then generate a new random number Y and create a
new solution using Equation A8; otherwise, stop. To ensure halting, after t failures, let the
children equal the parents and stop:
x’,
X’=X+Y(X-r>,
feasibility =
{
--I
Y
=x
if x : > a , ,
otherwise
(‘48)
xisb,
forall i = 1 , . . . , 72
649)
A.4 Initialization of Population
As indicated in step 1 of Figure 1, the GA must be provided with an initial population. The
most common methods are to (1) randomly generate solutions for the entire population, or
(2) seed some individuals in the population with “good” starting solutions. T h e experiments
in Section 3 used a random initial population with common random numbers across all
search strategies for each replication.
Evolutionary Computation Volume 5, Number 1
51
C . K,Houck,J. A. Joines, M. G. Kay, and J. R. Wilson
Parameter Value
Parameter
Uniform mutation
4
Nonuniform mutation
Multi-nonuniform mutation
Boundary mutation
Simple crossover
AArithnieticcrossover
Heuristic crossover
Probability of selecting the best, y
Population size, ,li
Maximum number of function evaluations
Maxiinurn number of iterations for SOP
4
6
4
2
2
2
0.08
80
1,000,000
2s
A.5 Termination Criteria
T h e (;A moves from generation to generation selecting and reproducing parents until a
termination criterion is met. Several stopping criteria currently exist and can be used in
conjunction with another, for example, a specified maximum number of generations, convergence of the population to virtually a single solution, and lack of improvement in the
best solution over a specified number of generations. T h e stopping criterion used in the
experiments in Section 3 was to terminate when either the known optitnal solution value was
found o r the specified maximum number of function evaluations was exceeded.
A.6 Evaluation Functions
As stated earlier, evaluation functions of many forms can be used in a GA, subject to the
minimal requirement that the function can map the population into a totally ordered set.
T h e GA employing no local improvement used the evaluation functions of Section 3 directly.
T h e hybrid GA used in the pure Baldwinian, pure Lamarckian, and partial Lamarchan
search strategies used the local iinprovement procedure (LIP) for each particular problem
as its evaluation function. Each o f the LIPS then used the evaluation functions of Section 3 .
A.7 GA Parameters
X b l e A1 lists the values of the parameters of the GAS used in the experiments in Section 3 . T h e parameter values associated with the mutation operators represent the number
of individuals that will undergo that particular mutation, whereas the values associated with
crossover represent the pair5 of individuals.
Appendix B. Means and Standard Deviations for Each Search Strategy
~-I he means and standard deviations of the final fitness value and the number of functional
evaluations for each search strategy are shown in Tables B l to B7 for all test problem instances.
T h e number of times (maximum 30) that the search strategy found the optimal solution is
also +en in the tables.
52
Evolutionary Computation Volume 5, Nomt)er 1
Benefits of Partial Lamarckianism
Table B1. Means and standard deviations for the Brown problem.
Problem
Instance
Brown-2
I
Search
Strategy
-1
0
S
10
20
40
so
60
80
90
9s
100
-1
0
5
Brown-I0
10
20
40
50
60
80
90
95
I00
-1
0
5
10
20
Brown-20
40
50
60
80
90
95
100
1
Final Fitness Value
Standard Dev.
Mean
9.33333e - 07
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.26602
5.03208e - 01
1.00000e - 07
3 . 3 3 3 3 3 e - 08
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.65499e + 01
1.10978e + 01
1.00000e - 07
3 . 3 3 3 3 3 e - 08
3.33333e - 08
3 . 3 3 3 3 3 ~- 08
6.66667e - 08
3.33333e - 08
0.00000
3 . 3 3 1 3 3 ~- 08
6.66667e - 08
6.66667~- 08
2.5371e - 07
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
1.9554
4.7555e - 01
3.0513~- 07
1.8257e - 07
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
4.3312
2.SFlle+Ol
3.OS13e - 07
1.8257e - 07
1.8257e - 07
1.8257e - 07
2.5371e - 07
1.8257e - 07
0.0000
1.8257e - 07
2.5371e - 07
2.5371e - 07
Evolutionary Computation Volume 5, Number 1
No. of Times
Optimal
30
30
30
30
30
30
30
30
30
30
30
30
0
7
30
30
30
30
30
30
30
30
30
30
0
0
30
30
30
30
30
30
30
30
30
30
No. of Functional Evaluations
Standard Dev.
Mean
9.S529e + 04
3.1070e + 03
3.1026~+ 03
3.0946e + 03
3.0856~+ 03
3.0389e + 03
3.0193e + 03
2.9950~+ 03
2.9282e + 03
2.9092~+ 03
2.8798e + 03
2.8681e + 03
1.0000e + 06
8.5243~+ OS
2.8470~+ 04
1.856Se + 04
1.3169~+ 04
9.2201e + 03
8.9320e + 03
8.6745e + 03
8.148le + 03
7.8687~+ 03
7.7787e + 03
7.6381~+ 03
1.0000e + 06
1.0013e + 06
2.7049~+ 04
1.4136r + 04
1.0893~+ 04
6.7153e + 03
5.7384~+ 03
6.1421e + 03
5.7521~
+ 03
5.3642e + 03
5.4270e + 03
5.5075~+ 0 3
6.72 13e + 04
1.522Oe + 02
1.6197e + 02
1.6261e + 02
1.7367e + 02
1.6929~
+ 02
1.8582e + 02
1.80768 + 02
1.6989e + 02
1.6929e + 02
1.6445r + 02
1.5229e + 02
0.0000
3.0670e + 05
1.S704e + 04
1.0994~+ 04
4.234Se + 03
1.7223~+ 03
1.5899e + 03
1 . 6 3 1 2 ~ +03
9.7597e + 02
8.260Se + 02
7.7405e + 02
6.6600e + 02
0.0000
8.6148~+ 02
2.45 37e + 04
7.3162~+ 03
5.1028e + 03
2.7935~+ 03
1.3383r + 03
1.4S38e + 03
1.3410e + 03
I. 190+ + 03
I . 1 9 5 5 e + 03
1.3284~+ 03
53
C. R. Houck, J. A. Joines, *M.
G. Kay, and J. R. Wilson
Table B2.
.\leans
and standard deviations for the Corana problem.
So. of Functional Evaluations
Benefits of Partial Lamarchanism
Table B3. Means and standard deviations for the Griewank problem.
Problein
Instance
Chewank-2
Griewank-lO
1
Search
Strategy
-I
0
5
10
20
40
so
60
80
90
95
I00
90
100
-1
Griewank-20
20
40
50
60
80
90
95
100
Final Fitness Value
Standard Dev.
Mean
3.69823~- 03
3.7610e - 03
7.39600~- 04
2.2567e - 03
7.39600e - 04
2.2567~- 03
9.86133~- 04
2 . 5 5 7 1 ~- 03
4.93067~- 04
1.8764e - 03
4.93067~- 04
1.8764~- 03
2.2567e - 03
7.39600e - 04
4.93067~- 04
1.8764e - 03
9.86133~- 04
2 . 5 5 7 1 ~- 03
1 . 2 3 2 6 7 ~- 03
2.8034e - 03
7.39600~- 04
2.2567~- 03
9.86133e - 04
2.5571~- 03
5.25359~- 02
2.9763e - 02
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.0000
0.00000
4.20434e - 02
4.4794e - 02
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
Evolutionary Computation Volume 5 , Number 1
No. of 'Lines
Optimal
15
27
27
26
28
28
27
28
26
25
27
26
1
30
30
30
30
30
30
30
30
30
30
30
4
30
30
30
30
30
30
30
30
30
30
30
No. of Functional Evaluations
Mean
Standard Dev.
5.0653~+ 05
5.0193~+ 05
3.2468e + 05
3.1 199e + 05
3.0965~+ 05
3.0753e + 05
2.9192e+OS
3.1061~+ 05
2.5194~t 05
2.4883~+ 05
2.92 17e + 05
2.6033e + 05
3.5088~t 05
3.1584e + 05
2.6093~+ 05
2.5242P + 05
3.3917e + 05
3.3349e + 0s
2.9896~+ 05
3.2910e + 05
2.2906e + 05
2.7131~+05
3.1562~+ 05
2.8729e t 05
9.7013e + 05
1.63681,+ 05
4.7135e+03
1.7993~+ 02
4.8088~+ 03
2.8770r + 02
5 . 0 W e + 03
3.7042~-+ 02
5.9641~+ 02
5.4296~+ 03
6.2652e + 03
6.4218~+ 02
6.8425e + 03
8.7743e + 02
7.2035e + 03
9.1656e + 02
8.2834~t 03
1.I45Oet02
8.8033e + 03
1.2008~+ 02
9.0910~t 03
I . 1866e + 02
9.3120e+03
1.1467e + 02
9.0575~+ 05
2.6030~-+ 05
7.8540~+ 03
I ,9988~+ 02
8.0354e + 03
2.8841P + 02
8.2087e + 03
3.8040e + 02
8.4880~+ 03
5.3300~+ 0 2
9.1453e + 03
6.8650e + 02
9.4776~+ 03
6 . 6 8 6 5 ~+ 02
9.7216~+ 03
7.1319Pt02
1.0219e+04
7.1239e+02
l.0371e+04
6.5058e + 02
1.0554~t 04
6.5540~+ 02
1.0655~+ 04
5.3251~+02
55
C. R. 1 louck, J. A. Joines, M. G. Kay, and J. R. Wilson
Table B4. Means and standard deviations for the Rastrigin problem.
Problem
instance
Search
Strategy
-1
0
5
I0
20
Rastrigin-2
40
50
60
80
90
95
I00
-1
0
5
10
20
Rastrigin-10
40
SO
0.OooOo
-1
1 .nooooP - 06
1.21 S15e + 01
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
10
20
40
50
60
80
90
95
100
56
~~~
60
80
90
95
I00
0
5
Rastrigin-20
Final Fit
Mean
5.00000e - 07
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
I .00000e-06
1.12762
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
No. of Times
Optimal
30
is Value
Standard Dev.
5.0855e - 07
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.0000
1.4006
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
7.3257
o.ooon
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
No. of Funct a1 Evaluations
Mean
Standard Dev.
6 . 3 1 3 6 ~ + 0 3 3.212lr + 03
2.1 390e + 03
2 . 3 5 7 2 ~+ 01
2.1376~+03 2.5566~+ 01
2.1391~+ 03
2.6483~+ 01
2.1376~+03
3.1725~+ 01
2.13Me + 03
4.3927~+ 01
2.1377e+03
4.7064~+ 01
2.13?6e+03
4.5057~+ 0 1
2 . 1 3 4 4 ~+ 03
5.3647e + 01
6.0398r + 01
2.1401~+ 03
2.1386~+ 03
6. I656r + 01
2.1426r t 03
6.429s~+ 0 1
30
30
30
30
30
30
30
30
30
30
9.6194~+ 03
7 163% + 05
1.5054~+ 05
I
30
30
I
30
30
30
30
30
30
30
30
30
30
I
6.8002~+ 04
5.2204~+ 04
4.1791% + 04
4.0310e + 04
3.8088~+ 04
3.7627~+ 04
1.5740~+ 04
3.7634~+ 04
2 . 1 5 6 1+~0 5
Y.9512P+05
4.5566~+ 05
3.7480e + 0.5
2.947% + 05
2.0678~+ 05
1.839% t 05
1.91 34P + 05
1.6438e t 05
1.499I P + 05
1.5426e + 05
1.6482~+ 05
1.7994~+ 05
1.0696~+ 05
6.005 1 u + 04
2.2303P + 04
1.1!29~+04
1.2086r + 04
1.0SRlc + 04
9.2368e + 01
8.8 I 2 1e + 03
8.81Ole + 03
1.0570~+ 04
6.0225~+ 04
3.78361. + o'+
1.s220u + 05
I .4303e + 05
I.I280r+05
6.5623r + 04
3.689% + 04
5.1640~+ 04
4.0192c + 04
3.8368r + 04
3.8706~+ 04
3 . 5 2 5 ~+ ~04
Evolutionary Computation Volume 5, Number 1
Benefits of Partial Lamarchanism
Table BS. Means and standard deviations for the Schwefel problem.
Problem
Instance
Search
Strategy
-1
0
5
Schwefel-2
10
20
40
50
60
80
90
95
I00
-1
0
5
10
Schwefel-I0
20
40
50
60
80
90
95
100
-1
Schwefel-20
0
5
10
20
40
50
60
80
90
95
100
Final Fitness Value
Standard Dev.
Mean
5.0401e - 07
4.33333e - 07
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.0000
1.00000e - 06
1.0560
6.63306e - 01
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
1.1698e - 03
4.40933e - 04
8.14151
9.0376
0.00000
0.0000
0.0000
0.00000
0.0000
0.00000
0.00000
0.0000
0.0000
0.00000
0.0000
0.00000
0.00000
0.0000
0.0000
0.00000
0.0000
0.00000
0.0000
0.00000
Fhlutionary Computation Volume 5 , Number 1
1
No. ofTimes
Optimal
30
30
30
30
30
30
30
30
30
30
30
30
30
19
30
30
30
30
30
30
30
30
30
30
23
11
30
30
30
30
30
30
30
30
30
30
No. of Functional Evaluations
Mean
Standard Dev.
6.782Ye + 03
3.4377e + 03
8.3627e + 03
8.9766~+ 03
8.3500e + 03
8.9631e + 03
8.3915e + 03
9.0084e + 03
9.0260e + 03
8.3996e + 03
8.4540e + 03
9.1 lO0e + 03
8.4480e + 03
9.0932e + 03
8.4446e + 03
9.0945e + 03
8.5156e + 03
9.2095~+ 03
8.529Ye + 03
9.2290e + 03
8.5241+
~ 03
Y.2185e + 03
8.5262~+ 03
9.21 7 5 +~ 03
7.5922~+ 04
1.312Oe + 04
4.9627e + 05
4.257% + 05
7.8612~+ 04
1.198% + 05
1.1426~+ 05
6.1898~+ 04
7.9676e + 04
3.7094e + 04
4.6399~+ 04
7.7312e + 04
7.3009~-+ 04
4.8608e + 04
4.72 17e + 04
7.1344e + 04
7.3YSle + 04
5.4221r + 04
7.199Se + 04
5.3185e+04
6.8266~+ 04
4.8523e + 04
6.9889e + 04
4.7395, + 04
4.6027e + 05
3.6124~+05
7.0960e + 05
4.0809~+ 05
3.6296~+ 05
1.8691~
+ 05
3.3455e + 05
1.500')P + 05
9.5471e + 04
2.8962~+ 05
2.447% + 05
8.721 1 P + 04
8.1234~+ 04
2.3312~+ 05
2.4751~+05
9.2.517e+0.1
2.3880e + 05
1.211 6 +~0i
2.2414~+ 05
1.1503e + 05
I.1259e + 0 s
2.2112e+OS
2.4558~+ 0 5
1.3543~+05
57
C. K. IIouck, J. A. Joines, At. G. Kay, and J. R. Wilson
Table B6. .\leans and standard deviations for the Location-Allocation (LA) problem.
Table B7. A\\Ieansand standard deviations for the cell formation problem.
Search
Straw?
-I
0
10
20
4n
?I)
(Ill
80
'ill
C) i
I00
Benefits of Partial Lamarckianism
References
Chu, P.,& Beasley, J. (1995). A genetic algorithm for the generalized assignment problem (Tech. Rep.).
London: T h e Management School Imperial College.
Cooper, L. (1972).The transportation-location problems. OperationsResearch, 20,94-108.
Corana, A., Marchesi, M., Martini, C., & Ridella, S. (1987). Minimizing multimodal functions of
continuous variables with the “simulated annealing” algorithm. ACM Transactions on Mathematical
Software, 13(3), 262-2 80.
Davis, L. (1991). Handbook ofgmetic algorithms. New York: Van Nostrand Reinhold.
Duncan, D. (1 975). t-tests and intervals for comparisons suggested by the data. Biometrics, 31,3 39-3 59.
Einot, I., & Gabriel, K. (1975). A study of the powers of several methods of multiple comparisons.
30burnalof the American Statistical Association, 70(3fI), 574-583.
Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA:
Addison-Wesley.
Grace, A. (1992). Optimization toolbox user’s gziide. Natick, MA: Mathworks.
Gruau, F., & Whitley, D. (1993). Adding learning to the cellular development of neural networks.
Evolutionary Computation, 1, 2 13-2 33.
Hinton, G., & Nolan, S. (1987). How learning can guide evolution. Complex Systems, I , 495-502.
Holland, J. (I 975). Adaptation in namral and artificial systems. PhD thesis, University of Michigan, Ann
Arbor.
Homaifar, A., Guan, S., & Liepins, G. E. (1993). A new approach on the traveling salesman problem
by genetic algorithms. In S. Forrest (Ed.), Proceedings of the 5th International Conference on Genetic
Algor-ithms(pp. 460-466). San Mateo, CA: Morgan Kaufmann.
IJouck, C. R., Joines, J. A., & Kay, M. G. (1996). Comparison of genetic algorithms, random restart,
and two-opt switching for solving large location-allocation problems. Cornputen and Operations
Research, 23(6), 587-596.
Joines, J., Culbreth, C., & King, R. (1996). Manufacturing cell design: An integer programming model
employing genetic algorithms. IIE Transactions, 28(1), 69-85.
Joines, J., & Houck, C. (1994). On the use of non-stationary penalty functions to solve constrained
optimization problems with genetic algorithms. In 2.Michalowicz, J. D. Schaffer, H.-P. Schwefel,
D. G. Fogel, & H. Kitano (Eds)., Pr-oceedingsof the First IEEE Conference on Evolutionary Cornputation
(Vol. 2, pp. 579-584). Piscataway, NJ: IEEE Press.
Joines, J., King, R., & Culbreth, C. (1995). The use ofa local improvement hezwistic with ageizeti~algorith~j
for manufacturing cell design. (Tech. Rep. NCSU-IE 95-06). Department of Industrial Engineering.
Raleigh, NC: North Carolina State University.
Lawrence, C., Zhou, J. L. & Tits, A. L. (1994). User’s guide for CFSQP version 2.4: A C code fbr
solving (large scale) constrained nonlinear (minimax) optimization problems, generating iterates siitisfiirigall inequaliq constraints (Tech. Rep. TR-940161-1). College Park, MD: University of Maryland,
Electrical Engineering Department and Institute for Systems Research.
Love, R., & Juel, H. (1982). Properties and solution methods for large location-allocation problems
with rectangular distances. Jo11mal of the Operations Research Society, ??, 443-452.
Love, R., & Morris, J. (1975). A computational procedure for the exact solution of location-allocation
problems with rectangular distances. Naval Researzh Logisfirs Qmcarterlv, 22, 441-453.
Michalewicz, Z. (1996). Genetic algorithms + data stmctw-es = evolutionary pi-og?*ams.(3rd ed.). New
York: Springer-Verlag.
Evolutionary Computation Volume 5, Number 1
59
C. K. Houck, J. A. Joines, M. G. Kay, and J. R. Wilson
Michalewicz, Z., d Nazhiyath, G. (1995). Genocop 111: A co-evolutionary algorithm for numerical
optimization problems with nonlinear constraints. In 1). B. Fogel (Ed.), A-omdingy ofthe Second
IEEE Inmnhoi2nl Co7lfiw77re 017 Erolutionoiy Covipritnrion p o l . 1, pp. 647-65 1). I'iscataway, NJ:
IFXZ Press.
Mitchell, M. (1996). A 7 z ititmdiictioii to p i m i czlgorithm. Cambridge, MA: MIT Press.
More, j.j.,Garbow, B., k Hillstrom, K. (1981). 7'esting unconstrained optimization software. ,4CM
*,
fi~clrZ.V7Ct?Oilr
011 L~rrtheirrnticnl
. ~ < J f i N I . e , :(I), 17-41 .
Nakano, R. (1991j. Conventional genetic algorithm for job shop problems. In Proceedings ofthe Fourrh
Inteniotiond Cor!firunre 017 Cuirtic Algol-ithms (pp. 474-479). San Mateo, CA: MorKan Kaufniann.
Xg, S. (1 903). Worst-case analysis of a n algorithm for cellular manufacturing. European ffovrnal of
Opemtionnl Restw~ch,ri9(3), 386398.
Orvosh, I)., k Davis, I,. (1094). Using a genetic algorithm to optimize prohlems with feasibility
constraints. In Z.Xlichalewicz, I . D. Schaffer, F3.-I?.Schwefel, D. G. Foeel. & H . Kitano (Eds.).
Renders, j.-hl.,LY Flasse, S. ( 1 996). IIybrid methods using genetic algorithms. IEEE E-ciiisnrtioizs o n
Systtws. .Zlm iiiirl ~ , ~ y l m n f t i i,76(2),
- ~ , 143-2 58.
Ryan, '1'. (1960). Significance tests for inultiple comparison of proportions, variances, and other
statistics. Py.ho/ogic.nl Biilletiii. K, 3 18-328.
S.AS Institute Inc. (1900). S.4S/S'L4T riser:rgciide (4th ed.). Car); NC: SAS Institute Inc
Ii'elsch, K . (1 Y77). Stepwise multiple comparison procedures. j o l i i 7 7 d of the Amei-icm Statisrid L4.rso&tzoii.
72(3F9/: .566-575.
\\'hitley, 11.;Gorclon, S . , k hlathias, I(.(1994). Lamarckian evolution, the Kaldwin effect and function
optimization. I n I-.Davidor, € I . Schwefel, & R. Ma'nner (Eds.), Pnrnllelp~oblevi~~oAitiyfr-om
izdtwePP.SS III (pp. 6-1 5). Berlin: Springer-\'erlag.