Wikibook (pages 1-223)
Transcription
Wikibook (pages 1-223)
Contents Articles Evolutionary computation 1 Evolutionary algorithm 4 Mathematical optimization 7 Nonlinear programming 19 Combinatorial optimization 21 Travelling salesman problem 24 Constraint (mathematics) 37 Constraint satisfaction problem 38 Constraint satisfaction 41 Heuristic (computer science) 45 Multi-objective optimization 45 Pareto efficiency 50 Stochastic programming 55 Parallel metaheuristic 57 There ain't no such thing as a free lunch 61 Fitness landscape 63 Genetic algorithm 65 Toy block 77 Chromosome (genetic algorithm) 79 Genetic operator 79 Crossover (genetic algorithm) 80 Mutation (genetic algorithm) 83 Inheritance (genetic algorithm) 84 Selection (genetic algorithm) 84 Tournament selection 85 Truncation selection 86 Fitness proportionate selection 86 Reward-based selection 87 Edge recombination operator 88 Population-based incremental learning 91 Defining length 93 Holland's schema theorem 94 Genetic memory (computer science) 95 Premature convergence 95 Schema (genetic algorithms) 96 Fitness function 97 Black box 98 Black box theory 100 Fitness approximation 101 Effective fitness 103 Speciation (genetic algorithm) 103 Genetic representation 104 Stochastic universal sampling 105 Quality control and genetic algorithms 106 Human-based genetic algorithm 108 Interactive evolutionary computation 110 Genetic programming 112 Gene expression programming 119 Grammatical evolution 120 Grammar induction 122 Java Grammatical Evolution 124 Linear genetic programming 125 Evolutionary programming 126 Gaussian adaptation 127 Differential evolution 133 Particle swarm optimization 135 Ant colony optimization algorithms 141 Artificial bee colony algorithm 153 Evolution strategy 155 Evolution window 157 CMA-ES 157 Cultural algorithm 168 Learning classifier system 170 Memetic algorithm 172 Meta-optimization 177 Cellular evolutionary algorithm 179 Cellular automaton 182 Artificial immune system 194 Evolutionary multi-modal optimization 198 Evolutionary music 201 Coevolution 203 Evolutionary art 208 Artificial life 210 Machine learning 214 Evolvable hardware 220 NEAT Particles 222 References Article Sources and Contributors 224 Image Sources, Licenses and Contributors 229 Article Licenses License 231 Evolutionary computation Evolutionary computation In computer science, evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) that involves combinatorial optimization problems. Evolutionary computation uses iterative progress, such as growth or development in a population. This population is then selected in a guided random search using parallel processing to achieve the desired end. Such processes are often inspired by biological mechanisms of evolution. As evolution can produce highly optimised processes and networks, it has many applications in computer science. History The use of Darwinian principles for automated problem solving originated in the fifties. It was not until the sixties that three distinct interpretations of this idea started to be developed in three different places. Evolutionary programming was introduced by Lawrence J. Fogel in the US, while John Henry Holland called his method a genetic algorithm. In Germany Ingo Rechenberg and Hans-Paul Schwefel introduced evolution strategies. These areas developed separately for about 15 years. From the early nineties on they are unified as different representatives (“dialects”) of one technology, called evolutionary computing. Also in the early nineties, a fourth stream following the general ideas had emerged – genetic programming. Since the 1990s, evolutionary computation has largely become swarm-based computation, and nature-inspired algorithms are becoming an increasingly significant part. These terminologies denote the field of evolutionary computing and consider evolutionary programming, evolution strategies, genetic algorithms, and genetic programming as sub-areas. Simulations of evolution using evolutionary algorithms and artificial life started with the work of Nils Aall Barricelli in the 1960s, and was extended by Alex Fraser, who published a series of papers on simulation of artificial selection.[1] Artificial evolution became a widely recognised optimisation method as a result of the work of Ingo Rechenberg in the 1960s and early 1970s, who used evolution strategies to solve complex engineering problems.[2] Genetic algorithms in particular became popular through the writing of John Holland.[3] As academic interest grew, dramatic increases in the power of computers allowed practical applications, including the automatic evolution of computer programs.[4] Evolutionary algorithms are now used to solve multi-dimensional problems more efficiently than software produced by human designers, and also to optimise the design of systems.[5] Techniques Evolutionary computing techniques mostly involve metaheuristic optimization algorithms. Broadly speaking, the field includes: Evolutionary algorithms • • • • • • Genetic algorithm Genetic programming Evolutionary programming Evolution strategy Differential evolution Eagle strategy Swarm intelligence • Ant colony optimization • Particle swarm optimization • Bees algorithm 1 Evolutionary computation • Cuckoo search and in a lesser extent also: • • • • • • • • • • • Artificial life (also see digital organism) Artificial immune systems Cultural algorithms Firefly algorithm Harmony search Learning classifier systems Learnable Evolution Model Parallel simulated annealing Self-organization such as self-organizing maps, competitive learning Self-Organizing Migrating Genetic Algorithm Swarm-based computing Evolutionary algorithms Evolutionary algorithms form a subset of evolutionary computation in that they generally only involve techniques implementing mechanisms inspired by biological evolution such as reproduction, mutation, recombination, natural selection and survival of the fittest. Candidate solutions to the optimization problem play the role of individuals in a population, and the cost function determines the environment within which the solutions "live" (see also fitness function). Evolution of the population then takes place after the repeated application of the above operators. In this process, there are two main forces that form the basis of evolutionary systems: Recombination and mutation create the necessary diversity and thereby facilitate novelty, while selection acts as a force increasing quality. Many aspects of such an evolutionary process are stochastic. Changed pieces of information due to recombination and mutation are randomly chosen. On the other hand, selection operators can be either deterministic, or stochastic. In the latter case, individuals with a higher fitness have a higher chance to be selected than individuals with a lower fitness, but typically even the weak individuals have a chance to become a parent or to survive. Evolutionary computation practitioners Incomplete list: • • • • • • • • • • Kalyanmoy Deb David E. Goldberg John Henry Holland John Koza Peter Nordin Ingo Rechenberg Hans-Paul Schwefel Peter J. Fleming Carlos M. Fonseca [6] Lee Graham 2 Evolutionary computation Major conferences and workshops • IEEE Congress on Evolutionary Computation (CEC) • Genetic and Evolutionary Computation Conference (GECCO)[7] • International Conference on Parallel Problem Solving From Nature (PPSN)[8] Bibliography • • • • K. A. De Jong, Evolutionary computation: a unified approach. MIT Press, Cambridge MA, 2006 A. E. Eiben and J.E. Smith, Introduction to Evolutionary Computing, Springer, 2003, ISBN 3-540-40184-9 A. E. Eiben and M. Schoenauer, Evolutionary computing, Information Processing Letters, 82(1): 1–6, 2002. S. Cagnoni, et al, Real-World Applications of Evolutionary Computing [9], Springer-Verlag Lecture Notes in Computer Science, Berlin, 2000. • W. Banzhaf, P. Nordin, R.E. Keller, and F.D. Francone. Genetic Programming — An Introduction. Morgan Kaufmann, 1998. • D. B. Fogel. Evolutionary Computation. Toward a New Philosophy of Machine Intelligence. IEEE Press, Piscataway, NJ, 1995. • H.-P. Schwefel. Numerical Optimization of Computer Models. John Wiley & Sons, New-York, 1981. 1995 – 2nd edition. • Th. Bäck and H.-P. Schwefel. An overview of evolutionary algorithms for parameter optimization. Evolutionary Computation, 1(1):1–23, 1993. • J. R. Koza. Genetic Programming: On the Programming of Computers by means of Natural Evolution. MIT Press, Massachusetts, 1992. • D. E. Goldberg. Genetic algorithms in search, optimization and machine learning. Addison Wesley, 1989. • J. H. Holland. Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor, 1975. • I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973. (German) • L. J. Fogel, A. J. Owens, and M. J. Walsh. Artificial Intelligence through Simulated Evolution. New York: John Wiley, 1966. References [1] Fraser AS (1958). "Monte Carlo analyses of genetic models". Nature 181 (4603): 208–9. doi:10.1038/181208a0. PMID 13504138. [2] Rechenberg, Ingo (1973) (in German). Evolutionsstrategie – Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Fromman-Holzboog. [3] Holland, John H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press. ISBN 0-262-58111-6. [4] Koza, John R. (1992). Genetic Programming. MIT Press. ISBN 0-262-11170-5. [5] Jamshidi M (2003). "Tools for intelligent control: fuzzy controllers, neural networks and genetic algorithms". Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 361 (1809): 1781–808. doi:10.1098/rsta.2003.1225. PMID 12952685. [6] http:/ / eden. dei. uc. pt/ ~cmfonsec/ [7] "Special Interest Group on Genetic and Evolutionary Computation" (http:/ / www. sigevo. org/ ). SIGEVO. . [8] "Parallel Problem Solving from Nature" (http:/ / ls11-www. cs. uni-dortmund. de/ rudolph/ ppsn). . Retrieved 2012-03-06. [9] http:/ / www. springer. com/ computer+ science/ theoretical+ computer+ science/ foundations+ of+ computations/ book/ 978-3-540-67353-8 • Evolutionary Computing Research Community Europe (http://www.evolutionary-computing.eu) • Evolutionary Computation Repository (http://www.fmi.uni-stuttgart.de/fk/evolalg/) • Hitch-Hiker's Guide to Evolutionary Computation (FAQ for comp.ai.genetic) (http://www.cse.dmu.ac.uk/ ~rij/gafaq/top.htm) • Interactive illustration of Evolutionary Computation (http://userweb.eng.gla.ac.uk/yun.li/ga_demo/) • VitaSCIENCES (http://www.vita-sciences.org/) 3 Evolutionary algorithm Evolutionary algorithm In artificial intelligence, an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses some mechanisms inspired by biological evolution: reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the environment within which the solutions "live" (see also cost function). Evolution of the population then takes place after the repeated application of the above operators. Artificial evolution (AE) describes a process involving individual evolutionary algorithms; EAs are individual components that participate in an AE. Evolutionary algorithms often perform well approximating solutions to all types of problems because they ideally do not make any assumption about the underlying fitness landscape; this generality is shown by successes in fields as diverse as engineering, art, biology, economics, marketing, genetics, operations research, robotics, social sciences, physics, politics and chemistry. Techniques from evolutionary algorithms applied to the modeling of biological evolution are generally limited to explorations of microevolutionary processes, however some computer simulations, such as Tierra and Avida, attempt to model macroevolutionary dynamics. In most real applications of EAs, computational complexity is a prohibiting factor. In fact, this computational complexity is due to fitness function evaluation. Fitness approximation is one of the solutions to overcome this difficulty. However, seemingly simple EA can solve often complex problems; therefore, there may be no direct link between algorithm complexity and problem complexity. Another possible limitation of many evolutionary algorithms is their lack of a clear genotype-phenotype distinction. In nature, the fertilized egg cell undergoes a complex process known as embryogenesis to become a mature phenotype. This indirect encoding is believed to make the genetic search more robust (i.e. reduce the probability of fatal mutations), and also may improve the evolvability of the organism.[1][2] Such indirect (aka generative or developmental) encodings also enable evolution to exploit the regularity in the environment.[3] Recent work in the field of artificial embryogeny, or artificial developmental systems, seeks to address these concerns. And gene expression programming successfully explores a genotype-phenotype system, where the genotype consists of linear multigenic chromosomes of fixed length and the phenotype consists of multiple expression trees or computer programs of different sizes and shapes.[4] Implementation of biological processes Usually, an initial population of randomly generated candidate solutions comprise the first generation. The fitness function is applied to the candidate solutions and any subsequent offspring. In selection, parents for the next generation are chosen with a bias towards higher fitness. The parents reproduce one or two offsprings (new candidates) by copying their genes, with two possible changes: crossover recombines the parental genes and mutation alters the genotype of an individual in a random way. These new candidates compete with old candidates for their place in the next generation (survival of the fittest). This process can be repeated until a candidate with sufficient quality (a solution) is found or a previously defined computational limit is reached. 4 Evolutionary algorithm Evolutionary algorithm techniques Similar techniques differ in the implementation details and the nature of the particular applied problem. • Genetic algorithm - This is the most popular type of EA. One seeks the solution of a problem in the form of strings of numbers (traditionally binary, although the best representations are usually those that reflect something about the problem being solved), by applying operators such as recombination and mutation (sometimes one, sometimes both). This type of EA is often used in optimization problems. • Genetic programming - Here the solutions are in the form of computer programs, and their fitness is determined by their ability to solve a computational problem. • Evolutionary programming - Similar to genetic programming, but the structure of the program is fixed and its numerical parameters are allowed to evolve. • Gene expression programming - Like genetic programming, GEP also evolves computer programs but it explores a genotype-phenotype system, where computer programs of different sizes are encoded in linear chromosomes of fixed length. • Evolution strategy - Works with vectors of real numbers as representations of solutions, and typically uses self-adaptive mutation rates. • Differential evolution - Based on vector differences and is therefore primarily suited for numerical optimization problems. • Neuroevolution - Similar to genetic programming but the genomes represent artificial neural networks by describing structure and connection weights. The genome encoding can be direct or indirect. • Learning classifier system Related techniques Swarm algorithms, including: • Ant colony optimization - Based on the ideas of ant foraging by pheromone communication to form paths. Primarily suited for combinatorial optimization and graph problems. • Bees algorithm is based on the foraging behaviour of honey bees. It has been applied in many applications such as routing and scheduling. • Cuckoo search is inspired by the brooding parasitism of some cuckoo species. It also uses Lévy flights, and thus it suits for global optimization problems. • Particle swarm optimization - Based on the ideas of animal flocking behaviour. Also primarily suited for numerical optimization problems. Other population-based metaheuristic methods: • Firefly algorithm is inspired by the behavior of fireflies, attracting each other by flashing light. This is especially useful for multimodal optimization. • Invasive weed optimization algorithm - Based on the ideas of weed colony behavior in searching and finding a suitable place for growth and reproduction. • Harmony search - Based on the ideas of musicians' behavior in searching for better harmonies. This algorithm is suitable for combinatorial optimization as well as parameter optimization. • Gaussian adaptation - Based on information theory. Used for maximization of manufacturing yield, mean fitness or average information. See for instance Entropy in thermodynamics and information theory. 5 Evolutionary algorithm References [1] G.S. Hornby and J.B. Pollack. Creating high-level components with a generative representation for body-brain evolution. Artificial Life, 8(3):223–246, 2002. [2] Jeff Clune, Benjamin Beckmann, Charles Ofria, and Robert Pennock. "Evolving Coordinated Quadruped Gaits with the HyperNEAT Generative Encoding" (https:/ / www. msu. edu/ ~jclune/ webfiles/ Evolving-Quadruped-Gaits-With-HyperNEAT. html). Proceedings of the IEEE Congress on Evolutionary Computing Special Section on Evolutionary Robotics, 2009. Trondheim, Norway. [3] J. Clune, C. Ofria, and R. T. Pennock, “How a generative encoding fares as problem-regularity decreases,” in PPSN (G. Rudolph, T. Jansen, S. M. Lucas, C. Poloni, and N. Beume, eds.), vol. 5199 of Lecture Notes in Computer Science, pp. 358–367, Springer, 2008. [4] Ferreira, C., 2001. Gene Expression Programming: A New Adaptive Algorithm for Solving Problems. Complex Systems, Vol. 13, issue 2: 87-129. (http:/ / www. gene-expression-programming. com/ webpapers/ GEP. pdf) Bibliography • Ashlock, D. (2006), Evolutionary Computation for Modeling and Optimization, Springer, ISBN 0-387-22196-4. • Bäck, T. (1996), Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms, Oxford Univ. Press. • Bäck, T., Fogel, D., Michalewicz, Z. (1997), Handbook of Evolutionary Computation, Oxford Univ. Press. • Eiben, A.E., Smith, J.E. (2003), Introduction to Evolutionary Computing, Springer. • Holland, J. H. (1975), Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor • Poli, R., Langdon, W. B., McPhee, N. F. (2008). A Field Guide to Genetic Programming (http://cswww.essex. ac.uk/staff/rpoli/gp-field-guide/). Lulu.com, freely available from the internet. ISBN 978-1-4092-0073-4. • Ingo Rechenberg (1971): Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Reprinted by Fromman-Holzboog (1973). • Hans-Paul Schwefel (1974): Numerische Optimierung von Computer-Modellen (PhD thesis). Reprinted by Birkhäuser (1977). • Michalewicz Z., Fogel D.B. (2004). How To Solve It: Modern Heuristics, Springer. • Price, K., Storn, R.M., Lampinen, J.A., (2005). "Differential Evolution: A Practical Approach to Global Optimization", Springer. • Yang X.-S., (2010), "Nature-Inspired Metaheuristic Algorithms", 2nd Edition, Luniver Press. External links • Evolutionary Computation Repository (http://www.fmi.uni-stuttgart.de/fk/evolalg/) • Genetic Algorithms and Evolutionary Computation (http://www.talkorigins.org/faqs/genalg/genalg.html) • An online interactive Evolutionary Algorithm demonstrator to practise or learn how exactly an EA works. (http:// userweb.elec.gla.ac.uk/y/yunli/ga_demo/) Learn step by step or watch global convergence in batch, change population size, crossover rate, mutation rate and selection mechanism, and add constraints. 6 Mathematical optimization 7 Mathematical optimization In mathematics, computational science, or management science, mathematical optimization (alternatively, optimization or mathematical programming) refers to the selection of a best element from some set of available alternatives.[1] In the simplest case, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations comprises a large area of applied mathematics. More generally, optimization includes finding "best available" values of some objective function given a defined domain, including a variety of different types of objective functions and different types of domains. Graph of a paraboloid given by f(x,y) = -(x²+y²)+4. The global maximum at (0,0,4) is indicated by a red dot. Optimization problems An optimization problem can be represented in the following way Given: a function f : A R from some set A to the real numbers Sought: an element x0 in A such that f(x0) ≤ f(x) for all x in A ("minimization") or such that f(x0) ≥ f(x) for all x in A ("maximization"). Such a formulation is called an optimization problem or a mathematical programming problem (a term not directly related to computer programming, but still in use for example in linear programming - see History below). Many real-world and theoretical problems may be modeled in this general framework. Problems formulated using this technique in the fields of physics and computer vision may refer to the technique as energy minimization, speaking of the value of the function f as representing the energy of the system being modeled. Typically, A is some subset of the Euclidean space Rn, often specified by a set of constraints, equalities or inequalities that the members of A have to satisfy. The domain A of f is called the search space or the choice set, while the elements of A are called candidate solutions or feasible solutions. The function f is called, variously, an objective function, cost function (minimization), utility function (maximization), or, in certain fields, energy function, or energy functional. A feasible solution that minimizes (or maximizes, if that is the goal) the objective function is called an optimal solution. By convention, the standard form of an optimization problem is stated in terms of minimization. Generally, unless both the objective function and the feasible region are convex in a minimization problem, there may be several local minima, where a local minimum x* is defined as a point for which there exists some δ > 0 so that for all x such that the expression holds; that is to say, on some region around x* all of the function values are greater than or equal to the value at that point. Local maxima are defined similarly. A large number of algorithms proposed for solving non-convex problems – including the majority of commercially available solvers – are not capable of making a distinction between local optimal solutions and rigorous optimal solutions, and will treat the former as actual solutions to the original problem. The branch of applied mathematics and numerical analysis that is concerned with the development of deterministic algorithms that are capable of Mathematical optimization 8 guaranteeing convergence in finite time to the actual optimal solution of a non-convex problem is called global optimization. Notation Optimization problems are often expressed with special notation. Here are some examples. Minimum and maximum value of a function Consider the following notation: This denotes the minimum value of the objective function x2 . The minimum value in this case is , occurring at , when choosing x from the set of real numbers . Similarly, the notation asks for the maximum value of the objective function 2x, where x may be any real number. In this case, there is no such maximum as the objective function is unbounded, so the answer is "infinity" or "undefined". Optimal input arguments Consider the following notation: or equivalently This represents the value (or values) of the argument x in the interval that minimizes (or minimize) the 2 objective function x + 1 (the actual minimum value of that function is not what the problem asks for). In this case, the answer is x = -1, since x = 0 is infeasible, i.e. does not belong to the feasible set. Similarly, or equivalently represents the pair (or pairs) that maximizes (or maximize) the value of the objective function with the added constraint that x lie in the interval , (again, the actual maximum value of the expression does not matter). In this case, the solutions are the pairs of the form (5, 2kπ) and (−5,(2k+1)π), where k ranges over all integers. Arg min and arg max are sometimes also written argmin and argmax, and stand for argument of the minimum and argument of the maximum. Mathematical optimization 9 History Fermat and Lagrange found calculus-based formulas for identifying optima, while Newton and Gauss proposed iterative methods for moving towards an optimum. Historically, the first term for optimization was "linear programming", which was due to George B. Dantzig, although much of the theory had been introduced by Leonid Kantorovich in 1939. Dantzig published the Simplex algorithm in 1947, and John von Neumann developed the theory of duality in the same year. The term programming in this context does not refer to computer programming. Rather, the term comes from the use of program by the United States military to refer to proposed training and logistics schedules, which were the problems Dantzig studied at that time. Later important researchers in mathematical optimization include the following: • Richard Bellman • Arkadii Nemirovskii • Ronald A. Howard • Yurii Nesterov • Narendra Karmarkar • Boris Polyak • William Karush • Lev Pontryagin • Leonid Khachiyan • James Renegar • Bernard Koopman • R. Tyrrell Rockafellar • Harold Kuhn • Cornelis Roos • Joseph Louis Lagrange • Naum Z. Shor • László Lovász • Michael J. Todd • Albert Tucker Major subfields • Convex programming studies the case when the objective function is convex (minimization) or concave (maximization) and the constraint set is convex. This can be viewed as a particular case of nonlinear programming or as generalization of linear or convex quadratic programming. • • • • • Linear programming (LP), a type of convex programming, studies the case in which the objective function f is linear and the set of constraints is specified using only linear equalities and inequalities. Such a set is called a polyhedron or a polytope if it is bounded. • Second order cone programming (SOCP) is a convex program, and includes certain types of quadratic programs. • Semidefinite programming (SDP) is a subfield of convex optimization where the underlying variables are semidefinite matrices. It is generalization of linear and convex quadratic programming. • Conic programming is a general form of convex programming. LP, SOCP and SDP can all be viewed as conic programs with the appropriate type of cone. • Geometric programming is a technique whereby objective and inequality constraints expressed as posynomials and equality constraints as monomials can be transformed into a convex program. Integer programming studies linear programs in which some or all variables are constrained to take on integer values. This is not convex, and in general much more difficult than regular linear programming. Quadratic programming allows the objective function to have quadratic terms, while the feasible set must be specified with linear equalities and inequalities. For specific forms of the quadratic term, this is a type of convex programming. Fractional programming studies optimization of ratios of two nonlinear functions. The special class of concave fractional programs can be transformed to a convex optimization problem. Nonlinear programming studies the general case in which the objective function or the constraints or both contain nonlinear parts. This may or may not be a convex program. In general, whether the program is convex affects the Mathematical optimization • • • • • • difficulty of solving it. Stochastic programming studies the case in which some of the constraints or parameters depend on random variables. Robust programming is, like stochastic programming, an attempt to capture uncertainty in the data underlying the optimization problem. This is not done through the use of random variables, but instead, the problem is solved taking into account inaccuracies in the input data. Combinatorial optimization is concerned with problems where the set of feasible solutions is discrete or can be reduced to a discrete one. Infinite-dimensional optimization studies the case when the set of feasible solutions is a subset of an infinite-dimensional space, such as a space of functions. Heuristics and metaheuristics make few or no assumptions about the problem being optimized. Usually, heuristics do not guarantee that any optimal solution need be found. On the other hand, heuristics are used to find approximate solutions for many complicated optimization problems. Constraint satisfaction studies the case in which the objective function f is constant (this is used in artificial intelligence, particularly in automated reasoning). • Constraint programming. • Disjunctive programming is used where at least one constraint must be satisfied but not all. It is of particular use in scheduling. In a number of subfields, the techniques are designed primarily for optimization in dynamic contexts (that is, decision making over time): • Calculus of variations seeks to optimize an objective defined over many points in time, by considering how the objective function changes if there is a small change in the choice path. • Optimal control theory is a generalization of the calculus of variations. • Dynamic programming studies the case in which the optimization strategy is based on splitting the problem into smaller subproblems. The equation that describes the relationship between these subproblems is called the Bellman equation. • Mathematical programming with equilibrium constraints is where the constraints include variational inequalities or complementarities. Multi-objective optimization Adding more than one objective to an optimization problem adds complexity. For example, to optimize a structural design, one would want a design that is both light and rigid. Because these two objectives conflict, a trade-off exists. There will be one lightest design, one stiffest design, and an infinite number of designs that are some compromise of weight and stiffness. The set of trade-off designs that cannot be improved upon according to one criterion without hurting another criterion is known as the Pareto set. The curve created plotting weight against stiffness of the best designs is known as the Pareto frontier. A design is judged to be "Pareto optimal" (equivalently, "Pareto efficient" or in the Pareto set) if it is not dominated by any other design: If it is worse than another design in some respects and no better in any respect, then it is dominated and is not Pareto optimal. The choice among "Pareto optimal" solutions to determine the "favorite solution" is delegated to the decision maker. In other words, defining the problem as multiobjective optimization signals that some information is missing: desirable objectives are given but not their detailed combination. In some cases, the missing information can be derived by interactive sessions with the decision maker. 10 Mathematical optimization Multi-modal optimization Optimization problems are often multi-modal; that is they possess multiple good solutions. They could all be globally good (same cost function value) or there could be a mix of globally good and locally good solutions. Obtaining all (or at least some of) the multiple solutions is the goal of a multi-modal optimizer. Classical optimization techniques due to their iterative approach do not perform satisfactorily when they are used to obtain multiple solutions, since it is not guaranteed that different solutions will be obtained even with different starting points in multiple runs of the algorithm. Evolutionary Algorithms are however a very popular approach to obtain multiple solutions in a multi-modal optimization task. See Evolutionary multi-modal optimization. Classification of critical points and extrema Feasibility problem The satisfiability problem, also called the feasibility problem, is just the problem of finding any feasible solution at all without regard to objective value. This can be regarded as the special case of mathematical optimization where the objective value is the same for every solution, and thus any solution is optimal. Many optimization algorithms need to start from a feasible point. One way to obtain such a point is to relax the feasibility conditions using a slack variable; with enough slack, any starting point is feasible. Then, minimize that slack variable until slack is null or negative. Existence The extreme value theorem of Karl Weierstrass states that a continuous real-valued function on a compact set attains its maximum and minimum value. More generally, a lower semi-continuous function on a compact set attains its minimum; an upper semi-continuous function on a compact set attains its maximum. Necessary conditions for optimality One of Fermat's theorems states that optima of unconstrained problems are found at stationary points, where the first derivative or the gradient of the objective function is zero (see first derivative test). More generally, they may be found at critical points, where the first derivative or gradient of the objective function is zero or is undefined, or on the boundary of the choice set. An equation (or set of equations) stating that the first derivative(s) equal(s) zero at an interior optimum is called a 'first-order condition' or a set of first-order conditions. Optima of inequality-constrained problems are instead found by the Lagrange multiplier method. This method calculates a system of inequalities called the 'Karush–Kuhn–Tucker conditions' or 'complementary slackness conditions', which may then be used to calculate the optimum. Sufficient conditions for optimality While the first derivative test identifies points that might be extrema, this test does not distinguish a point that is a minimum from one that is a maximum or one that is neither. When the objective function is twice differentiable, these cases can be distinguished by checking the second derivative or the matrix of second derivatives (called the Hessian matrix) in unconstrained problems, or the matrix of second derivatives of the objective function and the constraints called the bordered Hessian in constrained problems. The conditions that distinguish maxima, or minima, from other stationary points are called 'second-order conditions' (see 'Second derivative test'). If a candidate solution satisfies the first-order conditions, then satisfaction of the second-order conditions as well is sufficient to establish at least local optimality. 11 Mathematical optimization Sensitivity and continuity of optima The envelope theorem describes how the value of an optimal solution changes when an underlying parameter changes. The process of computing this change is called comparative statics. The maximum theorem of Claude Berge (1963) describes the continuity of an optimal solution as a function of underlying parameters. Calculus of optimization For unconstrained problems with twice-differentiable functions, some critical points can be found by finding the points where the gradient of the objective function is zero (that is, the stationary points). More generally, a zero subgradient certifies that a local minimum has been found for minimization problems with convex functions and other locally Lipschitz functions. Further, critical points can be classified using the definiteness of the Hessian matrix: If the Hessian is positive definite at a critical point, then the point is a local minimum; if the Hessian matrix is negative definite, then the point is a local maximum; finally, if indefinite, then the point is some kind of saddle point. Constrained problems can often be transformed into unconstrained problems with the help of Lagrange multipliers. Lagrangian relaxation can also provide approximate solutions to difficult constrained problems. When the objective function is convex, then any local minimum will also be a global minimum. There exist efficient numerical techniques for minimizing convex functions, such as interior-point methods. Computational optimization techniques To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution (on some specified class of problems), or heuristics that may provide approximate solutions to some problems (although their iterates need not converge). Optimization algorithms • • • • Simplex algorithm of George Dantzig, designed for linear programming. Extensions of the simplex algorithm, designed for quadratic programming and for linear-fractional programming. Variants of the simplex algorithm that are especially suited for network optimization. Combinatorial algorithms Iterative methods The iterative methods used to solve problems of nonlinear programming differ according to whether they evaluate Hessians, gradients, or only function values. While evaluating Hessians (H) and gradients (G) improves the rate of convergence, such evaluations increase the computational complexity (or computational cost) of each iteration. In some cases, the computational complexity may be excessively high. One major criterion for optimizers is just the number of required function evaluations as this often is already a large computational effort, usually much more effort than within the optimizer itself, which mainly has to operate over the N variables. The derivatives provide detailed information for such optimizers, but are even harder to calculate, e.g. approximating the gradient takes at least N+1 function evaluations. For approximations of the 2nd derivatives (collected in the Hessian matrix) the number of function evaluations is in the order of N². Newton's method requires the 2nd order derivates, so for each iteration the number of function calls is in the order of N², but for a simpler pure gradient optimizer it is only N. However, gradient optimizers need usually more iterations than Newton's algorithm. Which one is best wrt. number of function calls depends on the problem itself. • Methods that evaluate Hessians (or approximate Hessians, using finite differences): 12 Mathematical optimization • Newton's method • Sequential quadratic programming: A Newton-based method for small-medium scale constrained problems. Some versions can handle large-dimensional problems. • Methods that evaluate gradients or approximate gradients using finite differences (or even subgradients): • Quasi-Newton methods: Iterative methods for medium-large problems (e.g. N<1000). • Conjugate gradient methods: Iterative methods for large problems. (In theory, these methods terminate in a finite number of steps with quadratic objective functions, but this finite termination is not observed in practice on finite–precision computers.) • Interior point methods: This is a large class of methods for constrained optimization. Some interior-point methods use only (sub)gradient information, and others of which require the evaluation of Hessians. • Gradient descent (alternatively, "steepest descent" or "steepest ascent"): A (slow) method of historical and theoretical interest, which has had renewed interest for finding approximate solutions of enormous problems. • Subgradient methods - An iterative method for large locally Lipschitz functions using generalized gradients. Following Boris T. Polyak, subgradient–projection methods are similar to conjugate–gradient methods. • Bundle method of descent: An iterative method for small–medium sized problems with locally Lipschitz functions, particularly for convex minimization problems. (Similar to conjugate gradient methods) • Ellipsoid method: An iterative method for small problems with quasiconvex objective functions and of great theoretical interest, particularly in establishing the polynomial time complexity of some combinatorial optimization problems. It has similarities with Quasi-Newton methods. • Reduced gradient method (Frank–Wolfe) for approximate minimization of specially structured problems with linear constraints, especially with traffic networks. For general unconstrained problems, this method reduces to the gradient method, which is regarded as obsolete (for almost all problems). • Methods that evaluate only function values: If a problem is continuously differentiable, then gradients can be approximated using finite differences, in which case a gradient-based method can be used. • Interpolation methods • Pattern search methods, which have better convergence properties than the Nelder–Mead heuristic (with simplices), which is listed below. Global convergence More generally, if the objective function is not a quadratic function, then many optimization methods use other methods to ensure that some subsequence of iterations converges to an optimal solution. The first and still popular method for ensuring convergence relies on line searches, which optimize a function along one dimension. A second and increasingly popular method for ensuring convergence uses trust regions. Both line searches and trust regions are used in modern methods of non-differentiable optimization. Usually a global optimizer is much slower than advanced local optimizers (such as BFGS), so often an efficient global optimizer can be constructed by starting the local optimizer from different starting points. 13 Mathematical optimization Heuristics Besides (finitely terminating) algorithms and (convergent) iterative methods, there are heuristics that can provide approximate solutions to some optimization problems: • • • • • • • • • • Memetic algorithm Differential evolution Dynamic relaxation Genetic algorithms Hill climbing Nelder-Mead simplicial heuristic: A popular heuristic for approximate minimization (without calling gradients) Particle swarm optimization Simulated annealing Tabu search Reactive Search Optimization (RSO)[2] implemented in LIONsolver Applications Mechanics and engineering Problems in rigid body dynamics (in particular articulated rigid body dynamics) often require mathematical programming techniques, since you can view rigid body dynamics as attempting to solve an ordinary differential equation on a constraint manifold; the constraints are various nonlinear geometric constraints such as "these two points must always coincide", "this surface must not penetrate any other", or "this point must always lie somewhere on this curve". Also, the problem of computing contact forces can be done by solving a linear complementarity problem, which can also be viewed as a QP (quadratic programming) problem. Many design problems can also be expressed as optimization programs. This application is called design optimization. One subset is the engineering optimization, and another recent and growing subset of this field is multidisciplinary design optimization, which, while useful in many problems, has in particular been applied to aerospace engineering problems. Economics Economics is closely enough linked to optimization of agents that an influential definition relatedly describes economics qua science as the "study of human behavior as a relationship between ends and scarce means" with alternative uses.[3] Modern optimization theory includes traditional optimization theory but also overlaps with game theory and the study of economic equilibria. The Journal of Economic Literature codes classify mathematical programming, optimization techniques, and related topics under JEL:C61-C63. In microeconomics, the utility maximization problem and its dual problem, the expenditure minimization problem, are economic optimization problems. Insofar as they behave consistently, consumers are assumed to maximize their utility, while firms are usually assumed to maximize their profit. Also, agents are often modeled as being risk-averse, thereby preferring to avoid risk. Asset prices are also modeled using optimization theory, though the underlying mathematics relies on optimizing stochastic processes rather than on static optimization. Trade theory also uses optimization to explain trade patterns between nations. The optimization of market portfolios is an example of multi-objective optimization in economics. Since the 1970s, economists have modeled dynamic decisions over time using control theory. For example, microeconomists use dynamic search models to study labor-market behavior.[4] A crucial distinction is between deterministic and stochastic models.[5] Macroeconomists build dynamic stochastic general equilibrium (DSGE) models that describe the dynamics of the whole economy as the result of the interdependent optimizing decisions of 14 Mathematical optimization workers, consumers, investors, and governments.[6][7] Operations research Another field that uses optimization techniques extensively is operations research. Operations research also uses stochastic modeling and simulation to support improved decision-making. Increasingly, operations research uses stochastic programming to model dynamic decisions that adapt to events; such problems can be solved with large-scale optimization and stochastic optimization methods. Control engineering Mathematical optimization is used in much modern controller design. High-level controllers such as Model predictive control (MPC) or Real-Time Optimization (RTO) employ mathematical optimization. These algorithms run online and repeatedly determine values for decision variables, such as choke openings in a process plant, by iteratively solving a mathematical optimization problem including constraints and a model of the system to be controlled. Notes [1] " The Nature of Mathematical Programming (http:/ / glossary. computing. society. informs. org/ index. php?page=nature. html)," Mathematical Programming Glossary, INFORMS Computing Society. [2] Battiti, Roberto; Mauro Brunato; Franco Mascia (2008). Reactive Search and Intelligent Optimization (http:/ / reactive-search. org/ thebook). Springer Verlag. ISBN 978-0-387-09623-0. . [3] Lionel Robbins (1935, 2nd ed.) An Essay on the Nature and Significance of Economic Science, Macmillan, p. 16. [4] A. K. Dixit ([1976] 1990). Optimization in Economic Theory, 2nd ed., Oxford. Description (http:/ / books. google. com/ books?id=dHrsHz0VocUC& pg=find& pg=PA194=false#v=onepage& q& f=false) and contents preview (http:/ / books. google. com/ books?id=dHrsHz0VocUC& pg=PR7& lpg=PR6& dq=false& lr=#v=onepage& q=false& f=false). [5] A.G. Malliaris (2008). "stochastic optimal control," The New Palgrave Dictionary of Economics, 2nd Edition. Abstract (http:/ / www. dictionaryofeconomics. com/ article?id=pde2008_S000269& edition=& field=keyword& q=Taylor's th& topicid=& result_number=1). [6] Julio Rotemberg and Michael Woodford (1997), "An Optimization-based Econometric Framework for the Evaluation of Monetary Policy.NBER Macroeconomics Annual, 12, pp. 297-346. (http:/ / people. hbs. edu/ jrotemberg/ PublishedArticles/ OptimizBasedEconometric_97. pdf) [7] From The New Palgrave Dictionary of Economics (2008), 2nd Edition with Abstract links: • " numerical optimization methods in economics (http:/ / www. dictionaryofeconomics. com/ article?id=pde2008_N000148& edition=current& q=optimization& topicid=& result_number=1)" by Karl Schmedders • " convex programming (http:/ / www. dictionaryofeconomics. com/ article?id=pde2008_C000348& edition=current& q=optimization& topicid=& result_number=4)" by Lawrence E. Blume • " Arrow–Debreu model of general equilibrium (http:/ / www. dictionaryofeconomics. com/ article?id=pde2008_A000133& edition=current& q=optimization& topicid=& result_number=20)" by John Geanakoplos. Further reading Comprehensive Undergraduate level • Bradley, S.; Hax, A.; Magnanti, T. (1977). Applied mathematical programming. Addison Wesley. • Rardin, Ronald L. (1997). Optimization in operations research. Prentice Hall. pp. 919. ISBN 0-02-398415-5. • Strang, Gilbert (1986). Introduction to applied mathematics (http://www.wellesleycambridge.com/tocs/ toc-appl). Wellesley, MA: Wellesley-Cambridge Press (Strang's publishing company). pp. xii+758. ISBN 0-9614088-0-4. MR870634. 15 Mathematical optimization Graduate level • Magnanti, Thomas L. (1989). "Twenty years of mathematical programming". In Cornet, Bernard; Tulkens, Henry. Contributions to Operations Research and Economics: The twentieth anniversary of CORE (Papers from the symposium held in Louvain-la-Neuve, January 1987). Cambridge, MA: MIT Press. pp. 163–227. ISBN 0-262-03149-3. MR1104662. • Minoux, M. (1986). Mathematical programming: Theory and algorithms (Translated by Steven Vajda from the (1983 Paris: Dunod) French ed.). Chichester: A Wiley-Interscience Publication. John Wiley & Sons, Ltd.. pp. xxviii+489. ISBN 0-471-90170-9. MR2571910. (2008 Second ed., in French: Programmation mathématique: Théorie et algorithmes. Editions Tec & Doc, Paris, 2008. xxx+711 pp. ISBN 978-2-7430-1000-3.. • Nemhauser, G. L.; Rinnooy Kan, A. H. G.; Todd, M. J., eds. (1989). Optimization. Handbooks in Operations Research and Management Science. 1. Amsterdam: North-Holland Publishing Co.. pp. xiv+709. ISBN 0-444-87284-1. MR1105099. • J. E. Dennis, Jr. and Robert B. Schnabel, A view of unconstrained optimization (pp. 1–72); • Donald Goldfarb and Michael J. Todd, Linear programming (pp. 73–170); • Philip E. Gill, Walter Murray, Michael A. Saunders, and Margaret H. Wright, Constrained nonlinear programming (pp. 171–210); • Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin, Network flows (pp. 211–369); • • • • • • W. R. Pulleyblank, Polyhedral combinatorics (pp. 371–446); George L. Nemhauser and Laurence A. Wolsey, Integer programming (pp. 447–527); Claude Lemaréchal, Nondifferentiable optimization (pp. 529–572); Roger J-B Wets, Stochastic programming (pp. 573–629); A. H. G. Rinnooy Kan and G. T. Timmer, Global optimization (pp. 631–662); P. L. Yu, Multiple criteria decision making: five basic concepts (pp. 663–699). • Shapiro, Jeremy F. (1979). Mathematical programming: Structures and algorithms. New York: Wiley-Interscience [John Wiley & Sons]. pp. xvi+388. ISBN 0-471-77886-9. MR544669. Continuous optimization • Mordecai Avriel (2003). Nonlinear Programming: Analysis and Methods. Dover Publishing. ISBN 0-486-43227-0. • Bonnans, J. Frédéric; Gilbert, J. Charles; Lemaréchal, Claude; Sagastizábal, Claudia A. (2006). Numerical optimization: Theoretical and practical aspects (http://www.springer.com/mathematics/applications/book/ 978-3-540-35445-1). Universitext (Second revised ed. of translation of 1997 French ed.). Berlin: Springer-Verlag. pp. xiv+490. doi:10.1007/978-3-540-35447-5. ISBN 3-540-35445-X. MR2265882. • Bonnans, J. Frédéric; Shapiro, Alexander (2000). Perturbation analysis of optimization problems. Springer Series in Operations Research. New York: Springer-Verlag. pp. xviii+601. ISBN 0-387-98705-3. MR1756264. • Boyd, Stephen P.; Vandenberghe, Lieven (2004) (pdf). Convex Optimization (http://www.stanford.edu/~boyd/ cvxbook/bv_cvxbook.pdf). Cambridge University Press. ISBN 978-0-521-83378-3. Retrieved October 15, 2011. • Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization (http://www.ece.northwestern.edu/ ~nocedal/book/num-opt.html). Springer. ISBN 0-387-30303-0. 16 Mathematical optimization Combinatorial optimization • R. K. Ahuja, Thomas L. Magnanti, and James B. Orlin (1993). Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Inc. ISBN 0-13-617549-X. • William J. Cook, William H. Cunningham, William R. Pulleyblank, Alexander Schrijver; Combinatorial Optimization; John Wiley & Sons; 1 edition (November 12, 1997); ISBN 0-471-55894-X. • Gondran, Michel; Minoux, Michel (1984). Graphs and algorithms. Wiley-Interscience Series in Discrete Mathematics (Translated by Steven Vajda from the second (Collection de la Direction des Études et Recherches d'Électricité de France [Collection of the Department of Studies and Research of Électricité de France], v. 37. Paris: Éditions Eyrolles 1985. xxviii+545 pp. MR868083) French ed.). Chichester: John Wiley & Sons, Ltd.. pp. xix+650. ISBN 0-471-10374-8. MR2552933. (Fourth ed. Collection EDF R&D. Paris: Editions Tec & Doc 2009. xxxii+784 pp.. • Eugene Lawler (2001). Combinatorial Optimization: Networks and Matroids. Dover. ISBN 0-486-41453-1. • Lawler, E. L.; Lenstra, J. K.; Rinnooy Kan, A. H. G.; Shmoys, D. B. (1985), The traveling salesman problem: A guided tour of combinatorial optimization, John Wiley & Sons, ISBN 0-471-90413-9. • Jon Lee; A First Course in Combinatorial Optimization (http://books.google.com/ books?id=3pL1B7WVYnAC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q& f=false); Cambridge University Press; 2004; ISBN 0-521-01012-8. • Christos H. Papadimitriou and Kenneth Steiglitz Combinatorial Optimization : Algorithms and Complexity; Dover Pubns; (paperback, Unabridged edition, July 1998) ISBN 0-486-40258-4. Journals • Computational Optimization and Applications (http://www.springer.com/mathematics/journal/10589) • Journal of Computational Optimization in Economics and Finance (https://www.novapublishers.com/catalog/ product_info.php?products_id=6353) • Journal of Economic Dynamics and Control (http://www.journals.elsevier.com/ journal-of-economic-dynamics-and-control/) • SIAM Journal on Optimization (SIOPT) (http://www.siam.org/journals/siopt.php) and Editorial Policy (http:/ /www.siam.org/journals/siopt/policy.php) • SIAM Journal on Control and Optimization (SICON) (http://www.siam.org/journals/sicon.php) and Editorial Policy (http://www.siam.org/journals/sicon/policy.php) External links • • • • • • COIN-OR (http://www.coin-or.org/)—Computational Infrastructure for Operations Research Decision Tree for Optimization Software (http://plato.asu.edu/guide.html) Links to optimization source codes Global optimization (http://www.mat.univie.ac.at/~neum/glopt.html) Mathematical Programming Glossary (http://glossary.computing.society.informs.org/) Mathematical Programming Society (http://www.mathprog.org/) NEOS Guide (http://www-fp.mcs.anl.gov/otc/Guide/index.html) currently being replaced by the NEOS Wiki (http://wiki.mcs.anl.gov/neos) • Optimization Online (http://www.optimization-online.org) A repository for optimization e-prints • Optimization Related Links (http://www2.arnes.si/~ljc3m2/igor/links.html) • Convex Optimization I (http://see.stanford.edu/see/courseinfo. aspx?coll=2db7ced4-39d1-4fdb-90e8-364129597c87) EE364a: Course from Stanford University • Convex Optimization – Boyd and Vandenberghe (http://www.stanford.edu/~boyd/cvxbook) Book on Convex Optimization 17 Mathematical optimization • Simplemax Online Optimization Services (http://simplemax.net) Web applications to access nonlinear optimization services Solvers: • APOPT (http://wiki.mcs.anl.gov/NEOS/index.php/APOPT) - large-scale nonlinear programming • Free Optimization Software by Systems Optimization Laboratory, Stanford University (http://www.stanford. edu/group/SOL/software.html) • MIDACO-Solver (http://www.midaco-solver.com/) General purpose (MINLP) optimization software based on Ant colony optimization algorithms (Matlab, Excel, C/C++, Fortran) • Moocho (http://trilinos.sandia.gov/packages/moocho/) - a very flexible open-source NLP solver • TANGO Project (http://www.ime.usp.br/~egbirgin/tango/) - Trustable Algorithms for Nonlinear General Optimization - Fortran Libraries: • The NAG Library (http://www.nag.co.uk/numeric/numerical_libraries.asp) is a collection of numerical routines developed by the Numerical Algorithms Group for multiple programming languages (including C, C++, Fortran, Visual Basic, Java and C#) and packages (for example, MATLAB, Excel, R, and LabVIEW) which contains several routines for both local and global optimization. • ALGLIB (http://www.alglib.net/optimization/) Open-source optimization routines (unconstrained and bound-constrained optimization). C++, C#, Delphi, Visual Basic. • IOptLib (Investigative Optimization Library) (http://www2.arnes.si/~ljc3m2/igor/ioptlib/) - a free, open-source library for optimization algorithms (ANSI C). • OAT (Optimization Algorithm Toolkit) (http://optalgtoolkit.sourceforge.net/) - a set of standard optimization algorithms and problems in Java. • Java Parallel Optimization Package (JPOP) (http://www5.informatik.uni-erlangen.de/research/software/ java-parallel-optimization-package/) An open-source java package which allows the parallel evaluation of functions, gradients, and hessians. • OOL (Open Optimization library) (http://ool.sourceforge.net/)-optimization routines in C. • FuncLib (http://funclib.codeplex.com/) Open source non-linear optimization library in C# with support for non-linear constraints and automatic differentiation. • JOptimizer (http://www.joptimizer.com/) Open source Java library for convex optimization. 18 Nonlinear programming Nonlinear programming In mathematics, nonlinear programming (NLP) is the process of solving a system of equalities and inequalities, collectively termed constraints, over a set of unknown real variables, along with an objective function to be maximized or minimized, where some of the constraints or the objective function are nonlinear.[1] Applicability A typical nonconvex problem is that of optimising transportation costs by selection from a set of transportion methods, one or more of which exhibit economies of scale, with various connectivities and capacity constraints. An example would be petroleum product transport given a selection or combination of pipeline, rail tanker, road tanker, river barge, or coastal tankship. Owing to economic batch size the cost functions may have discontinuities in addition to smooth changes. Mathematical formulation of the problem The problem can be stated simply as: to maximize some variable such as product throughput or to minimize a cost function where Methods for solving the problem If the objective function f is linear and the constrained space is a polytope, the problem is a linear programming problem, which may be solved using well known linear programming solutions. If the objective function is concave (maximization problem), or convex (minimization problem) and the constraint set is convex, then the program is called convex and general methods from convex optimization can be used in most cases. If the objective function is a ratio of a concave and a convex function (in the maximization case) and the constraints are convex, then the problem can be transformed to a convex optimization problem using fractional programming techniques. Several methods are available for solving nonconvex problems. One approach is to use special formulations of linear programming problems. Another method involves the use of branch and bound techniques, where the program is divided into subclasses to be solved with convex (minimization problem) or linear approximations that form a lower bound on the overall cost within the subdivision. With subsequent divisions, at some point an actual solution will be obtained whose cost is equal to the best lower bound obtained for any of the approximate solutions. This solution is optimal, although possibly not unique. The algorithm may also be stopped early, with the assurance that the best possible solution is within a tolerance from the best point found; such points are called ε-optimal. Terminating to ε-optimal points is typically necessary to ensure finite termination. This is especially useful for large, difficult problems and problems with uncertain costs or values where the uncertainty can be estimated with an appropriate reliability estimation. Under differentiability and constraint qualifications, the Karush–Kuhn–Tucker (KKT) conditions provide necessary conditions for a solution to be optimal. Under convexity, these conditions are also sufficient. 19 Nonlinear programming 20 Examples 2-dimensional example A simple problem can be defined by the constraints x1 ≥ 0 x2 ≥ 0 x12 + x22 ≥ 1 x12 + x22 ≤ 2 with an objective function to be maximized f(x) = x1 + x2 where x = (x1, x2). Solve 2-D Problem [2]. The intersection of the line with the constrained space represents the solution 3-dimensional example Another simple problem can be defined by the constraints x12 − x22 + x32 ≤ 2 x12 + x22 + x32 ≤ 10 with an objective function to be maximized f(x) = x1x2 + x2x3 where x = (x1, x2, x3). Solve 3-D Problem [3]. References [1] Bertsekas, Dimitri P. (1999). Nonlinear Programming (Second ed.). Cambridge, MA.: Athena Scientific. ISBN 1-886529-00-0. The intersection of the top surface with the constrained space in the center represents the solution [2] http:/ / apmonitor. com/ online/ view_pass. php?f=2d. apm [3] http:/ / apmonitor. com/ online/ view_pass. php?f=3d. apm Further reading • Avriel, Mordecai (2003). Nonlinear Programming: Analysis and Methods. Dover Publishing. ISBN 0-486-43227-0. • Bazaraa, Mokhtar S. and Shetty, C. M. (1979). Nonlinear programming. Theory and algorithms. John Wiley & Sons. ISBN 0-471-78610-1. • Bertsekas, Dimitri P. (1999). Nonlinear Programming: 2nd Edition. Athena Scientific. ISBN 1-886529-00-0. • Bonnans, J. Frédéric; Gilbert, J. Charles; Lemaréchal, Claude; Sagastizábal, Claudia A. (2006). Numerical optimization: Theoretical and practical aspects (http://www.springer.com/mathematics/applications/book/ 978-3-540-35445-1). Universitext (Second revised ed. of translation of 1997 French ed.). Berlin: Springer-Verlag. pp. xiv+490. doi:10.1007/978-3-540-35447-5. ISBN 3-540-35445-X. MR2265882. • Luenberger, David G.; Ye, Yinyu (2008). Linear and nonlinear programming. International Series in Operations Research & Management Science. 116 (Third ed.). New York: Springer. pp. xiv+546. ISBN 978-0-387-74502-2. Nonlinear programming MR2423726. • Nocedal, Jorge and Wright, Stephen J. (1999). Numerical Optimization. Springer. ISBN 0-387-98793-2. • Jan Brinkhuis and Vladimir Tikhomirov, 'Optimization: Insights and Applications', 2005, Princeton University Press External links • • • • Nonlinear programming FAQ (http://www.neos-guide.org/NEOS/index.php/Nonlinear_Programming_FAQ) Mathematical Programming Glossary (http://glossary.computing.society.informs.org/) Nonlinear Programming Survey OR/MS Today (http://www.lionhrtpub.com/orms/surveys/nlp/nlp.html) Overview of Optimization in Industry (http://apmonitor.com/wiki/index.php/Main/Background) Combinatorial optimization In applied mathematics and theoretical computer science, combinatorial optimization is a topic that consists of finding an optimal object from a finite set of objects.[1] In many such problems, exhaustive search is not feasible. It operates on the domain of those optimization problems, in which the set of feasible solutions is discrete or can be reduced to discrete, and in which the goal is to find the best solution. Some common problems involving combinatorial optimization are the traveling salesman problem ("TSP") and the minimum spanning tree problem. Combinatorial optimization is a subset of mathematical optimization that is related to operations research, algorithm theory, and computational complexity theory. It has important applications in several fields, including artificial intelligence, machine learning, mathematics, auction theory, and software engineering. Some research literature[2] considers discrete optimization to consist of integer programming together with combinatorial optimization (which in turn is composed of optimization problems dealing with graphs, matroids, and related structures) although all of these topics have closely intertwined research literature. It often involves determining the way to efficiently allocate resources used to find solutions to mathematical problems. Methods There is a large amount of literature on polynomial-time algorithms for certain special classes of discrete optimization, a considerable amount of it unified by the theory of linear programming. Some examples of combinatorial optimization problems that fall into this framework are shortest paths and shortest path trees, flows and circulations, spanning trees, matching, and matroid problems. For NP-complete discrete optimization problems, current research literature includes the following topics: • • • • polynomial-time exactly-solvable special cases of the problem at hand (e.g. see fixed-parameter tractable) algorithms that perform well on "random" instances (e.g. for TSP) approximation algorithms that run in polynomial time and find a solution that is "close" to optimal solving real-world instances that arise in practice and do not necessarily exhibit the worst-case behavior inherent in NP-complete problems (e.g. TSP instances with tens of thousands of nodes[3]). Combinatorial optimization problems can be viewed as searching for the best element of some set of discrete items, therefore, in principle, any sort of search algorithm or metaheuristic can be used to solve them. However, generic search algorithms are not guaranteed to find an optimal solution, nor are they guaranteed to run quickly (in polynomial time). Since some discrete optimization problems are NP-complete, such as the traveling salesman problem, this is expected unless P=NP. 21 Combinatorial optimization 22 Specific problems • Vehicle routing problem • Traveling salesman problem • Minimum spanning tree problem • Linear programming (if the solution space is the choice of which variables to make basic) • Integer programming • Eight queens puzzle - A constraint satisfaction problem. When applying standard combinatorial optimization algorithms to this problem, one would usually treat the goal function as the number of unsatisfied constraints (e.g. number of attacks) rather than whether the whole problem is satisfied or not. • • • • Knapsack problem Cutting stock problem Assignment problem Weapon target assignment problem Lookahead An optimal traveling salesperson tour through Germany’s 15 largest cities. It is the shortest among [4] 43,589,145,600 possible tours visiting each city exactly once. In artificial intelligence, lookahead is an important component of combinatorial search which specifies, roughly, how deeply the graph representing the problem is explored. The need for a specific limit on lookahead comes from the large problem graphs in many applications, such as computer chess and computer Go. A naive breadth-first search of these graphs would quickly consume all the memory of any modern computer. By setting a specific lookahead limit, the algorithm's time can be carefully controlled; its time increases exponentially as the lookahead limit increases. More sophisticated search techniques such as alpha-beta pruning are able to eliminate entire subtrees of the search tree from consideration. When these techniques are used, lookahead is not a precisely defined quantity, but instead either the maximum depth searched or some type of average. Further reading • Schrijver, Alexander. Combinatorial Optimization: Polyhedra and Efficiency. Algorithms and Combinatorics. 24. Springer. References [1] [2] [3] [4] Schrijver, p. 1 "Discrete Optimization" (http:/ / www. elsevier. com/ locate/ disopt). Elsevier. . Retrieved 2009-06-08. Bill Cook. "Optimal TSP Tours" (http:/ / www. tsp. gatech. edu/ optimal/ index. html). . Retrieved 2009-06-08. Take one city, and take all possible orders of the other 14 cities. Then divide by two because it does not matter in which direction in time they come after each other: 14!/2 = 43,589,145,600. Combinatorial optimization External links • Alexander Schrijver. On the history of combinatorial optimization (till 1960) (http://homepages.cwi.nl/~lex/ files/histco.pdf). Lecture notes • Integer programming (http://people.brunel.ac.uk/~mastjjb/jeb/or/ip.html) notes, J E Beasley. Source code • Java Combinatorial Optimization Platform (http://sourceforge.net/projects/jcop/) open source project. Others • Alexander Schrijver; A Course in Combinatorial Optimization (http://homepages.cwi.nl/~lex/files/dict.pdf) February 1, 2006 (© A. Schrijver) • William J. Cook, William H. Cunningham, William R. Pulleyblank, Alexander Schrijver; Combinatorial Optimization; John Wiley & Sons; 1 edition (November 12, 1997); ISBN 0-471-55894-X. • Eugene Lawler (2001). Combinatorial Optimization: Networks and Matroids. Dover. ISBN 0486414531. • Jon Lee; A First Course in Combinatorial Optimization (http://books.google.com/ books?id=3pL1B7WVYnAC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q& f=false); Cambridge University Press; 2004; ISBN 0-521-01012-8. • Pierluigi Crescenzi, Viggo Kann, Magnús Halldórsson, Marek Karpinski, Gerhard Woeginger, A Compendium of NP Optimization Problems (http://www.nada.kth.se/~viggo/wwwcompendium/). • Christos H. Papadimitriou and Kenneth Steiglitz Combinatorial Optimization : Algorithms and Complexity; Dover Pubns; (paperback, Unabridged edition, July 1998) ISBN 0-486-40258-4. • Arnab Das and Bikas K Chakrabarti (Eds.) Quantum Annealing and Related Optimization Methods, Lecture Note in Physics, Vol. 679, Springer, Heidelberg (2005) • Journal of Combinatorial Optimization (http://www.kluweronline.com/issn/1382-6905) • Arnab Das and Bikas K Chakrabarti, Rev. Mod. Phys. 80 1061 (2008) 23 Travelling salesman problem Travelling salesman problem The travelling salesman problem (TSP) is an NP-hard problem in combinatorial optimization studied in operations research and theoretical computer science. Given a list of cities and their pairwise distances, the task is to find the shortest possible route that visits each city exactly once and returns to the origin city. It is a special case of the travelling purchaser problem. The problem was first formulated as a mathematical problem in 1930 and is one of the most intensively studied problems in optimization. It is used as a benchmark for many optimization methods. Even though the problem is computationally difficult, a large number of heuristics and exact methods are known, so that some instances with tens of thousands of cities can be solved. The TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing. In these applications, the concept city represents, for example, customers, soldering points, or DNA fragments, and the concept distance represents travelling times or cost, or a similarity measure between DNA fragments. In many applications, additional constraints such as limited resources or time windows make the problem considerably harder. In the theory of computational complexity, the decision version of the TSP (where, given a length L, the task is to decide whether any tour is shorter than L) belongs to the class of NP-complete problems. Thus, it is likely that the worst-case running time for any algorithm for the TSP increases exponentially with the number of cities. History The origins of the travelling salesman problem are unclear. A handbook for travelling salesmen from 1832 mentions the problem and includes example tours through Germany and Switzerland, but contains no mathematical treatment.[1] The travelling salesman problem was defined in the 1800s by the Irish mathematician W. R. Hamilton and by the British mathematician Thomas Kirkman. Hamilton’s Icosian Game was a recreational puzzle based on finding a Hamiltonian cycle.[2] The general form of the TSP appears to have been first studied by mathematicians during the 1930s in Vienna and at Harvard, notably by Karl Menger, who defines the problem, considers the obvious brute-force algorithm, and observes the non-optimality of the nearest neighbour heuristic: We denote by messenger problem (since in practice this question should be solved by each postman, anyway also by many travelers) the task to find, for finitely many points whose pairwise distances are known, the shortest route connecting the points. Of course, this problem is solvable by finitely many trials. Rules which would push the number of trials below the William Rowan Hamilton number of permutations of the given points, are not known. The rule that one first should go from the starting point to the closest point, then to the point closest to this, etc., in general does not yield the shortest route.[3] Hassler Whitney at Princeton University introduced the name travelling salesman problem soon after.[4] In the 1950s and 1960s, the problem became increasingly popular in scientific circles in Europe and the USA. Notable contributions were made by George Dantzig, Delbert Ray Fulkerson and Selmer M. Johnson at the RAND 24 Travelling salesman problem 25 Corporation in Santa Monica, who expressed the problem as an integer linear program and developed the cutting plane method for its solution. With these new methods they solved an instance with 49 cities to optimality by constructing a tour and proving that no other tour could be shorter. In the following decades, the problem was studied by many researchers from mathematics, computer science, chemistry, physics, and other sciences. Richard M. Karp showed in 1972 that the Hamiltonian cycle problem was NP-complete, which implies the NP-hardness of TSP. This supplied a mathematical explanation for the apparent computational difficulty of finding optimal tours. Great progress was made in the late 1970s and 1980, when Grötschel, Padberg, Rinaldi and others managed to exactly solve instances with up to 2392 cities, using cutting planes and branch-and-bound. In the 1990s, Applegate, Bixby, Chvátal, and Cook developed the program Concorde that has been used in many recent record solutions. Gerhard Reinelt published the TSPLIB in 1991, a collection of benchmark instances of varying difficulty, which has been used by many research groups for comparing results. In 2005, Cook and others computed an optimal tour through a 33,810-city instance given by a microchip layout problem, currently the largest solved TSPLIB instance. For many other instances with millions of cities, solutions can be found that are guaranteed to be within 1% of an optimal tour. Description As a graph problem TSP can be modelled as an undirected weighted graph, such that cities are the graph's vertices, paths are the graph's edges, and a path's distance is the edge's length. It is a minimization problem starting and finishing at a specified vertex after having visited each other vertex exactly once. Often, the model is a complete graph (i.e. each pair of vertices is connected by an edge). If no path exists between two cities, adding an arbitrarily long edge will complete the graph without affecting the optimal tour. Asymmetric and symmetric Symmetric TSP with four cities In the symmetric TSP, the distance between two cities is the same in each opposite direction, forming an undirected graph. This symmetry halves the number of possible solutions. In the asymmetric TSP, paths may not exist in both directions or the distances might be different, forming a directed graph. Traffic collisions, one-way streets, and airfares for cities with different departure and arrival fees are examples of how this symmetry could break down. Related problems • An equivalent formulation in terms of graph theory is: Given a complete weighted graph (where the vertices would represent the cities, the edges would represent the roads, and the weights would be the cost or distance of that road), find a Hamiltonian cycle with the least weight. • The requirement of returning to the starting city does not change the computational complexity of the problem, see Hamiltonian path problem. • Another related problem is the bottleneck travelling salesman problem (bottleneck TSP): Find a Hamiltonian cycle in a weighted graph with the minimal weight of the weightiest edge. The problem is of considerable practical importance, apart from evident transportation and logistics areas. A classic example is in printed circuit manufacturing: scheduling of a route of the drill machine to drill holes in a PCB. In robotic machining or drilling Travelling salesman problem applications, the "cities" are parts to machine or holes (of different sizes) to drill, and the "cost of travel" includes time for retooling the robot (single machine job sequencing problem). • The generalized travelling salesman problem deals with "states" that have (one or more) "cities" and the salesman has to visit exactly one "city" from each "state". Also known as the "travelling politician problem". One application is encountered in ordering a solution to the cutting stock problem in order to minimise knife changes. Another is concerned with drilling in semiconductor manufacturing, see e.g. U.S. Patent 7054798 [5]. Surprisingly, Behzad and Modarres[6] demonstrated that the generalised travelling salesman problem can be transformed into a standard travelling salesman problem with the same number of cities, but a modified distance matrix. • The sequential ordering problem deals with the problem of visiting a set of cities where precedence relations between the cities exist. • The travelling purchaser problem deals with a purchaser who is charged with purchasing a set of products. He can purchase these products in several cities, but at different prices and not all cities offer the same products. The objective is to find a route between a subset of the cities, which minimizes total cost (travel cost + purchasing cost) and which enables the purchase of all required products. Computing a solution The traditional lines of attack for the NP-hard problems are the following: • Devising algorithms for finding exact solutions (they will work reasonably fast only for small problem sizes). • Devising "suboptimal" or heuristic algorithms, i.e., algorithms that deliver either seemingly or probably good solutions, but which could not be proved to be optimal. • Finding special cases for the problem ("subproblems") for which either better or exact heuristics are possible. Computational complexity The problem has been shown to be NP-hard (more precisely, it is complete for the complexity class FPNP; see function problem), and the decision problem version ("given the costs and a number x, decide whether there is a round-trip route cheaper than x") is NP-complete. The bottleneck travelling salesman problem is also NP-hard. The problem remains NP-hard even for the case when the cities are in the plane with Euclidean distances, as well as in a number of other restrictive cases. Removing the condition of visiting each city "only once" does not remove the NP-hardness, since it is easily seen that in the planar case there is an optimal tour that visits each city only once (otherwise, by the triangle inequality, a shortcut that skips a repeated visit would not increase the tour length). Complexity of approximation In the general case, finding a shortest travelling salesman tour is NPO-complete.[7] If the distance measure is a metric and symmetric, the problem becomes APX-complete[8] and Christofides’s algorithm approximates it within 1.5.[9] If the distances are restricted to 1 and 2 (but still are a metric) the approximation ratio becomes 7/6. In the asymmetric, metric case, only logarithmic performance guarantees are known, the best current algorithm achieves performance ratio 0.814 log n;[10] it is an open question if a constant factor approximation exists. The corresponding maximization problem of finding the longest travelling salesman tour is approximable within 63/38.[11] If the distance function is symmetric, the longest tour can be approximated within 4/3 by a deterministic algorithm[12] and within by a randomised algorithm.[13] 26 Travelling salesman problem Exact algorithms The most direct solution would be to try all permutations (ordered combinations) and see which one is cheapest (using brute force search). The running time for this approach lies within a polynomial factor of O(n!), the factorial of the number of cities, so this solution becomes impractical even for only 20 cities. One of the earliest applications of dynamic programming is the Held–Karp algorithm that solves the problem in time O(n22n).[14] The dynamic programming solution requires exponential space. Using inclusion–exclusion, the problem can be solved in time within a polynomial factor of and polynomial space.[15] Improving these time bounds seems to be difficult. For example, it has not been determined whether an exact algorithm for TSP that runs in time exists.[16] Other approaches include: • Various branch-and-bound algorithms, which can be used to process TSPs containing 40–60 cities. • Progressive improvement algorithms which use techniques reminiscent of linear programming. Works well for up to 200 cities. • Implementations of branch-and-bound and problem-specific cut generation; this is the method of choice for solving large instances. This approach holds the current record, solving an instance with 85,900 cities, see Applegate et al. (2006). An exact solution for 15,112 German towns from TSPLIB was found in 2001 using the cutting-plane method proposed by George Dantzig, Ray Fulkerson, and Selmer M. Johnson in 1954, based on linear programming. The computations were performed on a network of 110 processors located at Rice University and Princeton University (see the Princeton external link). The total computation time was equivalent to 22.6 years on a single 500 MHz Alpha processor. In May 2004, the travelling salesman problem of visiting all 24,978 towns in Sweden was solved: a tour of length approximately 72,500 kilometers was found and it was proven that no shorter tour exists.[17] In March 2005, the travelling salesman problem of visiting all 33,810 points in a circuit board was solved using Concorde TSP Solver: a tour of length 66,048,945 units was found and it was proven that no shorter tour exists. The computation took approximately 15.7 CPU-years (Cook et al. 2006). In April 2006 an instance with 85,900 points was solved using Concorde TSP Solver, taking over 136 CPU-years, see Applegate et al. (2006). Heuristic and approximation algorithms Various heuristics and approximation algorithms, which quickly yield good solutions have been devised. Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution. Several categories of heuristics are recognized. Constructive heuristics The nearest neighbour (NN) algorithm (or so-called greedy algorithm) lets the salesman choose the nearest unvisited city as his next move. This algorithm quickly yields an effectively short route. For N cities randomly distributed on a plane, the algorithm on average yields a path 25% longer than the shortest possible path.[18] However, there exist many specially arranged city distributions which make the NN algorithm give the worst route (Gutin, Yeo, and Zverovich, 2002). This is true for both asymmetric and symmetric TSPs (Gutin and Yeo, 2007). Rosenkrantz et al. [1977] showed that the NN algorithm has the approximation factor for instances satisfying the triangle inequality. Constructions based on a minimum spanning tree have an approximation ratio of 2. The Christofides algorithm achieves a ratio of 1.5. The bitonic tour of a set of points is the minimum-perimeter monotone polygon that has the points as its vertices; it can be computed efficiently by dynamic programming. 27 Travelling salesman problem Another constructive heuristic, Match Twice and Stitch (MTS) (Kahng, Reda 2004 [19]), performs two sequential matchings, where the second matching is executed after deleting all the edges of the first matching, to yield a set of cycles. The cycles are then stitched to produce the final tour. Iterative improvement Pairwise exchange, or Lin–Kernighan heuristics The pairwise exchange or 2-opt technique involves iteratively removing two edges and replacing these with two different edges that reconnect the fragments created by edge removal into a new and shorter tour. This is a special case of the k-opt method. Note that the label Lin–Kernighan is an often heard misnomer for 2-opt. Lin–Kernighan is actually a more general method. k-opt heuristic Take a given tour and delete k mutually disjoint edges. Reassemble the remaining fragments into a tour, leaving no disjoint subtours (that is, don't connect a fragment's endpoints together). This in effect simplifies the TSP under consideration into a much simpler problem. Each fragment endpoint can be connected to 2k − 2 other possibilities: of 2k total fragment endpoints available, the two endpoints of the fragment under consideration are disallowed. Such a constrained 2k-city TSP can then be solved with brute force methods to find the least-cost recombination of the original fragments. The k-opt technique is a special case of the V-opt or variable-opt technique. The most popular of the k-opt methods are 3-opt, and these were introduced by Shen Lin of Bell Labs in 1965. There is a special case of 3-opt where the edges are not disjoint (two of the edges are adjacent to one another). In practice, it is often possible to achieve substantial improvement over 2-opt without the combinatorial cost of the general 3-opt by restricting the 3-changes to this special subset where two of the removed edges are adjacent. This so-called two-and-a-half-opt typically falls roughly midway between 2-opt and 3-opt, both in terms of the quality of tours achieved and the time required to achieve those tours. V-opt heuristic The variable-opt method is related to, and a generalization of the k-opt method. Whereas the k-opt methods remove a fixed number (k) of edges from the original tour, the variable-opt methods do not fix the size of the edge set to remove. Instead they grow the set as the search process continues. The best known method in this family is the Lin–Kernighan method (mentioned above as a misnomer for 2-opt). Shen Lin and Brian Kernighan first published their method in 1972, and it was the most reliable heuristic for solving travelling salesman problems for nearly two decades. More advanced variable-opt methods were developed at Bell Labs in the late 1980s by David Johnson and his research team. These methods (sometimes called Lin–Kernighan–Johnson) build on the Lin–Kernighan method, adding ideas from tabu search and evolutionary computing. The basic Lin–Kernighan technique gives results that are guaranteed to be at least 3-opt. The Lin–Kernighan–Johnson methods compute a Lin–Kernighan tour, and then perturb the tour by what has been described as a mutation that removes at least four edges and reconnecting the tour in a different way, then v-opting the new tour. The mutation is often enough to move the tour from the local minimum identified by Lin–Kernighan. V-opt methods are widely considered the most powerful heuristics for the problem, and are able to address special cases, such as the Hamilton Cycle Problem and other non-metric TSPs that other heuristics fail on. For many years Lin–Kernighan–Johnson had identified optimal solutions for all TSPs where an optimal solution was known and had identified the best known solutions for all other TSPs on which the method had been tried. 28 Travelling salesman problem Randomised improvement Optimized Markov chain algorithms which use local searching heuristic sub-algorithms can find a route extremely close to the optimal route for 700 to 800 cities. TSP is a touchstone for many general heuristics devised for combinatorial optimization such as genetic algorithms, simulated annealing, Tabu search, ant colony optimization, river formation dynamics (see swarm intelligence) and the cross entropy method. Ant colony optimization Artificial intelligence researcher Marco Dorigo described in 1997 a method of heuristically generating "good solutions" to the TSP using a simulation of an ant colony called ACS (Ant Colony System).[20] It models behavior observed in real ants to find short paths between food sources and their nest, an emergent behaviour resulting from each ant's preference to follow trail pheromones deposited by other ants. ACS sends out a large number of virtual ant agents to explore many possible routes on the map. Each ant probabilistically chooses the next city to visit based on a heuristic combining the distance to the city and the amount of virtual pheromone deposited on the edge to the city. The ants explore, depositing pheromone on each edge that they cross, until they have all completed a tour. At this point the ant which completed the shortest tour deposits virtual pheromone along its complete tour route (global trail updating). The amount of pheromone deposited is inversely proportional to the tour length: the shorter the tour, the more it deposits. Special cases Metric TSP In the metric TSP, also known as delta-TSP or Δ-TSP, the intercity distances satisfy the triangle inequality. A very natural restriction of the TSP is to require that the distances between cities form a metric, i.e., they satisfy the triangle inequality. This can be understood as the absence of "shortcuts", in the sense that the direct connection from A to B is never longer than the route via intermediate C: The edge lengths then form a metric on the set of vertices. When the cities are viewed as points in the plane, many natural distance functions are metrics, and so many natural instances of TSP satisfy this constraint. The following are some examples of metric TSPs for various metrics. • In the Euclidean TSP (see below) the distance between two cities is the Euclidean distance between the corresponding points. 29 Travelling salesman problem • In the rectilinear TSP the distance between two cities is the sum of the differences of their x- and y-coordinates. This metric is often called the Manhattan distance or city-block metric. • In the maximum metric, the distance between two points is the maximum of the absolute values of differences of their x- and y-coordinates. The last two metrics appear for example in routing a machine that drills a given set of holes in a printed circuit board. The Manhattan metric corresponds to a machine that adjusts first one co-ordinate, and then the other, so the time to move to a new point is the sum of both movements. The maximum metric corresponds to a machine that adjusts both co-ordinates simultaneously, so the time to move to a new point is the slower of the two movements. In its definition, the TSP does not allow cities to be visited twice, but many applications do not need this constraint. In such cases, a symmetric, non-metric instance can be reduced to a metric one. This replaces the original graph with a complete graph in which the inter-city distance is replaced by the shortest path between and in the original graph. There is a constant-factor approximation algorithm for the metric TSP due to Christofides[21] that always finds a tour of length at most 1.5 times the shortest tour. In the next paragraphs, we explain a weaker (but simpler) algorithm which finds a tour of length at most twice the shortest tour. The length of the minimum spanning tree of the network is a natural lower bound for the length of the optimal route. In the TSP with triangle inequality case it is possible to prove upper bounds in terms of the minimum spanning tree and design an algorithm that has a provable upper bound on the length of the route. The first published (and the simplest) example follows: 1. Construct the minimum spanning tree. 2. Duplicate all its edges. That is, wherever there is an edge from u to v, add a second edge from u to v. This gives us an Eulerian graph. 3. Find a Eulerian cycle in it. Clearly, its length is twice the length of the tree. 4. Convert the Eulerian cycle into the Hamiltonian one in the following way: walk along the Eulerian cycle, and each time you are about to come into an already visited vertex, skip it and try to go to the next one (along the Eulerian cycle). It is easy to prove that the last step works. Moreover, thanks to the triangle inequality, each skipping at Step 4 is in fact a shortcut; i.e., the length of the cycle does not increase. Hence it gives us a TSP tour no more than twice as long as the optimal one. The Christofides algorithm follows a similar outline but combines the minimum spanning tree with a solution of another problem, minimum-weight perfect matching. This gives a TSP tour which is at most 1.5 times the optimal. The Christofides algorithm was one of the first approximation algorithms, and was in part responsible for drawing attention to approximation algorithms as a practical approach to intractable problems. As a matter of fact, the term "algorithm" was not commonly extended to approximation algorithms until later; the Christofides algorithm was initially referred to as the Christofides heuristic. In the special case that distances between cities are all either one or two (and thus the triangle inequality is necessarily satisfied), there is a polynomial-time approximation algorithm that finds a tour of length at most 8/7 times the optimal tour length.[22] However, it is a long-standing (since 1975) open problem to improve the Christofides approximation factor of 1.5 for general metric TSP to a smaller constant. It is known that, unless P = NP, there is no polynomial-time algorithm that finds a tour of length at most 220/219=1.00456… times the optimal tour's length.[23] In the case of bounded metrics it is known that there is no polynomial time algorithm that constructs a tour of length at most 321/320 times the optimal tour's length, unless P = NP.[24] 30 Travelling salesman problem 31 Euclidean TSP The Euclidean TSP, or planar TSP, is the TSP with the distance being the ordinary Euclidean distance. The Euclidean TSP is a particular case of the metric TSP, since distances in a plane obey the triangle inequality. Like the general TSP, the Euclidean TSP (and therefore the general metric TSP) is NP-complete.[25] However, in some respects it seems to be easier than the general metric TSP. For example, the minimum spanning tree of the graph associated with an instance of the Euclidean TSP is a Euclidean minimum spanning tree, and so can be computed in expected O(n log n) time for n points (considerably less than the number of edges). This enables the simple 2-approximation algorithm for TSP with triangle inequality above to operate more quickly. In general, for any c > 0, where d is the number of dimensions in the Euclidean space, there is a polynomial-time algorithm that finds a tour of length at most (1 + 1/c) times the optimal for geometric instances of TSP in time; this is called a polynomial-time approximation scheme (PTAS).[26] Sanjeev Arora and Joseph S. B. Mitchell were awarded the Gödel Prize in 2010 for their concurrent discovery of a PTAS for the Euclidean TSP. In practice, heuristics with weaker guarantees continue to be used. Asymmetric TSP In most cases, the distance between two nodes in the TSP network is the same in both directions. The case where the distance from A to B is not equal to the distance from B to A is called asymmetric TSP. A practical application of an asymmetric TSP is route optimisation using street-level routing (which is made asymmetric by one-way streets, slip-roads, motorways, etc.). Solving by conversion to symmetric TSP Solving an asymmetric TSP graph can be somewhat complex. The following is a 3×3 matrix containing all possible path weights between the nodes A, B and C. One option is to turn an asymmetric matrix of size N into a symmetric matrix of size 2N.[27] A B C A 1 2 B 6 3 C 5 4 |+ Asymmetric path weights To double the size, each of the nodes in the graph is duplicated, creating a second ghost node. Using duplicate points with very low weights, such as −∞, provides a cheap route "linking" back to the real node and allowing symmetric evaluation to continue. The original 3×3 matrix shown above is visible in the bottom left and the inverse of the original in the top-right. Both copies of the matrix have had their diagonals replaced by the low-cost hop paths, represented by −∞. Travelling salesman problem 32 A A′ B′ C′ A −∞ 6 5 B 1 −∞ 4 C 2 3 −∞ A′ −∞ B C 1 2 B′ 6 −∞ 3 C′ 5 4 −∞ |+ Symmetric path weights The original 3×3 matrix would produce two Hamiltonian cycles (a path that visits every node once), namely A-B-C-A [score 9] and A-C-B-A [score 12]. Evaluating the 6×6 symmetric version of the same problem now produces many paths, including A-A′-B-B′-C-C′-A, A-B′-C-A′-A, A-A′-B-C′-A [all score 9 – ∞]. The important thing about each new sequence is that there will be an alternation between dashed (A′,B′,C′) and un-dashed nodes (A, B, C) and that the link to "jump" between any related pair (A-A′) is effectively free. A version of the algorithm could use any weight for the A-A′ path, as long as that weight is lower than all other path weights present in the graph. As the path weight to "jump" must effectively be "free", the value zero (0) could be used to represent this cost—if zero is not being used for another purpose already (such as designating invalid paths). In the two examples above, non-existent paths between nodes are shown as a blank square. Benchmarks For benchmarking of TSP algorithms, TSPLIB [28] is a library of sample instances of the TSP and related problems is maintained, see the TSPLIB external reference. Many of them are lists of actual cities and layouts of actual printed circuits. Human performance on TSP The TSP, in particular the Euclidean variant of the problem, has attracted the attention of researchers in cognitive psychology. It is observed that humans are able to produce good quality solutions quickly. The first issue of the Journal of Problem Solving [29] is devoted to the topic of human performance on TSP. TSP path length for random pointset in a square Suppose N points are randomly distributed in a 1 x 1 square with N>>1. Consider many such squares. Suppose we want to know the average of the shortest path length (i.e. TSP solution) of each square. Lower bound is a lower bound obtained by assuming i be a point in the tour sequence and i has its nearest neighbor as its next in the path. is a better lower bound obtained by assuming is next is is nearest, and is previous is is second nearest. is an even better lower bound obtained by dividing the path sequence into two parts as before_i and after_i with each part containing N/2 points, and then deleting the before_i part to form a diluted pointset (see discussion). Travelling salesman problem 33 • David S. Johnson[30] obtained a lower bound by computer experiment: , where 0.522 comes from the points near square boundary which have fewer neighbors. • Christine L. Valenzuela and Antonia J. Jones [31] obtained another lower bound by computer experiment: Upper bound By applying Simulated Annealing method on samples of N=40000, computer analysis shows an upper bound of , where 0.72 comes from the boundary effect. Because the actual solution is only the shortest path, for the purposes of programmatic search another upper bound is the length of any previously discovered approximation. Analyst's travelling salesman problem There is an analogous problem in geometric measure theory which asks the following: under what conditions may a subset E of Euclidean space be contained in a rectifiable curve (that is, when is there a continuous curve that visits every point in E)? This problem is known as the analyst's travelling salesman problem or the geometric travelling salesman problem. Notes [1] "Der Handlungsreisende – wie er sein soll und was er zu thun [sic] hat, um Aufträge zu erhalten und eines glücklichen Erfolgs in seinen Geschäften gewiß zu sein – von einem alten Commis-Voyageur" (The traveling salesman — how he must be and what he should do in order to be sure to perform his tasks and have success in his business — by a high commis-voyageur) [2] A discussion of the early work of Hamilton and Kirkman can be found in Graph Theory 1736–1936 [3] Cited and English translation in Schrijver (2005). Original German: "Wir bezeichnen als Botenproblem (weil diese Frage in der Praxis von jedem Postboten, übrigens auch von vielen Reisenden zu lösen ist) die Aufgabe, für endlich viele Punkte, deren paarweise Abstände bekannt sind, den kürzesten die Punkte verbindenden Weg zu finden. Dieses Problem ist natürlich stets durch endlich viele Versuche lösbar. Regeln, welche die Anzahl der Versuche unter die Anzahl der Permutationen der gegebenen Punkte herunterdrücken würden, sind nicht bekannt. Die Regel, man solle vom Ausgangspunkt erst zum nächstgelegenen Punkt, dann zu dem diesem nächstgelegenen Punkt gehen usw., liefert im allgemeinen nicht den kürzesten Weg." [4] A detailed treatment of the connection between Menger and Whitney as well as the growth in the study of TSP can be found in Alexander Schrijver's 2005 paper "On the history of combinatorial optimization (till 1960). Handbook of Discrete Optimization (K. Aardal, G.L. Nemhauser, R. Weismantel, eds.), Elsevier, Amsterdam, 2005, pp. 1–68. PS (http:/ / homepages. cwi. nl/ ~lex/ files/ histco. ps), PDF (http:/ / homepages. cwi. nl/ ~lex/ files/ histco. pdf) [5] http:/ / www. google. com/ patents?vid=7054798 [6] Behzad, Arash; Modarres, Mohammad (2002), "New Efficient Transformation of the Generalized Traveling Salesman Problem into Traveling Salesman Problem", Proceedings of the 15th International Conference of Systems Engineering (Las Vegas) [7] Orponen (1987) [8] Papadimitriou (1983) [9] Christofides (1976) [10] Kaplan (2004) [11] Kosaraju (1994) [12] Serdyukov (1984) [13] Hassin (2000) [14] Bellman (1960), Bellman (1962), Held & Karp (1962) [15] Kohn (1977) Karp (1982) [16] Woeginger (2003) [17] Work by David Applegate, AT&T Labs – Research, Robert Bixby, ILOG and Rice University, Vašek Chvátal, Concordia University, William Cook, Georgia Tech, and Keld Helsgaun, Roskilde University is discussed on their project web page hosted by Georgia Tech and last updated in June 2004, here (http:/ / www. tsp. gatech. edu/ sweden/ ) [18] Johnson, D.S. and McGeoch, L.A.. "The traveling salesman problem: A case study in local optimization", Local search in combinatorial optimization, 1997, 215-310 Travelling salesman problem [19] A. B. Kahng and S. Reda, "Match Twice and Stitch: A New TSP Tour Construction Heuristic," Operations Research Letters, 2004, 32(6). pp. 499–509. http:/ / dx. doi. org/ 10. 1016/ j. orl. 2004. 04. 001 [20] Marco Dorigo. Ant Colonies for the Traveling Salesman Problem. IRIDIA, Université Libre de Bruxelles. IEEE Transactions on Evolutionary Computation, 1(1):53–66. 1997. http:/ / citeseer. ist. psu. edu/ 86357. html [21] N. Christofides, Worst-case analysis of a new heuristic for the traveling salesman problem, Report 388, Graduate School of Industrial Administration, Carnegie Mellon University, 1976. [22] P. Berman (2006). M. Karpinski, "8/7-Approximation Algorithm for (1,2)-TSP", Proc. 17th ACM-SIAM SODA (2006), pp. 641–648, ECCC TR05-069. [23] C.H. Papadimitriou and Santosh Vempala. On the approximability of the traveling salesman problem (http:/ / dx. doi. org/ 10. 1007/ s00493-006-0008-z), Combinatorica 26(1):101–120, 2006. [24] L. Engebretsen, M. Karpinski, TSP with bounded metrics (http:/ / dx. doi. org/ 10. 1016/ j. jcss. 2005. 12. 001). Journal of Computer and System Sciences, 72(4):509‒546, 2006. [25] Christos H. Papadimitriou. "The Euclidean travelling salesman problem is NP-complete". Theoretical Computer Science 4:237–244, 1977. doi:10.1016/0304-3975(77)90012-3 [26] Sanjeev Arora. Polynomial Time Approximation Schemes for Euclidean Traveling Salesman and other Geometric Problems. Journal of the ACM, Vol.45, Issue 5, pp.753–782. ISSN:0004-5411. September 1998. http:/ / citeseer. ist. psu. edu/ arora96polynomial. html. [27] Roy Jonker, Ton Volgenant, Transforming asymmetric into symmetric traveling salesman problems (http:/ / www. sciencedirect. com/ science/ article/ pii/ 0167637783900482), Operations Research Letters, Volume 2, Issue 4, November 1983, Pages 161-163, ISSN 0167-6377, doi:10.1016/0167-6377(83)90048-2. [28] http:/ / comopt. ifi. uni-heidelberg. de/ software/ TSPLIB95/ [29] http:/ / docs. lib. purdue. edu/ jps/ [30] David S. Johnson (http:/ / www. research. att. com/ ~dsj/ papers/ HKsoda. pdf) [31] Christine L. Valenzuela and Antonia J. Jones (http:/ / users. cs. cf. ac. uk/ Antonia. J. Jones/ Papers/ EJORHeldKarp/ HeldKarp. pdf) References • Applegate, D. L.; Bixby, R. M.; Chvátal, V.; Cook, W. J. (2006), The Traveling Salesman Problem, ISBN 0691129932. • Bellman, R. (1960), "Combinatorial Processes and Dynamic Programming", in Bellman, R., Hall, M., Jr. (eds.), Combinatorial Analysis, Proceedings of Symposia in Applied Mathematics 10,, American Mathematical Society, pp. 217–249. • Bellman, R. (1962), "Dynamic Programming Treatment of the Travelling Salesman Problem", J. Assoc. Comput. Mach. 9: 61–63, doi:10.1145/321105.321111. • Christofides, N. (1976), Worst-case analysis of a new heuristic for the travelling salesman problem, Technical Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh. • Hassin, R.; Rubinstein, S. (2000), "Better approximations for max TSP", Information Processing Letters 75 (4): 181–186, doi:10.1016/S0020-0190(00)00097-1. • Held, M.; Karp, R. M. (1962), "A Dynamic Programming Approach to Sequencing Problems", Journal of the Society for Industrial and Applied Mathematics 10 (1): 196–210, doi:10.1137/0110015. • Kaplan, H.; Lewenstein, L.; Shafrir, N.; Sviridenko, M. (2004), "Approximation Algorithms for Asymmetric TSP by Decomposing Directed Regular Multigraphs", In Proc. 44th IEEE Symp. on Foundations of Comput. Sci, pp. 56–65. • Karp, R.M. (1982), "Dynamic programming meets the principle of inclusion and exclusion", Oper. Res. Lett. 1 (2): 49–51, doi:10.1016/0167-6377(82)90044-X. • Kohn, S.; Gottlieb, A.; Kohn, M. (1977), "A Generating Function Approach to the Traveling Salesman Problem", ACM Annual Conference, ACM Press, pp. 294–300. • Kosaraju, S. R.; Park, J. K.; Stein, C. (1994), "Long tours and short superstrings'", Proc. 35th Ann. IEEE Symp. on Foundations of Comput. Sci, IEEE Computer Society, pp. 166–177. • Orponen, P.; Mannila, H. (1987), "On approximation preserving reductions: Complete problems and robust measures'", Technical Report C-1987–28, Department of Computer Science, University of Helsinki. • Papadimitriou, C. H.; Yannakakis, M. (1993), "The traveling salesman problem with distances one and two", Math. Oper. Res. 18: 1–11, doi:10.1287/moor.18.1.1. 34 Travelling salesman problem • Serdyukov, A. I. (1984), "An algorithm with an estimate for the traveling salesman problem of the maximum'", Upravlyaemye Sistemy 25: 80–86. • Woeginger, G.J. (2003), "Exact Algorithms for NP-Hard Problems: A Survey", Combinatorial Optimization – Eureka, You Shrink! Lecture notes in computer science, vol. 2570, Springer, pp. 185–207. Further reading • Adleman, Leonard (1994), Molecular Computation of Solutions To Combinatorial Problems (http://www.usc. edu/dept/molecular-science/papers/fp-sci94.pdf) • Applegate, D. L.; Bixby, R. E.; Chvátal, V.; Cook, W. J. (2006), The Traveling Salesman Problem: A Computational Study, Princeton University Press, ISBN 978-0-691-12993-8. • Arora, S. (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems" (http://graphics.stanford.edu/courses/cs468-06-winter/Papers/arora-tsp.pdf), Journal of the ACM 45 (5): 753–782, doi:10.1145/290179.290180. • Babin, Gilbert; Deneault, Stéphanie; Laportey, Gilbert (2005), Improvements to the Or-opt Heuristic for the Symmetric Traveling Salesman Problem (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89. 9953), Cahiers du GERAD, G-2005-02, Montreal: Group for Research in Decision Analysis. • Cook, William (2011), In Pursuit of the Travelling Salesman: Mathematics at the Limits of Computation, Princeton University Press, ISBN 978-0-691-15270-7. • Cook, William; Espinoza, Daniel; Goycoolea, Marcos (2007), "Computing with domino-parity inequalities for the TSP", INFORMS Journal on Computing 19 (3): 356–365, doi:10.1287/ijoc.1060.0204. • Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C. (2001), "35.2: The traveling-salesman problem", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp. 1027–1033, ISBN 0-262-03293-7. • Dantzig, G. B.; Fulkerson, R.; Johnson, S. M. (1954), "Solution of a large-scale traveling salesman problem", Operations Research 2 (4): 393–410, doi:10.1287/opre.2.4.393, JSTOR 166695. • Garey, M. R.; Johnson, D. S. (1979), "A2.3: ND22–24", Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman, pp. 211–212, ISBN 0-7167-1045-5. • Goldberg, D. E. (1989), Genetic Algorithms in Search, Optimization & Machine Learning, New York: Addison-Wesley, ISBN 0201157675. • Gutin, G.; Yeo, A.; Zverovich, A. (2002), "Traveling salesman should not be greedy: domination analysis of greedy-type heuristics for the TSP", Discrete Applied Mathematics 117 (1–3): 81–86, doi:10.1016/S0166-218X(01)00195-0. • Gutin, G.; Punnen, A. P. (2006), The Traveling Salesman Problem and Its Variations, Springer, ISBN 0-387-44459-9. • Johnson, D. S.; McGeoch, L. A. (1997), "The Traveling Salesman Problem: A Case Study in Local Optimization", in Aarts, E. H. L.; Lenstra, J. K., Local Search in Combinatorial Optimisation, John Wiley and Sons Ltd, pp. 215–310. • Lawler, E. L.; Lenstra, J. K.; Rinnooy Kan, A. H. G.; Shmoys, D. B. (1985), The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization, John Wiley & Sons, ISBN 0-471-90413-9. • MacGregor, J. N.; Ormerod, T. (1996), "Human performance on the traveling salesman problem" (http://www. psych.lancs.ac.uk/people/uploads/TomOrmerod20030716T112601.pdf), Perception & Psychophysics 58 (4): 527–539, doi:10.3758/BF03213088. • Mitchell, J. S. B. (1999), "Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems" (http://citeseer.ist.psu.edu/622594. html), SIAM Journal on Computing 28 (4): 1298–1309, doi:10.1137/S0097539796309764. • Rao, S.; Smith, W. (1998), "Approximating geometrical graphs via 'spanners' and 'banyans'", Proc. 30th Annual ACM Symposium on Theory of Computing, pp. 540–550. 35 Travelling salesman problem • Rosenkrantz, Daniel J.; Stearns, Richard E.; Lewis, Philip M., II (1977), "An Analysis of Several Heuristics for the Traveling Salesman Problem", SIAM Journal on Computing 6 (5): 563–581, doi:10.1137/0206041. • Vickers, D.; Butavicius, M.; Lee, M.; Medvedev, A. (2001), "Human performance on visually presented traveling salesman problems", Psychological Research 65 (1): 34–45, doi:10.1007/s004260000031, PMID 11505612. • Walshaw, Chris (2000), A Multilevel Approach to the Travelling Salesman Problem, CMS Press. • Walshaw, Chris (2001), A Multilevel Lin-Kernighan-Helsgaun Algorithm for the Travelling Salesman Problem, CMS Press. External links • Traveling Salesman Problem (http://www.tsp.gatech.edu/index.html) at Georgia Tech • TSPLIB (http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/) at the University of Heidelberg • Traveling Salesman Problem (http://demonstrations.wolfram.com/TravelingSalesmanProblem/) by Jon McLoone based on a program by Stephen Wolfram, after work by Stan Wagon, Wolfram Demonstrations Project. • optimap (http://www.gebweb.net/optimap/) an approximation using ACO on GoogleMaps with JavaScript • tsp (http://travellingsalesmanproblem.appspot.com/) an exact solver using Constraint Programming on GoogleMaps • Demo applet of a genetic algorithm solving TSPs and VRPTW problems (http://www.dna-evolutions.com/ dnaappletsample.html) • Source code library for the travelling salesman problem (http://www.adaptivebox.net/CILib/code/ tspcodes_link.html) • TSP solvers in R (http://tsp.r-forge.r-project.org/) for symmetric and asymmetric TSPs. Implements various insertion, nearest neighbor and 2-opt heuristics and an interface to Georgia Tech's Concorde and Chained Lin-Kernighan heuristics. 36 Constraint (mathematics) 37 Constraint (mathematics) In mathematics, a constraint is a condition that a solution to an optimization problem must satisfy. There are two types of constraints: equality constraints and inequality constraints. The set of solutions that satisfy all constraints is called the feasible set. Example The following is a simple optimization problem: subject to and where denotes the vector (x1, x2). In this example, the first line defines the function to be minimized (called the objective or cost function). The second and third lines define two constraints, the first of which is an inequality constraint and the second of which is an equality constraint. These two constraints define the feasible set of candidate solutions. Without the constraints, the solution would be where has the lowest value. But this solution does not satisfy the constraints. The solution of the constrained optimization problem stated above but is the point with the smallest value of , which that satisfies the two constraints. Terminology • If a constraint is an equality at a given point, the constraint is said to be binding, as the point cannot be varied in the direction of the constraint. • If a constraint is an inequality at a given point, the constraint is said to be non-binding, as the point can be varied in the direction of the constraint. • If a constraint is not satisfied, the point is said to be infeasible. External links • Nonlinear programming FAQ [1] • Mathematical Programming Glossary [2] References [1] http:/ / www-unix. mcs. anl. gov/ otc/ Guide/ faq/ nonlinear-programming-faq. html [2] http:/ / glossary. computing. society. informs. org/ Constraint satisfaction problem 38 Constraint satisfaction problem Constraint satisfaction problems (CSP)s are mathematical problems defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs represent the entities in a problem as a homogeneous collection of finite constraints over variables, which is solved by constraint satisfaction methods. CSPs are the subject of intense research in both artificial intelligence and operations research, since the regularity in their formulation provides a common basis to analyze and solve problems of many unrelated families. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time. The boolean satisfiability problem (SAT), the Satisfiability Modulo Theories (SMT) and answer set programming (ASP) can be roughly thought of as certain forms of the constraint satisfaction problem. Examples of simple problems that can be modeled as a constraint satisfaction problem: • Eight queens puzzle • Map coloring problem • Sudoku Examples demonstrating the above are often provided with tutorials of ASP, boolean SAT and SMT solvers. In the general case, constraint problems can be much harder, and may not be expressible in some of these simpler systems. Formal definition Formally, a constraint satisfaction problem is defined as a triple a domain of values, and matrix), where is an , where is a set of constraints. Every constraint is in turn a pair -tuple of variables and is an -ary relation on function from the set of variables to the domain of values, if is a set of variables, is (usually represented as a . An evaluation of the variables is a . An evaluation satisfies a constraint . A solution is an evaluation that satisfies all constraints. Resolution of CSPs Constraint satisfaction problems on finite domains are typically solved using a form of search. The most used techniques are variants of backtracking, constraint propagation, and local search. Backtracking is a recursive algorithm. It maintains a partial assignment of the variables. Initially, all variables are unassigned. At each step, a variable is chosen, and all possible values are assigned to it in turn. For each value, the consistency of the partial assignment with the constraints is checked; in case of consistency, a recursive call is performed. When all values have been tried, the algorithm backtracks. In this basic backtracking algorithm, consistency is defined as the satisfaction of all constraints whose variables are all assigned. Several variants of backtracking exists. Backmarking improves the efficiency of checking consistency. Backjumping allows saving part of the search by backtracking "more than one variable" in some cases. Constraint learning infers and saves new constraints that can be later used to avoid part of the search. Look-ahead is also often used in backtracking to attempt to foresee the effects of choosing a variable or a value, thus sometimes determining in advance when a subproblem is satisfiable or unsatisfiable. Constraint propagation techniques are methods used to modify a constraint satisfaction problem. More precisely, they are methods that enforce a form of local consistency, which are conditions related to the consistency of a group of variables and/or constraints. Constraint propagation has various uses. First, it turns a problem into one that is equivalent but is usually simpler to solve. Second, it may prove satisfiability or unsatisfiability of problems. This is not guaranteed to happen in general; however, it always happens for some forms of constraint propagation and/or for some certain kinds of problems. The most known and used form of local consistency are arc consistency, hyper-arc consistency, and path consistency. The most popular constraint propagation method is the AC-3 algorithm, which enforces arc consistency. Constraint satisfaction problem Local search methods are incomplete satisfiability algorithms. They may find a solution of a problem, but they may fail even if the problem is satisfiable. They work by iteratively improving a complete assignment over the variables. At each step, a small number of variables are changed value, with the overall aim of increasing the number of constraints satisfied by this assignment. The min-conflicts algorithm is a local search algorithm specific for CSPs and based in that principle. In practice, local search appears to work well when these changes are also affected by random choices. Integration of search with local search have been developed, leading to hybrid algorithms. Theoretical aspects of CSPs Decision problems CSPs are also studied in computational complexity theory and finite model theory. An important question is whether for each set of relations, the set of all CSPs that can be represented using only relations chosen from that set is either in P or NP-complete. If such a dichotomy theorem is true, then CSPs provide one of the largest known subsets of NP which avoids NP-intermediate problems, whose existence was demonstrated by Ladner's theorem under the assumption that P ≠ NP. Schaefer's dichotomy theorem handles the case when all the available relations are boolean operators, that is, for domain size 2. Schaefer's dichotomoy theorem was recently generalized to a larger class of relations.[1] Most classes of CSPs that are known to be tractable are those where the hypergraph of constraints has bounded treewidth (and there are no restrictions on the set of constraint relations), or where the constraints have arbitrary form but there exist essentially non-unary polymorphisms of the set of constraint relations. Every CSP can also be considered as a conjunctive query containment problem.[2] Function problems A similar situation exists between the functional classes FP and #P. By a generalization of Ladner's theorem, there are also problems in neither FP nor #P-complete as long as FP ≠ #P. As in the decision case, a problem in the #CSP is defined by a set of relations. Each problem takes as input a Boolean formula as input and the task is to compute the number of satisfying assignments. This can be further generalized by using larger domain sizes and attaching a weight to each satisfying assignment and computing the sum of these weights. It is known that any complex weighted #CSP problem is either in FP or #P-hard.[3] Variants of CSPs The classic model of Constraint Satisfaction Problem defines a model of static, inflexible constraints. This rigid model is a shortcoming that makes it difficult to represent problems easily.[4] Several modifications of the basic CSP definition have been proposed to adapt the model to a wide variety of problems. Dynamic CSPs Dynamic CSPs[5] (DCSPs) are useful when the original formulation of a problem is altered in some way, typically because the set of constraints to consider evolves because of the environment.[6] DCSPs are viewed as a sequence of static CSPs, each one a transformation of the previous one in which variables and constraints can be added (restriction) or removed (relaxation). Information found in the initial formulations of the problem can be used to refine the next ones. The solving method can be classified according to the way in which information is transferred: • Oracles: the solution found to previous CSPs in the sequence are used as heuristics to guide the resolution of the current CSP from scratch. • Local repair: each CSP is calculated starting from the partial solution of the previous one and repairing the inconsistent constraints with local search. 39 Constraint satisfaction problem • Constraint recording: new constraints are defined in each stage of the search to represent the learning of inconsistent group of decisions. Those constraints are carried over the new CSP problems. Flexible CSPs Classic CSPs treat constraints as hard, meaning that they are imperative (each solution must satisfy all them) and inflexible (in the sense that they must be completely satisfied or else they are completely violated). Flexible CSPs relax those assumptions, partially relaxing the constraints and allowing the solution to not comply with all them. This is similar to preferences in preference-based planning. Some types of flexible CSPs include: • MAX-CSP, where a number of constraints are allowed to be violated, and the quality of a solution is measured by the number of satisfied constraints. • Weighted CSP, a MAX-CSP in which each violation of a constraint is weighted according to a predefined preference. Thus satisfying constraint with more weight is preferred. • Fuzzy CSP model constraints as fuzzy relations in which the satisfaction of a constraint is a continuous function of its variables' values, going from fully satisfied to fully violated. References [1] Bodirsky, Manuel; Pinsker, Michael (2010). "Schaefer's theorem for graphs". CoRR abs/1011.2894: 2894. arXiv:1011.2894. Bibcode 2010arXiv1011.2894B. [2] Kolaitis, Phokion G.; Vardi, Moshe Y. (2000). "Conjunctive-Query Containment and Constraint Satisfaction". Journal of Computer and System Sciences 61 (2): 302–332. doi:10.1006/jcss.2000.1713. [3] Cai, Jin-Yi; Chen, Xi (2011). "Complexity of Counting CSP with Complex Weights" (http:/ / arxiv. org/ abs/ 1111. 2384). CoRR abs/1111.2384. . [4] . doi:10.1.1.9.6733. [5] Dechter, R. and Dechter, A., Belief Maintenance in Dynamic Constraint Networks In Proc. of AAAI-88, 37-42. (http:/ / www. ics. uci. edu/ ~csp/ r5. pdf) [6] Solution reuse in dynamic constraint satisfaction problems (http:/ / www. aaai. org/ Papers/ AAAI/ 1994/ AAAI94-302. pdf), Thomas Schiex Further reading • Steven Minton, Andy Philips, Mark D. Johnston, Philip Laird (1993). "Minimizing Conflicts: A Heuristic Repair Method for Constraint-Satisfaction and Scheduling Problems" (https://eprints.kfupm.edu.sa/50799/1/50799. pdf) (PDF). Journal of Artificial Intelligence Research 58: 161–205. External links • CSP Tutorial (http://4c.ucc.ie/web/outreach/tutorial.html) • Tsang, Edward (1993). Foundations of Constraint Satisfaction (http://www.bracil.net/edward/FCS.html). Academic Press. ISBN 0-12-701610-4 • Chen, Hubie (December 2009). "A Rendezvous of Logic, Complexity, and Algebra". ACM Computing Surveys (ACM) 42 (1): 1–32. doi:10.1145/1592451.1592453. • Dechter, Rina (2003). Constraint processing (http://www.ics.uci.edu/~dechter/books/index.html). Morgan Kaufmann. ISBN 1-55860-890-7 • Apt, Krzysztof (2003). Principles of constraint programming. Cambridge University Press. ISBN 0-521-82583-0 • Lecoutre, Christophe (2009). Constraint Networks: Techniques and Algorithms (http://www.iste.co.uk/index. php?f=a&ACTION=View&id=250). ISTE/Wiley. ISBN 978-1-84821-106-3 • Tomás Feder, Constraint satisfaction: a personal perspective (http://theory.stanford.edu/~tomas/consmod. pdf), manuscript. • Constraints archive (http://4c.ucc.ie/web/archive/index.jsp) 40 Constraint satisfaction problem • Forced Satisfiable CSP Benchmarks of Model RB (http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/ benchmarks.htm) • Benchmarks -- XML representation of CSP instances (http://www.cril.univ-artois.fr/~lecoutre/research/ benchmarks/benchmarks.html) • Dynamic Flexible Constraint Satisfaction and Its Application to AI Planning (http://www.cs.st-andrews.ac.uk/ ~ianm/docs/Thesis.ppt), Ian Miguel - slides. • Constraint Propagation (http://www.ps.uni-sb.de/Papers/abstracts/tackDiss.html) - Dissertation by Guido Tack giving a good survey of theory and implementation issues Constraint satisfaction In artificial intelligence and operations research, constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a vector of variables that satisfies all constraints. The techniques used in constraint satisfaction depend on the kind of constraints being considered. Often used are constraints on a finite domain, to the point that constraint satisfaction problems are typically identified with problems based on constraints on a finite domain. Such problems are usually solved via search, in particular a form of backtracking or local search. Constraint propagation are other methods used on such problems; most of them are incomplete in general, that is, they may solve the problem or prove it unsatisfiable, but not always. Constraint propagation methods are also used in conjunction with search to make a given problem simpler to solve. Other considered kinds of constraints are on real or rational numbers; solving problems on these constraints is done via variable elimination or the simplex algorithm. Constraint satisfaction originated in the field of artificial intelligence in the 1970s (see for example (Laurière 1978)). During the 1980s and 1990s, embedding of constraints into a programming language were developed. Languages often used for constraint programming are Prolog and C++. Constraint satisfaction problem As originally defined in artificial intelligence, constraints enumerate the possible values a set of variables may take. Informally, a finite domain is a finite set of arbitrary elements. A constraint satisfaction problem on such domain contains a set of variables whose values can only be taken from the domain, and a set of constraints, each constraint specifying the allowed values for a group of variables. A solution to this problem is an evaluation of the variables that satisfies all constraints. In other words, a solution is a way for assigning a value to each variable in such a way that all constraints are satisfied by these values. In some circumstances, there may exist additional requirements: one may be interested not only in the solution (and in the fastest or most computationally efficient way to reach it) but in how it was reached; e.g. one may want the "simplest" solution ("simplest" in a logical, non computational sense that has to be precisely defined). This is often the case in logic games such as Sudoku. In practice, constraints are often expressed in compact form, rather than enumerating all values of the variables that would satisfy the constraint. One of the most used constraints is the one establishing that the values of the affected variables must be all different. Problems that can be expressed as constraint satisfaction problems are the Eight queens puzzle, the Sudoku solving problem, the Boolean satisfiability problem, scheduling problems and various problems on graphs such as the graph coloring problem. While usually not included in the above definition of a constraint satisfaction problem, arithmetic equations and inequalities bound the values of the variables they contain and can therefore be considered a form of constraints. 41 Constraint satisfaction 42 Their domain is the set of numbers (either integer, rational, or real), which is infinite: therefore, the relations of these constraints may be infinite as well; for example, has an infinite number of pairs of satisfying values. Arithmetic equations and inequalities are often not considered within the definition of a "constraint satisfaction problem", which is limited to finite domains. They are however used often in constraint programming. Solving Constraint satisfaction problems on finite domains are typically solved using a form of search. The most used techniques are variants of backtracking, constraint propagation, and local search. These techniques are used on problems with nonlinear constraints. In case there is a requirement on "simplicity", a pure logic, pattern based approach was first introduced for the Sudoku CSP in the book The Hidden Logic of Sudoku[1]. It has recently been generalized to any finite CSP in another book by the same author: Constraint Resolution Theories[2]. Variable elimination and the simplex algorithm are used for solving linear and polynomial equations and inequalities, and problems containing variables with infinite domain. These are typically solved as optimization problems in which the optimized function is the number of violated constraints. Complexity Solving a constraint satisfaction problem on a finite domain is an NP complete problem with respect to the domain size. Research has shown a number of tractable subcases, some limiting the allowed constraint relations, some requiring the scopes of constraints to form a tree, possibly in a reformulated version of the problem. Research has also established relationship of the constraint satisfaction problem with problems in other areas such as finite model theory. A very different aspect of complexity appears when one fixes the size of the domain. It is about the complexity distribution of minimal instances of a CSP of fixed size (e.g. Sudoku(9x9)). Here, complexity is measured according to the above-mentioned "simplicity" requirement (see Unbiased Statistics of a CSP - A Controlled-Bias Generator'[3] or Constraint Resolution Theories[2]). In this context, a minimal instance is an instance with a unique solution such that if any given (or clue) is deleted from it, the resulting instance has several solutions (statistics can only be meaningful on the set of minimal instances). Constraint programming Constraint programming is the use of constraints as a programming language to encode and solve problems. This is often done by embedding constraints into a programming language, which is called the host language. Constraint programming originated from a formalization of equalities of terms in Prolog II, leading to a general framework for embedding constraints into a logic programming language. The most common host languages are Prolog, C++, and Java, but other languages have been used as well. Constraint logic programming A constraint logic program is a logic program that contains constraints in the bodies of clauses. As an example, the clause A(X):-X>0,B(X) is a clause containing the constraint X>0 in the body. Constraints can also be present in the goal. The constraints in the goal and in the clauses used to prove the goal are accumulated into a set called constraint store. This set contains the constraints the interpreter has assumed satisfiable in order to proceed in the evaluation. As a result, if this set is detected unsatisfiable, the interpreter backtracks. Equations of terms, as used in logic programming, are considered a particular form of constraints which can be simplified using unification. As a result, the constraint store can be considered an extension of the concept of substitution that is used in regular logic programming. The most common kinds of constraints used in constraint logic programming are constraints over Constraint satisfaction integers/rational/real numbers and constraints over finite domains. Concurrent constraint logic programming languages have also been developed. They significantly differ from non-concurrent constraint logic programming in that they are aimed at programming concurrent processes that may not terminate. Constraint handling rules can be seen as a form of concurrent constraint logic programming, but are also sometimes used within a non-concurrent constraint logic programming language. They allow for rewriting constraints or to infer new ones based on the truth of conditions. Constraint satisfaction toolkits Constraint satisfaction toolkits are software libraries for imperative programming languages that are used to encode and solve a constraint satisfaction problem. • Cassowary constraint solver is an open source project for constraint satisfaction (accessible from C, Java, Python and other languages). • Comet, a commercial programming language and toolkit • Gecode, an open source portable toolkit written in C++ developed as a production-quality and highly efficient implementation of a complete theoretical background. • JaCoP (solver) an open source Java constraint solver [4] • Koalog [5] a commercial Java-based constraint solver. • logilab-constraint [6] an open source constraint solver written in pure Python with constraint propagation algorithms. • MINION [7] an open-source constraint solver written in C++, with a small language for the purpose of specifying models/problems. • ZDC [8] is an open source program developed in the Computer-Aided Constraint Satisfaction Project [9] for modelling and solving constraint satisfaction problems. Other constraint programming languages Constraint toolkits are a way for embedding constraints into an imperative programming language. However, they are only used as external libraries for encoding and solving problems. An approach in which constraints are integrated into an imperative programming language is taken in the Kaleidoscope programming language. Constraints have also been embedded into functional programming languages. References [1] (English)Berthier, Denis (16 mai 2007). Lulu Publishers, ISBN 978-1-84753-472-9. http:/ / www. carva. org/ denis. berthier/ HLS. Retrieved 16 mai 2007. [2] (English)Berthier, Denis (5 octobre 2011). Lulu Publishers, ISBN 978-1-4478-6888-0. http:/ / www. carva. org/ denis. berthier/ CRT. Retrieved 5 octobre 2011. [3] Denis Berthier, Unbiased Statistics of a CSP - A Controlled-Bias Generator, International Joint Conferences on Computer, Information, Systems Sciences and Engineering (CISSE 09), December 4-12, 2009 [4] http:/ / jacop. osolpro. com/ [5] http:/ / www. koalog. com/ [6] http:/ / www. logilab. org/ projects/ constraint [7] http:/ / minion. sourceforge. net/ [8] http:/ / www. bracil. net/ CSP/ cacp/ cacpdemo. html [9] http:/ / www. bracil. net/ CSP/ cacp/ • Apt, Krzysztof (2003). Principles of constraint programming. Cambridge University Press. ISBN 0-521-82583-0. • Berthier, Denis (2011). Constraint Resolution Theories (http://www.carva.org/denis.berthier/CRT). Lulu. ISBN 978-1-4478-6888-0. • Dechter, Rina (2003). Constraint processing (http://www.ics.uci.edu/~dechter/books/index.html). Morgan Kaufmann. ISBN 1-55860-890-7. 43 Constraint satisfaction • Dincbas, M.; Simonis, H.; Van Hentenryck, P. (1990). "Solving Large Combinatorial Problems in Logic Programming". Journal of logic programming 8 (1–2): 75–93. doi:10.1016/0743-1066(90)90052-7. • Freuder, Eugene; Alan Mackworth (ed.) (1994). Constraint-based reasoning. MIT Press. • Frühwirth, Thom; Slim Abdennadher (2003). Essentials of constraint programming. Springer. ISBN 3-540-67623-6. • Guesguen, Hans; Hertzberg Joachim (1992). A Perspective of Constraint Based Reasoning. Springer. ISBN 978-3540555100. • Jaffar, Joxan; Michael J. Maher (1994). "Constraint logic programming: a survey". Journal of logic programming 19/20: 503–581. doi:10.1016/0743-1066(94)90033-7. • Laurière, Jean-Louis (1978). "A Language and a Program for Stating and Solving Combinatorial Problems". Artificial intelligence 10 (1): 29–127. doi:10.1016/0004-3702(78)90029-2. • Lecoutre, Christophe (2009). Constraint Networks: Techniques and Algorithms (http://www.iste.co.uk/index. php?f=a&ACTION=View&id=250). ISTE/Wiley. ISBN 978-1-84821-106-3. • Marriot, Kim; Peter J. Stuckey (1998). Programming with constraints: An introduction. MIT Press. ISBN 0-262-13341-5. • Rossi, Francesca; Peter van Beek, Toby Walsh (ed.) (2006). Handbook of Constraint Programming, (http:// www.elsevier.com/wps/find/bookdescription.cws_home/708863/description#description). Elsevier. ISBN 978-0-444-52726-4 0-444-52726-5. • Tsang, Edward (1993). Foundations of Constraint Satisfaction (http://www.bracil.net/edward/FCS.html). Academic Press. ISBN 0-12-701610-4. • Van Hentenryck, Pascal (1989). Constraint Satisfaction in Logic Programming. MIT Press. ISBN 0-262-08181-4. External links • CSP Tutorial (http://4c.ucc.ie/web/outreach/tutorial.html) 44 Heuristic (computer science) Heuristic (computer science) In computer science and optimization a heuristic is a rule of thumb learned from experience but not always justified by an underlying theory. Heuristics are often used to improve efficiency or effectiveness of optimization algorithms, either by finding an approximate answer when the optimal answer would be prohibitively difficult or to make an algorithm faster. Usually, heuristics do not guarantee that an optimal solution is ever found. On the other hand, results about NP-hardness in theoretical computer science make heuristics the only viable alternative for many complex optimization problems which are significant in the real world. An example of an approximation is one Jon Bentley described for solving the travelling salesman problem (TSP) where it was selecting the order to draw using a pen plotter. TSP is known to be NP-hard so an optimal solution for even moderate size problem is intractable. Instead the greedy algorithm can be used to to give a good but not optimal (it is an approximation to the optimal answer) in a short amount of time. The greedy algorithm heuristic says to pick whatever is currently the best next step regardless of whether that precludes good steps later. It is a heuristic in that practice says it is a good enough solution, theory says there are better solutions (and even can tell how much better in some cases).[1] An example of making an algorithm faster occurs in certain search methods where it tries every possibility at each step but can stop the search if the current possibility is already worse than the best solution already found; in this sort of algorithm a heuristic can be used to try good choices first so that later it can eliminate bad paths early. (See alpha-beta pruning). References [1] Writing Efficient Programs, Jon Louis Bentley, Prentice-Hall Software Series, 1982, Page 11- Multi-objective optimization Multi-objective optimization (or multi-objective [1][2] programming), also known as multi-criteria or multi-attribute optimization, is the process of simultaneously optimizing two or more conflicting objectives subject to certain constraints. Multiobjective optimization problems can be found in various fields: product and process design, finance, aircraft design, the oil and gas industry, automobile design, or wherever optimal decisions need to be taken in the presence of trade-offs between Plot of objectives when maximizing return and two or more conflicting objectives. Maximizing profit and minimizing risk in financial portfolios (Pareto-optimal points in red) minimizing the cost of a product; maximizing performance and minimizing fuel consumption of a vehicle; and minimizing weight while maximizing the strength of a particular component are examples of multi-objective optimization problems. For nontrivial multiobjective problems, one cannot identify a single solution that simultaneously optimizes each objective. While searching for solutions, one reaches points such that, when attempting to improve an objective further, other objectives suffer as a result. A tentative solution is called non-dominated, Pareto optimal, or Pareto efficient if it cannot be eliminated from consideration by replacing it with another solution which improves an objective without worsening another one. Finding such non-dominated solutions, and quantifying the trade-offs in satisfying the different objectives, is the goal when setting up and solving a multiobjective optimization problem. 45 Multi-objective optimization 46 When the role of the decision maker (DM) is considered, one distinguishes between: a priori approaches that require all knowledge about the relative importance of the objectives before starting the solution process, a posteriori approaches that deliver a large representative set of Pareto-optimal solutions among which the DM chooses the preferred one, and interactive approches which alternate the production of some Pareto-optimal solutions with the feedback by the DM, so that a better tuning of the preferred combination of objectives can be learned.[3] Introduction In mathematical terms, the multiobjective problem can be written as: where is the -th objective function, and are the inequality and equality constraints, respectively, and is the vector of optimization or decision variables. The solution to the above problem is a set of Pareto points. Thus, instead of being a unique solution to the problem, the solution to a multiobjective problem is a possibly infinite set of Pareto points. A design point in objective space vector such that is termed Pareto optimal if there does not exist another feasible design objective for all , and for at least one index of , . Solution methods Some methods for finding a solution to a multiobjective optimization problem are summarized below. Constructing a single aggregate objective function (AOF) This is an intuitive approach to solving the multi-objective problem. The basic idea is to combine all of the objectives into a single objective function, called the AOF, such as the well-known weighted linear sum of the objectives. This objective function is optimized subject to technological constraints specifying how much of one objective must be sacrificed, from any given starting point, in order to gain a certain amount regarding the other objective. These technological constraints frequently come in the form for some function f, where and are the objectives (e.g., strength and lightness of a product). Often the aggregate objective function is not linear in the objectives, but rather is non-linear, expressing increasing marginal dissatisfaction with greater incremental sacrifices in the value of either objective. Furthermore, sometimes the aggregate objective function is additively separable, so that it is expressed as a weighted average of a non-linear function of one objective and a non-linear function of another objective. Then the optimal solution obtained will depend on the relative values of the weights specified. For example, if one is trying to maximize the strength of a machine component and minimize the production cost, and if a higher weight is specified for the cost objective compared to the strength, the solution will be one that favors lower cost over higher strength. The weighted sum method, like any method of selecting a single solution as preferable to all others, is essentially subjective, in that a decision manager needs to supply the weights. Moreover, this approach may prove difficult to implement if the Pareto frontier is not globally convex and/or the objective function to be minimized is not globally concave. The objective way of characterizing multi-objective problems, by identifying multiple Pareto optimal candidate solutions, requires a Pareto-compliant ranking method, favoring non-dominated solutions, as seen in current Multi-objective optimization multi-objective evolutionary approaches such as NSGA-II [4] and SPEA2. Here, no weight is required and thus no a priori information on the decision-maker's preferences is needed.[5] However, to decide upon one of the Pareto-efficient options as the one to adopt requires information about the decision-maker's preferences. Thus the objective characterization of the problem is simply the first stage in a two-stage analysis, consisting of (1) identifying the non-dominated possibilities, and (2) choosing among them. The NBI, NC, SPO and DSD methods The Normal Boundary Intersection (NBI)[6][7], Normal Constraint (NC)[8][9], Successive Pareto Optimization (SPO)[10], and Directed Search Domain (DSD)[11] methods solve the multi-objective optimization problem by constructing several AOFs. The solution of each AOF yields a Pareto point, whether locally or globally. The NC and DSD methods suggest two different filtering procedures to remove locally Pareto points. The AOFs are constructed with the target of obtaining evenly distributed Pareto points that give a good impression (approximation) of the real set of Pareto points. The DSD, NC and SPO methods generate solutions that represent some peripheral regions of the set of Pareto points for more than two objectives that are known to be not represented by the solutions generated with the NBI method. According to Erfani and Utyuzhnikov, the DSD method works reasonably more efficiently than its NC and NBI counterparts on some difficult test cases in the literature.[11] Evolutionary algorithms Evolutionary algorithms are popular approaches to solving multiobjective optimization. Currently most evolutionary optimizers apply Pareto-based ranking schemes. Genetic algorithms such as the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) and Strength Pareto Evolutionary Algorithm 2 (SPEA-2) have become standard approaches, although some schemes based on particle swarm optimization and simulated annealing[12] are significant. The main advantage of evolutionary algorithms, when applied to solve multi-objective optimization problems, is the fact that they typically optimize sets of solutions, allowing computation of an approximation of the entire Pareto front in a single algorithm run. The main disadvantage of evolutionary algorithms is the much lower speed. Other methods Multiobjective Optimization using Evolutionary Algorithms (MOEA).[5][13][14] PGEN (Pareto surface generation for convex multiobjective instances)[15] IOSO (Indirect Optimization on the basis of Self-Organization) SMS-EMOA (S-metric selection evolutionary multiobjective algorithm)[16] Reactive Search Optimization (using machine learning for adapting strategies and objectives)[17][18], implemented in LIONsolver • Benson's algorithm for linear vector optimization problems • • • • • Applications Economics In economics, the study of resource allocation under scarcity, many problems involve multiple objectives along with constraints on what combinations of those objectives are attainable. For example, a consumer's demands for various goods are determined by the process of maximization of the utility derived from those goods, subject to a constraint based on how much income is available to spend on those goods and on the prices of those goods. This constraint allows more of one good to be purchased only at the sacrifice of consuming less of another good; therefore, the various objectives (more consumption of each good is preferred) are 47 Multi-objective optimization in conflict with each other according to this constraint. A common method for analyzing such a problem is to use a graph of indifference curves, representing preferences, and a budget constraint, representing the trade-offs that the consumer is faced with. Another example involves the production possibilities frontier, which specifies what combinations of various types of goods can be produced by a society with certain amounts of various resources. The frontier specifies the trade-offs that the society is faced with — if the society is fully utilizing its resources, more of one good can be produced only at the expense of producing less of another good. A society must then use some process to choose among the possibilities on the frontier. Macroeconomic policy-making is a context requiring multi-objective optimization. Typically a central bank must choose a stance for monetary policy that balances competing objectives — low inflation, low unemployment, low balance of trade deficit, etc. To do this, the central bank uses a model of the economy that quantitatively describes the various causal linkages in the economy; it simulates the model repeatedly under various possible stances of monetary policy, in order to obtain a menu of possible predicted outcomes for the various variables of interest. Then in principle it can use an aggregate objective function to rate the alternative sets of predicted outcomes, although in practice central banks use a non-quantitative, judgement-based, process for ranking the alternatives and making the policy choice. Finance In finance, a common problem is to choose a portfolio when there are two conflicting objectives — the desire to have the expected value of portfolio returns be as high as possible, and the desire to have risk, measured by the standard deviation of portfolio returns, be as low as possible. This problem is often represented by a graph in which the efficient frontier shows the best combinations of risk and expected return that are available, and in which indifference curves show the investor's preferences for various risk-expected return combinations. The problem of optimizing a function of the expected value (first moment) and the standard deviation (square root of the second moment) of portfolio return is called a two-moment decision model. Linear programming applications In linear programming problems, a linear objective function is optimized subject to linear constraints. Typically multiple variables of concern appear in the objective function. A vast body of research has been devoted to methods of solving these problems. Because the efficient set, the set of combinations of values of the various variables of interest having the feature that none of the variables can be given a better value without hurting the value of another variable, is piecewise linear and not continuously differentiable, the problem is not dealt with by first specifying all the points on the Pareto-efficient set; instead, solution procedures utilize the aggregate objective function right from the start. Many practical problems in operations research can be expressed as linear programming problems. Certain special cases of linear programming, such as network flow problems and multi-commodity flow problems are considered important enough to have generated much research on specialized algorithms for their solution. Linear programming is heavily used in microeconomics and company management, for dealing with such issues as planning, production, transportation, technology, and so forth. Optimal control applications In engineering and economics, many problems involve multiple objectives which are not describable as the-more-the-better or the-less-the-better; instead, there is an ideal target value for each objective, and the desire is to get as close as possible to the desired value of each objective. For example, one might want to adjust a rocket's fuel usage and orientation so that it arrives both at a specified place and at a specified time; or one might want to conduct open market operations so that both the inflation rate and the unemployment rate are as close as possible to their 48 Multi-objective optimization desired values. Often such problems are subject to linear equality constraints that prevent all objectives from being simultaneously perfectly met, especially when the number of controllable variables is less than the number of objectives and when the presence of random shocks generates uncertainty. Commonly a multi-objective quadratic objective function is used, with the cost associated with an objective rising quadratically with the distance of the objective from its ideal value. Since these problems typically involve adjusting the controlled variables at various points in time and/or evaluating the objectives at various points in time, intertemporal optimization techniques are employed. References [1] Steuer, R.E. (1986). Multiple Criteria Optimization: Theory, Computations, and Application. New York: John Wiley & Sons, Inc. ISBN 047188846X. [2] Sawaragi, Y.; Nakayama, H. and Tanino, T. (1985). Theory of Multiobjective Optimization (vol. 176 of Mathematics in Science and Engineering). Orlando, FL: Academic Press Inc. ISBN 0126203709. [3] A. M. Geoffrion; J. S. Dyer; A. Feinberg (December 1972). "An Interactive Approach for Multi-Criterion Optimization, with an Application to the Operation of an Academic Department". Management Science. Application Series (INFORMS) 19 (4 Part 1): 357–368. [4] Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. (2002). "A fast and elitist multi-objective genetic algorithm: NSGA-II". IEEE Transactions on Evolutionary Computation 6 (2): 182–197. doi:10.1109/4235.996017. [5] Deb, K. (2001). Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons. ISBN 978-0471873396. [6] Das, I.; Dennis, J. E. (1998). "Normal-Boundary Intersection: A New Method for Generating the Pareto Surface in Nonlinear Multicriteria Optimization Problems". SIAM Journal on Optimization 8: 631–657. [7] "Normal-Boundary Intersection: An Alternate Method For Generating Pareto Optimal Points In Multicriteria Optimization Problems" (http:/ / ntrs. nasa. gov/ archive/ nasa/ casi. ntrs. nasa. gov/ 19970005647_1997005080. pdf) (pdf). . [8] Messac, A.; Ismail-Yahaya, A.; Mattson, C.A. (2003). "The normalized normal constraint method for generating the Pareto frontier". Structural and multidisciplinary optimization 25 (2): 86–98. [9] Messac, A.; Mattson, C. A. (2004). "Normal constraint method with guarantee of even representation of complete Pareto frontier". AIAA journal 42 (10): 2101–2111. [10] Mueller-Gritschneder, Daniel; Graeb, Helmut; Schlichtmann, Ulf (2009). "A Successive Approach to Compute the Bounded Pareto Front of Practical Multiobjective Optimization Problems". SIAM Journal on Optimization 20 (2): 915–934. [11] Erfani, Tohid; Utyuzhnikov, Sergei V. (2011). "Directed Search Domain: A Method for Even Generation of Pareto Frontier in Multiobjective Optimization" (http:/ / personalpages. manchester. ac. uk/ postgrad/ tohid. erfani/ TohidErfaniSUtyuzhnikov. pdf) (pdf). Journal of Engineering Optimization 43 (5): 1–18. . Retrieved October 17, 2011. [12] Suman, B.; Kumar, P. (2006). "A survey of simulated annealing as a tool for single and multiobjective optimization". Journal of the Operational Research Society 57 (10): 1143–1160. doi:10.1057/palgrave.jors.2602068. [13] Coello Coello, C. A.; Lamont, G. B.; Van Veldhuizen, D. A. (2007). Evolutionary Algorithms for Solving Multi-Objective Problems (2 ed.). Springer. ISBN 978-0-387-33254-3. [14] Das, S.; Panigrahi, B. K. (2008). Rabuñal, J. R.; Dorado, J.; Pazos, A.. eds. Multi-objective Evolutionary Algorithms, Encyclopedia of Artificial Intelligence. 3. Idea Group Publishing. pp. 1145–1151. [15] Craft, D.; Halabi, T.; Shih, H.; Bortfeld, T. (2006). "Approximating convex Pareto surfaces in multiobjective radiotherapy planning". Medical Physics 33 (9): 3399–3407. [16] http:/ / ls11-www. cs. uni-dortmund. de/ people/ beume/ publications/ BNR08_at. pdf [17] Battiti, Roberto; Mauro Brunato; Franco Mascia (2008). Reactive Search and Intelligent Optimization. Springer Verlag. ISBN 978-0-387-09623-0. [18] Battiti, Roberto; Mauro Brunato (2011). Reactive Business Intelligence. From Data to Models to Insight. (http:/ / www. reactivebusinessintelligence. com/ ). Trento, Italy: Reactive Search Srl. ISBN 978-88-905795-0-9. . External links • A tutorial on multiobjective optimization (http://www.calresco.org/lucas/pmo.htm) • Evolutionary Multiobjective Optimization (http://demonstrations.wolfram.com/ EvolutionaryMultiobjectiveOptimization/), The Wolfram Demonstrations Project 49 Pareto efficiency 50 Pareto efficiency Pareto efficiency, or Pareto optimality, is a concept in economics with applications in engineering and social sciences. The term is named after Vilfredo Pareto (1848–1923), an Italian economist who used the concept in his studies of economic efficiency and income distribution. In a Pareto efficient economic system no allocation of given goods can be made without making at least one individual worse off. Given an initial allocation of goods among a set of individuals, a change to a different allocation that makes at least one individual better off without making any other individual worse off is called a Pareto improvement. An allocation is defined as "Pareto efficient" or "Pareto optimal" when no further Pareto improvements can be made. Pareto efficiency is a minimal notion of efficiency and does not necessarily result in a socially desirable distribution of resources: it makes no statement about equality, or the overall well-being of a society.[1][2] Pareto efficiency in short An economic system that is not Pareto efficient implies that a certain change in allocation of goods (for example) may result in some individuals being made "better off" with no individual being made worse off, and therefore can be made more Pareto efficient through a Pareto improvement. Here 'better off' is often interpreted as "put in a preferred position." It is commonly accepted that outcomes that are not Pareto efficient are to be avoided, and therefore Pareto efficiency is an important criterion for evaluating economic systems and public policies. If economic allocation in any system is not Pareto efficient, there is potential for a Pareto improvement—an increase in Pareto efficiency: through reallocation, improvements to at least one participant's well-being can be made better without reducing any other participant's well-being. Looking at the Production-possibility frontier, shows how productive efficiency is a precondition for Pareto efficiency. Point A is not efficient in production because you can produce more of either one or both goods (Butter and Guns) without producing less of the other. Thus, moving from A to D enables you to make one person better off without making anyone else worse off (Pareto improvement). Moving to point B from point A, however, is not Pareto efficient, as less butter is produced. Likewise, moving to point C from point A is not Pareto efficient, as fewer guns are produced. A point on the frontier curve with the same x or y coordinate will be Pareto efficient. In the real world ensuring that nobody is disadvantaged by a change aimed at improving economic efficiency may require compensation of one or more parties. For instance, if a change in economic policy dictates that a legally protected monopoly ceases to exist and that market subsequently becomes competitive and more efficient, the monopolist will be made worse off. However, the loss to the monopolist will be more than offset by the gain in efficiency. This means the monopolist can be compensated for its loss while still leaving an efficiency Pareto efficiency gain to be realized by others in the economy. Thus, the requirement of nobody being made worse off for a gain to others is met. In real-world practice compensations have substantial frictional costs. They can also lead to incentive distortions over time since most real-world policy changes occur with players who are not atomistic, rather who have considerable market power (or political power) over time and may use it in a game theoretic manner. Compensation attempts may therefore lead to substantial practical problems of misrepresentation and moral hazard and considerable inefficiency as players behave opportunistically and with guile. In real-world practice, the compensation principle often appealed to is hypothetical. That is, for the alleged Pareto improvement (say from public regulation of the monopolist or removal of tariffs) some losers are not (fully) compensated. The change thus results in distribution effects in addition to any Pareto improvement that might have taken place. The theory of hypothetical compensation is part of Kaldor–Hicks efficiency, also called Potential Pareto Criterion.[3] Hicks-Kaldor compensation is what turns the utilitarian rule for the maximization of a function of all individual utilities postulated by Samuelson as a solution to the optimal public goods problem, into a rule that mimics Pareto efficiency. This is how Pareto-efficiency finds itself at the heart of modern Public Choice theory where under certain conditions Black's median voter opts for a Hick-Kaldor compensated Pareto efficient level of public goods.[4] Under certain idealized conditions, it can be shown that a system of free markets will lead to a Pareto efficient outcome. This is called the first welfare theorem. It was first demonstrated mathematically by economists Kenneth Arrow and Gérard Debreu. However, the result does not rigorously establish welfare results for real economies because of the restrictive assumptions necessary for the proof (markets exist for all possible goods, all markets are in full equilibrium, markets are perfectly competitive, transaction costs are negligible, there must be no externalities, and market participants must have perfect information). Moreover, it has since been demonstrated mathematically that, in the absence of perfect information or complete markets, outcomes will generically be Pareto inefficient (the Greenwald–Stiglitz theorem).[5] A competitive equilibrium may not be Pareto Optimal because of externalities, tax distortion, or use of monopoly power. A negative externality causes the firm to overproduce relative to Pareto efficiency, while a positive externality causes the firm to underproduce. Tax distortions cause a wedge between the marginal rate of substitution and marginal product of labour. Monopoly power occurs when firms may not be price-takers. If the firm is large relative to market size, it can use its monopoly power to restrict output, raise prices, and increase profits.[6] Pareto improvements and microeconomic theory Note that microeconomic analysis does not assume additive utility nor does it assume any interpersonal utility tradeoffs. To engage in interpersonal utility tradeoffs leads to greater good problems faced by earlier utilitarians. It also creates a question as to how weights are assigned and who assigns them, as well as questions regarding how to compare pleasure or pain across individuals. Efficiency – in all of standard microeconomics – therefore refers to the absence of possible Pareto improvements. It does not in any way opine on the fairness of the allocation (in the sense of distributive justice or equity). An 'efficient' equilibrium could be one where one player has all the goods and other players have none (in an extreme example). 51 Pareto efficiency Weak and strong Pareto optimum A "weak Pareto optimum" (WPO) nominally satisfies the same standard of not being Pareto-inferior to any other allocation, but for the purposes of weak Pareto optimization, an alternative allocation is considered to be a Pareto improvement only if the alternative allocation is strictly preferred by all individuals. In other words, when an allocation is WPO there are no possible alternative allocations whose realization would cause every individual to gain. Weak Pareto-optimality is "weaker" than strong Pareto-optimality in the sense that the conditions for WPO status are "weaker" than those for SPO status: any allocation that can be considered an SPO will also qualify as a WPO, but a WPO allocation won't necessarily qualify as an SPO. Under any form of Pareto-optimality, for an alternative allocation to be Pareto-superior to an allocation being tested—and, therefore, for the feasibility of an alternative allocation to serve as proof that the tested allocation is not an optimal one—the feasibility of the alternative allocation must show that the tested allocation fails to satisfy at least one of the requirements for SPO status. One may apply the same metaphor to describe the set of requirements for WPO status as being "weaker" than the set of requirements for SPO status. (Indeed, because the SPO set entirely encompasses the WPO set, with respect to any property the requirements for SPO status are of strength equal to or greater than the strength of the requirements for WPO status. Therefore, the requirements for WPO status are not merely weaker on balance or weaker according to the odds; rather, one may describe them more specifically and quite fittingly as "Pareto-weaker.") • Note that when one considers the requirements for an alternative allocation's superiority according to one definition against the requirements for its superiority according to the other, the comparison between the requirements of the respective definitions is the opposite of the comparison between the requirements for optimality: To demonstrate the WPO-inferiority of an allocation being tested, an alternative allocation must falsify at least one of the particular conditions in the WPO subset, rather than merely falsify at least one of either these conditions or the other SPO conditions. Therefore, the requirements for weak Pareto-superiority of an alternative allocation are harder to satisfy (in other words, "stronger") than are the requirements for strong Pareto-superiority of an alternative allocation. • It further follows that every SPO is a WPO (but not every WPO is an SPO): Whereas the WPO description applies to any allocation from which every feasible departure results in the NON-IMPROVEMENT of at least one individual, the SPO description applies to only those allocations that meet both the WPO requirement and the more specific ("stronger") requirement that at least one non-improving individual exhibit a specific type of non-improvement, namely doing worse. • The "strong" and "weak" descriptions of optimality continue to hold true when one construes the terms in the context set by the field of semantics: If one describes an allocation as being a WPO, one makes a "weaker" statement than one would make by describing it as an SPO: If the statements "Allocation X is a WPO" and "Allocation X is a SPO" are both true, then the former statement is less controversial than the latter in that to defend the latter, one must prove everything to defend the former "and then some." By the same token, however, the former statement is less informative or contentful in that it "says less" about the allocation; that is, the former statement contains, implies, and (when stated) asserts fewer constituent propositions about the allocation. 52 Pareto efficiency 53 Formal representation Formally, a (strong/weak) Pareto optimum is a maximal element for the partial order relation of Pareto improvement/strict Pareto improvement: it is an allocation such that no other allocation is "better" in the sense of the order relation. Pareto frontier Given a set of choices and a way of valuing them, the Pareto frontier or Pareto set or Pareto front is the set of choices that are Pareto efficient. The Pareto frontier is particularly useful in engineering: by restricting attention to the set of choices that are Pareto-efficient, a designer can make tradeoffs within this set, rather than considering the full range of every parameter. The Pareto frontier is defined formally as follows. Consider a design space with n real parameters, and for each design space point there are m different criteria by which to judge that point. Let be Example of a Pareto frontier. The boxed points represent feasible choices, and smaller values are preferred to larger ones. Point C is not on the Pareto Frontier because it is dominated by both point A and point B. Points A and B are not strictly dominated by any other, and hence do lie on the frontier. the function which assigns, to each design space point x, a criteria space point f(x). This represents the way of valuing the designs. Now, it may be that some designs are infeasible; so let X be a set of feasible designs in , which must be a compact set. Then the set which represents the feasible criterion points is f(X), the image of the set X under the action of f. Call this image Y. Now construct the Pareto frontier as a subset of Y, the feasible criterion points. It can be assumed that the preferable values of each criterion parameter are the lesser ones, thus minimizing each dimension of the criterion vector. Then compare criterion vectors as follows: One criterion vector y strictly dominates (or "is preferred to") a vector y* if each parameter of y is not strictly greater than the corresponding parameter of y* and at least one parameter is strictly less: that is, for each i and for some i. This is written as to mean that y strictly dominates y*. Then the Pareto frontier is the set of points from Y that are not strictly dominated by another point in Y. Formally, this defines a partial order on Y, namely the product order on as a subset of (more precisely, the induced order on Y ), and the Pareto frontier is the set of maximal elements with respect to this order. Algorithms for computing the Pareto frontier of a finite set of alternatives have been studied in computer science, being sometimes referred to as the maximum vector problem or the skyline query.[7][8] Pareto efficiency 54 Relationship to marginal rate of substitution At a Pareto efficient allocation (on the Pareto frontier), the marginal rate of substitution is the same for all consumers. A formal statement can be derived by considering a system with m consumers and n goods, and a utility function of each consumer as where is the vector of goods, both for all i. The supply constraint is written for . To optimize this problem, the Lagrangian is used: where and are Lagrange multipliers. By taking the partial derivative of the Lagrangian with respect to consumer 1's consumption of good j, and then taking the partial derivative of the Lagrangian with respect to consumer i's consumption of good j, we have the following system of equations: where ƒij denotes consumer i's marginal utility of consuming good j (the partial derivative of ƒi with respect to xj ). These equations combine to yield precisely the condition that requires that the marginal rate of substitution between each ordered pair of goods be equal across all consumers. Notes [1] Barr, N. (2004). Economics of the welfare state. New York, Oxford University Press (USA). [2] Sen, A. (1993). Markets and freedom: Achievements and limitations of the market mechanism in promoting individual freedoms. Oxford Economic Papers, 45(4), 519–541. [3] Ng, 1983. [4] Palda, 2011. [5] Greenwald, Bruce; Stiglitz, Joseph E. (1986). "Externalities in economies with imperfect information and incomplete markets". Quarterly Journal of Economics 101 (2): 229–264. doi:10.2307/1891114. JSTOR 1891114 [6] Stephen D. Williamson (2010). "Sources of Social Inefficiences", Macroeconomics 3rd edition. [7] Kung, H.T.; Luccio, F.; Preparata, F.P. (1975). "On finding the maxima of a set of vectors.". Journal of the ACM 22 (4): 469–476. doi:10.1145/321906.321910 [8] Godfrey, Parke; Shipley, Ryan; Gryz, Jarek (2006). "Algorithms and Analyses for Maximal Vector Computation". VLDB Journal 16: 5–28. doi:10.1007/s00778-006-0029-7 Pareto efficiency References • • • • • • • • Fudenberg, D. and Tirole, J. (1983). Game Theory. MIT Press. Chapter 1, Section 2.4. ISBN 0-262-06141-4. Ng, Yew-Kwang (1983). Welfare Economics. Macmillan. ISBN 0-333-97121-3. Osborne, M. J. and Rubenstein, A. (1994). A Course in Game Theory. MIT Press. pp. 7. ISBN 0-262-65040-1. Dalimov R.T. Modelling International Economic Integration: an Oscillation Theory Approach. Victoria, Trafford, 2008, 234 pp. Dalimov R.T. "The heat equation and the dynamics of labor and capital migration prior and after economic integration. African Journal of Marketing Management, vol. 1 (1), pp. 023–031, April 2009. Jovanovich, M. The Economics Of European Integration: Limits And Prospects. Edward Elgar, 2005, 918 p. Mathur, Vijay K. "How Well Do We Know Pareto Optimality?" "How Well Do We Know Pareto Optimality?" Journal of Economic Education 22#2 (1991) pp 172–178 online edition (http://www.questia.com/read/ 95848335) Palda, Filip Pareto's Republic and the New Science of Peace. Cooper-Wolfling, 2011. Stochastic programming Stochastic programming is a framework for modeling optimization problems that involve uncertainty. Whereas deterministic optimization problems are formulated with known parameters, real world problems almost invariably include some unknown parameters. When the parameters are known only within certain bounds, one approach to tackling such problems is called robust optimization. Here the goal is to find a solution which is feasible for all such data and optimal in some sense. Stochastic programming models are similar in style but take advantage of the fact that probability distributions governing the data are known or can be estimated. The goal here is to find some policy that is feasible for all (or almost all) the possible data instances and maximizes the expectation of some function of the decisions and the random variables. More generally, such models are formulated, solved analytically or numerically, and analyzed in order to provide useful information to a decision-maker.[1] As an example, consider two-stage linear programs. Here the decision maker takes some action in the first stage, after which a random event occurs affecting the outcome of the first-stage decision. A recourse decision can then be made in the second stage that compensates for any bad effects that might have been experienced as a result of the first-stage decision. The optimal policy from such a model is a single first-stage policy and a collection of recourse decisions (a decision rule) defining which second-stage action should be taken in response to each random outcome. Stochastic programming has applications in a broad range of areas ranging from finance to transportation to energy optimization.[2][3] Biological Applications Stochastic dynamic programming is frequently used to model animal behaviour in such fields as behavioural ecology.[4][5] Empirical tests of models of optimal foraging, life-history transitions such as fledging in birds and egg laying in parasitoid wasps have shown the value of this modelling technique in explaining the evolution of behavioural decision making. These models are typically many staged, rather than two-staged. Economic Applications Stochastic dynamic programming is a useful tool in understanding decision making under uncertainty. The accumulation of capital stock under uncertainty is one example, often it is used by resource economists to analyze bioeconomic problems[6] where the uncertainty enters in such as weather, etc. 55 Stochastic programming Solvers • FortSP - solver for stochastic programming problems References [1] Shapiro, Alexander; Dentcheva, Darinka; Ruszczyński, Andrzej (2009). Lectures on stochastic programming: Modeling and theory (http:/ / www2. isye. gatech. edu/ people/ faculty/ Alex_Shapiro/ SPbook. pdf). MPS/SIAM Series on Optimization. 9. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM). pp. xvi+436. ISBN 978-0-898716-87-0. MR2562798. . [2] Stein W. Wallace and William T. Ziemba (eds.). Applications of Stochastic Programming. MPS-SIAM Book Series on Optimization 5, 2005. [3] Applications of stochastic programming are described at the following website, Stochastic Programming Community (http:/ / stoprog. org). [4] Mangel, M. & Clark, C. W. 1988. Dynamic modeling in behavioral ecology. Princeton University Press ISBN 0-691-08506-4 [5] Houston, A. I & McNamara, J. M. 1999. Models of adaptive behaviour: an approach based on state. Cambridge University Press ISBN 0-521-65539-0 [6] Howitt, R., Msangi, S., Reynaud, A and K. Knapp. 2002. "Using Polynomial Approximations to Solve Stochastic Dynamic Programming Problems: or A "Betty Crocker " Approach to SDP." University of California, Davis, Department of Agricultural and Resource Economics Working Paper. http:/ / www. agecon. ucdavis. edu/ aredepart/ facultydocs/ Howitt/ Polyapprox3a. pdf Further reading • John R. Birge and François V. Louveaux. Introduction to Stochastic Programming. Springer Verlag, New York, 1997. • Kall, Peter; Wallace, Stein W. (1994). Stochastic programming (http://stoprog.org/index.html?introductions. html). Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons, Ltd.. pp. xii+307. ISBN 0-471-95158-7. MR1315300. • G. Ch. Pflug: Optimization of Stochastic Models. The Interface between Simulation and Optimization. Kluwer, Dordrecht, 1996. • Andras Prekopa. Stochastic Programming. Kluwer Academic Publishers, Dordrecht, 1995. • Andrzej Ruszczynski and Alexander Shapiro (eds.). Stochastic Programming. Handbooks in Operations Research and Management Science, Vol. 10, Elsevier, 2003. • Shapiro, Alexander; Dentcheva, Darinka; Ruszczyński, Andrzej (2009). Lectures on stochastic programming: Modeling and theory (http://www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdf). MPS/SIAM Series on Optimization. 9. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM). pp. xvi+436. ISBN 978-0-898716-87-0. MR2562798. • Stein W. Wallace and William T. Ziemba (eds.). Applications of Stochastic Programming. MPS-SIAM Book Series on Optimization 5, 2005. External links • Stochastic Programming Community Home Page (http://stoprog.org) 56 Parallel metaheuristic Parallel metaheuristic Parallel metaheuristic is a class of new advanced techniques that are able of reducing both the numerical effort and the run time of a metaheuristic. To this end, concepts and technologies from the field of parallelism in computer science are used to enhance and even completely modify the behavior of existing metaheuristics. Just as it exists a long list of metaheuristics like evolutionary algorithms, particle swarm, ant colony optimization, simulated annealing, etc. it also exists a large set of different techniques strongly or losely based in these ones, whose behavior encompasses the multiple parallel execution of algorithm components that cooperate in some way to solve a problem on a given parallel hardware platform. Background In practice, optimization (and searching, and learning) problems are often NP-hard, complex, and time consuming. Two major approaches are traditionally used to tackle these problems: exact methods and metaheuristics. Exact methods allow to find exact solutions but are often impractical as they are extremely time-consuming for real-world problems (large dimension, hardly constrained, multimodal, time-varying, epistatic problems). Conversely, metaheuristics provide sub-optimal (sometimes optimal) solutions in a reasonable time. Thus, metaheuristics An example of different implementations of the same PSO metaheuristic model. usually allow to meet the resolution delays imposed in the industrial field as well as they allow to study general problem classes instead that particular problem instances. In general, many of the best performing techniques in precision and effort to solve complex and real-world problems are metaheuristics. Their fields of application range from combinatorial optimization, bioinformatics, and telecommunications to economics, software engineering, etc. These fiels are full of many tasks needing fast solutions of high quality. See [1] for more details on complex applications. Metaheuristics fall in two categories: trajectory-based metaheuristics and population-based metaheuristics. The main difference of these two kind of methods relies in the number of tentative solutions used in each step of the (iterative) algorithm. A trajectory-based technique starts with a single initial solution and, at each step of the search, the current solution is replaced by another (often the best) solution found in its neighborhood. It is usual that trajectory-based metaheuristics allow to quickly find a locally optimal solution, and so they are called exploitation-oriented methods promoting intensification in the search space. On the other hand, population-based algorithms make use of a population of solutions. The initial population is in this case randomly generated (or created with a greedy algorithm), and then enhanced through an iterative process. At each generation of the process, the whole population (or a part of it) is replaced by newly generated individuals (often the best ones). These techniques are called exploration-oriented methods, since their main ability resides in the diversification in the search space. 57 Parallel metaheuristic Most basic metaheuristics are sequential. Although their utilization allows to significantly reduce the temporal complexity of the search process, this latter remains high for real-world problems arising in both academic and industrial domains. Therefore, parallelism comes as a natural way not to only reduce the search time, but also to improve the quality of the provided solutions. For a comprehensive discussion on how parallelism can be mixed with metaheuristics see [2]. Parallel trajectory-based metaheuristics Metaheuristics for solving optimization problems could be viewed as walks through neighborhoods tracing search trajectories through the solution domains of the problem at hands: Algorithm: Sequential trajectory-based general pseudo-code Generate(s(0)); // Initial solution t := 0; // Numerical step while not Termination Criterion(s(t)) do ...s′(t) := SelectMove(s(t)); // Exploration of the neighborhood ...if AcceptMove(s′(t)) then ...s(t) := ApplyMove(s′(t)); ...t := t+1; endwhile Walks are performed by iterative procedures that allow moving from one solution to another one in the solution space (see the above algorithm). This kind of metaheuristics perform the moves in the neighborhood of the current solution, i.e., they have a perturbative nature. The walks start from a solution randomly generated or obtained from another optimization algorithm. At each iteration, the current solution is replaced by another one selected from the set of its neighboring candidates. The search process is stopped when a given condition is satisfied (a maximum number of generation, find a solution with a target quality, stuck for a given time, . . . ). A powerful way to achieve high computational efficiency with trajectory-based methods is the use of parallelism. Different parallel models have been proposed for trajectory-based metaheuristics, and three of them are commonly used in the literature: the parallel multi-start model, the parallel exploration and evaluation of the neighborhood (or parallel moves model), and the parallel evaluation of a single solution (or move acceleration model): • Parallel multi-start model: It consists in simultaneously launching several trajectory-based methods for computing better and robust solutions. They may be heterogeneous or homogeneous, independent or cooperative, start from the same or different solution(s), and configured with the same or different parameters. • Parallel moves model: It is a low-level master-slave model that does not alter the behavior of the heuristic. A sequential search would compute the same result but slower. At the beginning of each iteration, the master duplicates the current solution between distributed nodes. Each one separately manages their candidate/solution and the results are returned to the master. • Move acceleration model: The quality of each move is evaluated in a parallel centralized way. That model is particularly interesting when the evaluation function can be itself parallelized as it is CPU time-consuming and/or I/O intensive. In that case, the function can be viewed as an aggregation of a certain number of partial functions that can be run in parallel. 58 Parallel metaheuristic Parallel population-based metaheuristics Population-based metaheuristic are stochastic search techniques that have been successfully applied in many real and complex applications (epistatic, multimodal, multi-objective, and highly constrained problems). A population-based algorithm is an iterative technique that applies stochastic operators on a pool of individuals: the population (see the algorithm below). Every individual in the population is the encoded version of a tentative solution. An evaluation function associates a fitness value to every individual indicating its suitability to the problem. Iteratively, the probabilistic application of variation operators on selected individuals guides the population to tentative solutions of higher quality. The most well-known metaheuristic families based on the manipulation of a population of solutions are evolutionary algorithms (EAs), ant colony optimization (ACO), particle swarm optimization (PSO), scatter search (SS), differential evolution (DE), and estimation distribution algorithms (EDA). Algorithm: Sequential population-based metaheuristic pseudo-code Generate(P(0)); // Initial population t := 0; // Numerical step while not Termination Criterion(P(t)) do ...Evaluate(P(t)); // Evaluation of the population ...P′′(t) := Apply Variation Operators(P′(t)); // Generation of new solutions ...P(t + 1) := Replace(P(t), P′′(t)); // Building the next population ...t := t + 1; endwhile For non-trivial problems, executing the reproductive cycle of a simple population-based method on long individuals and/or large populations usually requires high computational resources. In general, evaluating a fitness function for every individual is frequently the most costly operation of this algorithm. Consequently, a variety of algorithmic issues are being studied to design efficient techniques. These issues usually consist of defining new operators, hybrid algorithms, parallel models, and so on. Parallelism arises naturally when dealing with populations, since each of the individuals belonging to it is an independent unit (at least according to the Pittsburg style, although there are other approaches like the Michigan one which do not consider the individual as independent units). Indeed, the performance of population-based algorithms is often improved when running in parallel. Two parallelizing strategies are specially focused on population-based algorithms: (1) Parallelization of computations, in which the operations commonly applied to each of the individuals are performed in parallel, and (2) Parallelization of population, in which the population is split in different parts that can be simply exchanged or evolved separately, and then joined later. In the beginning of the parallelization history of these algorithms, the well-known master-slave (also known as global parallelization or farming) method was used. In this approach, a central processor performs the selection operations while the associated slave processors (workers) run the variation operator and the evaluation of the fitness function. This algorithm has the same behavior as the sequential one, although its computational efficiency is improved, especially for time consuming objective functions. On the other hand, many researchers use a pool of processors to speed up the execution of a sequential algorithm, just because independent runs can be made more rapidly by using several processors than by using a single one. In this case, no interaction at all exists between the independent runs. However, actually most parallel population-based techniques found in the literature utilize some kind of spatial disposition for the individuals, and then parallelize the resulting chunks in a pool of processors. Among the most widely known types of structured metaheuristics, the distributed (or coarse grain) and cellular (or fine grain) algorithms are very popular optimization procedures. 59 Parallel metaheuristic In the case of distributed ones, the population is partitioned in a set of subpopulations (islands) in which isolated serial algorithms are executed. Sparse exchanges of individuals are performed among these islands with the goal of introducing some diversity into the subpopulations, thus preventing search of getting stuck in local optima. In order to design a distributed metaheuristic, we must take several decisions. Among them, a chief decision is to determine the migration policy: topology (logical links between the islands), migration rate (number of individuals that undergo migration in every exchange), migration frequency (number of steps in every subpopulation between two successive exchanges), and the selection/replacement of the migrants. In the case of a cellular method, the concept of neighborhood is introduced, so that an individual may only interact with its nearby neighbors in the breeding loop. The overlapped small neighborhood in the algorithm helps in exploring the search space because a slow diffusion of solutions through the population provides a kind of exploration, while exploitation takes place inside each neighborhood. See [3] for more information on cellular Genetic Algorithms and related models. Also, hybrid models are being proposed in which a two-level approach of parallelization is undertaken. In general, the higher level for parallelization is a coarse-grained implementation and the basic island performs a a cellular, a master-slave method or even another distributed one. See Also • Cellular Evolutionary Algorithms • Enrique Alba References [1] http:/ / eu. wiley. com/ WileyCDA/ WileyTitle/ productCd-0470293322. html [2] http:/ / eu. wiley. com/ WileyCDA/ WileyTitle/ productCd-0471678066. html [3] http:/ / www. springer. com/ business/ operations+ research/ book/ 978-0-387-77609-5 • G. Luque, E. Alba, Parallel Genetic Algorithms. Theory and Real World Applications, Springer-Verlag, ISBN 978-3-642-22083-8, July 2011 (http://www.amazon.com/ Parallel-Genetic-Algorithms-Applications-Computational/dp/3642220835) • Alba E., Blum C., Isasi P., León C. Gómez J.A. (eds.), Optimization Techniques for Solving Complex Problems, Wiley, ISBN 978-0-470-29332-4, 2009 (http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470293322. html) • E. Alba, B. Dorronsoro, Cellular Genetic Algorithms, Springer-Verlag, ISBN 978-0-387-77609-5, 2008 (http:// www.springer.com/business/operations+research/book/978-0-387-77609-5) • N. Nedjah, E. Alba, L. de Macedo Mourelle, Parallel Evolutionary Computations, Springer-Verlag, ISBN 3-540-32837-8, 2006 (http://www.springer.com/east/home?SGWID=5-102-22-138979270-0) • E. Alba, Parallel Metaheuristics: A New Class of Algorithms, Wiley, ISBN 0-471-67806-6, July 2005 (http://eu. wiley.com/WileyCDA/WileyTitle/productCd-0471678066.html) • MALLBA (http://neo.lcc.uma.es/software/mallba/index.php) • JGDS (http://neo.lcc.uma.es/software/jgds/index.php) • DEME (http://neo.lcc.uma.es/software/deme/index.php) • xxGA (http://neo.lcc.uma.es/software/xxga/index.php) • Paradiseo 60 Parallel metaheuristic External links • THE Page on Parallel Metaheuristics (http://mallba10.lcc.uma.es/PM/index.php/Parallel_Metaheuristics) • The NEO group at the University of Málaga, Spain (http://neo.lcc.uma.es) There ain't no such thing as a free lunch "There ain't no such thing as a free lunch" (alternatively, "There's no such thing as a free lunch" or other variants) is a popular adage communicating the idea that it is impossible to get something for nothing. The acronyms TANSTAAFL and TINSTAAFL are also used. Uses of the phrase dating back to the 1930s and 1940s have been found, but the phrase's first appearance is unknown.[1] The "free lunch" in the saying refers to the nineteenth century practice in American bars of offering a "free lunch" as a way to entice drinking customers. The phrase and the acronym are central to Robert Heinlein's 1966 libertarian science fiction novel The Moon is a Harsh Mistress, which popularized it.[2][3] The free-market economist Milton Friedman also popularized the phrase[1] by using it as the title of a 1975 book, and it often appears in economics textbooks;[4] Campbell McConnell writes that the idea is "at the core of economics".[5] History and usage “Free lunch” The “free lunch” referred to in the acronym relates back to the once-common tradition of saloons in the United States providing a "free" lunch to patrons who had purchased at least one drink. All the foods on offer were high in salt (e.g. ham, cheese and salted crackers) so those who ate them ended up buying a lot of beer. Rudyard Kipling, writing in 1891, noted how he came upon a bar room full of bad Salon pictures, in which men with hats on the backs of their heads were wolfing food from a counter. “It was the institution of the 'free lunch' I had struck. You paid for a drink and got as much as you wanted to eat. For something less than a rupee a day a man can feed himself sumptuously in San Francisco, even though he be a bankrupt. Remember this if ever you are stranded in these parts.”[6] TANSTAAFL, on the other hand, indicates an acknowledgment that in reality a person or a society cannot get "something for nothing". Even if something appears to be free, there is always a cost to the person or to society as a whole even though that cost may be hidden or distributed. For example, as Heinlein has one of his characters point out, a bar offering a free lunch will likely charge more for its drinks.[7] Early uses According to Robert Caro, Fiorello La Guardia, on becoming mayor of New York in 1934, said "È finita la cuccagna!", meaning "No more free lunch"; in this context "free lunch" refers to graft and corruption.[1] The earliest known occurrence of the full phrase, in the form "There ain’t no such thing as free lunch", appears as the punchline of a joke related in an article in the El Paso Herald-Post of June 27, 1938, entitled "Economics in Eight Words".[8] In 1945 "There ain't no such thing as a free lunch" appeared in the Columbia Law Review, and "there is no free lunch" appeared in a 1942 article in the Oelwein Daily Register (in a quote attributed to economist Harley L. Lutz) and in a 1947 column by economist Merryle S. Rukeyser.[2][9] In 1949 the phrase appeared in an article by Walter Morrow in the San Francisco News (published on 1 June) and in Pierre Dos Utt's monograph, "TANSTAAFL: a plan for a new economic world order",[10] which describes an oligarchic political system based on his conclusions from "no free lunch" principles. The 1938 and 1949 sources use the phrase in relating a fable about a king (Nebuchadrezzar in Dos Utt's retelling) seeking advice from his economic advisors. Morrow's retelling, which claims to derive from an earlier editorial 61 There ain't no such thing as a free lunch reported to be non-existent,[11] but closely follows the story as related in the earlier article in the El Paso Herald-Post, differs from Dos Utt's in that the ruler asks for ever-simplified advice following their original "eighty-seven volumes of six hundred pages" as opposed to a simple failure to agree on "any major remedy". The last surviving economist advises that "There ain't no such thing as a free lunch". In 1950, a New York Times columnist ascribed the phrase to economist (and Army General) Leonard P. Ayres of the Cleveland Trust Company. "It seems that shortly before the General's death [in 1946]... a group of reporters approached the general with the request that perhaps he might give them one of several immutable economic truisms that he gathered from long years of economic study... 'It is an immutable economic fact,' said the general, 'that there is no such thing as a free lunch.'"[12] Meanings TANSTAAFL demonstrates opportunity cost. Greg Mankiw described the concept as: "To get one thing that we like, we usually have to give up another thing that we like. Making decisions requires trading off one goal against another."[13] The idea that there is no free lunch at the societal level applies only when all resources are being used completely and appropriately, i.e., when economic efficiency prevails. If not, a 'free lunch' can be had through a more efficient utilisation of resources. If one individual or group gets something at no cost, somebody else ends up paying for it. If there appears to be no direct cost to any single individual, there is a social cost. Similarly, someone can benefit for "free" from an externality or from a public good, but someone has to pay the cost of producing these benefits. In the sciences, TANSTAAFL means that the universe as a whole is ultimately a closed system—there is no magic source of matter, energy, light, or indeed lunch, that does not draw resources from something else, and will not eventually be exhausted. Therefore the TANSTAAFL argument may also be applied to natural physical processes in a closed system (either the universe as a whole, or any system that does not receive energy or matter from outside). (See Second law of thermodynamics.) The bio-ecologist Barry Commoner used this concept as the last of his famous "Four Laws of Ecology". In mathematical finance, the term is also used as an informal synonym for the principle of no-arbitrage. This principle states that a combination of securities that has the same cash flows as another security must have the same net price in equilibrium. TANSTAAFL is sometimes used as a response to claims of the virtues of free software. Supporters of free software often counter that the use of the term "free" in this context is primarily a reference to a lack of constraint ("libre") rather than a lack of cost ("gratis"). Richard Stallman has described it as "free as in speech not as in beer". The prefix "TANSTAA-" is used in numerous other contexts as well to denote some immutable property of the system being discussed. For example, "TANSTAANFS" is used by Electrical Engineering professors to stand for "There Ain't No Such Thing As A Noise Free System". 62 There ain't no such thing as a free lunch References [1] Safire, William On Language; Words Left Out in the Cold" New York Times, 2-14-1993 (http:/ / query. nytimes. com/ gst/ fullpage. html?res=9F0CE7DF1138F937A25751C0A965958260) [2] Keyes, Ralph (2006). The Quote Verifier. New York: St. Martin's Press. p. 70. ISBN 978-0-312-34004-9. [3] Smith, Chrysti M. (2006). Verbivore's Feast: Second Course. Helena, MT: Farcountry Press. p. 131. ISBN 978-1-56037-404-6. [4] Gwartney, James D.; Richard Stroup, Dwight R. Lee (2005). Common Sense Economics. New York: St. Martin's Press. pp. 8–9. ISBN 0-312-33818-X. [5] McConnell, Campbell R.; Stanley L. Brue (2005). Economics: principles, problems, and policies (http:/ / books. google. com/ books?id=XzCE3CjiANwC& lpg=PA3& dq="free lunch" economics& pg=PA3#v=onepage& q="free lunch" economics& f=false). Boston: McGraw-Hill Irwin. p. 3. ISBN 978-0-07-281935-9. OCLC 314959936. . Retrieved 2009-12-10. [6] Kipling, Rudyard (1930). American Notes. Standard Book Company. (published in book form in 1930, based on essays that appeared in periodicals in 1891) • American Notes by Rudyard Kipling (http:/ / www. gutenberg. org/ etext/ 977) at Project Gutenberg [7] Heinlein, Robert A. (1997). The Moon Is a Harsh Mistress. New York: Tom Doherty Assocs.. pp. 8–9. ISBN 0-312-86355-1. [8] Shapiro, Fred (16 July 2009). "Quotes Uncovered: The Punchline, Please" (http:/ / freakonomics. blogs. nytimes. com/ 2009/ 07/ 16/ quotes-uncovered-the-punchline-please/ ). The New York Times – Freakonomics blog. . Retrieved 16 July 2009. [9] Fred R. Shapiro, ed. (2006). The Yale Book of Quotations. New Haven, CT: Yale Univ. Press. p. 478. ISBN 978-0-300-10798-2. [10] Dos Utt, Pierre (1949). TANSTAAFL: a plan for a new economic world order. Cairo Publications, Canton, OH. [11] http:/ / www. barrypopik. com/ index. php/ new_york_city/ entry/ no_more_free_lunch_fiorello_la_guardia/ [12] Fetridge, Robert H, "Along the Highways and Byways of Finance," The New York Times, Nov 12, 1950, p. 135 [13] Principles of Economics (4th edition), p. 4. • Tucker, Bob, (Wilson Tucker) The Neo-Fan's Guide to Science Fiction Fandom (3rd–8th Editions), 8th edition: 1996, Kansas City Science Fiction & Fantasy Society, KaCSFFS Press, No ISSN or ISBN listed. Fitness landscape In evolutionary biology, fitness landscapes or adaptive landscapes are used to visualize the relationship between genotypes (or phenotypes) and reproductive success. It is assumed that every genotype has a well-defined replication rate (often referred to as fitness). This fitness is the "height" of the landscape. Genotypes which are very similar are said to be "close" to each other, while those that are very different are "far" from each other. The two concepts of height and distance are sufficient to form the concept of a "landscape". The set of all possible genotypes, their degree of similarity, and their related fitness values is then called a fitness landscape. The idea of a fitness landscape helps explain flawed forms in evolution, including exploits and glitches in animals like their reactions to supernormal stimuli. In evolutionary optimization problems, fitness landscapes are evaluations of a fitness function for all candidate solutions (see below). The idea of studying evolution by visualizing the distribution of fitness values as a kind of landscape was first introduced by Sewall Wright in 1932.[1] Fitness landscapes in biology Fitness landscapes are often conceived of as ranges of mountains. There exist local peaks (points from which all paths are downhill, i.e. to lower fitness) and valleys (regions from which most paths lead uphill). A fitness landscape with many local peaks surrounded by deep valleys is called rugged. If all genotypes have the same replication rate, on the other hand, a fitness landscape is said to be flat. The shapes of fitness landscapes are also closely related to epistasis, as demonstrated by Stuart Kauffman's NK-Landscape model. An evolving population typically climbs uphill in the fitness landscape, by a series of small genetic changes, until a local optimum is reached (Fig. 1). There it remains, unless a rare mutation opens a path to a new, higher fitness peak. Note, however, that at high mutation rates this picture is somewhat simplistic. A population may not be able to climb a very sharp peak if the mutation rate is too high, or it may drift away from a peak it had already found; 63 Fitness landscape consequently, reducing the fitness of the system. The process of drifting away from a peak is often referred to as Muller's ratchet. The apparent lack of wheeled animals is an example of a fitness peak which is presently inaccessible due to a surrounding valley. In general, the higher the connectivity the more rugged the system becomes. Thus, a simply connected system only has one peak and if part of the system is changed then there will be little, if any, effect on any other part of the system. A high connectivity implies that the variables or sub-systems interact far more and the system may have to settle for a level of ‘fitness’ lower than it might be able to attain. The system would then have to change its approach to overcoming whatever problems that confront it, thus, changing the ‘terrain’ and enabling it to continue. Fitness landscapes in evolutionary optimization Apart from the field of evolutionary biology, the concept of a fitness landscape has also gained importance in evolutionary optimization methods such as genetic algorithms or evolutionary strategies. In evolutionary optimization, one tries to solve real-world problems (e.g., engineering or logistics problems) by imitating the dynamics of biological evolution. For example, a delivery truck with a number of destination addresses can take a large variety of different routes, but only very few will result in a short driving time. In order to use evolutionary optimization, one has to define for every possible solution s to the problem of interest (i.e., every possible route in the case of the delivery truck) how 'good' it is. This is done by introducing a scalar-valued function f(s) (scalar valued means that f(s) is a simple number, such as 0.3, while s can be a more complicated object, for example a list of destination addresses in the case of the delivery truck), which is called the fitness function or fitness landscape. A high f(s) implies that s is a good solution. In the case of the delivery truck, f(s) could be the number of deliveries per hour on route s. The best, or at least a very good, solution is then found in the following way: initially, a population of random solutions is created. Then, the solutions are mutated and selected for those with higher fitness, until a satisfying solution has been found. Evolutionary optimization techniques are particularly useful in situations in which it is easy to determine the quality of a single solution, but hard to go through all possible solutions one by one (it is easy to determine the driving time for a particular route of the delivery truck, but it is almost impossible to check all possible routes once the number of destinations grows to more than a handful). The concept of a scalar valued fitness function f(s) also corresponds to the concept of a potential or energy function in physics. The two concepts only differ in that physicists traditionally think in terms of minimizing the potential function, while biologists prefer the notion that fitness is being maximized. Therefore, taking the inverse of a potential function turns it into a fitness function, and vice versa. Figure 1: Sketch of a fitness landscape. The arrows indicate the preferred flow of a population on the landscape, and the points A and C are local optima. The red ball indicates a population that moves from a very low fitness value to the top of a peak. 64 Fitness landscape References [1] Wright, S. (1932). "The roles of mutation, inbreeding, crossbreeding, and selection in evolution" (http:/ / www. blackwellpublishing. com/ ridley/ classictexts/ wright. pdf). Proceedings of the Sixth International Congress on Genetics. pp. 355–366. . Further reading • Niko Beerenwinkel; Lior Pachter; Bernd Sturmfels (2007). "Epistasis and Shapes of Fitness Landscapes". Statistica Sinica 17 (4): 1317–1342. arXiv:q-bio.PE/0603034. MR2398598. • Richard Dawkins (1996). Climbing Mount Improbable. ISBN 0-393-03930-7. • Sergey Gavrilets (2004). Fitness landscapes and the origin of species (http://press.princeton.edu/titles/7799. html). ISBN 978-0-691-11983-0. • Stuart Kauffman (1995). At Home in the Universe: The Search for Laws of Self-Organization and Complexity. ISBN 978-0-19-511130-9. • Melanie Mitchell (1996). An Introduction to Genetic Algorithms. ISBN 978-0-262-63185-3. • W. B. Langdon and R. Poli (2002). "Chapter 2 Fitness Landscapes" (http://www.cs.ucl.ac.uk/staff/W. Langdon/FOGP/intro_pic/landscape.html). ISBN 3-540-42451-2. • Stuart Kauffman (1993). The Origins of Order. ISBN 978-0-19-507951-7. Genetic algorithm In the computer science field of artificial intelligence, a genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. Methodology In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached. Genetic algorithms find application in bioinformatics, phylogenetics, computational science, engineering, economics, chemistry, manufacturing, mathematics, physics and other fields. A typical genetic algorithm requires: 1. a genetic representation of the solution domain, 2. a fitness function to evaluate the solution domain. A standard representation of the solution is as an array of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations. Variable length 65 Genetic algorithm representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in genetic programming and graph-form representations are explored in evolutionary programming. The fitness function is defined over the genetic representation and measures the quality of the represented solution. The fitness function is always problem dependent. For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The fitness of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise. In some problems, it is hard or even impossible to define the fitness expression; in these cases, interactive genetic algorithms are used. Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions (usually randomly) and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators. Initialization Initially many individual solutions are (usually) randomly generated to form an initial population. The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Traditionally, the population is generated randomly, allowing the entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found. Selection During each successive generation, a proportion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as the latter process may be very time-consuming. Reproduction The next step is to generate a second generation population of solutions from those selected through genetic operators: crossover (also called recombination), and/or mutation. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. Although reproduction methods that are based on the use of two parents are more "biology inspired", some research[1][2] suggests that more than two "parents" generate higher quality chromosomes. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions, for reasons already mentioned above. Although Crossover and Mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms.[3] 66 Genetic algorithm Termination This generational process is repeated until a termination condition has been reached. Common terminating conditions are: • • • • A solution is found that satisfies minimum criteria Fixed number of generations reached Allocated budget (computation time/money) reached The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results • Manual inspection • Combinations of the above Simple generational genetic algorithm procedure: 1. Choose the initial population of individuals 2. Evaluate the fitness of each individual in that population 3. Repeat on this generation until termination (time limit, sufficient fitness achieved, etc.): 1. Select the best-fit individuals for reproduction 2. Breed new individuals through crossover and mutation operations to give birth to offspring 3. Evaluate the individual fitness of new individuals 4. Replace least-fit population with new individuals The building block hypothesis Genetic algorithms are simple to implement, but their behavior is difficult to understand. In particular it is difficult to understand why these algorithms frequently succeed at generating solutions of high fitness when applied to practical problems. The building block hypothesis (BBH) consists of: 1. A description of a heuristic that performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length schemata with above average fitness. 2. A hypothesis that a genetic algorithm performs adaptation by implicitly and efficiently implementing this heuristic. Goldberg describes the heuristic as follows: "Short, low order, and highly fit schemata are sampled, recombined [crossed over], and resampled to form strings of potentially higher fitness. In a way, by working with these particular schemata [the building blocks], we have reduced the complexity of our problem; instead of building high-performance strings by trying every conceivable combination, we construct better and better strings from the best partial solutions of past samplings. "Because highly fit schemata of low defining length and low order play such an important role in the action of genetic algorithms, we have already given them a special name: building blocks. Just as a child creates magnificent fortresses through the arrangement of simple blocks of wood, so does a genetic algorithm seek near optimal performance through the juxtaposition of short, low-order, high-performance schemata, or building blocks."[4] 67 Genetic algorithm Observations There are several general observations about the generation of solutions specifically via a genetic algorithm: • Selection is clearly an important genetic operator, but opinion is divided over the importance of crossover versus mutation. Some argue that crossover is the most important, while mutation is only necessary to ensure that potential solutions are not lost. Others argue that crossover in a largely uniform population only serves to propagate innovations originally found by mutation, and in a non-uniform population crossover is nearly always equivalent to a very large mutation (which is likely to be catastrophic). There are many references in Fogel (2006) that support the importance of mutation-based search. • As with all current machine learning problems it is worth tuning the parameters such as mutation probability, crossover probability and population size to find reasonable settings for the problem class being worked on. A very small mutation rate may lead to genetic drift (which is non-ergodic in nature). A recombination rate that is too high may lead to premature convergence of the genetic algorithm. A mutation rate that is too high may lead to loss of good solutions unless there is elitist selection. There are theoretical but not yet practical upper and lower bounds for these parameters that can help guide selection. • Often, GAs can rapidly locate good solutions, even for large search spaces. The same is of course also true for evolution strategies and evolutionary programming. Criticisms There are several criticisms of the use of a genetic algorithm compared to alternative optimization algorithms: • Repeated fitness function evaluation for complex problems is often the most prohibitive and limiting segment of artificial evolutionary algorithms. Finding the optimal solution to complex high dimensional, multimodal problems often requires very expensive fitness function evaluations. In real world problems such as structural optimization problems, one single function evaluation may require several hours to several days of complete simulation. Typical optimization methods can not deal with such types of problem. In this case, it may be necessary to forgo an exact evaluation and use an approximated fitness that is computationally efficient. It is apparent that amalgamation of approximate models may be one of the most promising approaches to convincingly use GA to solve complex real life problems. • Genetic algorithms do not scale well with complexity. That is, where the number of elements which are exposed to mutation is large there is often an exponential increase in search space size. This makes it extremely difficult to use the technique on problems such as designing an engine, a house or plane. In order to make such problems tractable to evolutionary search, they must be broken down into the simplest representation possible. Hence we typically see evolutionary algorithms encoding designs for fan blades instead of engines, building shapes instead of detailed construction plans, aerofoils instead of whole aircraft designs. The second problem of complexity is the issue of how to protect parts that have evolved to represent good solutions from further destructive mutation, particularly when their fitness assessment requires them to combine well with other parts. It has been suggested by some in the community that a developmental approach to evolved solutions could overcome some of the issues of protection, but this remains an open research question. • The "better" solution is only in comparison to other solutions. As a result, the stop criterion is not clear in every problem. • In many problems, GAs may have a tendency to converge towards local optima or even arbitrary points rather than the global optimum of the problem. This means that it does not "know how" to sacrifice short-term fitness to gain longer-term fitness. The likelihood of this occurring depends on the shape of the fitness landscape: certain problems may provide an easy ascent towards a global optimum, others may make it easier for the function to find the local optima. This problem may be alleviated by using a different fitness function, increasing the rate of mutation, or by using selection techniques that maintain a diverse population of solutions, although the No Free 68 Genetic algorithm Lunch theorem[5] proves that there is no general solution to this problem. A common technique to maintain diversity is to impose a "niche penalty", wherein, any group of individuals of sufficient similarity (niche radius) have a penalty added, which will reduce the representation of that group in subsequent generations, permitting other (less similar) individuals to be maintained in the population. This trick, however, may not be effective, depending on the landscape of the problem. Another possible technique would be to simply replace part of the population with randomly generated individuals, when most of the population is too similar to each other. Diversity is important in genetic algorithms (and genetic programming) because crossing over a homogeneous population does not yield new solutions. In evolution strategies and evolutionary programming, diversity is not essential because of a greater reliance on mutation. • Operating on dynamic data sets is difficult, as genomes begin to converge early on towards solutions which may no longer be valid for later data. Several methods have been proposed to remedy this by increasing genetic diversity somehow and preventing early convergence, either by increasing the probability of mutation when the solution quality drops (called triggered hypermutation), or by occasionally introducing entirely new, randomly generated elements into the gene pool (called random immigrants). Again, evolution strategies and evolutionary programming can be implemented with a so-called "comma strategy" in which parents are not maintained and new parents are selected only from offspring. This can be more effective on dynamic problems. • GAs cannot effectively solve problems in which the only fitness measure is a single right/wrong measure (like decision problems), as there is no way to converge on the solution (no hill to climb). In these cases, a random search may find a solution as quickly as a GA. However, if the situation allows the success/failure trial to be repeated giving (possibly) different results, then the ratio of successes to failures provides a suitable fitness measure. • For specific optimization problems and problem instances, other optimization algorithms may find better solutions than genetic algorithms (given the same amount of computation time). Alternative and complementary algorithms include evolution strategies, evolutionary programming, simulated annealing, Gaussian adaptation, hill climbing, and swarm intelligence (e.g.: ant colony optimization, particle swarm optimization) and methods based on integer linear programming. The question of which, if any, problems are suited to genetic algorithms (in the sense that such algorithms are better than others) is open and controversial. Variants The simplest algorithm represents each chromosome as a bit string. Typically, numeric parameters can be represented by integers, though it is possible to use floating point representations. The floating point representation is natural to evolution strategies and evolutionary programming. The notion of real-valued genetic algorithms has been offered but is really a misnomer because it does not really represent the building block theory that was proposed by John Henry Holland in the 1970s. This theory is not without support though, based on theoretical and experimental results (see below). The basic algorithm performs crossover and mutation at the bit level. Other variants treat the chromosome as a list of numbers which are indexes into an instruction table, nodes in a linked list, hashes, objects, or any other imaginable data structure. Crossover and mutation are performed so as to respect data element boundaries. For most data types, specific variation operators can be designed. Different chromosomal data types seem to work better or worse for different specific problem domains. When bit-string representations of integers are used, Gray coding is often employed. In this way, small changes in the integer can be readily effected through mutations or crossovers. This has been found to help prevent premature convergence at so called Hamming walls, in which too many simultaneous mutations (or crossover events) must occur in order to change the chromosome to a better solution. Other approaches involve using arrays of real-valued numbers instead of bit strings to represent chromosomes. Theoretically, the smaller the alphabet, the better the performance, but paradoxically, good results have been obtained from using real-valued chromosomes. 69 Genetic algorithm A very successful (slight) variant of the general process of constructing a new population is to allow some of the better organisms from the current generation to carry over to the next, unaltered. This strategy is known as elitist selection. Parallel implementations of genetic algorithms come in two flavours. Coarse-grained parallel genetic algorithms assume a population on each of the computer nodes and migration of individuals among the nodes. Fine-grained parallel genetic algorithms assume an individual on each processor node which acts with neighboring individuals for selection and reproduction. Other variants, like genetic algorithms for online optimization problems, introduce time-dependence or noise in the fitness function. Genetic algorithms with adaptive parameters (adaptive genetic algorithms, AGAs) is another significant and promising variant of genetic algorithms. The probabilities of crossover (pc) and mutation (pm) greatly determine the degree of solution accuracy and the convergence speed that genetic algorithms can obtain. Instead of using fixed values of pc and pm, AGAs utilize the population information in each generation and adaptively adjust the pc and pm in order to maintain the population diversity as well as to sustain the convergence capacity. In AGA (adaptive genetic algorithm),[6] the adjustment of pc and pm depends on the fitness values of the solutions. In CAGA (clustering-based adaptive genetic algorithm),[7] through the use of clustering analysis to judge the optimization states of the population, the adjustment of pc and pm depends on these optimization states. It can be quite effective to combine GA with other optimization methods. GA tends to be quite good at finding generally good global solutions, but quite inefficient at finding the last few mutations to find the absolute optimum. Other techniques (such as simple hill climbing) are quite efficient at finding absolute optimum in a limited region. Alternating GA and hill climbing can improve the efficiency of GA while overcoming the lack of robustness of hill climbing. This means that the rules of genetic variation may have a different meaning in the natural case. For instance – provided that steps are stored in consecutive order – crossing over may sum a number of steps from maternal DNA adding a number of steps from paternal DNA and so on. This is like adding vectors that more probably may follow a ridge in the phenotypic landscape. Thus, the efficiency of the process may be increased by many orders of magnitude. Moreover, the inversion operator has the opportunity to place steps in consecutive order or any other suitable order in favour of survival or efficiency. (See for instance [8] or example in travelling salesman problem, in particular the use of an edge recombination operator.) A variation, where the population as a whole is evolved rather than its individual members, is known as gene pool recombination. Linkage-learning A number of variations have been developed to attempt to improve performance of GAs on problems with a high degree of fitness epistasis, i.e. where the fitness of a solution consists of interacting subsets of its variables. Such algorithms aim to learn (before exploiting) these beneficial phenotypic interactions. As such, they are aligned with the Building Block Hypothesis in adaptively reducing disruptive recombination. Prominent examples of this approach include the mGA,[9] GEMGA[10] and LLGA.[11] 70 Genetic algorithm Problem domains Problems which appear to be particularly appropriate for solution by genetic algorithms include timetabling and scheduling problems, and many scheduling software packages are based on GAs. GAs have also been applied to engineering. Genetic algorithms are often applied as an approach to solve global optimization problems. As a general rule of thumb genetic algorithms might be useful in problem domains that have a complex fitness landscape as mixing, i.e., mutation in combination with crossover, is designed to move the population away from local optima that a traditional hill climbing algorithm might get stuck in. Observe that commonly used crossover operators cannot change any uniform population. Mutation alone can provide ergodicity of the overall genetic algorithm process (seen as a Markov chain). Examples of problems solved by genetic algorithms include: mirrors designed to funnel sunlight to a solar collector, antennae designed to pick up radio signals in space, and walking methods for computer figures. Many of their solutions have been highly effective, unlike anything a human engineer would have produced, and inscrutable as to how they arrived at that solution. History Computer simulations of evolution started as early as in 1954 with the work of Nils Aall Barricelli, who was using the computer at the Institute for Advanced Study in Princeton, New Jersey.[12][13] His 1954 publication was not widely noticed. Starting in 1957,[14] the Australian quantitative geneticist Alex Fraser published a series of papers on simulation of artificial selection of organisms with multiple loci controlling a measurable trait. From these beginnings, computer simulation of evolution by biologists became more common in the early 1960s, and the methods were described in books by Fraser and Burnell (1970)[15] and Crosby (1973).[16] Fraser's simulations included all of the essential elements of modern genetic algorithms. In addition, Hans-Joachim Bremermann published a series of papers in the 1960s that also adopted a population of solution to optimization problems, undergoing recombination, mutation, and selection. Bremermann's research also included the elements of modern genetic algorithms.[17] Other noteworthy early pioneers include Richard Friedberg, George Friedman, and Michael Conrad. Many early papers are reprinted by Fogel (1998).[18] Although Barricelli, in work he reported in 1963, had simulated the evolution of ability to play a simple game,[19] artificial evolution became a widely recognized optimization method as a result of the work of Ingo Rechenberg and Hans-Paul Schwefel in the 1960s and early 1970s – Rechenberg's group was able to solve complex engineering problems through evolution strategies.[20][21][22][23] Another approach was the evolutionary programming technique of Lawrence J. Fogel, which was proposed for generating artificial intelligence. Evolutionary programming originally used finite state machines for predicting environments, and used variation and selection to optimize the predictive logics. Genetic algorithms in particular became popular through the work of John Holland in the early 1970s, and particularly his book Adaptation in Natural and Artificial Systems (1975). His work originated with studies of cellular automata, conducted by Holland and his students at the University of Michigan. Holland introduced a formalized framework for predicting the quality of the next generation, known as Holland's Schema Theorem. Research in GAs remained largely theoretical until the mid-1980s, when The First International Conference on Genetic Algorithms was held in Pittsburgh, Pennsylvania. As academic interest grew, the dramatic increase in desktop computational power allowed for practical application of the new technique. In the late 1980s, General Electric started selling the world's first genetic algorithm product, a mainframe-based toolkit designed for industrial processes. In 1989, Axcelis, Inc. released Evolver, the world's first commercial GA product for desktop computers. The New York Times technology writer John Markoff wrote[24] about Evolver in 1990. 71 Genetic algorithm Related techniques Parent fields Genetic algorithms are a sub-field of: • Evolutionary algorithms • Evolutionary computing • Metaheuristics • Stochastic optimization • Optimization Related fields Evolutionary algorithms Evolutionary algorithms is a sub-field of evolutionary computing. • Evolution strategies (ES, see Rechenberg, 1994) evolve individuals by means of mutation and intermediate or discrete recombination. ES algorithms are designed particularly to solve problems in the real-value domain. They use self-adaptation to adjust control parameters of the search. De-randomization of self-adaptation has led to the contemporary Covariance Matrix Adaptation Evolution Strategy (CMA-ES). • Evolutionary programming (EP) involves populations of solutions with primarily mutation and selection and arbitrary representations. They use self-adaptation to adjust parameters, and can include other variation operations such as combining information from multiple parents. • Genetic programming (GP) is a related technique popularized by John Koza in which computer programs, rather than function parameters, are optimized. Genetic programming often uses tree-based internal data structures to represent the computer programs for adaptation instead of the list structures typical of genetic algorithms. • Grouping genetic algorithm (GGA) is an evolution of the GA where the focus is shifted from individual items, like in classical GAs, to groups or subset of items.[25] The idea behind this GA evolution proposed by Emanuel Falkenauer is that solving some complex problems, a.k.a. clustering or partitioning problems where a set of items must be split into disjoint group of items in an optimal way, would better be achieved by making characteristics of the groups of items equivalent to genes. These kind of problems include bin packing, line balancing, clustering with respect to a distance measure, equal piles, etc., on which classic GAs proved to perform poorly. Making genes equivalent to groups implies chromosomes that are in general of variable length, and special genetic operators that manipulate whole groups of items. For bin packing in particular, a GGA hybridized with the Dominance Criterion of Martello and Toth, is arguably the best technique to date. • Interactive evolutionary algorithms are evolutionary algorithms that use human evaluation. They are usually applied to domains where it is hard to design a computational fitness function, for example, evolving images, music, artistic designs and forms to fit users' aesthetic preference. 72 Genetic algorithm Swarm intelligence Swarm intelligence is a sub-field of evolutionary computing. • Ant colony optimization (ACO) uses many ants (or agents) to traverse the solution space and find locally productive areas. While usually inferior to genetic algorithms and other forms of local search, it is able to produce results in problems where no global or up-to-date perspective can be obtained, and thus the other methods cannot be applied. • Particle swarm optimization (PSO) is a computational method for multi-parameter optimization which also uses population-based approach. A population (swarm) of candidate solutions (particles) moves in the search space, and the movement of the particles is influenced both by their own best known position and swarm's global best known position. Like genetic algorithms, the PSO method depends on information sharing among population members. In some problems the PSO is often more computationally efficient than the GAs, especially in unconstrained problems with continuous variables.[26] • Intelligent Water Drops or the IWD algorithm [27] is a nature-inspired optimization algorithm inspired from natural water drops which change their environment to find the near optimal or optimal path to their destination. The memory is the river's bed and what is modified by the water drops is the amount of soil on the river's bed. Other evolutionary computing algorithms Evolutionary computation is a sub-field of the metaheuristic methods. • Harmony search (HS) is an algorithm mimicking musicians' behaviours in the process of improvisation. • Memetic algorithm (MA), also called hybrid genetic algorithm among others, is a relatively new evolutionary method where local search is applied during the evolutionary cycle. The idea of memetic algorithms comes from memes, which unlike genes, can adapt themselves. In some problem areas they are shown to be more efficient than traditional evolutionary algorithms. • Bacteriologic algorithms (BA) inspired by evolutionary ecology and, more particularly, bacteriologic adaptation. Evolutionary ecology is the study of living organisms in the context of their environment, with the aim of discovering how they adapt. Its basic concept is that in a heterogeneous environment, you can't find one individual that fits the whole environment. So, you need to reason at the population level. It is also believed BAs could be successfully applied to complex positioning problems (antennas for cell phones, urban planning, and so on) or data mining.[28] • Cultural algorithm (CA) consists of the population component almost identical to that of the genetic algorithm and, in addition, a knowledge component called the belief space. • Gaussian adaptation (normal or natural adaptation, abbreviated NA to avoid confusion with GA) is intended for the maximisation of manufacturing yield of signal processing systems. It may also be used for ordinary parametric optimisation. It relies on a certain theorem valid for all regions of acceptability and all Gaussian distributions. The efficiency of NA relies on information theory and a certain theorem of efficiency. Its efficiency is defined as information divided by the work needed to get the information.[29] Because NA maximises mean fitness rather than the fitness of the individual, the landscape is smoothed such that valleys between peaks may disappear. Therefore it has a certain “ambition” to avoid local peaks in the fitness landscape. NA is also good at climbing sharp crests by adaptation of the moment matrix, because NA may maximise the disorder (average information) of the Gaussian simultaneously keeping the mean fitness constant. 73 Genetic algorithm Other metaheuristic methods Metaheuristic methods broadly fall within stochastic optimisation methods. • Simulated annealing (SA) is a related global optimization technique that traverses the search space by testing random mutations on an individual solution. A mutation that increases fitness is always accepted. A mutation that lowers fitness is accepted probabilistically based on the difference in fitness and a decreasing temperature parameter. In SA parlance, one speaks of seeking the lowest energy instead of the maximum fitness. SA can also be used within a standard GA algorithm by starting with a relatively high rate of mutation and decreasing it over time along a given schedule. • Tabu search (TS) is similar to simulated annealing in that both traverse the solution space by testing mutations of an individual solution. While simulated annealing generates only one mutated solution, tabu search generates many mutated solutions and moves to the solution with the lowest energy of those generated. In order to prevent cycling and encourage greater movement through the solution space, a tabu list is maintained of partial or complete solutions. It is forbidden to move to a solution that contains elements of the tabu list, which is updated as the solution traverses the solution space. • Extremal optimization (EO) Unlike GAs, which work with a population of candidate solutions, EO evolves a single solution and makes local modifications to the worst components. This requires that a suitable representation be selected which permits individual solution components to be assigned a quality measure ("fitness"). The governing principle behind this algorithm is that of emergent improvement through selectively removing low-quality components and replacing them with a randomly selected component. This is decidedly at odds with a GA that selects good solutions in an attempt to make better solutions. Other stochastic optimisation methods • The cross-entropy (CE) method generates candidates solutions via a parameterized probability distribution. The parameters are updated via cross-entropy minimization, so as to generate better samples in the next iteration. • Reactive search optimization (RSO) advocates the integration of sub-symbolic machine learning techniques into search heuristics for solving complex optimization problems. The word reactive hints at a ready response to events during the search through an internal online feedback loop for the self-tuning of critical parameters. Methodologies of interest for Reactive Search include machine learning and statistics, in particular reinforcement learning, active or query learning, neural networks, and meta-heuristics. References [1] Eiben, A. E. et al (1994). "Genetic algorithms with multi-parent recombination". PPSN III: Proceedings of the International Conference on Evolutionary Computation. The Third Conference on Parallel Problem Solving from Nature: 78–87. ISBN 3-540-58484-6. [2] Ting, Chuan-Kang (2005). "On the Mean Convergence Time of Multi-parent Genetic Algorithms Without Selection". Advances in Artificial Life: 403–412. ISBN 978-3-540-28848-0. [3] Akbari, Ziarati (2010). "A multilevel evolutionary algorithm for optimizing numerical functions" IJIEC 2 (2011): 419–430 (http:/ / growingscience. com/ ijiec/ Vol2/ IJIEC_2010_11. pdf) [4] Goldberg, David E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Addison Wesley. p. 41. ISBN 0-201-15767-5. [5] Wolpert, D.H., Macready, W.G., 1995. No Free Lunch Theorems for Optimisation. Santa Fe Institute, SFI-TR-05-010, Santa Fe. [6] Srinivas. M and Patnaik. L, "Adaptive probabilities of crossover and mutation in genetic algorithms," IEEE Transactions on System, Man and Cybernetics, vol.24, no.4, pp.656–667, 1994. (http:/ / ieeexplore. ieee. org/ xpls/ abs_all. jsp?arnumber=286385) [7] ZHANG. J, Chung. H and Lo. W. L, “Clustering-Based Adaptive Crossover and Mutation Probabilities for Genetic Algorithms”, IEEE Transactions on Evolutionary Computation vol.11, no.3, pp. 326–335, 2007. (http:/ / ieeexplore. ieee. org/ xpls/ abs_all. jsp?arnumber=4220690) [8] Evolution-in-a-nutshell (http:/ / web. telia. com/ ~u91131915/ traveller. htm) [9] D.E. Goldberg, B. Korb, and K. Deb. "Messy genetic algorithms: Motivation, analysis, and first results". Complex Systems, 5(3):493–530, October 1989. (http:/ / www. complex-systems. com/ issues/ 03-5. html) [10] Gene expression: The missing link in evolutionary computation [11] G. Harik. Learning linkage to efficiently solve problems of bounded difficulty using genetic algorithms. PhD thesis, Dept. Computer Science, University of Michigan, Ann Arbour, 1997 (http:/ / portal. acm. org/ citation. cfm?id=269517) 74 Genetic algorithm [12] Barricelli, Nils Aall (1954). "Esempi numerici di processi di evoluzione". Methodos: 45–68. [13] Barricelli, Nils Aall (1957). "Symbiogenetic evolution processes realized by artificial methods". Methodos: 143–182. [14] Fraser, Alex (1957). "Simulation of genetic systems by automatic digital computers. I. Introduction". Aust. J. Biol. Sci. 10: 484–491. [15] Fraser, Alex; Donald Burnell (1970). Computer Models in Genetics. New York: McGraw-Hill. ISBN 0-07-021904-4. [16] Crosby, Jack L. (1973). Computer Simulation in Genetics. London: John Wiley & Sons. ISBN 0-471-18880-8. [17] 02.27.96 - UC Berkeley's Hans Bremermann, professor emeritus and pioneer in mathematical biology, has died at 69 (http:/ / berkeley. edu/ news/ media/ releases/ 96legacy/ releases. 96/ 14319. html) [18] Fogel, David B. (editor) (1998). Evolutionary Computation: The Fossil Record. New York: IEEE Press. ISBN 0-7803-3481-7. [19] Barricelli, Nils Aall (1963). "Numerical testing of evolution theories. Part II. Preliminary tests of performance, symbiogenesis and terrestrial life". Acta Biotheoretica (16): 99–126. [20] Rechenberg, Ingo (1973). Evolutionsstrategie. Stuttgart: Holzmann-Froboog. ISBN 3-7728-0373-3. [21] Schwefel, Hans-Paul (1974). Numerische Optimierung von Computer-Modellen (PhD thesis). [22] Schwefel, Hans-Paul (1977). Numerische Optimierung von Computor-Modellen mittels der Evolutionsstrategie : mit einer vergleichenden Einführung in die Hill-Climbing- und Zufallsstrategie. Basel; Stuttgart: Birkhäuser. ISBN 3-7643-0876-1. [23] Schwefel, Hans-Paul (1981). Numerical optimization of computer models (Translation of 1977 Numerische Optimierung von Computor-Modellen mittels der Evolutionsstrategie. Chichester ; New York: Wiley. ISBN 0-471-09988-0. [24] Markoff, John (1990-08-29). "What's the Best Answer? It's Survival of the Fittest" (http:/ / www. nytimes. com/ 1990/ 08/ 29/ business/ business-technology-what-s-the-best-answer-it-s-survival-of-the-fittest. html). New York Times. . Retrieved 2009-08-09. [25] Falkenauer, Emanuel (1997). Genetic Algorithms and Grouping Problems. Chichester, England: John Wiley & Sons Ltd. ISBN 978-0-471-97150-4. [26] Rania Hassan, Babak Cohanim, Olivier de Weck, Gerhard Vente r (2005) A comparison of particle swarm optimization and the genetic algorithm (http:/ / www. mit. edu/ ~deweck/ PDF_archive/ 3 Refereed Conference/ 3_50_AIAA-2005-1897. pdf) [27] Hamed Shah-Hosseini, The intelligent water drops algorithm: a nature-inspired swarm-based optimization algorithm, International Journal of Bio-Inspired Computation (IJBIC), vol. 1, no. ½, 2009, (http:/ / inderscience. metapress. com/ media/ g3t6qnluqp0uc9j3kg0v/ contributions/ a/ 4/ 0/ 6/ a4065612210t6130. pdf) [28] Baudry, Benoit; Franck Fleurey, Jean-Marc Jézéquel, and Yves Le Traon (March/April 2005). "Automatic Test Case Optimization: A Bacteriologic Algorithm" (http:/ / www. irisa. fr/ triskell/ publis/ 2005/ Baudry05d. pdf) (PDF). IEEE Software (IEEE Computer Society) 22 (2): 76–82. doi:10.1109/MS.2005.30. . Retrieved 2009-08-09. [29] Kjellström, G. (December 1991). "On the Efficiency of Gaussian Adaptation". Journal of Optimization Theory and Applications 71 (3): 589–597. doi:10.1007/BF00941405. Bibliography • Banzhaf, Wolfgang; Nordin, Peter; Keller, Robert; Francone, Frank (1998) Genetic Programming – An Introduction, Morgan Kaufmann, San Francisco, CA. • Bies, Robert R; Muldoon, Matthew F; Pollock, Bruce G; Manuck, Steven; Smith, Gwenn and Sale, Mark E (2006). "A Genetic Algorithm-Based, Hybrid Machine Learning Approach to Model Selection". Journal of Pharmacokinetics and Pharmacodynamics (Netherlands: Springer): 196–221. • Cha, Sung-Hyuk; Tappert, Charles C (2009). "A Genetic Algorithm for Constructing Compact Binary Decision Trees" (http://www.jprr.org/index.php/jprr/article/view/44/25). Journal of Pattern Recognition Research (http://www.jprr.org/index.php/jprr) 4 (1): 1–13. • Fraser, Alex S. (1957). "Simulation of Genetic Systems by Automatic Digital Computers. I. Introduction". Australian Journal of Biological Sciences 10: 484–491. • Goldberg, David E (1989), Genetic Algorithms in Search, Optimization and Machine Learning, Kluwer Academic Publishers, Boston, MA. • Goldberg, David E (2002), The Design of Innovation: Lessons from and for Competent Genetic Algorithms, Addison-Wesley, Reading, MA. • Fogel, David B (2006), Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, IEEE Press, Piscataway, NJ. Third Edition • Holland, John H (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor • Koza, John (1992), Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press. ISBN 0-262-11170-5 • Michalewicz, Zbigniew (1999), Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag. • Mitchell, Melanie, (1996), An Introduction to Genetic Algorithms, MIT Press, Cambridge, MA. 75 Genetic algorithm • Poli, R., Langdon, W. B., McPhee, N. F. (2008). A Field Guide to Genetic Programming. Lulu.com, freely available from the internet. ISBN 978-1-4092-0073-4. • Rechenberg, Ingo (1994): Evolutionsstrategie '94, Stuttgart: Fromman-Holzboog. • Schmitt, Lothar M; Nehaniv, Chrystopher L; Fujii, Robert H (1998), Linear analysis of genetic algorithms, Theoretical Computer Science 208: 111–148 • Schmitt, Lothar M (2001), Theory of Genetic Algorithms, Theoretical Computer Science 259: 1–61 • Schmitt, Lothar M (2004), Theory of Genetic Algorithms II: models for genetic operators over the string-tensor representation of populations and convergence to global optima for arbitrary fitness function under scaling, Theoretical Computer Science 310: 181–231 • Schwefel, Hans-Paul (1974): Numerische Optimierung von Computer-Modellen (PhD thesis). Reprinted by Birkhäuser (1977). • Vose, Michael D (1999), The Simple Genetic Algorithm: Foundations and Theory, MIT Press, Cambridge, MA. • Whitley, D. (1994). A genetic algorithm tutorial. Statistics and Computing 4, 65–85. • Hingston,Philip F.; Barone, Luigi C.; Michalewicz, Zbigniew (2008) Design by Evolution: Advances in Evolutionary Design:297 • Eiben,Agoston E.; Smith, James E. (2003) Introduction to Evolutionary Computing External links Resources • DigitalBiology.NET (http://www.digitalbiology.net/) Vertical search engine for GA/GP resources • Genetic Algorithms Index (http://www.geneticprogramming.com/ga/index.htm) The site Genetic Programming Notebook provides a structured resource pointer to web pages in genetic algorithms field Tutorials • Genetic Algorithms Computer programs that "evolve" in ways that resemble natural selection can solve complex problems even their creators do not fully understand (http://www2.econ.iastate.edu/tesfatsi/holland.gaintro. htm) An excellent introduction to GA by John Holland and with an application to the Prisoner's Dilemma • An online interactive GA demonstrator to practise or learn how a GA works. (http://userweb.elec.gla.ac.uk/y/ yunli/ga_demo/) Learn step by step or watch global convergence in batch, change population size, crossover rate, mutation rate and selection mechanism, and add constraints. • A Genetic Algorithm Tutorial by Darrell Whitley Computer Science Department Colorado State University (http:/ /samizdat.mines.edu/ga_tutorial/ga_tutorial.ps) An excellent tutorial with lots of theory • "Essentials of Metaheuristics" (http://cs.gmu.edu/~sean/book/metaheuristics/), 2009 (225 p). Free open text by Sean Luke. • Global Optimization Algorithms – Theory and Application (http://www.it-weise.de/projects/book.pdf) • "Demystifying Genetic Algorithms" (http://www.leolol.com/drupal/tutorials/theory/ genetic-algorithms-tutorial-part-1-computer-theory) Tutorial on how Genetic Algorithms work, with examples. 76 Genetic algorithm 77 Examples • Introduction to Genetic Algorithms with interactive Java applets. (http://www.obitko.com/tutorials/ genetic-algorithms/) For experimenting with GAs online. • Cross discipline example applications for GAs with references. (http://www.talkorigins.org/faqs/genalg/ genalg.html) • An interactive applet featuring evolving vehicles. (http://boxcar2d.com/) Toy block Toy blocks (also building bricks, building blocks, or simply blocks), are wooden, plastic or foam pieces of various shapes (square, cylinder, arch, triangle, etc.) and colors that are used as building toys. Sometimes toy blocks depict letters of the alphabet. A set of blocks History 1693: One of the first references to Alphabet Nursery Blocks was made by English philosopher John Locke, in 1693, made the statement that "dice and playthings, with letters on them to teach children the alphabet by playing" would make learning to read a more enjoyable experience.[1] Baby at Play, by Thomas Eakins, 1876. 1798: Witold Rybczynski has found that the earliest mention of building bricks for children appears in Maria and R.L. Edgeworth's Practical Education (1798). Called "rational toys," blocks were intended to teach children about gravity and physics, as well as spatial relationships that allow them to see how many different parts become a whole.[2] 1820: The first large-scale production of blocks was in the Williamsburg area of Brooklyn by S. L. Hill, who patented "ornamenting wood" a patent related to painting or coloring a block surface prior to the embossing process and then adding another color after the embossing to have multi-colored blocks.[3] 1850: During the mid-nineteenth century, Henry Cole (under the pseudonym of Felix Summerly) wrote a series of children’s books. Cole's A book of stories from The Home Treasury included a box of terracotta toy blocks and, in the accompanying pamphlet "Architectural Pastime.", actual blueprints. Toy block 2003: National Toy Hall of Fame at the Strong Museum, inducted ABC blocks into their collection, granting it the title of one of America's toys of national significance.[4] Educational benefits • Physical benefits: toy blocks build strength in a child’s fingers and hands, and improve eye-hand coordination. They also help educate children in different shapes. • Social benefits: block play encourages children to make friends and cooperate, and is often one of the first experiences a child has playing with others. Blocks are a benefit for the children because they encourage interaction and imagination. Creativity can be a combined action that is important for social play. • Intellectual benefits: children can potentially develop their vocabularies as they learn to describe sizes, shapes, and positions. Math skills are developed through the process of grouping, adding, and subtracting, particularly with standardized blocks, such as unit blocks. Experiences with gravity, balance, and geometry learned from toy blocks also provide intellectual stimulation. • Creative benefits: children receive creative stimulation by making their own designs with blocks. In popular culture Art Clokey, the creator of Gumby, has stated that Gumby's nemeses, the Block-heads, evolved from the blocks that appeared in the toy store that originally provided the setting for the stop-motion series.[5] References [1] "The History of Alphabet Blocks" (http:/ / www. nuttybug. com/ index. asp?PageAction=VIEWPROD& ProdID=988). Nuttybug. . Retrieved 2008-02-14. [2] Witold Rybczynski, Looking Around: A Journey Through Architecture, 2006 [3] "The History of Alphabet Blocks" (http:/ / www. nuttybug. com/ index. asp?PageAction=VIEWPROD& ProdID=988). Nuttybug. . Retrieved 2008-02-14. [4] "The History of Alphabet Blocks" (http:/ / www. nuttybug. com/ index. asp?PageAction=VIEWPROD& ProdID=988). Nuttybug. . Retrieved 2008-02-14. [5] gumbyworld.com (http:/ / www. gumbyworld. com/ memorylane/ histblkhd. htm) • Block play: Building a child's mind (http://www.woodentoy.com/html/BlocksGoodToy.html), the National Association for the Education of Young Children 78 Chromosome (genetic algorithm) Chromosome (genetic algorithm) In genetic algorithms, a chromosome (also sometimes called a genome) is a set of parameters which define a proposed solution to the problem that the genetic algorithm is trying to solve. The chromosome is often represented as a simple string, although a wide variety of other data structures are also used. Chromosome design The article would also benefit from more relevant and clearer examples. The design of the chromosome and its parameters is by necessity specific to the problem to be solved. To give a trivial example, suppose the problem is to find the integer value of between 0 and 255 that provides the maximal result for . (This isn't the type of problem that is normally solved by a genetic algorithm, since it can be trivially solved using numeric methods. It is only used to serve as a simple example.) Our possible solutions are the integers from 0 to 255, which can all be represented as 8-digit binary strings. Thus, we might use an 8-digit binary string as our chromosome. If a given chromosome in the population represents the value 155, its chromosome would be 10011011. A more realistic problem we might wish to solve is the travelling salesman problem. In this problem, we seek an ordered list of cities that results in the shortest trip for the salesman to travel. Suppose there are six cities, which we'll call A, B, C, D, E, and F. A good design for our chromosome might be the ordered list we want to try. An example chromosome we might encounter in the population might be DFABEC. The mutation operator and crossover operator employed by the genetic algorithm must take into account the chromosome's design. Genetic operator A genetic operator is an operator used in genetic algorithms to maintain genetic diversity, known as Mutation (genetic algorithm) and to combine existing solutions into others, Crossover (genetic algorithm). The main difference between them is that the mutation operators operate on one chromosome, that is, they are unary, while the crossover operators are binary operators. Genetic variation is a necessity for the process of evolution. Genetic operators used in genetic algorithms are analogous to those in the natural world: survival of the fittest, or selection; reproduction (crossover, also called recombination); and mutation. Types of Operators 1. Mutation (genetic algorithm) 2. Crossover (genetic algorithm) 79 Crossover (genetic algorithm) Crossover (genetic algorithm) In genetic algorithms, crossover is a genetic operator used to vary the programming of a chromosome or chromosomes from one generation to the next. It is analogous to reproduction and biological crossover, upon which genetic algorithms are based. Cross over is a process of taking more than one parent solutions and producing a child solution from them. There are methods for selection of the chromosomes. Those are also given below. Methods of selection of chromosomes for crossover • Roulette wheel selection (SCX) [1] It is also known as fitness proportionate selection. The individual is selected on the basis of fitness. The probability of an individual to be selected increases with the fitness of the individual greater or less than its competitor's fitness. • Boltzmann selection • Tournament selection • Rank selection • Steady state selection Crossover techniques Many crossover techniques exist for organisms which use different data structures to store themselves. One-point crossover A single crossover point on both parents' organism strings is selected. All data beyond that point in either organism string is swapped between the two parent organisms. The resulting organisms are the children: Two-point crossover Two-point crossover calls for two points to be selected on the parent organism strings. Everything between the two points is swapped between the parent organisms, rendering two child organisms: "Cut and splice" Another crossover variant, the "cut and splice" approach, results in a change in length of the children strings. The reason for this difference is that each parent string has a separate choice of crossover point. 80 Crossover (genetic algorithm) Uniform Crossover and Half Uniform Crossover The Uniform Crossover uses a fixed mixing ratio between two parents. Unlike one- and two-point crossover, the Uniform Crossover enables the parent chromosomes to contribute the gene level rather than the segment level. If the mixing ratio is 0.5, the offspring has approximately half of the genes from first parent and the other half from second parent, although cross over points can be randomly chosen as seen below The Uniform Crossover evaluates each bit in the parent strings for exchange with a probability of 0.5. Even though the uniform crossover is a poor method, empirical evidence suggest that it is a more exploratory approach to crossover than the traditional exploitative approach that maintains longer schemata. This results in a more complete search of the design space with maintaining the exchange of good information. Unfortunately, no satisfactory theory exists to explain the discrepancies between the Uniform Crossover and the traditional approaches. [2] In the uniform crossover scheme (UX) individual bits in the string are compared between two parents. The bits are swapped with a fixed probability, typically 0.5. In the half uniform crossover scheme (HUX), exactly half of the nonmatching bits are swapped. Thus first the Hamming distance (the number of differing bits) is calculated. This number is divided by two. The resulting number is how many of the bits that do not match between the two parents will be swapped. Three parent crossover In this technique, the child is derived from three parents. They are randomly chosen. Each bit of first parent is checked with bit of second parent whether they are same. If same then the bit is taken for the offspring otherwise the bit from the third parent is taken for the offspring. parent1 1 1 0 1 0 0 0 1 0 parent2 0 1 1 0 0 1 0 0 1 parent3 1 1 0 1 1 0 1 0 1 offspring 1 1 0 1 0 0 0 0 1[3] Crossover for Ordered Chromosomes Depending on how the chromosome represents the solution, a direct swap may not be possible. One such case is when the chromosome is an ordered list, such as an ordered list of the cities to be travelled for the traveling salesman problem. There are many crossover methods for ordered chromosomes. The already mentioned N-point crossover can be applied for ordered chromosomes also, but this always need a corresponding repair process, actually, some ordered crossover methods are derived from the idea. However, sometimes a crossover of chromosomes produces recombinations which violate the constraint of ordering and thus need to be repaired. Several examples for crossover operators (also mutation operator) preserving a given order are given in [4]: 1. partially matched crossover (PMX): In this method, two crossover points are selected at random and PMX proceeds by position wise exchanges. The two crossover points give matching selection. It affects cross by position-by-position exchange operations. In this method parents are mapped to each other, hence we can also call it partially mapped crossover.[5] 2. cycle crossover (CX): Beginning at any gene in parent 1, the -th gene in parent 2 becomes replaced by it. The same is repeated for the displaced gene until the gene which is equal to the first inserted gene becomes replaced (cycle). 3. order crossover operator (OX1): A portion of one parent is mapped to a portion of the other parent. From the replaced portion on, the rest is filled up by the remaining genes, where already present genes are omitted and the order is preserved. 81 Crossover (genetic algorithm) 4. 5. 6. 7. 8. order-based crossover operator (OX2) position-based crossover operator (POS) voting recombination crossover operator (VR) alternating-position crossover operator (AP) sequential constrictive crossover operator (SCX) [6] Other possible methods include the edge recombination operator and partially mapped crossover. Crossover biases For crossover operators which exchange contiguous sections of the chromosomes (e.g. k-point) the ordering of the variables may become important. This is particularly true when good solutions contain building blocks which might be disrupted by a non-respectful crossover operator. References • John Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, Michigan. 1975. ISBN 0-262-58111-6. • Larry J. Eshelman, The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination, in Gregory J. E. Rawlins editor, Proceedings of the First Workshop on Foundations of Genetic Algorithms. pages 265-283. Morgan Kaufmann, 1991. ISBN 1-55860-170-8. • Tomasz D. Gwiazda, Genetic Algorithms Reference Vol.1 Crossover for single-objective numerical optimization problems, Tomasz Gwiazda, Lomianki, 2006. ISBN 83-923958-3-2. [1] <http://en.wikipedia.org/wiki/Fitness_proportionate_selection> [2] (eds.), P.K. Chawdhry ... (1998). Soft computing in engineering design and manufacturing (http:/ / books. google. com/ books?id=mxcP1mSjOlsC). London: Springer. pp. 164. ISBN 3540762140. . [3] Introduction to genetic algorithms By S. N. Sivanandam, S. N. Deepa [4] Pedro Larrañaga et al., "Learning Bayesian Network Structures by searching for the best ordering with genetic algorithms", IEEE Transactions on systems, man and cybernetics, Vol 26, No. 4, 1996 [5] Introduction to genetic algorithms By S. N. Sivanandam, S. N. Deepa [6] Ahmed, Zakir H. "Genetic Algorithm for the Traveling Salesman Problem Using Sequential Constructive Crossover Operator." International Journal of Biometric and Bioinformatics 3.6 (2010). Computer Science Journals. Web. <http://www.cscjournals.org/csc/manuscript/Journals/IJBB/volume3/Issue6/IJBB-41.pdf>. External links • Newsgroup: comp.ai.genetic FAQ (http://www.faqs.org/faqs/ai-faq/genetic/part2/) - see section on crossover (also known as recombination). 82 Mutation (genetic algorithm) 83 Mutation (genetic algorithm) In genetic algorithms of computing, mutation is a genetic operator used to maintain genetic diversity from one generation of a population of algorithm chromosomes to the next. It is analogous to biological mutation. Mutation alters one or more gene values in a chromosome from its initial state. In mutation, the solution may change entirely from the previous solution. Hence GA can come to better solution by using mutation. Mutation occurs during evolution according to a user-definable mutation probability. This probability should be set low. If it is set to high, the search will turn into a primitive random search. The classic example of a mutation operator involves a probability that an arbitrary bit in a genetic sequence will be changed from its original state. A common method of implementing the mutation operator involves generating a random variable for each bit in a sequence. This random variable tells whether or not a particular bit will be modified. This mutation procedure, based on the biological point mutation, is called single point mutation. Other types are inversion and floating point mutation. When the gene encoding is restrictive as in permutation problems, mutations are swaps, inversions and scrambles. The purpose of mutation in GAs is preserving and introducing diversity. Mutation should allow the algorithm to avoid local minima by preventing the population of chromosomes from becoming too similar to each other, thus slowing or even stopping evolution. This reasoning also explains the fact that most GA systems avoid only taking the fittest of the population in generating the next but rather a random (or semi-random) selection with a weighting toward those that are fitter.[1] For different genome types, different mutation types are suitable: • Bit string mutation The mutation of bit strings ensue through bit flips at random positions. Example: 1 0 1 0 0 1 0 ↓ 1 0 1 0 1 1 0 The probability of a mutation of a bit is rate of , where is the length of the binary vector. Thus, a mutation per mutation and individual selected for mutation is reached. • Flip Bit This mutation operator takes the chosen genome and inverts the bits. (i.e. if the genome bit is 1,it is changed to 0 and vice versa) • Boundary This mutation operator replaces the genome with either lower or upper bound randomly. This can be used for integer and float genes. • Non-Uniform The probability that amount of mutation will go to 0 with the next generation is increased by using non-uniform mutation operator.It keeps the population from stagnating in the early stages of the evolution.It tunes solution in later stages of evolution.This mutation operator can only be used for integer and float genes. • Uniform This operator replaces the value of the chosen gene with a uniform random value selected between the user-specified upper and lower bounds for that gene. This mutation operator can only be used for integer and float genes. Mutation (genetic algorithm) • Gaussian This operator adds a unit Gaussian distributed random value to the chosen gene. If it falls outside of the user-specified lower or upper bounds for that gene,the new gene value is clipped. This mutation operator can only be used for integer and float genes. References [1] "XI. Crossover and Mutation" (http:/ / www. obitko. com/ tutorials/ genetic-algorithms/ crossover-mutation. php). http:/ / www. obitko. com/ : Marek Obitko. . Retrieved 2011-04-07. Bibliography • John Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, Michigan. 1975. ISBN 0-262-58111-6. Inheritance (genetic algorithm) In genetic algorithms, inheritance is the ability of modelled objects to mate, mutate and propagate their problem solving genes to the next generation, in order to produce an evolved solution to a particular problem. Selection (genetic algorithm) Selection is the stage of a genetic algorithm in which individual genomes are chosen from a population for later breeding (recombination or crossover). A generic selection procedure may be implemented as follows: 1. The fitness function is evaluated for each individual, providing fitness values, which are then normalized. Normalization means dividing the fitness value of each individual by the sum of all fitness values, so that the sum of all resulting fitness values equals 1. 2. The population is sorted by descending fitness values. 3. Accumulated normalized fitness values are computed (the accumulated fitness value of an individual is the sum of its own fitness value plus the fitness values of all the previous individuals). The accumulated fitness of the last individual should be 1 (otherwise something went wrong in the normalization step). 4. A random number R between 0 and 1 is chosen. 5. The selected individual is the first one whose accumulated normalized value is greater than R. If this procedure is repeated until there are enough selected individuals, this selection method is called fitness proportionate selection or roulette-wheel selection. If instead of a single pointer spun multiple times, there are multiple, equally spaced pointers on a wheel that is spun once, it is called stochastic universal sampling. Repeatedly selecting the best individual of a randomly chosen subset is tournament selection. Taking the best half, third or another proportion of the individuals is truncation selection. There are other selection algorithms that do not consider all individuals for selection, but only those with a fitness value that is higher than a given (arbitrary) constant. Other algorithms select from a restricted pool where only a certain percentage of the individuals are allowed, based on fitness value. Retaining the best individuals in a generation unchanged in the next generation, is called elitism or elitist selection. It is a successful (slight) variant of the general process of constructing a new population. See the main article on genetic algorithms for the context in which selection is used. 84 Selection (genetic algorithm) See Also • • • • Fitness proportionate selection Stochastic universal sampling Tournament selection Reward-based selection External links • Introduction to Genetic Algorithms [1] References [1] http:/ / www. rennard. org/ alife/ english/ gavintrgb. html Tournament selection Tournament selection is a method of selecting an individual from a population of individuals in a genetic algorithm. Tournament selection involves running several "tournaments" among a few individuals chosen at random from the population. The winner of each tournament (the one with the best fitness) is selected for crossover. Selection pressure is easily adjusted by changing the tournament size. If the tournament size is larger, weak individuals have a smaller chance to be selected. Tournament selection pseudo code: choose choose choose choose and so k (the tournament size) individuals from the population at random the best individual from pool/tournament with probability p the second best individual with probability p*(1-p) the third best individual with probability p*((1-p)^2) on... Deterministic tournament selection selects the best individual (when p=1) in any tournament. A 1-way tournament (k=1) selection is equivalent to random selection. The chosen individual can be removed from the population that the selection is made from if desired, otherwise individuals can be selected more than once for the next generation. Tournament selection has several benefits: it is efficient to code, works on parallel architectures and allows the selection pressure to be easily adjusted. See Also • Fitness proportionate selection • Reward-based selection External links • "Genetic Algorithms, Tournament Selection, and the Effects of Noise" [1] by Brad L. Miller and David E. Goldberg (PDF link). • "Tournament Selection in XCS" [2] by Martin V. Butz, Kumara Sastry and David E. Goldberg (PDF link). 85 Tournament selection 86 References [1] http:/ / citeseerx. ist. psu. edu/ viewdoc/ download;jsessionid=621DB995CF9017353A57518149E3CAA4?doi=10. 1. 1. 30. 6625& rep=rep1& type=pdf [2] http:/ / citeseerx. ist. psu. edu/ viewdoc/ download?doi=10. 1. 1. 19. 1850& rep=rep1& type=pdf Truncation selection Truncation selection is a selection method used in genetic algorithms to select potential candidate solutions for recombination. In truncation selection the candidate solutions are ordered by fitness, and some proportion, p, (e.g. p=1/2, 1/3, etc.), of the fittest individuals are selected and reproduced 1/p times. Truncation selection is less sophisticated than many other selection methods, and is not often used in practice. It is used in Muhlenbein's Breeder Genetic Algorithm.[1] References [1] H Muhlenbein, D Schlierkamp-Voosen (1993). "Predictive Models for the Breeder Genetic Algorithm" (http:/ / citeseer. comp. nus. edu. sg/ rd/ 0,730860,1,0. 25,Download/ http:qSqqSqwww. ais. fraunhofer. deqSq%7EmuehlenqSqpublicationsqSqgmd_as_ga-93_01. ps). Evolutionary Computation. . Fitness proportionate selection Fitness proportionate selection, also known as roulette-wheel selection, is a genetic operator used in genetic algorithms for selecting potentially useful solutions for recombination. In fitness proportionate selection, as in all selection methods, the fitness function assigns a fitness to Example of the selection of a single individual possible solutions or chromosomes. This fitness level is used to associate a probability of selection with each individual chromosome. If is the fitness of individual in the population, its probability of being selected is , where is the number of individuals in the population. This could be imagined similar to a Roulette wheel in a casino. Usually a proportion of the wheel is assigned to each of the possible selections based on their fitness value. This could be achieved by dividing the fitness of a selection by the total fitness of all the selections, thereby normalizing them to 1. Then a random selection is made similar to how the roulette wheel is rotated. While candidate solutions with a higher fitness will be less likely to be eliminated, there is still a chance that they may be. Contrast this with a less sophisticated selection algorithm, such as truncation selection, which will eliminate a fixed percentage of the weakest candidates. With fitness proportionate selection there is a chance some weaker solutions may survive the selection process; this is an advantage, as though a solution may be weak, it may include some component which could prove useful following the recombination process. The analogy to a roulette wheel can be envisaged by imagining a roulette wheel in which each candidate solution represents a pocket on the wheel; the size of the pockets are proportionate to the probability of selection of the solution. Selecting N chromosomes from the population is equivalent to playing N games on the roulette wheel, as each candidate is drawn independently. Fitness proportionate selection 87 Other selection techniques, such as stochastic universal sampling[1] or tournament selection, are often used in practice. This is because they have less stochastic noise, or are fast, easy to implement and have a constant selection pressure [Blickle, 1996]. Note performance gains can be achieved by using a binary search rather than a linear search to find the right pocket. See Also • Stochastic universal sampling • Tournament selection • Reward-based selection External links • C implementation [2] (.tar.gz; see selector.cxx) WBL • Example on Roulette wheel selection [3] References [1] Bäck, Thomas, Evolutionary Algorithms in Theory and Practice (1996), p. 120, Oxford Univ. Press [2] http:/ / www. cs. ucl. ac. uk/ staff/ W. Langdon/ ftp/ gp-code/ GProc-1. 8b. tar. gz [3] http:/ / www. edc. ncl. ac. uk/ highlight/ rhjanuary2007g02. php/ Reward-based selection Reward-based selection is a technique used in evolutionary algorithms for selecting potentially useful solutions for recombination. The probability of being selected for an individual is proportional to the cumulative reward, obtained by the individual. The cumulative reward can be computed as a sum of the individual reward and the reward, inherited from parents. Description Reward-based selection can be used within Multi-armed bandit framework for Multi-objective optimization to obtain a better approximation of the Pareto front. [1] The newborn and its parents receive a reward , if was selected for new population , otherwise the reward is zero. Several reward definitions are possible: • 1. , if the newborn individual • 2. was selected for new population , where individual in the population of . is the rank of newly inserted individuals. Rank can be computed using a well-known non-dominated sorting [2] procedure. • 3. indicator contribution of the individual , where to the population . The reward is the hypervolume if the newly inserted individual improves the quality of the population, which is measured as its hypervolume contribution in the objective space. • 4. A relaxation of the above reward, involving a rank-based penalization for points for front: -th dominated Pareto Reward-based selection 88 Reward-based selection can quickly identify the most fruitful directions of search by maximizing the cumulative reward of individuals. References [1] Loshchilov, I.; M. Schoenauer and M. Sebag (2011). "Not all parents are equal for MO-CMA-ES" (http:/ / www. lri. fr/ ~ilya/ publications/ EMO2011_MOCMAselection. pdf). Evolutionary Multi-Criterion Optimization 2011 (EMO 2011). Springer Verlag, LNCS 6576. pp. 31-45. . [2] Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. (2002). "A fast and elitist multi-objective genetic algorithm: NSGA-II". IEEE Transactions on Evolutionary Computation 6 (2): 182–197. doi:10.1109/4235.996017. Edge recombination operator The edge recombination operator (ERO) is an operator that creates a path that is similar to a set of existing paths (parents) by looking at the edges rather than the vertices. The main application of this is for crossover in genetic algorithms when a genotype with non-repeating gene sequences is needed such as for the travelling salesman problem. Algorithm ERO is based on an adjacency matrix, which lists the neighbors of each node in any parent. For example, in a travelling salesman problem such as the one depicted, the node map for the parents CABDEF and ABCEFD (see illustration) is generated by taking the first parent, say, 'ABCEFD' and recording its immediate neighbors, including those that roll around the end of the string. Therefore; ERO crossover ... -> [A] <-> [B] <-> [C] <-> [E] <-> [F] <-> [D] <- ... ...is converted into the following adjacency matrix by taking each node in turn, and listing its connected neighbors; A: B: C: D: E: F: B A B F C E D C E A F D With the same operation performed on the second parent (CABDEF), the following is produced: A: B: C: D: E: F: C A F B D E B D A E F C Edge recombination operator 89 Followed by making a union of these two lists, and ignoring any duplicates. This is as simple as taking the elements of each list and appending them to generate a list of unique link end points. In our example, generating this; A: B: C: D: E: F: B A A A C C C C B B D D D D E F E F F E = = = = = = {B,D} {A,C} {B,E} {F,A} {C,F} {E,D} ∪ {C,B} ∪ {A,D} ∪ {F,A} ∪ {B,E} ∪ {D,F} ∪ {E,C} The result is another adjacency matrix, which stores the links for a network described by all the links in the parents. Note that more than two parents can be employed here to give more diverse links. However, this approach may result in sub-optimal paths. Then, to create a path K, the following algorithm is employed: Let K be the empty list Let N be the first node of a random parent. While Length(K) < Length(Parent): K := K, N (append N to K) Remove N from all neighbor lists If N's neighbor list is non-empty then let N* be the neighbor of N with the fewest neighbors in its list (or a random one, should there be multiple) else let N* be a randomly chosen node that is not in K N := N* To step through the example, we randomly select a node from the parent starting points, {A, C}. • • • • • • () -> A. We remove A from all the neighbor sets, and find that the smallest of B, C and D is B={C,D}. AB. The smallest sets of C and D are C={E,F} and D={E,F}. We randomly select D. ABD. Smallest are E={C,F}, F={C,E}. We pick F. ABDF. C={E}, E={C}. We pick C. ABDFC. The smallest set is E={}. ABDFCE. The length of the child is now the same as the parent, so we are done. Note that the only edge introduced in ABDFCE is AE. Comparison with other operators If one were to use an indirect representation for these parents (where each number in turn indexes and removes an element from an initially sorted set of nodes) and cross them with simple one-point crossover, one would get the following: Indirect one-point crossover Edge recombination operator 90 The parents: 31|1111 (CABDEF) 11|1211 (ABCEFD) The children: 11|1111 (ABCDEF) 31|1211 (ABEDFC) Both children introduce the edges CD and FA. The reason why frequent edge introduction is a bad thing in these kinds of problem is that very few of the edges tend to be usable and many of them severely inhibit an otherwise good solution. The optimal route in the examples is ABDFEC, but swapping A for F turns it from optimal to far below an average random guess. The difference between ERO and the indirect one-point crossover can be seen in the diagram. It takes ERO 25 generations of 500 individuals to reach 80% of the optimal path in a 29 point data set, something the indirect representation spends 150 generations on. Partially mapped crossover (PMX) ranks between ERO and indirect one-point crossover, with 80 generations for this particular target.[1] References [1] The traveling salesman and sequence scheduling: quality solutions using genetic edge recombination ERO vs PMX vs Indirect one-point crossover Whitley, Darrell; Timothy Starkweather, D'Ann Fuquay (1989). "Scheduling problems and traveling salesman: The genetic edge recombination operator". International Conference on Genetic Algorithms. pp. 133–140. ISBN 1-55860-066-3. Implementations • "Edge Recombination Operator" (http://github.com/raunak/Travelling-Salesman-Problem/blob/master/ edge_recombination.py) (Python) Population-based incremental learning Population-based incremental learning In computer science and machine learning, population-based incremental learning (PBIL) is an optimization algorithm, and an estimation of distribution algorithm. This is a type of genetic algorithm where the genotype of an entire population (probability vector) is evolved rather than individual members[1]. The algorithm is proposed by Shumeet Baluja in 1994. The algorithm is simpler than a standard genetic algorithm, and in many cases leads to better results than a standard genetic algorithm[2][3][4]. Algorithm In PBIL, genes are represented as real values in the range [0,1], indicating the probability that any particular allele appears in that gene. The PBIL algorithm is as follows: 1. 2. 3. 4. A population is generated from the probability vector. The fitness of each member is evaluated and ranked. Update population genotype (probability vector) based on fittest individual. Mutate. 5. Repeat steps 1-4 Source code This is a part of source code implemented in Java. In the paper, learnRate = 0.1, negLearnRate = 0.075, mutProb = 0.02, and mutShift = 0.05 is used. N = 100 and ITER_COUNT = 1000 is enough for a small problem. public void optimize() { final int totalBits = getTotalBits(domains); final double[] probVec = new double[totalBits]; Arrays.fill(probVec, 0.5); bestCost = POSITIVE_INFINITY; for (int i = 0; i < ITER_COUNT; i++) { // Creates N genes final boolean[][] genes = new boolean[N][totalBits]; for (boolean[] gene : genes) { for (int k = 0; k < gene.length; k++) { if (rand.nextDouble() < probVec[k]) gene[k] = true; } } // Calculate costs final double[] costs = new double[N]; for (int j = 0; j < N; j++) { costs[j] = costFunc.cost(toRealVec(genes[j], domains)); } // Find min and max cost genes boolean[] minGene = null, maxGene = null; 91 Population-based incremental learning double minCost = POSITIVE_INFINITY, maxCost = NEGATIVE_INFINITY; for (int j = 0; j < N; j++) { double cost = costs[j]; if (minCost > cost) { minCost = cost; minGene = genes[j]; } if (maxCost < cost) { maxCost = cost; maxGene = genes[j]; } } // Compare with the best cost gene if (bestCost > minCost) { bestCost = minCost; bestGene = minGene; } // Update the probability vector with max and min cost genes for (int j = 0; j < totalBits; j++) { if (minGene[j] == maxGene[j]) { probVec[j] = probVec[j] * (1d - learnRate) + (minGene[j] ? 1d : 0d) * learnRate; } else { final double learnRate2 = learnRate + negLearnRate; probVec[j] = probVec[j] * (1d - learnRate2) + (minGene[j] ? 1d : 0d) * learnRate2; } } // Mutation for (int j = 0; j < totalBits; j++) { if (rand.nextDouble() < mutProb) { probVec[j] = probVec[j] * (1d - mutShift) + (rand.nextBoolean() ? 1d : 0d) * mutShift; } } } } 92 Population-based incremental learning References [1] Karray, Fakhreddine O.; de Silva, Clarence (2004), Soft computing and intelligent systems design, Addison Wesley, ISBN 0-321-11617-8 [2] Baluja, Shumeet (1994), "Population-Based Incremental Learning: A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning" (http:/ / citeseerx. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 61. 8554), Technical Report (Pittsburgh, PA: Carnegie Mellon University) (CMU–CS–94–163), [3] Baluja, Shumeet; Caruana, Rich (1995), Removing the Genetics from the Standard Genetic Algorithm (http:/ / citeseerx. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 44. 5424), Morgan Kaufmann Publishers, pp. 38–46, [4] Baluja, Shumeet (1995), An Empirical Comparison of Seven Iterative and Evolutionary Function Optimization Heuristics (http:/ / citeseerx. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 43. 1108), Defining length In genetic algorithms and genetic programming defining length L(H) is the maximum distance between two defining symbols (that is symbols that have a fixed value as opposed to symbols that can take any value, commonly denoted as # or *) in schema H. In tree GP schemata, L(H) is the number of links in the minimum tree fragment including all the non-= symbols within a schema H.[1] Example Schemata "00##0", "1###1", "01###", and "##0##" have defining lengths of 4, 4, 1, and 0, respectively. Lengths are computed by determining the last fixed position and subtracting from it the first fixed position. In genetic algorithms as the defining length of a solution increases so does the susceptibility of the solution to disruption due to mutation or cross-over. References [1] "Foundations of Genetic Programming" (http:/ / www. cs. ucl. ac. uk/ staff/ W. Langdon/ FOGP/ ). UCL UK. . Retrieved 13 July 2010. 93 Holland's schema theorem 94 Holland's schema theorem Holland's schema theorem is widely taken to be the foundation for explanations of the power of genetic algorithms. It was proposed by John Holland in the 1970s. A schema is a template that identifies a subset of strings with similarities at certain string positions. Schemata are a special case of cylinder sets; and so form a topological space. Description For example, consider binary strings of length 6. The schema 1*10*1 describes the set of all strings of length 6 with 1's at positions 1, 3 and 6 and a 0 at position 4. The * is a wildcard symbol, which means that positions 2 and 5 can have a value of either 1 or 0. The order of a schema is defined as the number of fixed positions in the template, while the defining length is the distance between the first and last specific positions. The order of 1*10*1 is 4 and its defining length is 5. The fitness of a schema is the average fitness of all strings matching the schema. The fitness of a string is a measure of the value of the encoded problem solution, as computed by a problem-specific evaluation function. Using the established methods and genetic operators of genetic algorithms, the schema theorem states that short, low-order schemata with above-average fitness increase exponentially in successive generations. Expressed as an equation: Here schema is the number of strings belonging to schema and is the observed average fitness at generation probability that crossover or mutation will destroy the schema where at generation is the number of fixed positions, , is the observed fitness of . The probability of disruption is the . It can be expressed as: is the length of the code, is the probability of crossover. So a schema with a shorter defining length is the probability of mutation and is less likely to be disrupted. An often misunderstood point is why the Schema Theorem is an inequality rather than an equality. The answer is in fact simple: the Theorem neglects the small, yet non-zero, probability that a string belonging to the schema will be created "from scratch" by mutation of a single string (or recombination of two strings) that did not belong to in the previous generation. References • J. Holland, Adaptation in Natural and Artificial Systems, The MIT Press; Reprint edition 1992 (originally published in 1975). • J. Holland, Hidden Order: How Adaptation Builds Complexity, Helix Books; 1996. Genetic memory (computer science) Genetic memory (computer science) In computer science, genetic memory refers to an artificial neural network combination of genetic algorithm and the mathematical model of sparse distributed memory. It can be used to predict weather patterns.[1] Genetic memory and genetic algorithms have also gained an interest in the creation of artificial life.[2] References [1] Rogers, David (ed. Touretzky, David S.) (1989). Advances in neural information processing systems: Weather prediction using a genetic memory. Los Altos, Calif: M. Kaufmann Publishers. pp. 455–464. ISBN 1-55860-100-7. [2] Rocha LM, Hordijk W (2005). "Material representations: From the genetic code to the evolution of cellular automata". Artificial Life 11 (1-2): 189–214. doi:10.1162/1064546053278964. PMID 15811227. Premature convergence In genetic algorithms, the term of premature convergence means that a population for an optimization problem converged too early, resulting in being suboptimal. In this context, the parental solutions, through the aid of genetic operators, are not able to generate offsprings that are superior to their parents. Premature convergence can happen in case of loss of genetic variation (every individual in the population is identical, see convergence). Strategies for preventing premature convergence Strategies to regain genetic variation can be: • • • • • a mating strategy called incest prevention,[1] uniform crossover, favored replacement of similar individuals (preselection or crowding), segmentation of individuals of similar fitness (fitness sharing), increasing population size. The genetic variation can also be regained by mutation though this process is highly random. References [1] Michalewicz, Zbigniew (1996). Genetic Algorithms + Data Structures = Evolution Programs, 3rd Edition. Springer-Verlag. p. 58. ISBN 3-540-60676-9. 95 Schema (genetic algorithms) 96 Schema (genetic algorithms) A schema is a template in computer science used in the field of genetic algorithms that identifies a subset of strings with similarities at certain string positions. Schemata are a special case of cylinder sets; and so form a topological space.[1] Description For example, consider binary strings of length 6. The schema 1**0*1 describes the set of all words of length 6 with 1's at positions 1 and 6 and a 0 at position 4. The * is a wildcard symbol, which means that positions 2, 3 and 5 can have a value of either 1 or 0. The order of a schema is defined as the number of fixed positions in the template, while the defining length is the distance between the first and last specific positions. The order of 1**0*1 is 3 and its defining length is 5. The fitness of a schema is the average fitness of all strings matching the schema. The fitness of a string is a measure of the value of the encoded problem solution, as computed by a problem-specific evaluation function. Length The length of a schema , called , is defined as the total number of nodes in the schema. equal to the number of nodes in the programs matching is also [2] . Disruption If the child of an individual that matches schema H does not itself match H, the schema is said to have been disrupted.[2] References [1] Holland (1992 reprint). Adaptation in Natural and Artificial Systems. The MIT Press. [2] "Foundations of Genetic Programming" (http:/ / www. cs. ucl. ac. uk/ staff/ W. Langdon/ FOGP/ ). UCL UK. . Retrieved 13 July 2010. Fitness function Fitness function A fitness function is a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims. In particular, in the fields of genetic programming and genetic algorithms, each design solution is represented as a string of numbers (referred to as a chromosome). After each round of testing, or simulation, the idea is to delete the 'n' worst design solutions, and to breed 'n' new ones from the best design solutions. Each design solution, therefore, needs to be awarded a figure of merit, to indicate how close it came to meeting the overall specification, and this is generated by applying the fitness function to the test, or simulation, results obtained from that solution. The reason that genetic algorithms are not a lazy way of performing design work is precisely because of the effort involved in designing a workable fitness function. Even though it is no longer the human designer, but the computer, that comes up with the final design, it is the human designer who has to design the fitness function. If this is designed wrongly, the algorithm will either converge on an inappropriate solution, or will have difficulty converging at all. Moreover, the fitness function must not only correlate closely with the designer's goal, it must also be computed quickly. Speed of execution is very important, as a typical genetic algorithm must be iterated many times in order to produce a usable result for a non-trivial problem. Fitness approximation may be appropriate, especially in the following cases: • Fitness computation time of a single solution is extremely high • Precise model for fitness computation is missing • The fitness function is uncertain or noisy. Two main classes of fitness functions exist: one where the fitness function does not change, as in optimizing a fixed function or testing with a fixed set of test cases; and one where the fitness function is mutable, as in niche differentiation or co-evolving the set of test cases. Another way of looking at fitness functions is in terms of a fitness landscape, which shows the fitness for each possible chromosome. Definition of the fitness function is not straightforward in many cases and often is performed iteratively if the fittest solutions produced by GA are not what is desired. In some cases, it is very hard or impossible to come up even with a guess of what fitness function definition might be. Interactive genetic algorithms address this difficulty by outsourcing evaluation to external agents (normally humans). References • An Nice Introduction to Adaptive Fuzzy Fitness Granulation (AFFG) (http://profsite.um.ac.ir/~davarynej/ Resources/CEC'07-Draft.pdf) (PDF), A promising approach to accelerate the convergence rate of EAs. Available as a free PDF. • The cyber shack of Adaptive Fuzzy Fitness Granulation (AFFG) (http://www.davarynejad.com/Mohsen/ index.php?n=Main.AFFG) That is designed to accelerate the convergence rate of EAs. • Fitness functions in evolutionary robotics: A survey and analysis (AFFG) (http://www.nelsonrobotics.org/ paper_archive_nelson/nelson-jras-2009.pdf) (PDF), A review of fitness functions used in evolutionary robotics. 97 Black box Black box In science and engineering, a black box is a device, system or object which can be viewed solely in terms of its input, output and transfer characteristics Scheme of a black box without any knowledge of its internal workings, that is, its implementation is "opaque" (black). Almost anything might be referred to as a black box: a transistor, an algorithm, or the human mind. The opposite of a black box is a system where the inner components or logic are available for inspection, which is sometimes known as a white box, a glass box, or a clear box. History The modern term "black box" seems to have entered the English language around 1945. The process of network synthesis from the transfer functions of black boxes can be traced to Wilhelm Cauer who published his ideas in their most developed form in 1941.[1] Although Cauer did not himself use the term, others who followed him certainly did describe the method as black-box analysis.[2] Vitold Belevitch[3] puts the concept of black-boxes even earlier, attributing the explicit use of two-port networks as black boxes to Franz Breisig in 1921 and argues that 2-terminal components were implicitly treated as black-boxes before that. Examples • In electronics, a sealed piece of replaceable equipment; see line-replaceable unit (LRU). • In computer programming and software engineering, black box testing is used to check that the output of a program is as expected, given certain inputs.[4] The term "black box" is used because the actual program being executed is not examined. • In computing in general, a black box program is one where the user cannot see its inner workings (perhaps because it is a closed source program) or one which has no side effects and the function of which need not be examined, a routine suitable for re-use. • Also in computing, a black box refers to a piece of equipment provided by a vendor, for the purpose of using that vendor's product. It is often the case that the vendor maintains and supports this equipment, and the company receiving the black box typically are hands-off. • In cybernetics a black box was described by Norbert Wiener as an unknown system that was to be identified using the techniques of system identification.[5] He saw the first step in Self-organization as being to be able to copy the output behaviour of a black box. • In neural networking or heuristic algorithms (computer terms generally used to describe 'learning' computers or 'AI simulations') a black box is used to describe the constantly changing section of the program environment which cannot easily be tested by the programmers. This is also called a White box (software engineering) in the context that the program code can be seen, but the code is so complex that it might as well be a Black box. • In finance many people trade with "black box" programs and algorithms designed by programmers.[6] These programs automatically trade user's accounts when certain technical market conditions suddenly exist (such as a SMA crossover). • In physics, a black box is a system whose internal structure is unknown, or need not be considered for a particular purpose. • In mathematical modelling, a limiting case. 98 Black box • In philosophy and psychology, the school of behaviorism sees the human mind as a black box; see black box theory.[7] • In neorealist international relations theory, the sovereign state is considered generally considered a black box: states are assumed to be unitary, rational, self-interested actors, and the actual decision-making processes of the state are disregarded as being largely irrelevant. Liberal and constructivist theorists often criticize neorealism for the "black box" model, and refer to much of their work on how states arrive at decisions as "breaking open the black box". • In cryptography to capture the notion of knowledge obtained by an algorithm through the execution of a cryptographic protocol such as a zero-knowledge proof protocol. If the output of the algorithm when interacting with the protocol can be simulated by a simulator that interacts only the algorithm, this means that the algorithm 'cannot know' anything more than the input of the simulator. If the simulator can only interact with the algorithm in a black box way, we speak of a black box simulator. • In aviation, a "black box" (they are actually bright orange, to facilitate their being found after a crash) is an audio or data recording device in an airplane or helicopter. The cockpit voice recorder records the conversation of the pilots and the flight data recorder logs information about controls and sensors, so that in the event of an accident investigators can use the recordings to assist in the investigation. Although these devices were originally called black boxes for a different reason, they are also an example of a black box according to the meaning above, in that it is of no concern how the recording is actually made. • In amateur radio the term "black box operator" is a disparaging or self deprecating description of someone who operates factory made radios without having a good understanding of how they work. Such operators don't build their own equipment (an activity called "homebrewing") or even repair their own "black boxes".[8] References [1] W. Cauer. Theorie der linearen Wechselstromschaltungen, Vol.I. Akad. Verlags-Gesellschaft Becker und Erler, Leipzig, 1941. [2] E. Cauer, W. Mathis, and R. Pauli, "Life and Work of Wilhelm Cauer (1900 – 1945)", Proceedings of the Fourteenth International Symposium of Mathematical Theory of Networks and Systems (MTNS2000), p4, Perpignan, June, 2000. Retrieved online (http:/ / www. cs. princeton. edu/ courses/ archive/ fall03/ cs323/ links/ cauer. pdf) 19th September 2008. [3] Belevitch, V, "Summary of the history of circuit theory", Proceedings of the IRE, vol 50, Iss 5, pp848-855, May 1962. [4] Black-Box Testing: Techniques for Functional Testing of Software and Systems, by Boris Beizer, 1995. ISBN 0471120944 [5] Cybernetics: Or the Control and Communication in the Animal and the Machine, by Norbert Wiener, page xi, MIT Press, 1961, ISBN 026273009X [6] Breaking the Black Box, by Martin J. Pring, McGraw-Hill, 2002, ISBN 0071384057 [7] "Mind as a Black Box: The Behaviorist Approach", pp 85-88, in Cognitive Science: An Introduction to the Study of Mind, by Jay Friedenberg, Gordon Silverman, Sage Publications, 2006 [8] http:/ / www. g3ngd. talktalk. net/ 1950. html 99 Black box theory Black box theory Black box theories are things defined only in terms of their function.[1][2] The term black box theory is applied to any field, philosophy and science or otherwise where some inquiry or definition is made into the relations between the appearance of something (exterior/outside), i.e. here specifically the things black box state, related to its characteristics and behaviour within(interior/inner).[3][4] Specifically that the inquiry is focused upon a thing that has no immediately apparent characteristics and therefore has only factors for consideration held within itself hidden from immediate observation. The observer is assumed ignorant in the first instance as the majority of available datum is held in a inner situation away from facile investigations. The black box element of the definition is shown as being characterised by a system where observable elements enter a perhaps imaginary box with a set of different outputs emerging which are also observable.[5] Origin of term The term black box was first recorded used by the RAF of approximately 1947 to describe the sealed containment used for apparatus of navigation, this usage becoming more widely applied after 1964.[6] The identifier is therefore applied to objects known as the flight data recorder (FDR) and cockpit voice recorder (CVR). These function to record the radio transmissions occurring within an airplane, and are particularly important to persons who engage into an inquiry into the cause of a plane crashing, where the plane is caused to become wreckage. These boxes are in fact coloured orange in order that they be more easily located.[7][8] Examples Considering a black box that could not be opened to "look inside" and see how it worked, all that would be possible would be to guess how it worked based Scheme of a black box on what happened when something was done to it (input), and what occurred as a result of that (output). If after putting an orange in on one side, an orange fell out the other, it would be possible to make educated guesses or hypotheses on what was happening inside the black box. It could be filled with oranges; it could have a conveyor belt to move the orange from one side to the other; it could even go through an alternate universe. Without being able to investigate the workings of the box, ultimately all we can do is guess. However, occasionally strange occurrences will take place that change our understanding of the black box. Consider putting in an orange in and having a guava pop out. Now our "filled with oranges" and "conveyor belt" theories no longer work, and we may have to change our educated guess as to how the black box works. The black box theory of consciousness, which states that the mind is fully understood once the inputs and outputs are well defined,[9] and generally couples this with a radical skepticism regarding the possibility of ever successfully describing the underlying structure, mechanism, and dynamics of the mind. 100 Black box theory Uses One of the black box theories uses is as a method to describe/understand psychological factors in fields such as marketing where applied to an analyses of consumer behaviour.[10][11][12] References [1] [2] [3] [4] [5] [6] [7] [8] Definition from Answers.com (http:/ / www. answers. com/ topic/ black-box-theory) definition from highbeam (http:/ / www. highbeam. com/ doc/ 1O98-blackboxtheory. html) Black box theory applied briefly to Isaac Newton (http:/ / www. new-science-theory. com/ isaac-newton. html) Usage of term (http:/ / www. ncbi. nlm. nih. gov/ pubmed/ 374288) Physics dept, Temple University,Philidelphia (http:/ / www. jstor. org/ pss/ 186066) online etymology dictionary (http:/ / www. etymonline. com/ index. php?search=black+ box) howstuffworks (http:/ / science. howstuffworks. com/ transport/ flight/ modern/ black-box. htm) cpaglobal (http:/ / www. cpaglobal. com/ newlegalreview/ widgets/ notes_quotes/ more/ 1259/ who_invented_the_black_box_for_use_in_airplanes) [9] the Professor network (http:/ / www. politicsprofessor. com/ politicaltheories/ black-box-model. php) [10] Institute for working futures (http:/ / www. marcbowles. com/ courses/ adv_dip/ module12/ chapter4/ amc12_ch4_two. htm) part of Advanced Diploma in Logistics and Management. Retrieved 11/09/2011 [11] Black-box theory used to understand Consumer behaviour (http:/ / books. google. com/ books?id=8qlKaIq0AccC& printsec=frontcover#v=onepage& q& f=false) Marketing By Richard L. Sandhusen. Retrieved 11/09/2011 [12] designing of websites (http:/ / designshack. co. uk/ articles/ business-articles/ using-the-black-box-model-to-design-better-websites/ ) Retrieved 11/09/2011 Fitness approximation In function optimization, fitness approximation is a method for decreasing the number of fitness function evaluations to reach a target solution. It belongs to the general class of evolutionary computation or artificial evolution methodologies. Approximate models in function optimisation Motivation In many real-world optimization problems including engineering problems, the number of fitness function evaluations needed to obtain a good solution dominates the optimization cost. In order to obtain efficient optimization algorithms, it is crucial to use prior information gained during the optimization process. Conceptually, a natural approach to utilizing the known prior information is building a model of the fitness function to assist in the selection of candidate solutions for evaluation. A variety of techniques for constructing of such a model, often also referred to as surrogates, metamodels or approximation models – for computationally expensive optimization problems have been considered. 101 Fitness approximation Approaches Common approaches to constructing approximate models based on learning and interpolation from known fitness values of a small population include: • low-degree • Polynomials and regression models • Artificial neural networks including • Multilayer perceptrons • Radial basis function networks • Support vector machines Due to the limited number of training samples and high dimensionality encountered in engineering design optimization, constructing a globally valid approximate model remains difficult. As a result, evolutionary algorithms using such approximate fitness functions may converge to local optima. Therefore, it can be beneficial to selectively use the original fitness function together with the approximate model. Adaptive fuzzy fitness granulation Adaptive fuzzy fitness granulation (AFFG) is a proposed solution to constructing an approximate model of the fitness function in place of traditional computationally expensive large-scale problem analysis like (L-SPA) in the Finite element method or iterative fitting of a Bayesian network structure. In adaptive fuzzy fitness granulation, an adaptive pool of solutions, represented by fuzzy granules, with an exactly computed fitness function result is maintained. If a new individual is sufficiently similar to an existing known fuzzy granule, then that granule’s fitness is used instead as an estimate. Otherwise, that individual is added to the pool as a new fuzzy granule. The pool size as well as each granule’s radius of influence is adaptive and will grow/shrink depending on the utility of each granule and the overall population fitness. To encourage fewer function evaluations, each granule’s radius of influence is initially large and is gradually shrunk in latter stages of evolution. This encourages more exact fitness evaluations when competition is fierce among more similar and converging solutions. Furthermore, to prevent the pool from growing too large, granules that are not used are gradually eliminated. Actually AFFG mirrors two features of human cognition: (a) granularity (b) similarity analysis. This granulation-based fitness approximation scheme is applied to solve various engineering optimization problems including detecting hidden information from a watermarked signal in addition to several structural optimization problems. References • The cyber shack of Adaptive Fuzzy Fitness Granulation (AFFG) (http://www.davarynejad.com/Mohsen/ index.php?n=Main.AFFG) That is designed to accelerate the convergence rate of EAs. • A complete list of references on Fitness Approximation in Evolutionary Computation (http://www. soft-computing.de/amec_n.html), by Yaochu Jin (http://www.soft-computing.de/jin.html). • M. Davarynejad, "Fuzzy Fitness Granulation in Evolutionary Algorithms for complex optimization" (http:// www.davarynejad.com/Resources1/MSc-Thesis-Abs.pdf), (PDF) M.Sc. Thesis. Ferdowsi University of Mashhad, Department of Electrical Engineering, 2007. 102 Effective fitness Effective fitness In natural evolution and artificial evolution (e.g. artificial life and evolutionary computation) the fitness (or performance or objective measure) of a schema is rescaled to give its effective fitness which takes into account crossover and mutation. That is effective fitness can be thought of as the fitness that the schema would need to have in order to increase or decrease as a fraction of the population as it actually does with crossover and mutation present but as if they were not. References • Foundations of Genetic Programming [1] References [1] http:/ / www. cs. ucl. ac. uk/ staff/ W. Langdon/ FOGP/ Speciation (genetic algorithm) Speciation is a process that occurs naturally in evolution and is modeled explicitly in some genetic algorithms. Speciation in nature occurs when two similar reproducing beings evolve to become too dissimilar to share genetic information effectively or correctly. In the case of living organisms, they are incapable of mating to produce offspring. Interesting special cases of different species being able to breed exist, such as a horse and a donkey mating to produce a mule. However in this case the Mule is usually infertile, and so the genetic isolation of the two parent species is maintained. In implementations of genetic search algorithms, the event of speciation is defined by some mathematical function that describes the similarity between two candidate solutions (usually described as individuals) in the population. If the result of the similarity is too low, the crossover operator is disallowed between those individuals. 103 Genetic representation Genetic representation Genetic representation is a way of representing solutions/individuals in evolutionary computation methods. Genetic representation can encode appearance, behavior, physical qualities of individuals. Designing a good genetic representation that is expressive and evolvable is a hard problem in evolutionary computation. Difference in genetic representations is one of the major criteria drawing a line between known classes of evolutionary computation. Genetic algorithms use linear binary representations. The most standard one is an array of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size. This facilitates simple crossover operation. Variable length representations were also explored in Genetic algorithms, but crossover implementation is more complex in this case. Evolution strategy uses linear real-valued representations, e.g. an array of real values. It uses mostly gaussian mutation and blending/averaging crossover. Genetic programming (GP) pioneered tree-like representations and developed genetic operators suitable for such representations. Tree-like representations are used in GP to represent and evolve functional programs with desired properties.[1] Human-based genetic algorithm (HBGA) offers a way to avoid solving hard representation problems by outsourcing all genetic operators to outside agents, in this case, humans. The algorithm has no need for knowledge of a particular fixed genetic representation as long as there are enough external agents capable of handling those representations, allowing for free-form and evolving genetic representations. Common genetic representations • • • • • binary array binary tree genetic tree natural language parse tree References and notes [1] Cramer, 1985 (http:/ / www. sover. net/ ~nichael/ nlc-publications/ icga85/ index. html) 104 Stochastic universal sampling Stochastic universal sampling Stochastic universal sampling (SUS) is a technique used in genetic algorithms for selecting potentially useful solutions for recombination. It was introduced by James Baker.[1] SUS is a development of fitness proportionate selection which exhibits no bias and minimal spread. Where fitness proportionate selection SUS example chooses several solutions from the population by repeated random sampling, SUS uses a single random value to sample all of the solutions by choosing them at evenly spaced intervals. Described as an algorithm, pseudocode for SUS looks like: RWS(population, f) Ptr := 0 for p in population if Ptr < f and Ptr + fitness of p > f return p Ptr := Ptr + fitness of p SUS(population, N) F := total fitness of population Start := random number between 0 and F/N Ptrs := [Start + i*F/N '''i''' in [0..'''N'''-1 return [RWS(i) | i in Ptrs] Here "RWS" describes the bulk of fitness proportionate selection (also known as "roulette wheel selection") - in true fitness proportional selection the parameter f is always a random number from 0 to F. The algorithm above is very inefficient both for fitness proportionate and stochastic universal sampling, and is intended to be illustrative rather than canonical. References [1] Baker, James E. (1987). "Reducing Bias and Inefficiency in the Selection Algorithm". Proceedings of the Second International Conference on Genetic Algorithms and their Application (Hillsdale, New Jersey: L. Erlbaum Associates): 14–21. 105 Quality control and genetic algorithms Quality control and genetic algorithms The combination of quality control and genetic algorithms led to novel solutions of complex quality control design and optimization problems. Quality control is a process by which entities review the quality of all factors involved in production. Quality is the degree to which a set of inherent characteristics fulfils a need or expectation that is stated, general implied or obligatory.[1] Genetic algorithms are search algorithms, based on the mechanics of natural selection and natural genetics.[2] Quality control Alternative quality control[3] (QC) procedures can be applied on a process to test statistically the null hypothesis, that the process conforms to the quality requirements, therefore that the process is in control, against the alternative, that the process is out of control. When a true null hypothesis is rejected, a statistical type I error is committed. We have then a false rejection of a run of the process. The probability of a type I error is called probability of false rejection. When a false null hypothesis is accepted, a statistical type II error is committed. We fail then to detect a significant change in the process. The probability of rejection of a false null hypothesis equals the probability of detection of the nonconformity of the process to the quality requirements. The QC procedure to be designed or optimized can be formulated as: Q1(n1,X1)# Q2(n2,X2) #...# Qq(nq,Xq) (1) where Qi(ni,Xi) denotes a statistical decision rule, ni denotes the size of the sample Si, that is the number of the samples the rule is applied upon, and Xi denotes the vector of the rule specific parameters, including the decision limits. Each symbol # denotes either the Boolean operator AND or the operator OR. Obviously, for # denoting AND, and for n1 < n2 <...< nq, that is for S1 S2 .... Sq, the (1) denotes a q-sampling QC procedure. Each statistical decision rule is evaluated by calculating the respective statistic of a monitored variable of samples taken from the process. Then, if the statistic is out of the interval between the decision limits, the decision rule is considered to be true. Many statistics can be used, including the following: a single value of the variable of a sample, the range, the mean, and the standard deviation of the values of the variable of the samples, the cumulative sum, the smoothed mean, and the smoothed standard deviation. Finally, the QC procedure is evaluated as a Boolean proposition. If it is true, then the null hypothesis is considered to be false, the process is considered to be out of control, and the run is rejected. A quality control procedure is considered to be optimum when it minimizes (or maximizes) a context specific objective function. The objective function depends on the probabilities of detection of the nonconformity of the process and of false rejection. These probabilities depend on the parameters of the quality control procedure (1) and on the probability density functions (see probability density function) of the monitored variables of the process. Genetic algorithms Genetic algorithms[4][5][6] are robust search algorithms, that do not require knowledge of the objective function to be optimized and search through large spaces quickly. Genetic algorithms have been derived from the processes of the molecular biology of the gene and the evolution of life. Their operators, cross-over, mutation, and reproduction, are isomorphic with the synonymous biological processes. Genetic algorithms have been used to solve a variety of complex optimization problems. Additionally the classifier systems and the genetic programming paradigm have shown us that genetic algorithms can be used for tasks as complex as the program induction. 106 Quality control and genetic algorithms Quality control and genetic algorithms In general, we can not use algebraic methods to optimize the quality control procedures. Usage of enumerative methods would be very tedious, especially with multi-rule procedures, as the number of the points of the parameter space to be searched grows exponentially with the number of the parameters to be optimized. Optimization methods based on the genetic algorithms offer an appealing alternative. Furthermore, the complexity of the design process of novel quality control procedures is obviously greater than the complexity of the optimization of predefined ones. In fact, since 1993, genetic algorithms have been used successfully to optimize and to design novel quality control procedures.[7][8][9] References [1] [2] [3] [4] [5] [6] Hoyle D. ISO 9000 quality systems handbook. Butterworth-Heineman 2001;p.654 Goldberg DE. Genetic algorithms in search, optimization and machine learning. Addison-Wesley 1989; p.1. Duncan AJ. Quality control and industrial statistics. Irwin 1986;pp.1-1123. Holland, JH. Adaptation in natural and artificial systems. The University of Michigan Press 1975;pp.1-228. Goldberg DE. Genetic algorithms in search, optimization and machine learning. Addison-Wesley 1989; pp.1-412. Mitchell M. An Introduction to genetic algorithms. The MIT Press 1998;pp.1-221. [7] Hatjimihail AT. Genetic algorithms based design and optimization of statistical quality control procedures. Clin Chem 1993;39:1972-8. (http:/ / www. clinchem. org/ cgi/ reprint/ 39/ 9/ 1972) [8] Hatjimihail AT, Hatjimihail TT. Design of statistical quality control procedures using genetic algorithms. In LJ Eshelman (ed): Proceedings of the Sixth International Conference on Genetic Algorithms. San Francisco: Morgan Kauffman 1995;551-7. [9] He D, Grigoryan A. Joint statistical design of double sampling x and s charts. European Journal of Operational Research 2006;168:122-142. External links • American Society for Quality (ASQ) (http://www.asq.org/index.html) • Illinois Genetic Algorithms Laboratory (IlliGAL) (http://www.illigal.uiuc.edu/web/) • Hellenic Complex Systems Laboratory (HCSL) (http://www.hcsl.com) 107 Human-based genetic algorithm 108 Human-based genetic algorithm In evolutionary computation, a human-based genetic algorithm (HBGA) is a genetic algorithm that allows humans to contribute solution suggestions to the evolutionary process. For this purpose, a HBGA has human interfaces for initialization, mutation, and recombinant crossover. As well, it may have interfaces for selective evaluation. In short, a HBGA outsources the operations of a typical genetic algorithm to humans. Evolutionary genetic systems and human agency Among evolutionary genetic systems, HBGA is the computer-based analogue of genetic engineering (Allan, 2005). This table compares systems on lines of human agency: system sequences innovator selector natural selection nucleotide nature nature artificial selection nucleotide nature human genetic engineering nucleotide human human human-based genetic algorithm data human human interactive genetic algorithm data computer human genetic algorithm data computer computer One obvious pattern in the table is the division between organic (top) and computer systems (bottom). Another is the vertical symmetry between autonomous systems (top and bottom) and human-interactive systems (middle). Looking to the right, the selector is the agent that decides fitness in the system. It determines which variations will reproduce and contribute to the next generation. In natural populations, and in genetic algorithms, these decisions are automatic; whereas in typical HBGA systems, they are made by people. The innovator is the agent of genetic change. The innovator mutates and recombines the genetic material, to produce the variations on which the selector operates. In most organic and computer-based systems (top and bottom), innovation is automatic, operating without human intervention. In HBGA, the innovators are people. HBGA is roughly similar to genetic engineering. In both systems, the innovators and selectors are people. The main difference lies in the genetic material they work with: electronic data vs. polynucleotide sequences. Differences from a plain genetic algorithm • All four genetic operators (initialization, mutation, crossover, and selection) can be delegated to humans using appropriate interfaces (Kosorukoff, 2001). • Initialization is treated as an operator, rather than a phase of the algorithm. This allows a HBGA to start with an empty population. Initialization, mutation, and crossover operators form the group of innovation operators. • Choice of genetic operator may be delegated to humans as well, so they are not forced to perform a particular operation at any given moment. Human-based genetic algorithm Functional features • HBGA is a method of collaboration and knowledge exchange. It merges competence of its human users creating a kind of symbiotic human-machine intelligence (see also distributed artificial intelligence). • Human innovation is facilitated by sampling solutions from population, associating and presenting them in different combinations to a user (see creativity techniques). • HBGA facilitates consensus and decision making by integrating individual preferences of its users. • HBGA makes use of a cumulative learning idea while solving a set of problems concurrently. This allows to achieve synergy because solutions can be generalized and reused among several problems. This also facilitates identification of new problems of interest and fair-share resource allocation among problems of different importance. • The choice of genetic representation, a common problem of genetic algorithms, is greatly simplified in HBGA, since the algorithm need not be aware of the structure of each solution. In particular, HBGA allows natural language to be a valid representation. • Storing and sampling population usually remains an algorithmic function. • A HBGA is usually a multi-agent system, delegating genetic operations to multiple agents (humans). Applications • Evolutionary knowledge management, integration of knowledge from different sources. • Social organization, collective decision-making, and e-governance. • Traditional areas of application of interactive genetic algorithms: computer art, user-centered design, etc. • Collaborative problem solving using natural language as a representation. The HBGA methodology was derived in 1999-2000 from analysis of the Free Knowledge Exchange project that was launched in the summer of 1998, in Russia (Kosorukoff, 1999). Human innovation and evaluation were used in support of collaborative problem solving. Users were also free to choose the next genetic operation to perform. Currently, several other projects implement the same model, the most popular being Yahoo! Answers, launched in December 2005. Recent research suggests that human-based innovation operators are advantageous not only where it is hard to design an efficient computational mutation and/or crossover (e.g. when evolving solutions in natural language), but also in the case where good computational innovation operators are readily available, e.g. when evolving an abstract picture or colors (Cheng and Kosorukoff, 2004). In the latter case, human and computational innovation can complement each other, producing cooperative results and improving general user experience by ensuring that spontaneous creativity of users will not be lost. References • Kosorukoff, Alex (1999). Free knowledge exchange. internet archive [1] • Kosorukoff, Alex (2000). Human-based genetic algorithm. online [2] • Kosorukoff, Alex (2001). Human-based genetic algorithm. In IEEE Transactions on Systems, Man, and Cybernetics, SMC-2001, 3464-3469. full text [3] • Cheng, Chihyung Derrick and Alex Kosorukoff (2004). Interactive one-max problem allows to compare the performance of interactive and human-based genetic algorithms. In Genetic and Evolutionary Computational Conference, GECCO-2004. full text [4] • Allan, Michael (2005). Simple recombinant design. SourceForge.net, project textbender, release 2005.0, file _/description.html. release archives [5], later version online [6] 109 Human-based genetic algorithm External links • Free Knowledge Exchange [7], a project using HBGA for collaborative solving of problems expressed in natural language. References [1] [2] [3] [4] [5] [6] [7] http:/ / web. archive. org/ web/ 19990824183328/ www. 3form. com/ formula/ whatis. htm http:/ / web. archive. org/ web/ 20091027041228/ http:/ / geocities. com/ alex+ kosorukoff/ hbga/ hbga. html http:/ / intl. ieeexplore. ieee. org/ xpl/ abs_free. jsp?arNumber=972056 http:/ / www. derrickcheng. com/ Project/ HBGA http:/ / sourceforge. net/ project/ showfiles. php?group_id=134813& amp;package_id=148018 http:/ / zelea. com/ project/ textbender/ d/ approach-simplex-wide. xht http:/ / www. 3form. com Interactive evolutionary computation Interactive evolutionary computation (IEC) or aesthetic selection is a general term for methods of evolutionary computation that use human evaluation. Usually human evaluation is necessary when the form of fitness function is not known (for example, visual appeal or attractiveness; as in Dawkins, 1986) or the result of optimization should fit a particular user preference (for example, taste of coffee or color set of the user interface). IEC design issues The number of evaluations that IEC can receive from one human user is limited by user fatigue which was reported by many researchers as a major problem. In addition, human evaluations are slow and expensive as compared to fitness function computation. Hence, one-user IEC methods should be designed to converge using a small number of evaluations, which necessarily implies very small populations. Several methods were proposed by researchers to speed up convergence, like interactive constrain evolutionary search (user intervention) or fitting user preferences using a convex function (Takagi, 2001). IEC human-computer interfaces should be carefully designed in order to reduce user fatigue. However IEC implementations that can concurrently accept evaluations from many users overcome the limitations described above. An example of this approach is an interactive media installation by Karl Sims that allows to accept preference from many visitors by using floor sensors to evolve attractive 3D animated forms. Some of these multi-user IEC implementations serve as collaboration tools, for example HBGA. IEC types IEC methods include interactive evolution strategy (Herdy, 1997), interactive genetic algorithm (Caldwell, 1991), interactive genetic programming (Sims, 1991; Unemi, 2000), and human-based genetic algorithm (Kosorukoff, 2001). IGA An interactive genetic algorithm (IGA) is defined as a genetic algorithm that uses human evaluation. These algorithms belong to a more general category of Interactive evolutionary computation. The main application of these techniques include domains where it is hard or impossible to design a computational fitness function, for example, evolving images, music, various artistic designs and forms to fit a user's aesthetic preferences. Interactive computation methods can use different representations, both linear (as in traditional genetic algorithms) and tree-like ones (as in genetic programming). 110 Interactive evolutionary computation References • Dawkins, R. (1986), The Blind Watchmaker, Longman, 1986; Penguin Books 1988. • Caldwell, Craig and Victor S. Johnston (1991), Tracking a Criminal Suspect through "Face-Space" with a Genetic Algorithm, in Proceedings of the Fourth International Conference on Genetic Algorithm, Morgan Kaufmann Publisher, pp.416-421, July 1991. • J. Clune and H. Lipson (2011). Evolving three-dimensional objects with a generative encoding inspired by developmental biology [1]. Proceedings of the European Conference on Artificial Life. 2011 • Sims, K. (1991), Artificial Evolution for Computer Graphics. Computer Graphics 25(4), Siggraph '91 Proceedings, July 1991, pp.319-328. • Sims, K. (1991), Interactive Evolution of Dynamical Systems. First European Conference on Artificial Life, MIT Press • Herdy, M. (1997), Evolutionary Optimisation based on Subjective Selection – evolving blends of coffee. Proceedings 5th European Congress on Intelligent Techniques and Soft Computing (EUFIT’97); pp 2010-644. • Unemi, T. (2000). SBART 2.4: an IEC tool for creating 2D images, Movies and Collage, Proceedings of 2000 Genetic and Evolutionary Computational Conference workshop program, Las Vegas, Nevada, July 8, 2000, p.153 • Kosorukoff, A. (2001), Human-based Genetic Algorithm. IEEE Transactions on Systems, Man, and Cybernetics, SMC-2001, 3464-3469. • Takagi, H. (2001). Interactive Evolutionary Computation: Fusion of the Capacities of EC Optimization and Human Evaluation. Proceedings of the IEEE 89, 9, pp. 1275-1296 [2] External links • EndlessForms.com [3], Collaborative interactive evolution allowing you to evolve 3D objects and have them 3D printed. • Art by Evolution on the Web [4] Interactive Art Generator. • An online interactive demonstrator to do Evolutionary Computation step by step. [5] • EFit-V [6] Facial composite system using interactive genetic algorithms. • Galapagos by Karl Sims [7] • E-volver [8] • SBART, a program to evolve 2D images [9] • GenJam (Genetic Jammer) [10] • Evolutionary music [11] • Darwin poetry [12] • Takagi Lab at Kyushu University [13] • [4] - Interactive one-max problem allows to compare the performance of interactive and human-based genetic algorithms. • idiofact.de [14], Webpage that uses interactive evolutionary computation with a generative design algorithm to generate 2d images. • Picbreeder service [15], Collaborative interactive evolution allowing branching from other users' creations that produces pictures like faces and spaceships. • Peer to Peer IGA [16] Using collaborative IGA sessions for floorplanning and document design. 111 Interactive evolutionary computation References [1] https:/ / www. msu. edu/ ~jclune/ webfiles/ publications/ 2011-CluneLipson-Evolving3DObjectsWithCPPNs-ECAL. pdf [2] http:/ / www. design. kyushu-u. ac. jp/ ~takagi/ TAKAGI/ IECpaper/ ProcIEEE_3. pdf [3] http:/ / EndlessForms. com [4] http:/ / eartweb. vanhemert. co. uk/ [5] http:/ / www. elec. gla. ac. uk/ ~yunli/ ga_demo/ [6] http:/ / www. visionmetric. com [7] http:/ / www. genarts. com/ galapagos/ index. html [8] http:/ / www. xs4all. nl/ ~notnot/ E-volverLUMC/ E-volverLUMC. html [9] http:/ / www. intlab. soka. ac. jp/ ~unemi/ sbart [10] http:/ / www. it. rit. edu/ ~jab/ GenJam. html [11] http:/ / www. timblackwell. com/ [12] http:/ / www. codeasart. com/ poetry/ darwin. html [13] http:/ / www. design. kyushu-u. ac. jp/ ~takagi/ TAKAGI/ takagiLab. html [14] http:/ / idiofact. de [15] http:/ / picbreeder. org [16] http:/ / www. cse. unr. edu/ ~quiroz/ Genetic programming In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. It is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task. History In 1954, GP began with the evolutionary algorithms first used by Nils Aall Barricelli applied to evolutionary simulations. In the 1960s and early 1970s, evolutionary algorithms became widely recognized as optimization methods. Ingo Rechenberg and his group were able to solve complex engineering problems through evolution strategies as documented in his 1971 PhD thesis and the resulting 1973 book. John Holland was highly influential during the 1970s. In 1964, Lawrence J. Fogel, one of the earliest practitioners of the GP methodology, applied evolutionary algorithms to the problem of discovering finite-state automata. Later GP-related work grew out of the learning classifier system community, which developed sets of sparse rules describing optimal policies for Markov decision processes. The first statement of modern "tree-based" Genetic Programming (that is, procedural languages organized in tree-based structures and operated on by suitably defined GA-operators) was given by Nichael L. Cramer (1985).[1] This work was later greatly expanded by John R. Koza, a main proponent of GP who has pioneered the application of genetic programming in various complex optimization and search problems.[2] In the 1990s, GP was mainly used to solve relatively simple problems because it is very computationally intensive. Recently GP has produced many novel and outstanding results in areas such as quantum computing, electronic design, game playing, sorting, and searching, due to improvements in GP technology and the exponential growth in CPU power.[3] These results include the replication or development of several post-year-2000 inventions. GP has also been applied to evolvable hardware as well as computer programs. Developing a theory for GP has been very difficult and so in the 1990s GP was considered a sort of outcast among search techniques. But after a series of breakthroughs in the early 2000s, the theory of GP has had a formidable and rapid development. So much so that it has been possible to build exact probabilistic models of GP (schema theories, Markov chain models and meta-optimization algorithms). 112 Genetic programming 113 Chromosome representation GP evolves computer programs, traditionally represented in memory as tree structures.[4] Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable). Non-tree representations have been suggested and successfully implemented, such as linear genetic programming which suits the more traditional imperative languages [see, for example, Banzhaf et al. (1998)]. The commercial GP software Discipulus, uses AIM, automatic induction of binary machine code[5] to achieve better performance. µGP[6] uses directed multigraphs to generate programs that fully exploit the syntax of a given assembly language. A function represented as a tree structure. Genetic operators The main operators used in evolutionary algorithms such as GP are crossover and mutation. Crossover Crossover is applied on an individual by simply switching one of its nodes with another node from another individual in the population. With a tree-based representation, replacing a node means replacing the whole branch. This adds greater effectiveness to the crossover operator. The expressions resulting from crossover are very much different from their initial parents. Mutation Mutation affects an individual in the population. It can replace a whole node in the selected individual, or it can replace just the node's information. To maintain integrity, operations must be fail-safe or the type of information the node holds must be taken into account. For example, mutation must be aware of binary operation nodes, or the operator must be able to handle missing values. Other approaches The basic ideas of genetic programming have been modified and extended in a variety of ways: • Extended Compact Genetic Programming (ECGP) • Embedded Cartesian Genetic Programming (ECGP) • Probabilistic Incremental Program Evolution (PIPE) MOSES Meta-Optimizing Semantic Evolutionary Search (MOSES) is a meta-programming technique for evolving programs by iteratively optimizing genetic populations.[7] It has been shown to strongly outperform genetic and evolutionary program learning systems, and has been successfully applied to many real-world problems, including computational biology, sentiment evaluation, and agent control.[8] When applied to supervised classification problems, MOSES performs as well as, or better than support vector machines (SVM), while offering more insight into the structure of the data, as the resulting program demonstrates dependencies and is understandable in a way that a large vector of Genetic programming 114 numbers is not.[8] MOSES is able to out-perform standard GP systems for two important reasons. One is that it uses estimation of distribution algorithms (EDA) to determine the Markov blanket (that is, the dependencies in a Bayesian network) between different parts of a program. This quickly rules out pointless mutations that change one part of a program without making corresponding changes in other, related parts of the program. The other is that it performs reduction to reduce programs to normal form at each iteration stage, thus making programs smaller, more compact, faster to execute, and more human readable. Besides avoiding spaghetti code, normalization removes redundancies in programs, thus allowing smaller populations of less complex programs, speeding convergence. Meta-Genetic Programming Meta-Genetic Programming is the proposed meta learning technique of evolving a genetic programming system using genetic programming itself. It suggests that chromosomes, crossover, and mutation were themselves evolved, therefore like their real life counterparts should be allowed to change on their own rather than being determined by a human programmer. Meta-GP was formally proposed by Jürgen Schmidhuber in 1987,[9] but some earlier efforts may be considered instances of the same technique, including Doug Lenat's Eurisko. It is a recursive but terminating algorithm, allowing it to avoid infinite recursion. Critics of this idea often say this approach is overly broad in scope. However, it might be possible to constrain the fitness criterion onto a general class of results, and so obtain an evolved GP that would more efficiently produce results for sub-classes. This might take the form of a Meta evolved GP for producing human walking algorithms which is then used to evolve human running, jumping, etc. The fitness criterion applied to the Meta GP would simply be one of efficiency. For general problem classes there may be no way to show that Meta GP will reliably produce results more efficiently than a created algorithm other than exhaustion. The same holds for standard GP and other search algorithms. Implementations Possibly most used: • • • • ECJ - Evolutionary Computation/Genetic Programming research system [10] (Java) Lil-Gp [11] Genetic Programming System (C). Beagle - Open BEAGLE, a versatile EC framework [12] (C++ with STL) EO Evolutionary Computation Framework [13] (C++ with static polymorphism) Other: Implementation EvoJ JEF Robust Genetic Programming System GNU GPL Evolutionary computations framework Creative Commons Attribution-NonCommercial-ShareAlike 3.0 [19] License JAVA Evolution Framework GNU Lesser GPL [22] Genetic Programming C++ Class Library GNU GPL [23] A tiny genetic programming system. [16] [20] TinyGP [15] Apache License [18] GPC++ License Fork of ECJ for .NET 4.0 BraneCloud [14] Evolution RobGP Description [17] [17] Language C# C++ [21] Java Java C++ C and Java Genetic programming [24] GenPro deap [26] [27] pySTEP JAGA [31] JGAP Python Strongly Typed gEnetic Programming MIT License [21] [28] Modified PSF [30] Python Python Python C++ Framework for conducting experiments in Genetic Programming .NET [34] simple Genetic Programming research system Java [35] Java Genetic Algorithms and Genetic Programming, an open-source framework Java Java Genetic Algorithms and Genetic Programming (stack oriented) framework Java object oriented framework for solving genetic programming problems C++ Directed Ruby Programming, Genetic Programming & Grammatical Evolution Library Ruby [39] A Genetic Programming Toolbox for MATLAB MATLAB [40] Genetic Programming Tool for MATLAB. aimed at performing multigene symbolic regression MATLAB [41] Evolutionary Algorithms (GA + GP) Modules, Open Source Python [32] [36] [37] PMDGP [38] GPLAB GPTIPS PyRobot PerlGP [42] Discipulus GAlib GNU Lesser GPL Java A Genetic Programming Package with support for Automatically Defined Functions n-genes DRP Distributed Evolutionary Algorithms in Python [25] Java [33] DGPF Apache License 2.0 Extensible and pluggable open source API for implementing genetic algorithms and genetic programming applications RMIT GP GPE Reflective Object Oriented Genetic Programming. Open Source Framework. Extend with POJO's, generates plain Java code. [29] Pyevolve 115 Grammar-based genetic programming in Perl [43] [44] Java GALib LAGEP PushGP Groovy [46] GNU GPL [17] Perl Commercial Genetic Programming Software from RML Technologies, Inc Generates code in most high level languages [45] Object oriented framework with 4 different GA GAlib License implementations and 4 representation types (arbitrary derivations possible) C++ Source Forge open source Java genetic algorithm library, complete with Javadocs and examples (see bottom of page) Java [47] Supporting single/multiple population genetic programming to generate mathematical functions. Open Source, OpenMP used. [48] a strongly typed, stack-based genetic programming system that allows GP to manipulate its own code (auto-constructive evolution) [49] Groovy Java Genetic Programming GNU GPL [17] C/C++ Java / C++ / Javascript / Scheme / Clojure / Lisp GNU GPL [17] Java Genetic programming GEVA jGE [50] Grammatical Evolution in Java [51] Java Grammatical Evolution [52] Evolutionary Computation Framework. different genotypes, parallel algorithms, tutorial ECF JCLEC [53] Evolutionary Computation Library in Java, expression tree encoding, syntax tree encoding [54] Java GNU GPL v3 [17] Java C++ GNU GPL [17] Java A Paradigm-Independent and Extensible Environment for Heuristic Optimization, rich graphical user interface, open source, plugin-based architecture C# [55] Strong typing and lambda abstractions Haskell [56] a small, one source file implementation of GE, with an interactive graphics demo application GNU GPL v3 General purpose tool, mostly exploited for assembly language generation GNU GPL HeuristicLab PolyGP 116 PonyGE MicroGP (uGP) [57] [17] [17] Python C++ NB. You should check the license and copyright terms on the program/library website before use. References and notes [1] [2] [3] [4] [5] [6] [7] [8] [9] Nichael Cramer's HomePage (http:/ / www. sover. net/ ~nichael/ nlc-publications/ icga85/ index. html) genetic-programming.com-Home-Page (http:/ / www. genetic-programming. com/ ) humancompetitive (http:/ / www. genetic-programming. com/ humancompetitive. html) Cramer, 1985 (http:/ / www. sover. net/ ~nichael/ nlc-publications/ icga85/ index. html) (Peter Nordin, 1997, Banzhaf et al., 1998, Section 11.6.2-11.6.3) MicroGP page on SourceForge, complete with tutorials and wiki (http:/ / ugp3. sourceforge. net) OpenCog MOSES (http:/ / wiki. opencog. org/ w/ Meta-Optimizing_Semantic_Evolutionary_Search) Moshe Looks (2006), Competent Program Learning (http:/ / metacog. org/ doc. html), PhD Thesis, 1987 THESIS ON LEARNING HOW TO LEARN, METALEARNING, META GENETIC PROGRAMMING, CREDIT-CONSERVING MACHINE LEARNING ECONOMY (http:/ / www. idsia. ch/ ~juergen/ diploma. html) [10] http:/ / cs. gmu. edu/ ~eclab/ projects/ ecj/ [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] http:/ / garage. cse. msu. edu/ software/ lil-gp/ http:/ / beagle. sf. net/ http:/ / eodev. sourceforge. net/ http:/ / branecloud. codeplex. com http:/ / branecloud. codeplex. com/ license http:/ / robgp. sourceforge. net/ about. php http:/ / www. gnu. org/ licenses/ gpl. html http:/ / evoj-frmw. appspot. com/ http:/ / creativecommons. org/ licenses/ by-nc-sa/ 3. 0/ legalcode http:/ / spl. utko. feec. vutbr. cz/ component/ content/ article/ 258-jef-java-evolution-framework?lang=en http:/ / www. gnu. org/ licenses/ lgpl. html http:/ / www. cs. ucl. ac. uk/ staff/ W. Langdon/ ftp/ weinbenner/ gp. html http:/ / cswww. essex. ac. uk/ staff/ sml/ gecco/ TinyGP. html http:/ / code. google. com/ p/ genpro/ http:/ / www. apache. org/ licenses/ LICENSE-2. 0 http:/ / code. google. com/ p/ deap/ http:/ / pystep. sourceforge. net/ http:/ / www. opensource. org/ licenses/ MIT http:/ / pyevolve. sourceforge. net/ http:/ / pyevolve. sourceforge. net/ license. html http:/ / www. jaga. org http:/ / goanna. cs. rmit. edu. au/ ~vc/ rmitgp/ http:/ / gpe. sourceforge. net/ Genetic programming [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] http:/ / dgpf. sourceforge. net/ http:/ / jgap. sourceforge. net http:/ / cui. unige. ch/ spc/ tools/ n-genes/ http:/ / pmdgp. sourceforge. net/ http:/ / drp. rubyforge. org http:/ / gplab. sourceforge. net http:/ / sites. google. com/ site/ gptips4matlab/ http:/ / emergent. brynmawr. edu/ pyro/ ?page=PyroModuleEvolutionaryAlgorithms http:/ / perlgp. org http:/ / www. rmltech. com http:/ / lancet. mit. edu/ ga/ http:/ / lancet. mit. edu/ ga/ Copyright. html http:/ / www. softtechdesign. com/ GA/ EvolvingABetterSolution-GA. html http:/ / www. cis. nctu. edu. tw/ ~gis91815/ lagep/ lagep. html http:/ / hampshire. edu/ lspector/ push. html http:/ / jgprog. sourceforge. net/ http:/ / ncra. ucd. ie/ geva/ http:/ / www. bangor. ac. uk/ ~eep201/ jge/ http:/ / gp. zemris. fer. hr/ ecf/ http:/ / jclec. sourceforge. net/ http:/ / dev. heuristiclab. com/ http:/ / darcs. haskell. org/ nofib/ real/ PolyGP/ [56] http:/ / code. google. com/ p/ ponyge/ [57] http:/ / ugp3. sourceforge. net/ Bibliography • Banzhaf, W., Nordin, P., Keller, R.E., and Francone, F.D. (1998), Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications, Morgan Kaufmann • Barricelli, Nils Aall (1954), Esempi numerici di processi di evoluzione, Methodos, pp. 45–68. • Brameier, M. and Banzhaf, W. (2007), Linear Genetic Programming, Springer, New York • Crosby, Jack L. (1973), Computer Simulation in Genetics, John Wiley & Sons, London. • Cramer, Nichael Lynn (1985), " A representation for the Adaptive Generation of Simple Sequential Programs (http://www.sover.net/~nichael/nlc-publications/icga85/index.html)" in Proceedings of an International Conference on Genetic Algorithms and the Applications, Grefenstette, John J. (ed.), Carnegie Mellon University • Fogel, David B. (2000) Evolutionary Computation: Towards a New Philosophy of Machine Intelligence IEEE Press, New York. • Fogel, David B. (editor) (1998) Evolutionary Computation: The Fossil Record, IEEE Press, New York. • Forsyth, Richard (1981), BEAGLE A Darwinian Approach to Pattern Recognition (http://www.cs.bham.ac. uk/~wbl/biblio/gp-html/kybernetes_forsyth.html) Kybernetes, Vol. 10, pp. 159–166. • Fraser, Alex S. (1957), Simulation of Genetic Systems by Automatic Digital Computers. I. Introduction. Australian Journal of Biological Sciences vol. 10 484-491. • Fraser, Alex and Donald Burnell (1970), Computer Models in Genetics, McGraw-Hill, New York. • Holland, John H (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor • Korns, Michael (2007), Large-Scale, Time-Constrained, Symbolic Regression-Classification, in Genetic Programming Theory and Practice V. Springer, New York. • Korns, Michael (2009), Symbolic Regression of Conditional Target Expressions, in Genetic Programming Theory and Practice VII. Springer, New York. • Korns, Michael (2010), Abstract Expression Grammar Symbolic Regression, in Genetic Programming Theory and Practice VIII. Springer, New York. • Koza, J.R. (1990), Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems, Stanford University Computer Science Department technical report STAN-CS-90-1314 (http://www.genetic-programming.com/jkpdf/tr1314.pdf). A thorough report, possibly 117 Genetic programming • • • • • • • • used as a draft to his 1992 book. Koza, J.R. (1992), Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press Koza, J.R. (1994), Genetic Programming II: Automatic Discovery of Reusable Programs, MIT Press Koza, J.R., Bennett, F.H., Andre, D., and Keane, M.A. (1999), Genetic Programming III: Darwinian Invention and Problem Solving, Morgan Kaufmann Koza, J.R., Keane, M.A., Streeter, M.J., Mydlowec, W., Yu, J., Lanza, G. (2003), Genetic Programming IV: Routine Human-Competitive Machine Intelligence, Kluwer Academic Publishers Langdon, W. B., Genetic Programming and Data Structures, Springer ISBN 0-7923-8135-1 (http://www. amazon.com/dp/0792381351/) Langdon, W. B., Poli, R. (2002), Foundations of Genetic Programming, Springer-Verlag ISBN 3-540-42451-2 (http://www.amazon.com/dp/3540424512/) Nordin, J.P., (1997) Evolutionary Program Induction of Binary Machine Code and its Application. Krehl Verlag, Muenster, Germany. Poli, R., Langdon, W. B., McPhee, N. F. (2008). A Field Guide to Genetic Programming. Lulu.com, freely available from the internet (http://www.gp-field-guide.org.uk/). ISBN 978-1-4092-0073-4. • Rechenberg, I. (1971): Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Reprinted by Fromman-Holzboog (1973). • Schmidhuber, J. (1987). Evolutionary principles in self-referential learning. (On learning how to learn: The meta-meta-... hook.) Diploma thesis, Institut f. Informatik, Tech. Univ. Munich. • Smith, S.F. (1980), A Learning System Based on Genetic Adaptive Algorithms, PhD dissertation (University of Pittsburgh) • Smith, Jeff S. (2002), Evolving a Better Solution (http://www.softtechdesign.com/GA/ EvolvingABetterSolution-GA.html), Developers Network Journal, March 2002 issue • Shu-Heng Chen et al. (2008), Genetic Programming: An Emerging Engineering Tool,International Journal of Knowledge-based Intelligent Engineering System, 12(1): 1-2, 2008. • Weise, T, Global Optimization Algorithms: Theory and Application (http://www.it-weise.de/projects/book. pdf), 2008 External links • Riccardo Poli, William B. Langdon,Nicholas F. McPhee, John R. Koza, " A Field Guide to Genetic Programming (http://cswww.essex.ac.uk/staff/poli/gp-field-guide/index.html)" (2008) • DigitalBiology.NET (http://www.digitalbiology.net/) Vertical search engine for GA/GP resources • Aymen S Saket & Mark C Sinclair (http://web.archive.org/web/20070813222058/http://uk.geocities.com/ markcsinclair/abstracts.html#pro00a/) • The Hitch-Hiker's Guide to Evolutionary Computation (http://www.etsimo.uniovi.es/ftp/pub/EC/FAQ/ www/) • GP bibliography (http://www.cs.bham.ac.uk/~wbl/biblio/README.html) • People who work on GP (http://www.cs.ucl.ac.uk/staff/W.Langdon/homepages.html) 118 Gene expression programming Gene expression programming Gene Expression Programming (GEP) is an evolutionary algorithm that evolves populations of computer programs in order to solve a user-defined problem. GEP has similarities, but is distinct from, the evolutionary computational method of genetic programming. In genetic programming the individuals comprising a population are typically symbolic expression trees; however, the individuals comprising a population of GEP are encoded as linear chromosomes, which are then translated into expression trees. The important difference is that the recombination operators of genetic programming operate directly on the tree structure (e.g. swapping sub-trees), whereas the recombination operators of gene expression programming operate directly on the linear encoding (i.e. before it is translated into a tree). As such, after recombination, the modified portions of the resulting expression trees often bear little semblance to their direct ancestors. The expression trees are themselves computer programs evolved to solve a particular problem and are selected according to their performance/fitness in solving the problem at hand. After repeated iteration, populations of such computer programs will ideally discover new traits and become better adapted to a particular selection environment. The desired endpoint of the algorithm is that a good solution has been evolved by the evolutionary process. Cândida Ferreira, the inventor of the technique, claims that GEP significantly surpasses the traditional genetic programming approach for a number of benchmark problems. She attributes the alleged speed increase to the separate genotype/phenotype representation and the inherently multigenic organization of GEP chromosomes. For further details of GEP see the GEP paper [1] published in Complex Systems, where the algorithm is described and applied to a set of problems including symbolic regression, Boolean concept learning, and cellular automata. Further reading • Ferreira, Cândida (2006). Gene Expression programming: mathematical modeling by an artificial intelligence. Springer-Verlag. ISBN 3-540-32796-7. "Online Edition ISBN 978-3-540-32849-0" • Ferreira, C. (2002). Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence [2]. Portugal: Angra do Heroismo. ISBN 9729589054. References • GEP home page [3] References [1] http:/ / www. gene-expression-programming. com/ webpapers/ gep. pdf [2] http:/ / www. gene-expression-programming. com/ GepBook/ Introduction. htm [3] http:/ / www. gene-expression-programming. com/ 119 Grammatical evolution Grammatical evolution Grammatical evolution is a relatively new evolutionary computation technique pioneered by Conor Ryan, JJ Collins and Michael O'Neill in 1998[1] at the BDS Group [2] in the University of Limerick. It is related to the idea of genetic programming in that the objective is to find an executable program or program fragment, that will achieve a good fitness value for the given objective function. In most published work on Genetic Programming, a LISP-style tree-structured expression is directly manipulated, whereas Grammatical Evolution applies genetic operators to an integer string, subsequently mapped to a program (or similar) through the use of a grammar. One of the benefits of GE is that this mapping simplifies the application of search to different programming languages and other structures. Problem addressed In type-free conventional, Koza/Cramer-style GP, the function set must meet the requirement of closure: all functions must be capable of accepting as their arguments the output of all other functions in the function set. Usually, this is implemented by dealing with a single data-type such as double-precision floating point. Whilst modern Genetic Programming frameworks supporting typing, such type-systems have limitations that Grammatical Evolution does not suffer from. GE's solution GE offers a solution to this issue by evolving solutions according to a user-specified grammar (usually a grammar in Backus-Naur form). Therefore the search space can be restricted, and domain knowledge of the problem can be incorporated. The inspiration for this approach comes from a desire to separate the "genotype" from the "phenotype": in GP, the objects the search algorithm operates on and what the fitness evaluation function interprets are one and the same. In contrast, GE's "genotypes" are ordered lists of integers which code for selecting rules from the provided context-free grammar. The phenotype, however, is the same as in Koza/Cramer-style GP: a tree-like structure that is evaluated recursively. This is more in line with how genetics work in nature, where there is a separation between an organism's genotype and that expression in proteins and the like. GE has a modular approach to it. In particular, the search portion of the GE paradigm needn't be carried out by any one particular algorithm or method. Observe that the objects GE performs search on are the same as that used in genetic algorithms. This means, in principle, that any existing genetic algorithm package, such as the popular GAlib [44] , can be used to carry out the search, and a developer implementing a GE system need only worry about carrying out the mapping from list of integers to program tree. It is also in principle possible to perform the search using some other method, such as particle swarm optimization (see the remark below); the modular nature of GE creates many opportunities for hybrids as the problem of interest to be solved dictates. Brabazon and O'Neill have successfully applied GE to predicting corporate bankruptcy, forecasting stock indices, bond credit ratings, and other financial applications. It is possible to structure a GE grammar that for a given function/terminal set is equivalent to genetic programming. 120 Grammatical evolution Criticism Despite its successes, GE has been the subject of some criticism. One issue is that as a result of its mapping operation, GE's genetic operators do not achieve high locality[3][4] which is a highly regarded property of genetic operators in evolutionary algorithms.[3] Variants Although GE is fairly new, there are already enhanced versions and variants that have been worked out. GE researchers have experimented with using particle swarm optimization to carry out the searching instead of genetic algorithms with results comparable to that of normal GE; this is referred to as a "grammatical swarm"; using only the basic PSO model it has been found that PSO is probably equally capable of carrying out the search process in GE as simple genetic algorithms are. (Although PSO is normally a floating-point search paradigm, it can be discretized, e.g., by simply rounding each vector to the nearest integer, for use with GE.) Yet another possible variation that has been experimented with in the literature is attempting to encode semantic information in the grammar in order to further bias the search process. Notes [1] [2] [3] [4] http:/ / www. grammaticalevolution. org/ eurogp98. ps http:/ / bds. ul. ie http:/ / www. springerlink. com/ content/ 0125627h52766534/ http:/ / www. cs. kent. ac. uk/ pubs/ 2010/ 3004/ index. html Resources • An Open Source C++ implementation (http://www.grammaticalevolution.org/libGE) of GE was funded by the Science Foundation of Ireland (http://www.sfi.ie). • Grammatical Evolution Tutorial (http://www.grammaticalevolution.org/tutorial.pdf). • Grammatical Evolution in Java (http://ncra.ucd.ie/geva). • jGE - Java Grammatical Evolution (http://www.bangor.ac.uk/~eep201/jge). • The Biocomputing and Developmental Systems (BDS) Group (http://bds.ul.ie) at the University of Limerick (http://www.ul.ie). • Michael O'Neill's Grammatical Evolution Page (http://www.grammatical-evolution.org), including a bibliography. • DRP (http://drp.rubyforge.org/), Directed Ruby Programming, is an experimental system designed to let users create hybrid GE/GP systems. It is implemented in pure Ruby. • GERET (http://geret.org/), Grammatical Evolution Ruby Exploratory Toolkit. 121 Grammar induction Grammar induction Grammatical induction, also known as grammatical inference or syntactic pattern recognition, refers to the process in machine learning of learning a formal grammar (usually in the form of re-write rules or productions) from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. Grammatical inference is distinguished from traditional decision rules and other such methods principally by the nature of the resulting model, which in the case of grammatical inference relies heavily on hierarchical substitutions. Whereas a traditional decision rule set is geared toward assessing object classification, a grammatical rule set is geared toward the generation of examples. In this sense, the grammatical induction problem can be said to seek a generative model, while the decision rule problem seeks a descriptive model. Methodologies There are a wide variety of methods for grammatical inference. Two of the classic sources are Fu (1977) and Fu (1982). Duda, Hart & Stork (2001) also devote a brief section to the problem, and cite a number of references. The basic trial-and-error method they present is discussed below. Grammatical inference by trial-and-error The method proposed in Section 8.7 of Duda, Hart & Stork (2001) suggests successively guessing grammar rules (productions) and testing them against positive and negative observations. The rule set is expanded so as to be able to generate each positive example, but if a given rule set also generates a negative example, it must be discarded. This particular approach can be characterized as "hypothesis testing" and bears some similarity to Mitchel's version space algorithm. The Duda, Hart & Stork (2001) text provide a simple example which nicely illustrates the process, but the feasibility of such an unguided trial-and-error approach for more substantial problems is dubious. Grammatical inference by genetic algorithms Grammatical Induction using evolutionary algorithms is the process of evolving a representation of the grammar of a target language through some evolutionary process. Formal grammars can easily be represented as a tree structure of production rules that can be subjected to evolutionary operators. Algorithms of this sort stem from the genetic programming paradigm pioneered by John Koza. Other early work on simple formal languages used the binary string representation of genetic algorithms, but the inherently hierarchical structure of grammars couched in the EBNF language made trees a more flexible approach. Koza represented Lisp programs as trees. He was able to find analogues to the genetic operators within the standard set of tree operators. For example, swapping sub-trees is equivalent to the corresponding process of genetic crossover, where sub-strings of a genetic code are transplanted into an individual of the next generation. Fitness is measured by scoring the output from the functions of the lisp code. Similar analogues between the tree structured lisp representation and the representation of grammars as trees, made the application of genetic programming techniques possible for grammar induction. In the case of Grammar Induction, the transplantation of sub-trees corresponds to the swapping of production rules that enable the parsing of phrases from some language. The fitness operator for the grammar is based upon some measure of how well it performed in parsing some group of sentences from the target language. In a tree representation of a grammar, a terminal symbol of a production rule corresponds to a leaf node of the tree. Its parent nodes corresponds to a non-terminal symbol (e.g. a noun phrase or a verb phrase) in the rule set. Ultimately, the root node might correspond to a sentence non-terminal. 122 Grammar induction Grammatical inference by greedy algorithms Like all greedy algorithms, greedy grammar inference algorithms make, in iterative manner, decisions that seem to be the best at that stage. These made decisions deal usually with things like the making of a new or the removing of the existing rules, the choosing of the applied rule or the merging of some existing rules. Because there are several ways to define 'the stage' and 'the best', there are also several greedy grammar inference algorithms. These context-free grammar generating algorithms make the decision after every read symbol: • Lempel-Ziv-Welch algorithm creates a context-free grammar in a deterministic way such that it is necessary to store only the start rule of the generated grammar. • Sequitur and its modifications. These context-free grammar generating algorithms first read the whole given symbol-sequence and then start to make decisions: • Byte pair encoding and its optimizations. Applications The principle of grammar induction has been applied to other aspects of natural language processing, and have been applied (among many other problems) to morpheme analysis, and even place name derivations. Grammar induction has also been used for lossless data compression and statistical inference via MML and MDL principles. References • Duda, Richard O.; Hart, Peter E.; Stork, David G. (2001), Pattern Classification [1], New York: John Wiley & Sons • Syntactic Pattern Recognition and Applications, Englewood Cliffs, NJ: Prentice-Hall, 1982 • Syntactic Pattern Recognition, Applications, Berlin: Springer-Verlag, 1977 • Horning, James Jay (1969), A Study of Grammatical Inference [2] (Ph.D. Thesis ed.), Stanford: Stanford University Computer Science Department • Gold, E Mark (1967), Language Identification in the Limit, Information and Control References [1] http:/ / www. wiley. com/ WileyCDA/ WileyTitle/ productCd-0471056693. html [2] http:/ / proquest. umi. com/ pqdlink?Ver=1& Exp=05-16-2013& FMT=7& DID=757518381& RQT=309& attempt=1& cfc=1 123 Java Grammatical Evolution Java Grammatical Evolution jGE Library jGE Library is an implementation of Grammatical Evolution in the Java programming language. It was the first published implementation of Grammatical Evolution in this language [1]. Today, another one well-known published Java implementation exists, named GEVA [2]. GEVA developed at UCD's Natural Computing Research & Applications group under the guidance of one of the inventors of Grammatical Evolution, Dr. Michael O'Neill. [3] jGE Library aims to provide not only an implementation of Grammatical Evolution, but also a free, open-source, and extendable framework for experimentation in the area of evolutionary computation. Namely, it supports the implementation (through additions and extensions) of any evolutionary algorithm[4]. Furthermore, its extendable architecture and design facilitates the implementation and incorporation of new experimental implementation inspired by natural evolution and biology [5]. The jGE Library binary file, the source code, the documentation, and an extension for the NetLogo modeling environment [6], named jGE NetLogo extension, can be downloaded from the jGE Official Web Site [7]. Web Site jGE Official Web Site [7] License The jGE Library is free software released under the GNU General Public License v3 [8]. jGE Publications • Georgiou, L. and Teahan, W. J. (2006a) “jGE - A Java implementation of Grammatical Evolution”. 10th WSEAS International Conference on Systems, Athens, Greece, July 10–15, 2006. • Georgiou, L. and Teahan, W. J. (2006b) “Implication of Prior Knowledge and Population Thinking in Grammatical Evolution: Toward a Knowledge Sharing Architecture”. WSEAS Transactions on Systems 5 (10), 2338-2345. • Georgiou, L. and Teahan, W. J. (2008) “Experiments with Grammatical Evolution in Java”. Knowledge-Driven Computing: Knowledge Engineering and Intelligent Computations, Studies in Computational Intelligence (vol. 102), 45-62. Berlin, Germany: Springer Berlin / Heidelberg. References [1] Georgiou, L. and Teahan, W. J. (2006a) “jGE - A Java implementation of Grammatical Evolution”. 10th WSEAS International Conference on Systems, Athens, Greece, July 10–15, 2006. [2] http:/ / ncra. ucd. ie/ Site/ GEVA. html [3] http:/ / www. csi. ucd. ie/ users/ michael-oneill [4] Georgiou, L. and Teahan, W. J. (2008) “Experiments with Grammatical Evolution in Java”. Knowledge-Driven Computing: Knowledge Engineering and Intelligent Computations, Studies in Computational Intelligence (vol. 102), 45-62. Berlin, Germany: Springer Berlin / Heidelberg. [5] Georgiou, L. and Teahan, W. J. (2006b) “Implication of Prior Knowledge and Population Thinking in Grammatical Evolution: Toward a Knowledge Sharing Architecture”. WSEAS Transactions on Systems 5 (10), 2338-2345. [6] http:/ / ccl. northwestern. edu/ netlogo [7] http:/ / www. bangor. ac. uk/ ~eep201/ jge [8] http:/ / www. gnu. org/ licenses 124 Linear genetic programming Linear genetic programming "Linear genetic programming" is unrelated to "linear programming". Linear Genetic Programming (LGP) is a particular subset of genetic programming wherein computer programs in population are represented as a sequence of instructions from imperative programming language or machine language. The graph-based data flow that results from a multiple usage of register contents and the existence of structurally noneffective code (introns) are two main differences to more common tree-based genetic programming (TGP) variant.[1] [2][3] Examples of LGP programs Because LGP programs are basically represented by a linear sequence of instructions, they are simpler to read and to operate on than their tree-based counterparts. For example, a simple program written in the LGP language Slash/A [4] looks like a series of instructions separated by a slash: input/ 0/ save/ input/ add/ output/. # # # # # # gets an input from user and saves it to register F sets register I = 0 saves content of F into data vector D[I] (i.e. D[0] := F) gets another input, saves to F adds to F current data pointed to by I (i.e. D[0] := F) outputs result from F By representing such code in bytecode format, i.e. as an array of bytes each representing a different instruction, one can make mutation operations simply by changing an element of such an array. See Cartesian genetic programming Notes [1] Brameier, M.: " On linear genetic programming (https:/ / eldorado. uni-dortmund. de/ handle/ 2003/ 20098)", Dortmund, 2003 [2] W. Banzhaf, P. Nordin, R. Keller, F. Francone, "Genetic Programming – An Introduction. On the Automatic Evolution of Computer Programs and its Application", Morgan Kaufmann, Heidelberg/San Francisco, 1998 [3] Poli, R., Langdon, W. B., McPhee, N. F. (2008). A Field Guide to Genetic Programming. Lulu.com, freely available from the internet. ISBN 978-1-4092-0073-4. [4] http:/ / github. com/ arturadib/ slash-a External links • Slash/A (http://github.com/arturadib/slash-a) A programming language and C++ library specifically designed for linear GP • DigitalBiology.NET (http://www.digitalbiology.net/) Vertical search engine for GA/GP resources • Discipulus (http://www.aimlearning.com/) Genetic-Programming Software • (http://www.genetic-programming.org) 125 Evolutionary programming Evolutionary programming Evolutionary programming is one of the four major evolutionary algorithm paradigms. It is similar to genetic programming, but the structure of the program to be optimized is fixed, while its numerical parameters are allowed to evolve. It was first used by Lawrence J. Fogel in the US in 1960 in order to use simulated evolution as a learning process aiming to generate artificial intelligence. Fogel used finite state machines as predictors and evolved them. Currently evolutionary programming is a wide evolutionary computing dialect with no fixed structure or (representation), in contrast with some of the other dialects. It is becoming harder to distinguish from evolutionary strategies. Its main variation operator is mutation; members of the population are viewed as part of a specific species rather than members of the same species therefore each parent generates an offspring, using a (μ + μ) survivor selection. References • Fogel, L.J., Owens, A.J., Walsh, M.J. (1966), Artificial Intelligence through Simulated Evolution, John Wiley. • Fogel, L.J. (1999), Intelligence through Simulated Evolution : Forty Years of Evolutionary Programming, John Wiley. • Eiben, A.E., Smith, J.E. (2003), Introduction to Evolutionary Computing [1], Springer [2]. ISBN 3-540-40184-9 External links • The Hitch-Hiker's Guide to Evolutionary Computation: What's Evolutionary Programming (EP)? [3] • Evolutionary Programming by Jason Brownlee (PhD) [4] References [1] [2] [3] [4] http:/ / www. cs. vu. nl/ ~gusz/ ecbook/ ecbook. html http:/ / www. springer. de http:/ / www. aip. de/ ~ast/ EvolCompFAQ/ Q1_2. htm http:/ / www. cleveralgorithms. com/ nature-inspired/ evolution/ evolutionary_programming. html 126 Gaussian adaptation Gaussian adaptation Gaussian adaptation (GA) is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. In short, GA is a stochastic adaptive process where a number of samples of an n-dimensional vector x[xT = (x1, x2, ..., xn)] are taken from a multivariate Gaussian distribution, N(m, M), having mean m and moment matrix M. The samples are tested for fail or pass. The first- and second-order moments of the Gaussian restricted to the pass samples are m* and M*. The outcome of x as a pass sample is determined by a function s(x), 0 < s(x) < q ≤ 1, such that s(x) is the probability that x will be selected as a pass sample. The average probability of finding pass samples (yield) is Then the theorem of GA states: For any s(x) and for any value of P < q, there always exist a Gaussian p. d. f. that is adapted for maximum dispersion. The necessary conditions for a local optimum are m = m* and M proportional to M*. The dual problem is also solved: P is maximized while keeping the dispersion constant (Kjellström, 1991). Proofs of the theorem may be found in the papers by Kjellström, 1970, and Kjellström & Taxén, 1981. Since dispersion is defined as the exponential of entropy/disorder/average information it immediately follows that the theorem is valid also for those concepts. Altogether, this means that Gaussian adaptation may carry out a simultateous maximisation of yield and average information (without any need for the yield or the average information to be defined as criterion functions). The theorem is valid for all regions of acceptability and all Gaussian distributions. It may be used by cyclic repetition of random variation and selection (like the natural evolution). In every cycle a sufficiently large number of Gaussian distributed points are sampled and tested for membership in the region of acceptability. The centre of gravity of the Gaussian, m, is then moved to the centre of gravity of the approved (selected) points, m*. Thus, the process converges to a state of equilibrium fulfilling the theorem. A solution is always approximate because the centre of gravity is always determined for a limited number of points. It was used for the first time in 1969 as a pure optimization algorithm making the regions of acceptability smaller and smaller (in analogy to simulated annealing, Kirkpatrick 1983). Since 1970 it has been used for both ordinary optimization and yield maximization. Natural evolution and Gaussian adaptation It has also been compared to the natural evolution of populations of living organisms. In this case s(x) is the probability that the individual having an array x of phenotypes will survive by giving offspring to the next generation; a definition of individual fitness given by Hartl 1981. The yield, P, is replaced by the mean fitness determined as a mean over the set of individuals in a large population. Phenotypes are often Gaussian distributed in a large population and a necessary condition for the natural evolution to be able to fulfill the theorem of Gaussian adaptation, with respect to all Gaussian quantitative characters, is that it may push the centre of gravity of the Gaussian to the centre of gravity of the selected individuals. This may be accomplished by the Hardy–Weinberg law. This is possible because the theorem of Gaussian adaptation is valid for any region of acceptability independent of the structure (Kjellström, 1996). In this case the rules of genetic variation such as crossover, inversion, transposition etcetera may be seen as random number generators for the phenotypes. So, in this sense Gaussian adaptation may be seen as a genetic algorithm. 127 Gaussian adaptation How to climb a mountain Mean fitness may be calculated provided that the distribution of parameters and the structure of the landscape is known. The real landscape is not known, but figure below shows a fictitious profile (blue) of a landscape along a line (x) in a room spanned by such parameters. The red curve is the mean based on the red bell curve at the bottom of figure. It is obtained by letting the bell curve slide along the x-axis, calculating the mean at every location. As can be seen, small peaks and pits are smoothed out. Thus, if evolution is started at A with a relatively small variance (the red bell curve), then climbing will take place on the red curve. The process may get stuck for millions of years at B or C, as long as the hollows to the right of these points remain, and the mutation rate is too small. If the mutation rate is sufficiently high, the disorder or variance may increase and the parameter(s) may become distributed like the green bell curve. Then the climbing will take place on the green curve, which is even more smoothed out. Because the hollows to the right of B and C have now disappeared, the process may continue up to the peaks at D. But of course the landscape puts a limit on the disorder or variability. Besides — dependent on the landscape — the process may become very jerky, and if the ratio between the time spent by the process at a local peak and the time of transition to the next peak is very high, it may as well look like a punctuated equilibrium as suggested by Gould (see Ridley). Computer simulation of Gaussian adaptation Thus far the theory only considers mean values of continuous distributions corresponding to an infinite number of individuals. In reality however, the number of individuals is always limited, which gives rise to an uncertainty in the estimation of m and M (the moment matrix of the Gaussian). And this may also affect the efficiency of the process. Unfortunately very little is known about this, at least theoretically. The implementation of normal adaptation on a computer is a fairly simple task. The adaptation of m may be done by one sample (individual) at a time, for example m(i + 1) = (1 – a) m(i) + ax where x is a pass sample, and a < 1 a suitable constant so that the inverse of a represents the number of individuals in the population. M may in principle be updated after every step y leading to a feasible point x = m + y according to: M(i + 1) = (1 – 2b) M(i) + 2byyT, where yT is the transpose of y and b << 1 is another suitable constant. In order to guarantee a suitable increase of average information, y should be normally distributed with moment matrix μ2M, where the scalar μ > 1 is used to increase average information (information entropy, disorder, diversity) at a suitable rate. But M will never be used in the calculations. Instead we use the matrix W defined by WWT = M. 128 Gaussian adaptation Thus, we have y = Wg, where g is normally distributed with the moment matrix μU, and U is the unit matrix. W and WT may be updated by the formulas W = (1 – b)W + bygT and WT = (1 – b)WT + bgyT because multiplication gives M = (1 – 2b)M + 2byyT, where terms including b2 have been neglected. Thus, M will be indirectly adapted with good approximation. In practice it will suffice to update W only W(i + 1) = (1 – b)W(i) + bygT. This is the formula used in a simple 2-dimensional model of a brain satisfying the Hebbian rule of associative learning; see the next section (Kjellström, 1996 and 1999). The figure below illustrates the effect of increased average information in a Gaussian p.d.f. used to climb a mountain Crest (the two lines represent the contour line). Both the red and green cluster have equal mean fitness, about 65%, but the green cluster has a much higher average information making the green process much more efficient. The effect of this adaptation is not very salient in a 2-dimensional case, but in a high-dimensional case, the efficiency of the search process may be increased by many orders of magnitude. The evolution in the brain In the brain the evolution of DNA-messages is supposed to be replaced by an evolution of signal patterns and the phenotypic landscape is replaced by a mental landscape, the complexity of which will hardly be second to the former. The metaphor with the mental landscape is based on the assumption that certain signal patterns give rise to a better well-being or performance. For instance, the control of a group of muscles leads to a better pronunciation of a word or performance of a piece of music. In this simple model it is assumed that the brain consists of interconnected components that may add, multiply and delay signal values. • A nerve cell kernel may add signal values, • a synapse may multiply with a constant and • An axon may delay values. This is a basis of the theory of digital filters and neural networks consisting of components that may add, multiply and delay signalvalues and also of many brain models, Levine 1991. In the figure below the brain stem is supposed to deliver Gaussian distributed signal patterns. This may be possible since certain neurons fire at random (Kandel et al.). The stem also constitutes a disordered structure surrounded by more ordered shells (Bergström, 1969), and according to the central limit theorem the sum of signals from many neurons may be Gaussian distributed. The triangular boxes represent synapses and the boxes with the + sign are cell 129 Gaussian adaptation kernels. In the cortex signals are supposed to be tested for feasibility. When a signal is accepted the contact areas in the synapses are updated according to the formulas below in agreement with the Hebbian theory. The figure shows a 2-dimensional computer simulation of Gaussian adaptation according to the last formula in the preceding section. m and W are updated according to: m1 = 0.9 m1 + 0.1 x1; m2 = 0.9 m2 + 0.1 x2; w11 = 0.9 w11 + 0.1 y1g1; w12 = 0.9 w12 + 0.1 y1g2; w21 = 0.9 w21 + 0.1 y2g1; w22 = 0.9 w22 + 0.1 y2g2; As can be seen this is very much like a small brain ruled by the theory of Hebbian learning (Kjellström, 1996, 1999 and 2002). Gaussian adaptation and free will Gaussian adaptation as an evolutionary model of the brain obeying the Hebbian theory of associative learning offers an alternative view of free will due to the ability of the process to maximize the mean fitness of signal patterns in the brain by climbing a mental landscape in analogy with phenotypic evolution. Such a random process gives us lots of freedom of choice, but hardly any will. An illusion of will may, however, emanate from the ability of the process to maximize mean fitness, making the process goal seeking. I. e., it prefers higher peaks in the landscape prior to lower, or better alternatives prior to worse. In this way an illusive will may appear. A similar view has been given by Zohar 1990. See also Kjellström 1999. A theorem of efficiency for random search The efficiency of Gaussian adaptation relies on the theory of information due to Claude E. Shannon (see information content). When an event occurs with probability P, then the information −log(P) may be achieved. For instance, if the mean fitness is P, the information gained for each individual selected for survival will be −log(P) – on the average - and the work/time needed to get the information is proportional to 1/P. Thus, if efficiency, E, is defined as information divided by the work/time needed to get it we have: E = −P log(P). This function attains its maximum when P = 1/e = 0.37. The same result has been obtained by Gaines with a different method. E = 0 if P = 0, for a process with infinite mutation rate, and if P = 1, for a process with mutation rate = 0 (provided that the process is alive). This measure of efficiency is valid for a large class of random search processes provided that certain conditions are at hand. 1 The search should be statistically independent and equally efficient in different parameter directions. This condition may be approximately fulfilled when the moment matrix of the Gaussian has been adapted for maximum average information to some region of acceptability, because linear transformations of the whole process do not have 130 Gaussian adaptation an impact on efficiency. 2 All individuals have equal cost and the derivative at P = 1 is < 0. Then, the following theorem may be proved: All measures of efficiency, that satisfy the conditions above, are asymptotically proportional to –P log(P/q) when the number of dimensions increases, and are maximized by P = q exp(-1) (Kjellström, 1996 and 1999). The figure above shows a possible efficiency function for a random search process such as Gaussian adaptation. To the left the process is most chaotic when P = 0, while there is perfect order to the right where P = 1. In an example by Rechenberg, 1971, 1973, a random walk is pushed thru a corridor maximizing the parameter x1. In this case the region of acceptability is defined as a (n − 1)-dimensional interval in the parameters x2, x3, ..., xn, but a x1-value below the last accepted will never be accepted. Since P can never exceed 0.5 in this case, the maximum speed towards higher x1-values is reached for P = 0.5/e = 0.18, in agreement with the findings of Rechenberg. A point of view that also may be of interest in this context is that no definition of information (other than that sampled points inside some region of acceptability gives information about the extension of the region) is needed for the proof of the theorem. Then, because, the formula may be interpreted as information divided by the work needed to get the information, this is also an indication that −log(P) is a good candidate for being a measure of information. The Stauffer and Grimson algorithm Gaussian adaptation has also been used for other purposes as for instance shadow removal by "The Stauffer-Grimson algorithm" which is equivalent to Gaussian adaptation as used in the section "Computer simulation of Gaussian adaptation" above. In both cases the maximum likelihood method is used for estimation of mean values by adaptation at one sample at a time. But there are differences. In the Stauffer-Grimson case the information is not used for the control of a random number generator for centering, maximization of mean fitness, average information or manufacturing yield. The adaptation of the moment matrix also differs very much as compared to "the evolution in the brain" above. 131 Gaussian adaptation References • Bergström, R. M. An Entropy Model of the Developing Brain. Developmental Psychobiology, 2(3): 139–152, 1969. • Brooks, D. R. & Wiley, E. O. Evolution as Entropy, Towards a unified theory of Biology. The University of Chicago Press, 1986. • Brooks, D. R. Evolution in the Information Age: Rediscovering the Nature of the Organism. Semiosis, Evolution, Energy, Development, Volume 1, Number 1, March 2001 • Gaines, Brian R. Knowledge Management in Societies of Intelligent Adaptive Agents. Journal of intelligent Information systems 9, 277–298 (1997). • Hartl, D. L. A Primer of Population Genetics. Sinauer, Sunderland, Massachusetts, 1981. • Hamilton, WD. 1963. The evolution of altruistic behavior. American Naturalist 97:354–356 • Kandel, E. R., Schwartz, J. H., Jessel, T. M. Essentials of Neural Science and Behavior. Prentice Hall International, London, 1995. • S. Kirkpatrick and C. D. Gelatt and M. P. Vecchi, Optimization by Simulated Annealing, Science, Vol 220, Number 4598, pages 671–680, 1983. • Kjellström, G. Network Optimization by Random Variation of component values. Ericsson Technics, vol. 25, no. 3, pp. 133–151, 1969. • Kjellström, G. Optimization of electrical Networks with respect to Tolerance Costs. Ericsson Technics, no. 3, pp. 157–175, 1970. • Kjellström, G. & Taxén, L. Stochastic Optimization in System Design. IEEE Trans. on Circ. and Syst., vol. CAS-28, no. 7, July 1981. • Kjellström, G., Taxén, L. and Lindberg, P. O. Discrete Optimization of Digital Filters Using Gaussian Adaptation and Quadratic Function Minimization. IEEE Trans. on Circ. and Syst., vol. CAS-34, no 10, October 1987. • Kjellström, G. On the Efficiency of Gaussian Adaptation. Journal of Optimization Theory and Applications, vol. 71, no. 3, December 1991. • Kjellström, G. & Taxén, L. Gaussian Adaptation, an evolution-based efficient global optimizer; Computational and Applied Mathematics, In, C. Brezinski & U. Kulish (Editors), Elsevier Science Publishers B. V., pp 267–276, 1992. • Kjellström, G. Evolution as a statistical optimization algorithm. Evolutionary Theory 11:105–117 (January, 1996). • Kjellström, G. The evolution in the brain. Applied Mathematics and Computation, 98(2–3):293–300, February, 1999. • Kjellström, G. Evolution in a nutshell and some consequences concerning valuations. EVOLVE, ISBN 91-972936-1-X, Stockholm, 2002. • Levine, D. S. Introduction to Neural & Cognitive Modeling. Laurence Erlbaum Associates, Inc., Publishers, 1991. • MacLean, P. D. A Triune Concept of the Brain and Behavior. Toronto, Univ. Toronto Press, 1973. • Maynard Smith, J. 1964. Group Selection and Kin Selection, Nature 201:1145–1147. • Maynard Smith, J. Evolutionary Genetics. Oxford University Press, 1998. • Mayr, E. What Evolution is. Basic Books, New York, 2001. • Müller, Christian L. and Sbalzarini Ivo F. Gaussian Adaptation revisited - an entropic view on Covariance Matrix Adaptation. Institute of Theoretical Computer Science and Swiss Institute of Bioinformatics, ETH Zurich, CH-8092 Zurich, Switzerland. • Pinel, J. F. and Singhal, K. Statistical Design Centering and Tolerancing Using Parametric Sampling. IEEE Transactions on Circuits and Systems, Vol. Das-28, No. 7, July 1981. • Rechenberg, I. (1971): Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Reprinted by Fromman-Holzboog (1973). • Ridley, M. Evolution. Blackwell Science, 1996. 132 Gaussian adaptation 133 • Stauffer, C. & Grimson, W.E.L. Learning Patterns of Activity Using Real-Time Tracking, IEEE Trans. on PAMI, 22(8), 2000. • Stehr, G. On the Performance Space Exploration of Analog Integrated Circuits. Technischen Universität Munchen, Dissertation 2005. • Taxén, L. A Framework for the Coordination of Complex Systems’ Development. Institute of Technology, Linköping University, Dissertation, 2003. • Zohar, D. The quantum self : a revolutionary view of human nature and consciousness rooted in the new physics. London, Bloomsbury, 1990. Differential evolution In computer science, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, metaheuristics such as DE do not guarantee an optimal solution is ever found. DE is used for multidimensional real-valued functions but does not use the gradient of the problem being optimized, which means DE does not require for the optimization problem to be differentiable as is required by classic optimization methods such as gradient descent and quasi-newton methods. DE can therefore also be used on optimization problems that are not even continuous, are noisy, change over time, etc [1]. DE optimizes a problem by maintaining a population of candidate solutions and creating new candidate solutions by combining existing ones according to its simple formulae, and then keeping whichever candidate solution has the best score or fitness on the optimization problem at hand. In this way the optimization problem is treated as a black box that merely provides a measure of quality given a candidate solution and the gradient is therefore not needed. DE is originally due to Storn and Price[2][3]. Books have been published on theoretical and practical aspects of using DE in parallel computing, multiobjective optimization, constrained optimization, and the books also contain surveys of application areas [4][5][6]. Algorithm A basic variant of the DE algorithm works by having a population of candidate solutions (called agents). These agents are moved around in the search-space by using simple mathematical formulae to combine the positions of existing agents from the population. If the new position of an agent is an improvement it is accepted and forms part of the population, otherwise the new position is simply discarded. The process is repeated and by doing so it is hoped, but not guaranteed, that a satisfactory solution will eventually be discovered. Formally, let be the cost function which must be minimized or fitness function which must be maximized. The function takes a candidate solution as argument in the form of a vector of real numbers and produces a real number as output which indicates the fitness of the given candidate solution. The gradient of is not known. The goal is to find a solution mean Let for which for all in the search-space, which would is the global minimum. Maximization can be performed by considering the function instead. designate a candidate solution (agent) in the population. The basic DE algorithm can then be described as follows: • Initialize all agents with random positions in the search-space. • Until a termination criterion is met (e.g. number of iterations performed, or adequate fitness reached), repeat the following: Differential evolution • For each agent 134 in the population do: • Pick three agents , and from the population at random, they must be distinct from each other as well as from agent • Pick a random index ( being the dimensionality of the problem to be optimized). • Compute the agent's potentially new position as follows: • For each • If • If , pick a uniformly distributed number or then set otherwise set then replace the agent in the population with the improved candidate solution, that is, replace with in the population. • Pick the agent from the population that has the highest fitness or lowest cost and return it as the best found candidate solution. Note that is called the differential weight and is called the crossover probability, both these parameters are selectable by the practitioner along with the population size see below. Parameter selection The choice of DE parameters and can have a large impact on optimization performance. Selecting the DE parameters that yield good performance has therefore been the subject of much research. Rules of thumb for parameter selection were devised by Storn et al.[3][4] and Liu and Lampinen [7]. Mathematical convergence analysis regarding parameter selection was done by Zaharie [8]. Meta-optimization of the DE parameters was done by Pedersen [9][10] and Zhang et al.[11]. Variants Performance landscape showing how the basic DE performs in aggregate on the Sphere and Rosenbrock benchmark problems when varying the two DE parameters and , and Variants of the DE algorithm are continually being developed in an keeping fixed =0.9. effort to improve optimization performance. Many different schemes for performing crossover and mutation of agents are possible in the basic algorithm given above, see e.g.[3]. More advanced DE variants are also being developed with a popular research trend being to perturb or adapt the DE parameters during optimization, see e.g. Price et al.[4], Liu and Lampinen [12], Qin and Suganthan [13], and Brest et al.[14]. References [1] Rocca, P.; Oliveri, G.; Massa, A. (2011). "Differential Evolution as Applied to Electromagnetics". IEEE Antennas and Propagation Magazine 53 (1): 38–49. doi:10.1109/MAP.2011.5773566. [2] Storn, R.; Price, K. (1997). "Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces". Journal of Global Optimization 11: 341–359. doi:10.1023/A:1008202821328. [3] Storn, R. (1996). "On the usage of differential evolution for function optimization". Biennial Conference of the North American Fuzzy Information Processing Society (NAFIPS). pp. 519–523. [4] Price, K.; Storn, R.M.; Lampinen, J.A. (2005). Differential Evolution: A Practical Approach to Global Optimization (http:/ / www. springer. com/ computer/ theoretical+ computer+ science/ foundations+ of+ computations/ book/ 978-3-540-20950-8). Springer. ISBN 978-3-540-20950-8. . [5] Feoktistov, V. (2006). Differential Evolution: In Search of Solutions (http:/ / www. springer. com/ mathematics/ book/ 978-0-387-36895-5). Springer. ISBN 978-0-387-36895-5. . [6] Chakraborty, U.K., ed. (2008), Advances in Differential Evolution (http:/ / www. springer. com/ engineering/ book/ 978-3-540-68827-3), Springer, ISBN 978-3-540-68827-3, Differential evolution [7] Liu, J.; Lampinen, J. (2002). "On setting the control parameter of the differential evolution method". Proceedings of the 8th International Conference on Soft Computing (MENDEL). Brno, Czech Republic. pp. 11–18. [8] Zaharie, D. (2002). "Critical values for the control parameters of differential evolution algorithms". Proceedings of the 8th International Conference on Soft Computing (MENDEL). Brno, Czech Republic. pp. 62–67. [9] Pedersen, M.E.H. (2010). Tuning & Simplifying Heuristical Optimization (http:/ / www. hvass-labs. org/ people/ magnus/ thesis/ pedersen08thesis. pdf) (PhD thesis). University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. . [10] Pedersen, M.E.H. (2010). "Good parameters for differential evolution" (http:/ / www. hvass-labs. org/ people/ magnus/ publications/ pedersen10good-de. pdf). Technical Report HL1002 (Hvass Laboratories). . [11] Zhang, X.; Jiang, X.; Scott, P.J. (2011). "A Minimax Fitting Algorithm for Ultra-Precision Aspheric Surfaces". The 13th International Conference on Metrology and Properties of Engineering Surfaces. [12] Liu, J.; Lampinen, J. (2005). "A fuzzy adaptive differential evolution algorithm". Soft Computing 9 (6): 448–462. [13] Qin, A.K.; Suganthan, P.N. (2005). "Self-adaptive differential evolution algorithm for numerical optimization". Proceedings of the IEEE congress on evolutionary computation (CEC). pp. 1785–1791. [14] Brest, J.; Greiner, S.; Boskovic, B.; Mernik, M.; Zumer, V. (2006). "Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark functions". IEEE Transactions on Evolutionary Computation 10 (6): 646–657. External links • Storn's Homepage on DE (http://www.icsi.berkeley.edu/~storn/code.html) featuring source-code for several programming languages. Particle swarm optimization In computer science, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. PSO optimizes a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the search-space according to simple mathematical formulae over the particle's position and velocity. Each particle's movement is influenced by its local best known position and is also guided toward the best known positions in the search-space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions. PSO is originally attributed to Kennedy, Eberhart and Shi[1][2] and was first intended for simulating social behaviour,[3] as a stylized representation of the movement of organisms in a bird flock or fish school. The algorithm was simplified and it was observed to be performing optimization. The book by Kennedy and Eberhart[4] describes many philosophical aspects of PSO and swarm intelligence. An extensive survey of PSO applications is made by Poli.[5][6] PSO is a metaheuristic as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, metaheuristics such as PSO do not guarantee an optimal solution is ever found. More specifically, PSO does not use the gradient of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by classic optimization methods such as gradient descent and quasi-newton methods. PSO can therefore also be used on optimization problems that are partially irregular, noisy, change over time, etc. Algorithm A basic variant of the PSO algorithm works by having a population (called a swarm) of candidate solutions (called particles). These particles are moved around in the search-space according to a few simple formulae. The movements of the particles are guided by their own best known position in the search-space as well as the entire swarm's best known position. When improved positions are being discovered these will then come to guide the movements of the swarm. The process is repeated and by doing so it is hoped, but not guaranteed, that a satisfactory solution will 135 Particle swarm optimization 136 eventually be discovered. Formally, let f: ℝn → ℝ be the fitness or cost function which must be minimized. The function takes a candidate solution as argument in the form of a vector of real numbers and produces a real number as output which indicates the fitness of the given candidate solution. The gradient of f is not known. The goal is to find a solution a for which f(a) ≤ f(b) for all b in the search-space, which would mean a is the global minimum. Maximization can be performed by considering the function h = -f instead. Let S be the number of particles in the swarm, each having a position xi ∈ ℝn in the search-space and a velocity vi ∈ ℝn. Let pi be the best known position of particle i and let g be the best known position of the entire swarm. A basic PSO algorithm is then: • For each particle i = 1, ..., S do: • Initialize the particle's position with a uniformly distributed random vector: xi ~ U(blo, bup), where blo and bup are the lower and upper boundaries of the search-space. • Initialize the particle's best known position to its initial position: pi ← xi • If (f(pi) < f(g)) update the swarm's best known position: g ← pi • Initialize the particle's velocity: vi ~ U(-|bup-blo|, |bup-blo|) • Until a termination criterion is met (e.g. number of iterations performed, or adequate fitness reached), repeat: • For each particle i = 1, ..., S do: • For each dimension d = 1, ..., n do: • Pick random numbers: rp, rg ~ U(0,1) • Update the particle's velocity: vi,d ← ω vi,d + φp rp (pi,d-xi,d) + φg rg (gd-xi,d) • Update the particle's position: xi ← xi + vi • If (f(xi) < f(pi)) do: • Update the particle's best known position: pi ← xi • If (f(pi) < f(g)) update the swarm's best known position: g ← pi • Now g holds the best found solution. The parameters ω, φp, and φg are selected by the practitioner and control the behaviour and efficacy of the PSO method, see below. Parameter selection The choice of PSO parameters can have a large impact on optimization performance. Selecting PSO parameters that yield good performance has therefore been the subject of much [7][8][9][10][11][12][13][14] research. The PSO parameters can also be tuned by using another overlaying optimizer, a concept known as meta-optimization.[15][16][17] Parameters have also been tuned for various optimization scenarios.[18] Neighbourhoods and Topologies Performance landscape showing how a simple PSO variant performs in aggregate on several benchmark problems when varying two PSO parameters. The basic PSO is easily trapped into a local minimum. This premature convergence can be avoided by not using any more the entire swarm's best known position g but just the best known position l of a sub-swarm "around" the particle that is moved. Such a sub-swarm can be a geometrical one - for example "the m nearest particles" - or, more often, a social one, i.e. a set of Particle swarm optimization particles that is not depending on any distance. In such a case, the PSO variant is said to be local best (vs global best for the basic PSO). If we suppose there is an information link between each particle and its neighbours, the set of these links builds a graph, a communication network, that is called the topology of the PSO variant. A commonly used social topology is the ring, in which each particle has just two neighbours, but there are far more.[19] The topology is not necessarily fixed, and can be adaptive (SPSO,[20] stochastic star,[21] TRIBES,[22] Cyber Swarm [23]). Inner workings There are several schools of thought as to why and how the PSO algorithm can perform optimization. A common belief amongst researchers is that the swarm behaviour varies between exploratory behaviour, that is, searching a broader region of the search-space, and exploitative behaviour, that is, a locally oriented search so as to get closer to a (possibly local) optimum. This school of thought has been prevalent since the inception of PSO.[2][3][7][11] This school of thought contends that the PSO algorithm and its parameters must be chosen so as to properly balance between exploration and exploitation to avoid premature convergence to a local optimum yet still ensure a good rate of convergence to the optimum. This belief is the precursor of many PSO variants, see below. Another school of thought is that the behaviour of a PSO swarm is not well understood in terms of how it affects actual optimization performance, especially for higher dimensional search-spaces and optimization problems that may be discontinuous, noisy, and time-varying. This school of thought merely tries to find PSO algorithms and parameters that cause good performance regardless of how the swarm behaviour can be interpreted in relation to e.g. exploration and exploitation. Such studies have led to the simplification of the PSO algorithm, see below. Convergence In relation to PSO the word convergence typically means one of two things, although it is often not clarified which definition is meant and sometimes they are mistakenly thought to be identical. • Convergence may refer to the swarm's best known position g approaching (converging to) the optimum of the problem, regardless of how the swarm behaves. • Convergence may refer to a swarm collapse in which all particles have converged to a point in the search-space, which may or may not be the optimum. Several attempts at mathematically analyzing PSO convergence exist in the literature.[10][11][12] These analyses have resulted in guidelines for selecting PSO parameters that are believed to cause convergence, divergence or oscillation of the swarm's particles, and the analyses have also given rise to several PSO variants. However, the analyses were criticized by Pedersen[17] for being oversimplified as they assume the swarm has only one particle, that it does not use stochastic variables and that the points of attraction, that is, the particle's best known position p and the swarm's best known position g, remain constant throughout the optimization process. Furthermore, some analyses allow for an infinite number of optimization iterations which is not possible in reality. This means that determining convergence capabilities of different PSO algorithms and parameters therefore still depends on empirical results. 137 Particle swarm optimization Biases As the basic PSO works dimension by dimension, the solution point is easier found when it lies on an axis of the search space, on a diagonal, and even easier if it is right on the centre.[24][25] A first approach to avoid this bias, and for fair comparisons, is precisely to use non-biased benchmark problems, that are shifted or rotated.[26] Another approach is to modify the algorithm itself so that it is not any more sensitive to the system of coordinates.[27][28] Variants Numerous variants of even a basic PSO algorithm are possible. For example, there are different ways to initialize the particles and velocities (e.g. start with zero velocities instead), how to dampen the velocity, only update pi and g after the entire swarm has been updated, etc. Some of these choices and their possible performance impact have been discussed in the literature.[9] New and more sophisticated PSO variants are also continually being introduced in an attempt to improve optimization performance. There are certain trends in that research; one is to make a hybrid optimization method using PSO combined with other optimizers,[29][30] another research trend is to try and alleviate premature convergence (that is, optimization stagnation) e.g. by reversing or perturbing the movement of the PSO particles,[14][31][32] another approach to deal with premature convergence is the use of multiple swarms (multi-swarm optimization), and then there are also attempts at adapting the behavioural parameters of PSO during optimization.[33] Simplifications Another school of thought is that PSO should be simplified as much as possible without impairing its performance; a general concept often referred to as Occam's razor. Simplifying PSO was originally suggested by Kennedy[3] and has been studied more extensively,[13][16][17][34] where it appeared that optimization performance was improved, and the parameters were easier to tune and they performed more consistently across different optimization problems. Another argument in favour of simplifying PSO is that metaheuristics can only have their efficacy demonstrated empirically by doing computational experiments on a finite number of optimization problems. This means a metaheuristic such as PSO cannot be proven correct and this increases the risk of making errors in its description and implementation. A good example of this[35] presented a promising variant of a genetic algorithm (another popular metaheuristic) but it was later found to be defective as it was strongly biased in its optimization search towards similar values for different dimensions in the search space, which happened to be the optimum of the benchmark problems considered. This bias was because of a programming error, and has now been fixed.[36] Initialization of velocities may require extra inputs. A simpler variant is the accelerated particle swarm optimization (APSO)[37], which does not need to use velocity at all and can speed up the convergence in many applications. A simple demo code of APSO is available[38] 138 Particle swarm optimization Multi-objective optimization PSO has also been applied to multi-objective problems,[39][40] in which the fitness comparison takes pareto dominance into account when moving the PSO particles and non-dominated solutions are stored so as to approximate the pareto front. Binary, Discrete, and Combinatorial PSO As the PSO equations given above work on real numbers, a commonly used method to solve discrete problems is to map the discrete search space to a continuous domain, to apply a classical PSO, and then to demap the result. Such a mapping can be very simple (for example by just using rounded values) or more sophisticated.[41] However, it can be noted that the equations of movement make use of operators that perform four actions: • • • • computing the difference of two positions. The result is a velocity (more precisely a displacement) multiplying a velocity by a numerical coefficient adding two velocities applying a velocity to a position Usually a position and a velocity are represented by n real numbers, and these operators are simply -, *, +, and again +. But all these mathematical objects can be defined in a completely different way, in order to cope with binary problems (or more generally discrete ones) , or even combinatorial ones [42] [43] [44] .[45]. One approach is to redefine the operators based on sets [46]. References [1] Kennedy, J.; Eberhart, R. (1995). "Particle Swarm Optimization" (http:/ / www. engr. iupui. edu/ ~shi/ Coference/ psopap4. html). Proceedings of IEEE International Conference on Neural Networks. IV. pp. 1942–1948. doi:10.1109/ICNN.1995.488968. . [2] Shi, Y.; Eberhart, R.C. (1998). "A modified particle swarm optimizer". Proceedings of IEEE International Conference on Evolutionary Computation. pp. 69–73. [3] Kennedy, J. (1997). "The particle swarm: social adaptation of knowledge". Proceedings of IEEE International Conference on Evolutionary Computation. pp. 303–308. [4] Kennedy, J.; Eberhart, R.C. (2001). Swarm Intelligence. Morgan Kaufmann. ISBN 1-55860-595-9. [5] Poli, R. (2007). "An analysis of publications on particle swarm optimisation applications" (http:/ / cswww. essex. ac. uk/ technical-reports/ 2007/ tr-csm469. pdf). Technical Report CSM-469 (Department of Computer Science, University of Essex, UK). . [6] Poli, R. (2008). "Analysis of the publications on the applications of particle swarm optimisation" (http:/ / downloads. hindawi. com/ archive/ 2008/ 685175. pdf). Journal of Artificial Evolution and Applications 2008: 1–10. doi:10.1155/2008/685175. . [7] Shi, Y.; Eberhart, R.C. (1998). "Parameter selection in particle swarm optimization". Proceedings of Evolutionary Programming VII (EP98). pp. 591–600. [8] Eberhart, R.C.; Shi, Y. (2000). "Comparing inertia weights and constriction factors in particle swarm optimization". Proceedings of the Congress on Evolutionary Computation. 1. pp. 84–88. [9] Carlisle, A.; Dozier, G. (2001). "An Off-The-Shelf PSO". Proceedings of the Particle Swarm Optimization Workshop. pp. 1–6. [10] van den Bergh, F. (2001) (PhD thesis). An Analysis of Particle Swarm Optimizers. University of Pretoria, Faculty of Natural and Agricultural Science. [11] Clerc, M.; Kennedy, J. (2002). "The particle swarm - explosion, stability, and convergence in a multidimensional complex space". IEEE Transactions on Evolutionary Computation 6 (1): 58–73. doi:10.1109/4235.985692. [12] Trelea, I.C. (2003). "The Particle Swarm Optimization Algorithm: convergence analysis and parameter selection". Information Processing Letters 85 (6): 317–325. doi:10.1016/S0020-0190(02)00447-7. [13] Bratton, D.; Blackwell, T. (2008). "A Simplified Recombinant PSO". Journal of Artificial Evolution and Applications. [14] Evers, G. (2009) (Master's thesis). An Automatic Regrouping Mechanism to Deal with Stagnation in Particle Swarm Optimization (http:/ / www. georgeevers. org/ publications. htm). The University of Texas - Pan American, Department of Electrical Engineering. . [15] Meissner, M.; Schmuker, M.; Schneider, G. (2006). "Optimized Particle Swarm Optimization (OPSO) and its application to artificial neural network training". BMC Bioinformatics 7: 125. doi:10.1186/1471-2105-7-125. PMC 1464136. PMID 16529661. [16] Pedersen, M.E.H. (2010) (PhD thesis). Tuning & Simplifying Heuristical Optimization (http:/ / www. hvass-labs. org/ people/ magnus/ thesis/ pedersen08thesis. pdf). University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. . [17] Pedersen, M.E.H.; Chipperfield, A.J. (2010). "Simplifying particle swarm optimization" (http:/ / www. hvass-labs. org/ people/ magnus/ publications/ pedersen08simplifying. pdf). Applied Soft Computing 10 (2): 618–628. doi:10.1016/j.asoc.2009.08.029. . 139 Particle swarm optimization [18] Pedersen, M.E.H. (2010). "Good parameters for particle swarm optimization" (http:/ / www. hvass-labs. org/ people/ magnus/ publications/ pedersen10good-pso. pdf). Technical Report HL1001 (Hvass Laboratories). . [19] Mendes, R. (2004). Population Topologies and Their Influence in Particle Swarm Performance (PhD thesis). Universidade do Minho. [20] SPSO, Particle Swarm Central (http:/ / www. particleswarm. info) [21] Miranda, V., Keko, H. and Duque, Á. J. (2008). Stochastic Star Communication Topology in Evolutionary Particle Swarms (EPSO). International Journal of Computational Intelligence Research (IJCIR), Volume 4, Number 2, pp. 105-116 [22] Clerc, M. (2006). Particle Swarm Optimization. ISTE (International Scientific and Technical Encyclopedia), 2006 [23] Yin, P., Glover, F., Laguna, M., & Zhu, J. (2011). A Complementary Cyber Swarm Algorithm. International Journal of Swarm Intelligence Research (IJSIR), 2(2), 22-41 [24] Monson, C. K. & Seppi, K. D. (2005). Exposing Origin-Seeking Bias in PSO GECCO'05, pp. 241-248 [25] Spears, W. M., Green, D. T. & Spears, D. F. (2010). Biases in Particle Swarm Optimization. International Journal of Swarm Intelligence Research, Vol. 1(2), pp. 34-57 [26] Suganthan, P. N., Hansen, N., Liang, J. J., Deb, K.; Chen, Y. P., Auger, A. & Tiwari, S. (2005). Problem definitions and evaluation criteria for the CEC 2005 Special Session on Real Parameter Optimization. Nanyang Technological University [27] Wilke, D. N., Kok, S. & Groenwold, A. A. (2007). Comparison of linear and classical velocity update rules in particle swarm optimization: notes on scale and frame invariance. International Journal for Numerical Methods in Engineering, John Wiley & Sons, Ltd., 70, pp. 985-1008 [28] SPSO 2011, Particle Swarm Central (http:/ / www. particleswarm. info) [29] Lovbjerg, M.; Krink, T. (2002). "The LifeCycle Model: combining particle swarm optimisation, genetic algorithms and hillclimbers". Proceedings of Parallel Problem Solving from Nature VII (PPSN). pp. 621–630. [30] Niknam, T.; Amiri, B. (2010). "An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis". Applied Soft Computing 10 (1): 183–197. doi:10.1016/j.asoc.2009.07.001. [31] Lovbjerg, M.; Krink, T. (2002). "Extending Particle Swarm Optimisers with Self-Organized Criticality". Proceedings of the Fourth Congress on Evolutionary Computation (CEC). 2. pp. 1588–1593. [32] Xinchao, Z. (2010). "A perturbed particle swarm algorithm for numerical optimization". Applied Soft Computing 10 (1): 119–124. doi:10.1016/j.asoc.2009.06.010. [33] Zhan, Z-H.; Zhang, J.; Li, Y; Chung, H.S-H. (2009). "Adaptive Particle Swarm Optimization". IEEE Transactions on Systems, Man, and Cybernetics 39 (6): 1362–1381. doi:10.1109/TSMCB.2009.2015956. [34] Yang, X.S. (2008). Nature-Inspired Metaheuristic Algorithms. Luniver Press. ISBN 978-1905986101. [35] Tu, Z.; Lu, Y. (2004). "A robust stochastic genetic algorithm (StGA) for global numerical optimization". IEEE Transactions on Evolutionary Computation 8 (5): 456–470. doi:10.1109/TEVC.2004.831258. [36] Tu, Z.; Lu, Y. (2008). "Corrections to "A Robust Stochastic Genetic Algorithm (StGA) for Global Numerical Optimization". IEEE Transactions on Evolutionary Computation 12 (6): 781–781. doi:10.1109/TEVC.2008.926734. [37] X. S. Yang, S. Deb and S. Fong, Accelerated particle swarm optimization and support vector machine for business optimization and applications, NDT 2011, Springer CCIS 136, pp. 53-66 (2011). [38] http:/ / www. mathworks. com/ matlabcentral/ fileexchange/ ?term=APSO [39] Parsopoulos, K.; Vrahatis, M. (2002). "Particle swarm optimization method in multiobjective problems" (http:/ / doi. acm. org/ 10. 1145/ 508791. 508907). Proceedings of the ACM Symposium on Applied Computing (SAC). pp. 603–607. . [40] Coello Coello, C.; Salazar Lechuga, M. (2002). "MOPSO: A Proposal for Multiple Objective Particle Swarm Optimization" (http:/ / portal. acm. org/ citation. cfm?id=1252327). Congress on Evolutionary Computation (CEC'2002). pp. 1051--1056. . [41] Roy, R., Dehuri, S., & Cho, S. B. (2012). A Novel Particle Swarm Optimization Algorithm for Multi-Objective Combinatorial Optimization Problem. 'International Journal of Applied Metaheuristic Computing (IJAMC)', 2(4), 41-57 [42] Kennedy, J. & Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm, Conference on Systems, Man, and Cybernetics, Piscataway, NJ: IEEE Service Center, pp. 4104-4109 [43] Clerc, M. (2004). Discrete Particle Swarm Optimization, illustrated by the Traveling Salesman Problem, New Optimization Techniques in Engineering, Springer, pp. 219-239 [44] Clerc, M. (2005). Binary Particle Swarm Optimisers: toolbox, derivations, and mathematical insights, Open Archive HAL (http:/ / hal. archives-ouvertes. fr/ hal-00122809/ en/ ) [45] Jarboui, B., Damak, N., Siarry, P., and Rebai, A.R. (2008). A combinatorial particle swarm optimization for solving multi-mode resource-constrained project scheduling problems. In Proceedings of Applied Mathematics and Computation, pp. 299-308. [46] Chen, Wei-neng; Zhang, Jun (2010). "A novel set-based particle swarm optimization method for discrete optimization problem". IEEE Transactions on Evolutionary Computation 14 (2): 278–300. 140 Particle swarm optimization External links • Particle Swarm Central (http://www.particleswarm.info) is a repository for information on PSO. Several source codes are freely available. • A brief video (http://vimeo.com/17407010) of particle swarms optimizing three benchmark functions. Ant colony optimization algorithms In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. This algorithm is a member of the ant colony algorithms family, in swarm intelligence methods, and it constitutes some metaheuristic optimizations. Initially proposed by Marco Dorigo in 1992 in his PhD thesis,[1][2] the first algorithm was aiming to search for an optimal path Ant behavior was the inspiration for the in a graph, based on the behavior of ants seeking a path between their metaheuristic optimization technique. colony and a source of food. The original idea has since diversified to solve a wider class of numerical problems, and as a result, several problems have emerged, drawing on various aspects of the behavior of ants. Overview Summary In the natural world, ants (initially) wander randomly, and upon finding food return to their colony while laying down pheromone trails. If other ants find such a path, they are likely not to keep travelling at random, but to instead follow the trail, returning and reinforcing it if they eventually find food (see Ant communication). Over time, however, the pheromone trail starts to evaporate, thus reducing its attractive strength. The more time it takes for an ant to travel down the path and back again, the more time the pheromones have to evaporate. A short path, by comparison, gets marched over more frequently, and thus the pheromone density becomes higher on shorter paths than longer ones. Pheromone evaporation also has the advantage of avoiding the convergence to a locally optimal solution. If there were no evaporation at all, the paths chosen by the first ants would tend to be excessively attractive to the following ones. In that case, the exploration of the solution space would be constrained. Thus, when one ant finds a good (i.e., short) path from the colony to a food source, other ants are more likely to follow that path, and positive feedback eventually leads all the ants following a single path. The idea of the ant colony algorithm is to mimic this behavior with "simulated ants" walking around the graph representing the problem to solve. 141 Ant colony optimization algorithms 142 Detailed The original idea comes from observing the exploitation of food resources among ants, in which ants’ individually limited cognitive abilities have collectively been able to find the shortest path between a food source and the nest. 1. The first ant finds the food source (F), via any way (a), then returns to the nest (N), leaving behind a trail pheromone (b) 2. Ants indiscriminately follow four possible ways, but the strengthening of the runway makes it more attractive as the shortest route. 3. Ants take the shortest route, long portions of other ways lose their trail pheromones. In a series of experiments on a colony of ants with a choice between two unequal length paths leading to a source of food, biologists have observed that ants tended to use the shortest route. [3] [4] A model explaining this behaviour is as follows: 1. 2. 3. 4. 5. An ant (called "blitz") runs more or less at random around the colony; If it discovers a food source, it returns more or less directly to the nest, leaving in its path a trail of pheromone; These pheromones are attractive, nearby ants will be inclined to follow, more or less directly, the track; Returning to the colony, these ants will strengthen the route; If there are two routes to reach the same food source then, in a given amount of time, the shorter one will be traveled by more ants than the long route; 6. The short route will be increasingly enhanced, and therefore become more attractive; 7. The long route will eventually disappear because pheromones are volatile; 8. Eventually, all the ants have determined and therefore "chosen" the shortest route. Ants use the environment as a medium of communication. They exchange information indirectly by depositing pheromones, all detailing the status of their "work". The information exchanged has a local scope, only an ant located where the pheromones were left has a notion of them. This system is called "Stigmergy" and occurs in many social animal societies (it has been studied in the case of the construction of pillars in the nests of termites). The mechanism to solve a problem too complex to be addressed by single ants is a good example of a self-organized system. This system is based on positive feedback (the deposit of pheromone attracts other ants that will strengthen it themselves) and negative (dissipation of the route by evaporation prevents the system from thrashing). Theoretically, if the quantity of pheromone remained the same over time on all edges, no route would be chosen. However, because of feedback, a slight variation on an edge will be amplified and thus allow the choice of an edge. The algorithm will move from an unstable state in which no edge is stronger than another, to a stable state where the route is composed of the strongest edges. The basic philosophy of the algorithm involves the movement of a colony of ants through the different states of the problem influenced by two local decision policies, viz., trails and attractiveness. Thereby, each such ant incrementally constructs a solution to the problem. When an ant completes a solution, or during the construction phase, the ant evaluates the solution and modifies the trail value on the components used in its solution. This Ant colony optimization algorithms pheromone information will direct the search of the future ants. Furthermore, the algorithm also includes two more mechanisms, viz., trail evaporation and daemon actions. Trail evaporation reduces all trail values over time thereby avoiding any possibilities of getting stuck in local optima. The daemon actions are used to bias the search process from a non-local perspective. Common extensions Here are some of most popular variations of ACO Algorithms Elitist ant system The global best solution deposits pheromone on every iteration along with all the other ants. Max-Min ant system (MMAS) Added Maximum and Minimum pheromone amounts [τmax,τmin] Only global best or iteration best tour deposited pheromone. All edges are initialized to τmax and reinitialized to τmax when nearing stagnation. [5] Ant Colony System It has been presented above.[6] Rank-based ant system (ASrank) All solutions are ranked according to their length. The amount of pheromone deposited is then weighted for each solution, such that solutions with shorter paths deposit more pheromone than the solutions with longer paths. Continuous orthogonal ant colony (COAC) The pheromone deposit mechanism of COAC is to enable ants to search for solutions collaboratively and effectively. By using an orthogonal design method, ants in the feasible domain can explore their chosen regions rapidly and efficiently, with enhanced global search capability and accuracy. The orthogonal design method and the adaptive radius adjustment method can also be extended to other optimization algorithms for delivering wider advantages in solving practical problems.[7] Convergence For some versions of the algorithm, it is possible to prove that it is convergent (i.e. it is able to find the global optimum in finite time). The first evidence of a convergence ant colony algorithm was made in 2000, the graph-based ant system algorithm, and then algorithms for ACS and MMAS. Like most metaheuristics, it is very difficult to estimate the theoretical speed of convergence. In 2004, Zlochin and his colleagues[8] showed that COA-type algorithms could be assimilated methods of stochastic gradient descent, on the cross-entropy and estimation of distribution algorithm. They proposed these metaheuristics as a "research-based model". Example pseudo-code and formulae procedure ACO_MetaHeuristic while(not_termination) generateSolutions() daemonActions() pheromoneUpdate() end while 143 Ant colony optimization algorithms 144 end procedure Edge selection An ant is a simple computational agent in the ant colony optimization algorithm. It iteratively constructs a solution for the problem at hand. The intermediate solutions are referred to as solution states. At each iteration of the algorithm, each ant moves from a state to state , corresponding to a more complete intermediate solution. Thus, each ant computes a set one of these in probability. For ant of feasible expansions to its current state in each iteration, and moves to , the probability of moving from state to state depends on the combination of two values, viz., the attractiveness of the move, as computed by some heuristic indicating the a priori desirability of that move and the trail level of the move, indicating how proficient it has been in the past to make that particular move. The trail level represents a posteriori indication of the desirability of that move. Trails are updated usually when all ants have completed their solution, increasing or decreasing the level of trails corresponding to moves that were part of "good" or "bad" solutions, respectively. In general, the th ant moves from state to state with probability where is the amount of pheromone deposited for transition from state influence of the distance) and , is the desirability of state transition to ,0≤ is a parameter to control the (a priori knowledge, typically ≥ 1 is a parameter to control the influence of , where is . Pheromone update When all the ants have completed a solution, the trails are updated by where is the amount of pheromone deposited for a state transition and is the amount of pheromone deposited by , is the pheromone evaporation coefficient th ant, typically given for a TSP problem (with moves corresponding to arcs of the graph) by where is the cost of the th ant's tour (typically length) and is a constant. Ant colony optimization algorithms 145 Applications Ant colony optimization algorithms have been applied to many combinatorial optimization problems, ranging from quadratic assignment to protein folding or routing vehicles and a lot of derived methods have been adapted to dynamic problems in real variables, stochastic problems, multi-targets and parallel implementations. It has also been used to produce near-optimal solutions to the travelling salesman problem. They have an advantage over simulated annealing and genetic algorithm approaches of similar problems when the graph may change dynamically; the ant colony algorithm can be run continuously and adapt to changes in real time. This is of interest in network routing and urban transportation systems. Knapsack problem. The ants prefer the smaller drop of honey over the more abundant, but less The first ACO algorithm was called the Ant system [9] and it was nutritious, sugar. aimed to solve the travelling salesman problem, in which the goal is to find the shortest round-trip to link a series of cities. The general algorithm is relatively simple and based on a set of ants, each making one of the possible round-trips along the cities. At each stage, the ant chooses to move from one city to another according to some rules: 1. It must visit each city exactly once; 2. A distant city has less chance of being chosen (the visibility); 3. The more intense the pheromone trail laid out on an edge between two cities, the greater the probability that that edge will be chosen; 4. Having completed its journey, the ant deposits more pheromones on all edges it traversed, if the journey is short; 5. After each iteration, trails of pheromones evaporate. Ant colony optimization algorithms Scheduling problem • • • • • • • • • Job-shop scheduling problem (JSP)[10] Open-shop scheduling problem (OSP)[11][12] Permutation flow shop problem (PFSP)[13] Single machine total tardiness problem (SMTTP)[14] Single machine total weighted tardiness problem (SMTWTP)[15][16][17] Resource-constrained project scheduling problem (RCPSP)[18] Group-shop scheduling problem (GSP)[19] Single-machine total tardiness problem with sequence dependent setup times (SMTTPDST)[20] Multistage Flowshop Scheduling Problem (MFSP) with sequence dependent setup/changeover times[21] Vehicle routing problem • • • • • Capacitated vehicle routing problem (CVRP)[22][23][24] Multi-depot vehicle routing problem (MDVRP)[25] Period vehicle routing problem (PVRP)[26] Split delivery vehicle routing problem (SDVRP)[27] Stochastic vehicle routing problem (SVRP)[28] • • • • Vehicle routing problem with pick-up and delivery (VRPPD)[29][30] Vehicle routing problem with time windows (VRPTW)[31][32][33] Time Dependent Vehicle Routing Problem with Time Windows (TDVRPTW)[34] Vehicle Routing Problem with Time Windows and Multiple Service Workers (VRPTWMS) Assignment problem • • • • Quadratic assignment problem (QAP)[35] Generalized assignment problem (GAP)[36][37] Frequency assignment problem (FAP)[38] Redundancy allocation problem (RAP)[39] Set problem • • • • • • Set covering problem(SCP)[40][41] Set partition problem (SPP)[42] Weight constrained graph tree partition problem (WCGTPP)[43] Arc-weighted l-cardinality tree problem (AWlCTP)[44] Multiple knapsack problem (MKP)[45] Maximum independent set problem (MIS)[46] Others • • • • • • Classification[47] Connection-oriented network routing[48] Connectionless network routing[49][50] Data mining [47][51][52][53] Discounted cash flows in project scheduling[54] Distributed Information Retrieval[55][56] • Grid Workflow Scheduling Problem[57] • Image processing[58][59] • Intelligent testing system[60] 146 Ant colony optimization algorithms • System identification[61][62] • Protein Folding[63][64] • Power Electronic Circuit Design[65] Definition difficulty With an ACO algorithm, the shortest path in a graph, between two points A and B, is built from a combination of several paths. It is not easy to give a precise definition of what algorithm is or is not an ant colony, because the definition may vary according to the authors and uses. Broadly speaking, ant colony algorithms are regarded as populated metaheuristics with each solution represented by an ant moving in the search space. Ants mark the best solutions and take account of previous markings to optimize their search. They can be seen as probabilistic multi-agent algorithms using a probability distribution to make the transition between each iteration. In their versions for combinatorial problems, they use an iterative construction of solutions. According to some authors, the thing which distinguishes ACO algorithms from other relatives (such as algorithms to estimate the distribution or particle swarm optimization) is precisely their constructive aspect. In combinatorial problems, it is possible that the best solution eventually be found, even though no ant would prove effective. Thus, in the example of the Travelling salesman problem, it is not necessary that an ant actually travels the shortest route: the shortest route can be built from the strongest segments of the best solutions. However, this definition can be problematic in the case of problems in real variables, where no structure of 'neighbours' exists. The collective behaviour of social insects remains a source of inspiration for researchers. The wide variety of algorithms (for optimization or not) seeking self-organization in biological systems has led to the concept of "swarm intelligence", which is a very general framework in which ant colony algorithms fit. Stigmergy algorithms There is in practice a large number of algorithms claiming to be "ant colonies", without always sharing the general framework of optimization by canonical ant colonies (COA). In practice, the use of an exchange of information between ants via the environment (a principle called "Stigmergy") is deemed enough for an algorithm to belong to the class of ant colony algorithms. This principle has led some authors to create the term "value" to organize methods and behavior based on search of food, sorting larvae, division of labour and cooperative transportation.[66] Related methods • Genetic algorithms (GA) maintain a pool of solutions rather than just one. The process of finding superior solutions mimics that of evolution, with solutions being combined or mutated to alter the pool of solutions, with solutions of inferior quality being discarded. • Simulated annealing (SA) is a related global optimization technique which traverses the search space by generating neighboring solutions of the current solution. A superior neighbor is always accepted. An inferior neighbor is accepted probabilistically based on the difference in quality and a temperature parameter. The temperature parameter is modified as the algorithm progresses to alter the nature of the search. • Reactive search optimization focuses on combining machine learning with optimization, by adding an internal feedback loop to self-tune the free parameters of an algorithm to the characteristics of the problem, of the 147 Ant colony optimization algorithms • • • • • • • instance, and of the local situation around the current solution. Tabu search (TS) is similar to simulated annealing in that both traverse the solution space by testing mutations of an individual solution. While simulated annealing generates only one mutated solution, tabu search generates many mutated solutions and moves to the solution with the lowest fitness of those generated. To prevent cycling and encourage greater movement through the solution space, a tabu list is maintained of partial or complete solutions. It is forbidden to move to a solution that contains elements of the tabu list, which is updated as the solution traverses the solution space. Artificial immune system (AIS) algorithms are modeled on vertebrate immune systems. Particle swarm optimization (PSO), a Swarm intelligence method Intelligent Water Drops (IWD), a swarm-based optimization algorithm based on natural water drops flowing in rivers Gravitational Search Algorithm (GSA), a Swarm intelligence method Ant colony clustering method (ACCM), a method that make use of clustering approach,extending the ACO. Stochastic diffusion search (SDS), an agent-based probabilistic global search and optimization technique best suited to problems where the objective function can be decomposed into multiple independent partial-functions History Chronology of COA algorithms Chronology of Ant colony optimization algorithms. 1959, Pierre-Paul Grassé invented the theory of Stigmergy to explain the behavior of nest building in termites;[67] 1983, Deneubourg and his colleagues studied the collective behavior of ants;[68] 1988, and Moyson Manderick have an article on self-organization among ants;[69] 1989, the work of Goss, Aron, Deneubourg and Pasteels on the collective behavior of Argentine ants, which will give the idea of Ant colony optimization algorithms;[3] • 1989, implementation of a model of behavior for food by Ebling and his colleagues;[70] • 1991, M. Dorigo proposed the Ant System in his doctoral thesis (which was published in 1992[2]). A technical report extracted from the thesis and co-authored by V. Maniezzo and A. Colorni [71] was published five years later;[9] • 1996, publication of the article on Ant System;[9] • • • • 148 Ant colony optimization algorithms 1996, Hoos and Stützle invent the MAX-MIN Ant System;[5] 1997, Dorigo and Gambardella publish the Ant Colony System;[6] 1997, Schoonderwoerd and his colleagues developed the first application to telecommunication networks;[72] 1998, Dorigo launches first conference dedicated to the ACO algorithms;[73] 1998, Stützle proposes initial parallel implementations;[74] 1999, Bonabeau, Dorigo and Theraulaz publish a book dealing mainly with artificial ants [75] 2000, special issue of the Future Generation Computer Systems journal on ant algorithms[76] 2000, first applications to the scheduling, scheduling sequence and the satisfaction of constraints; 2000, Gutjahr provides the first evidence of convergence for an algorithm of ant colonies[77] 2001, the first use of COA Algorithms by companies (Eurobios [78] and AntOptima [79]); 2001, IREDA and his colleagues published the first multi-objective algorithm [80] 2002, first applications in the design of schedule, Bayesian networks; 2002, Bianchi and her colleagues suggested the first algorithm for stochastic problem;[81] 2004, Zlochin and Dorigo show that some algorithms are equivalent to the stochastic gradient descent, the cross-entropy and algorithms to estimate distribution [8] • 2005, first applications to protein folding problems. • • • • • • • • • • • • • • References [1] A. Colorni, M. Dorigo et V. Maniezzo, Distributed Optimization by Ant Colonies, actes de la première conférence européenne sur la vie artificielle, Paris, France, Elsevier Publishing, 134-142, 1991. [2] M. Dorigo, Optimization, Learning and Natural Algorithms, PhD thesis, Politecnico di Milano, Italie, 1992. [3] S. Goss, S. Aron, J.-L. Deneubourg et J.-M. Pasteels, Self-organized shortcuts in the Argentine ant, Naturwissenschaften, volume 76, pages 579-581, 1989 [4] J.-L. Deneubourg, S. Aron, S. Goss et J.-M. Pasteels, The self-organizing exploratory pattern of the Argentine ant, Journal of Insect Behavior, volume 3, page 159, 1990 [5] T. Stützle et H.H. Hoos, MAX MIN Ant System, Future Generation Computer Systems, volume 16, pages 889-914, 2000 [6] M. Dorigo et L.M. Gambardella, Ant Colony System : A Cooperative Learning Approach to the Traveling Salesman Problem, IEEE Transactions on Evolutionary Computation, volume 1, numéro 1, pages 53-66, 1997. [7] X Hu, J Zhang, and Y Li (2008). Orthogonal methods based ant colony search for solving continuous optimization problems. Journal of Computer Science and Technology, 23(1), pp.2-18. (http:/ / eprints. gla. ac. uk/ 3894/ ) [8] M. Zlochin, M. Birattari, N. Meuleau, et M. Dorigo, Model-based search for combinatorial optimization: A critical survey, Annals of Operations Research, vol. 131, pp. 373-395, 2004. [9] M. Dorigo, V. Maniezzo, et A. Colorni, Ant system: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man, and Cybernetics--Part B , volume 26, numéro 1, pages 29-41, 1996. [10] D. Martens, M. De Backer, R. Haesen, J. Vanthienen, M. Snoeck, B. Baesens, Classification with Ant Colony Optimization, IEEE Transactions on Evolutionary Computation, volume 11, number 5, pages 651—665, 2007. [11] B. Pfahring, "Multi-agent search for open scheduling: adapting the Ant-Q formalism," Technical report TR-96-09, 1996. [12] C. Blem, "Beam-ACO, Hybridizing ant colony optimization with beam search. An application to open shop scheduling," Technical report TR/IRIDIA/2003-17, 2003. [13] T. Stützle, "An ant approach to the flow shop problem," Technical report AIDA-97-07, 1997. [14] A. Baucer, B. Bullnheimer, R. F. Hartl and C. Strauss, "Minimizing total tardiness on a single machine using ant colony optimization," Central European Journal for Operations Research and Economics, vol.8, no.2, pp.125-141, 2000. [15] M. den Besten, "Ants for the single machine total weighted tardiness problem," Master’s thesis, University of Amsterdam, 2000. [16] M, den Bseten, T. Stützle and M. Dorigo, "Ant colony optimization for the total weighted tardiness problem," Proceedings of PPSN-VI, Sixth International Conference on Parallel Problem Solving from Nature, vol. 1917 of Lecture Notes in Computer Science, pp.611-620, 2000. [17] D. Merkle and M. Middendorf, "An ant algorithm with a new pheromone evaluation rule for total tardiness problems," Real World Applications of Evolutionary Computing, vol. 1803 of Lecture Notes in Computer Science, pp.287-296, 2000. [18] D. Merkle, M. Middendorf and H. Schmeck, "Ant colony optimization for resource-constrained project scheduling," Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2000), pp.893-900, 2000. [19] C. Blum, "ACO applied to group shop scheduling: a case study on intensification and diversification," Proceedings of ANTS 2002, vol. 2463 of Lecture Notes in Computer Science, pp.14-27, 2002. [20] C. Gagné, W. L. Price and M. Gravel, "Comparing an ACO algorithm with other heuristics for the single machine scheduling problem with sequence-dependent setup times," Journal of the Operational Research Society, vol.53, pp.895-906, 2002. 149 Ant colony optimization algorithms [21] A. V. Donati, V. Darley, B. Ramachandran, "An Ant-Bidding Algorithm for Multistage Flowshop Scheduling Problem: Optimization and Phase Transitions", book chapter in Advances in Metaheuristics for Hard Optimization, Springer, ISBN 978-3-540-72959-4, pp.111-138, 2008. [22] P. Toth, D. Vigo, "Models, relaxations and exact approaches for the capacitated vehicle routing problem," Discrete Applied Mathematics, vol.123, pp.487-512, 2002. [23] J. M. Belenguer, and E. Benavent, "A cutting plane algorithm for capacitated arc routing problem," Computers & Operations Research, vol.30, no.5, pp.705-728, 2003. [24] T. K. Ralphs, "Parallel branch and cut for capacitated vehicle routing," Parallel Computing, vol.29, pp.607-629, 2003. [25] S. Salhi and M. Sari, "A multi-level composite heuristic for the multi-depot vehicle fleet mix problem," European Journal for Operations Research, vol.103, no.1, pp.95-112, 1997. [26] E. Angelelli and M. G. Speranza, "The periodic vehicle routing problem with intermediate facilities," European Journal for Operations Research, vol.137, no.2, pp.233-247, 2002. [27] S. C. Ho and D. Haugland, "A tabu search heuristic for the vehicle routing problem with time windows and split deliveries," Computers & Operations Research, vol.31, no.12, pp.1947-1964, 2004. [28] N. Secomandi, "Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands," Computers & Operations Research, vol.27, no.11, pp.1201-1225, 2000. [29] W. P. Nanry and J. W. Barnes, "Solving the pickup and delivery problem with time windows using reactive tabu search," Transportation Research Part B, vol.34, no. 2, pp.107-121, 2000. [30] R. Bent and P.V. Hentenryck, "A two-stage hybrid algorithm for pickup and delivery vehicle routing problems with time windows," Computers & Operations Research, vol.33, no.4, pp.875-893, 2003. [31] A. Bachem, W. Hochstattler and M. Malich, "The simulated trading heuristic for solving vehicle routing problems," Discrete Applied Mathematics, vol. 65, pp.47-72, 1996.. [32] [57] S. C. Hong and Y. B. Park, "A heuristic for bi-objective vehicle routing with time window constraints," International Journal of Production Economics, vol.62, no.3, pp.249-258, 1999. [33] R. A. Rusell and W. C. Chiang, "Scatter search for the vehicle routing problem with time windows," European Journal for Operations Research, vol.169, no.2, pp.606-622, 2006. [34] A. V. Donati, R. Montemanni, N. Casagrande, A. E. Rizzoli, L. M. Gambardella, "Time Dependent Vehicle Routing Problem with a Multi Ant Colony System", European Journal of Operational Research, vol.185, no.3, pp.1174–1191, 2008. [35] T. Stützle, "MAX-MIN Ant System for the quadratic assignment problem," Technical Report AIDA-97-4, FB Informatik, TU Darmstadt, Germany, 1997. [36] R. Lourenço and D. Serra "Adaptive search heuristics for the generalized assignment problem," Mathware & soft computing, vol.9, no.2-3, 2002. [37] M. Yagiura, T. Ibaraki and F. Glover, "An ejection chain approach for the generalized assignment problem," INFORMS Journal on Computing, vol. 16, no. 2, pp. 133–151, 2004. [38] K. I. Aardal, S. P. M.van Hoesel, A. M. C. A. Koster, C. Mannino and Antonio. Sassano, "Models and solution techniques for the frequency assignment problem," A Quarterly Journal of Operations Research, vol.1, no.4, pp.261-317, 2001. [39] Y. C. Liang and A. E. Smith, "An ant colony optimization algorithm for the redundancy allocation problem (RAP)," IEEE Transactions on Reliability, vol.53, no.3, pp.417-423, 2004. [40] G. Leguizamon and Z. Michalewicz, "A new version of ant system for subset problems," Proceedings of the 1999 Congress on Evolutionary Computation(CEC 99), vol.2, pp.1458-1464, 1999. [41] R. Hadji, M. Rahoual, E. Talbi and V. Bachelet "Ant colonies for the set covering problem," Abstract proceedings of ANTS2000, pp.63-66, 2000. [42] V Maniezzo and M Milandri, "An ant-based framework for very strongly constrained problems," Proceedings of ANTS2000, pp.222-227, 2002. [43] R. Cordone and F. Maffioli,"Colored Ant System and local search to design local telecommunication networks," Applications of Evolutionary Computing: Proceedings of Evo Workshops, vol.2037, pp.60-69, 2001. [44] C. Blum and M.J. Blesa, "Metaheuristics for the edge-weighted k-cardinality tree problem," Technical Report TR/IRIDIA/2003-02, IRIDIA, 2003. [45] S. Fidanova, "ACO algorithm for MKP using various heuristic information" (http:/ / parallel. bas. bg/ ~stefka/ heuristic. ps), Numerical Methods and Applications, vol.2542, pp.438-444, 2003. [46] G. Leguizamon, Z. Michalewicz and Martin Schutz, "An ant system for the maximum independent set problem," Proceedings of the 2001 Argentinian Congress on Computer Science, vol.2, pp.1027-1040, 2001. [47] D. Martens, M. De Backer, R. Haesen, J. Vanthienen, M. Snoeck, B. Baesens, "Classification with Ant Colony Optimization", IEEE Transactions on Evolutionary Computation, volume 11, number 5, pages 651—665, 2007. [48] G. D. Caro and M. Dorigo, "Extending AntNet for best-effort quality-of-service routing," Proceedings of the First Internation Workshop on Ant Colony Optimization (ANTS’98), 1998. [49] G.D. Caro and M. Dorigo "AntNet: a mobile agents approach to adaptive routing," Proceedings of the Thirty-First Hawaii International Conference on System Science, vol.7, pp.74-83, 1998. 150 Ant colony optimization algorithms [50] G. D. Caro and M. Dorigo, "Two ant colony algorithms for best-effort routing in datagram networks," Proceedings of the Tenth IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS’98), pp.541-546, 1998. [51] D. Martens, B. Baesens, T. Fawcett "Editorial Survey: Swarm Intelligence for Data Mining," Machine Learning, volume 82, number 1, pp. 1-42, 2011 [52] R. S. Parpinelli, H. S. Lopes and A. A Freitas, "An ant colony algorithm for classification rule discovery," Data Mining: A heuristic Approach, pp.191-209, 2002. [53] R. S. Parpinelli, H. S. Lopes and A. A Freitas, "Data mining with an ant colony optimization algorithm," IEEE Transaction on Evolutionary Computation, vol.6, no.4, pp.321-332, 2002. [54] W. N. Chen, J. ZHANG and H. Chung, "Optimizing Discounted Cash Flows in Project Scheduling--An Ant Colony Optimization Approach", IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews Vol.40 No.5 pp.64-77, Jan. 2010. [55] D. Picard, A. Revel, M. Cord, "An Application of Swarm Intelligence to Distributed Image Retrieval", Information Sciences, 2010 [56] D. Picard, M. Cord, A. Revel, "Image Retrieval over Networks : Active Learning using Ant Algorithm", IEEE Transactions on Multimedia, vol. 10, no. 7, pp. 1356--1365 - nov 2008 [57] W. N. Chen and J. ZHANG "Ant Colony Optimization Approach to Grid Workflow Scheduling Problem with Various QoS Requirements", IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews, Vol. 31, No. 1,pp.29-43,Jan 2009. [58] S. Meshoul and M Batouche, "Ant colony system with extremal dynamics for point matching and pose estimation," Proceeding of the 16th International Conference on Pattern Recognition, vol.3, pp.823-826, 2002. [59] H. Nezamabadi-pour, S. Saryazdi, and E. Rashedi, " Edge detection using ant algorithms", Soft Computing, vol. 10, no.7, pp. 623-628, 2006. [60] Xiao. M.Hu, J. ZHANG, and H. Chung, "An Intelligent Testing System Embedded with an Ant Colony Optimization Based Test Composition Method", IEEE Transactions on Systems, Man, and Cybernetics--Part C: Applications and Reviews, Vol. 39, No. 6, pp. 659-669, Dec 2009. [61] L. Wang and Q. D. Wu, "Linear system parameters identification based on ant system algorithm," Proceedings of the IEEE Conference on Control Applications, pp.401-406, 2001. [62] K. C. Abbaspour, R. Schulin, M. T. Van Genuchten, "Estimating unsaturated soil hydraulic parameters using ant colony optimization," Advances In Water Resources, vol.24, no.8, pp.827-841, 2001. [63] X. M. Hu, J. ZHANG,J. Xiao and Y. Li, "Protein Folding in Hydrophobic-Polar Lattice Model: A Flexible Ant- Colony Optimization Approach ", Protein and Peptide Letters, Volume 15, Number 5, 2008, Pp. 469-477. [64] A. Shmygelska, R. A. Hernández and H. H. Hoos, "An ant colony algorithm for the 2D HP protein folding problem," Proceedings of the 3rd International Workshop on Ant Algorithms/ANTS 2002, Lecture Notes in Computer Science, vol.2463, pp.40-52, 2002. [65] J. ZHANG, H. Chung, W. L. Lo, and T. Huang, "Extended Ant Colony Optimization Algorithm for Power Electronic Circuit Design", IEEE Transactions on Power Electronic. Vol.24,No.1, pp.147-162, Jan 2009. [66] A. Ajith; G. Crina; R. Vitorino (éditeurs), Stigmergic Optimization, Studies in Computational Intelligence , volume 31, 299 pages, 2006. ISBN 978-3-540-34689-0 [67] P.-P. Grassé, La reconstruction du nid et les coordinations inter-individuelles chez Belicositermes natalensis et Cubitermes sp. La théorie de la Stigmergie : Essai d’interprétation du comportement des termites constructeurs, Insectes Sociaux, numéro 6, p. 41-80, 1959. [68] J.L. Denebourg, J.M. Pasteels et J.C. Verhaeghe, Probabilistic Behaviour in Ants : a Strategy of Errors?, Journal of Theoretical Biology, numéro 105, 1983. [69] F. Moyson, B. Manderick, The collective behaviour of Ants : an Example of Self-Organization in Massive Parallelism, Actes de AAAI Spring Symposium on Parallel Models of Intelligence, Stanford, Californie, 1988. [70] M. Ebling, M. Di Loreto, M. Presley, F. Wieland, et D. Jefferson,An Ant Foraging Model Implemented on the Time Warp Operating System, Proceedings of the SCS Multiconference on Distributed Simulation, 1989 [71] Dorigo M., V. Maniezzo et A. Colorni, Positive feedback as a search strategy, rapport technique numéro 91-016, Dip. Elettronica, Politecnico di Milano, Italy, 1991 [72] R. Schoonderwoerd, O. Holland, J. Bruten et L. Rothkrantz, Ant-based load balancing in telecommunication networks, Adaptive Behaviour, volume 5, numéro 2, pages 169-207, 1997 [73] M. Dorigo, ANTS’ 98, From Ant Colonies to Artificial Ants : First International Workshop on Ant Colony Optimization, ANTS 98, Bruxelles, Belgique, octobre 1998. [74] T. Stützle, Parallelization Strategies for Ant Colony Optimization, Proceedings of PPSN-V, Fifth International Conference on Parallel Problem Solving from Nature, Springer-Verlag, volume 1498, pages 722-731, 1998. [75] É. Bonabeau, M. Dorigo et G. Theraulaz, Swarm intelligence, Oxford University Press, 1999. [76] M. Dorigo , G. Di Caro et T. Stützle, Special issue on "Ant Algorithms", Future Generation Computer Systems, volume 16, numéro 8, 2000 [77] W.J. Gutjahr, A graph-based Ant System and its convergence, Future Generation Computer Systems, volume 16, pages 873-888, 2000. [78] http:/ / www. eurobios. com/ [79] http:/ / www. antoptima. com/ [80] S. Iredi, D. Merkle et M. Middendorf, Bi-Criterion Optimization with Multi Colony Ant Algorithms, Evolutionary Multi-Criterion Optimization, First International Conference (EMO’01), Zurich, Springer Verlag, pages 359-372, 2001. [81] L. Bianchi, L.M. Gambardella et M.Dorigo, An ant colony optimization approach to the probabilistic traveling salesman problem, PPSN-VII, Seventh International Conference on Parallel Problem Solving from Nature, Lecture Notes in Computer Science, Springer Verlag, Berlin, Allemagne, 2002. 151 Ant colony optimization algorithms Publications (selected) • M. Dorigo, 1992. Optimization, Learning and Natural Algorithms, PhD thesis, Politecnico di Milano, Italy. • M. Dorigo, V. Maniezzo & A. Colorni, 1996. "Ant System: Optimization by a Colony of Cooperating Agents", IEEE Transactions on Systems, Man, and Cybernetics–Part B, 26 (1): 29–41. • M. Dorigo & L. M. Gambardella, 1997. "Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem". IEEE Transactions on Evolutionary Computation, 1 (1): 53–66. • M. Dorigo, G. Di Caro & L. M. Gambardella, 1999. "Ant Algorithms for Discrete Optimization". Artificial Life, 5 (2): 137–172. • E. Bonabeau, M. Dorigo et G. Theraulaz, 1999. Swarm Intelligence: From Natural to Artificial Systems, Oxford University Press. ISBN 0-19-513159-2 • M. Dorigo & T. Stützle, 2004. Ant Colony Optimization, MIT Press. ISBN 0-262-04219-3 • M. Dorigo, 2007. "Ant Colony Optimization" (http://www.scholarpedia.org/article/ Ant_Colony_Optimization). Scholarpedia. • C. Blum, 2005 "Ant colony optimization: Introduction and recent trends". Physics of Life Reviews, 2: 353-373 • M. Dorigo, M. Birattari & T. Stützle, 2006 Ant Colony Optimization: Artificial Ants as a Computational Intelligence Technique (http://iridia.ulb.ac.be/IridiaTrSeries/IridiaTr2006-023r001.pdf). TR/IRIDIA/2006-023 • Mohd Murtadha Mohamad,"Articulated Robots Motion Planning Using Foraging Ant Strategy",Journal of Information Technology - Special Issues in Artificial Intelligence, Vol.20, No. 4 pp. 163–181, December 2008, ISSN0128-3790. • N. Monmarché, F. Guinand & P. Siarry (eds), "Artificial Ants", August 2010 Hardback 576 pp. ISBN 9781848211940. External links • Ant Colony Optimization Home Page (http://www.aco-metaheuristic.org/) • AntSim - Simulation of Ant Colony Algorithms (http://www.nightlab.ch/antsim) • MIDACO-Solver (http://www.midaco-solver.com/) General purpose optimization software based on Ant Colony Optimization (Matlab, Excel, C/C++, Fortran, Python) • University of Kaiserslautern, Germany, AG Wehn: Ant Colony Optimization Applet (http://ems.eit.uni-kl.de/ index.php?id=156) Visualization of Traveling Salesman solved by Ant System with numerous options and parameters (Java Applet) • Ant Farm Simulator (http://webspace.webring.com/people/br/raguirre/hormigas/antfarm/) • Ant algorithm simulation (Java Applet) (http://www.djoh.net/inde/ANTColony/applet.html) 152 Artificial bee colony algorithm Artificial bee colony algorithm In computer science and operations research, the artificial bee colony algorithm (ABC) is an optimization algorithm based on the intelligent foraging behaviour of honey bee swarm, proposed by Karaboga in 2005.[1] Algorithm In the ABC model, the colony consists of three groups of bees: employed bees, onlookers and scouts. It is assumed that there is only one artificial employed bee for each food source. In other words, the number of employed bees in the colony is equal to the number of food sources around the hive. Employed bees go to their food source and come back to hive and dance on this area. The employed bee whose food source has been abandoned becomes a scout and starts to search for finding a new food source. Onlookers watch the dances of employed bees and choose food sources depending on dances. The main steps of the algorithm are given below: • Initial food sources are produced for all employed bees • REPEAT • Each employed bee goes to a food source in her memory and determines a neighbour source, then evaluates its nectar amount and dances in the hive • Each onlooker watches the dance of employed bees and chooses one of their sources depending on the dances, and then goes to that source. After choosing a neighbour around that, she evaluates its nectar amount. • Abandoned food sources are determined and are replaced with the new food sources discovered by scouts. • The best food source found so far is registered. • UNTIL (requirements are met) In ABC, a population based algorithm, the position of a food source represents a possible solution to the optimization problem and the nectar amount of a food source corresponds to the quality (fitness) of the associated solution. The number of the employed bees is equal to the number of solutions in the population. At the first step, a randomly distributed initial population (food source positions) is generated. After initialization, the population is subjected to repeat the cycles of the search processes of the employed, onlooker, and scout bees, respectively. An employed bee produces a modification on the source position in her memory and discovers a new food source position. Provided that the nectar amount of the new one is higher than that of the previous source, the bee memorizes the new source position and forgets the old one. Otherwise she keeps the position of the one in her memory. After all employed bees complete the search process, they share the position information of the sources with the onlookers on the dance area. Each onlooker evaluates the nectar information taken from all employed bees and then chooses a food source depending on the nectar amounts of sources. As in the case of the employed bee, she produces a modification on the source position in her memory and checks its nectar amount. Providing that its nectar is higher than that of the previous one, the bee memorizes the new position and forgets the old one. The sources abandoned are determined and new sources are randomly produced to be replaced with the abandoned ones by artificial scouts. Application to real-world problems Since 2005, D. Karaboga and his research group [2] have been studying the ABC algorithm and its applications to real world problems. Karaboga and Basturk have investigated the performance of the ABC algorithm on unconstrained numerical optimization problems [3][4][5] and its extended version for the constrained optimization problems [6] and Karaboga et al. applied ABC algorithm to neural network training.[7][8] In 2010, Hadidi et al. employed an Artificial Bee Colony (ABC) Algorithm based approach for structural optimization.[9] In 2011, Zhang et al. employed the ABC for optimal multi-level thresholding,[10] MR brain image classification,[11] cluster analysis,[12] face pose estimation,[13] and 2D protein folding.[14] 153 Artificial bee colony algorithm References [1] D. Karaboga, An Idea Based On Honey Bee Swarm for Numerical Optimization, Technical Report-TR06,Erciyes University, Engineering Faculty, Computer Engineering Department 2005. [2] "Artificial bee colony (ABC) algorithm homepage" (http:/ / mf. erciyes. edu. tr/ abc). Mf.erciyes.edu.tr. . Retrieved 2012-02-19. [3] B.Basturk, Dervis Karaboga, An Artificial Bee Colony (ABC) Algorithm for Numeric function Optimization, IEEE Swarm Intelligence Symposium 2006, May 12–14, 2006, Indianapolis, Indiana, USA. [4] D. Karaboga, B. Basturk, A Powerful And Efficient Algorithm For Numerical Function Optimization: Artificial Bee Colony (ABC) Algorithm, Journal of Global Optimization, Volume:39 , Issue:3 ,pp: 459–471, Springer Netherlands, 2007. doi: 10.1007/s10898-007-9149-x [5] D. Karaboga, B. Basturk, On The Performance Of Artificial Bee Colony (ABC) Algorithm, Applied Soft Computing,Volume 8, Issue 1, January 2008, Pages 687–697. doi:10.1016/j.asoc.2007.05.007 [6] D. Karaboga, B. Basturk, Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization Problems, LNCS: Advances in Soft Computing: Foundations of Fuzzy Logic and Soft Computing, Vol: 4529/2007, pp: 789–798, Springer- Verlag, 2007, IFSA 2007. doi: 10.1007/978-3-540-72950-1_77 [7] D. Karaboga, B. Basturk Akay, Artificial Bee Colony Algorithm on Training Artificial Neural Networks, Signal Processing and Communications Applications, 2007. SIU 2007, IEEE 15th. 11–13 June 2007, Page(s):1 – 4, doi: 10.1109/SIU.2007.4298679 [8] D. Karaboga, B. Basturk Akay, C. Ozturk, Artificial Bee Colony (ABC) Optimization Algorithm for Training Feed-Forward Neural Networks, LNCS: Modeling Decisions for Artificial Intelligence, Vol: 4617/2007, pp:318–319, Springer-Verlag, 2007, MDAI 2007. doi: 10.1007/978-3-540-73729-2_30 [9] Ali Hadidi, Sina Kazemzadeh Azad, Saeid Kazemzadeh Azad, Structural optimization using artificial bee colony algorithm, 2nd International Conference on Engineering Optimization, 2010, September 6 – 9, Lisbon, Portugal. [10] Y. Zhang and L. Wu, Optimal multi-level Thresholding based on Maximum Tsallis Entropy via an Artificial Bee Colony Approach, Entropy, vol. 13, no. 4, (2011), pp. 841-859 [11] Y. Zhang, L. Wu, and S. Wang, Magnetic Resonance Brain Image Classification by an Improved Artificial Bee Colony Algorithm, Progress in Electromagnetics Research, vol. 116, (2011), pp. 65-79 [12] Y. Zhang, L. Wu, S. Wang, Y. Huo, Chaotic Artificial Bee Colony used for Cluster Analysis, Communications in Computer and Information Science, vol. 134, no. 1, (2011), pp. 205-211 [13] Y. Zhang, L. Wu, Face Pose Estimation by Chaotic Artificial Bee Colony, International Journal of Digital Content Technology and its Applications, vol. 5, no. 2, (2011), pp. 55-63 [14] Y. Zhang and L. Wu, Artificial Bee Colony for Two Dimensional Protein Folding, Advances in Electrical Engineering Systems, vol. 1, no. 1, (2012), pp. 19-23 10. Mustafa Sonmez,Discrete optimum design of truss structures using artificial bee colony algorithm, Structural and Multidisciplinary Optimization, Volume 43 Issue 1, January 2011 External links • Artificial Bee Colony Algorithm (http://mf.erciyes.edu.tr/abc) 154 Evolution strategy Evolution strategy In computer science, Evolution Strategy (ES) is an optimization technique based on ideas of adaptation and evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies. History The evolution strategy optimization technique was created in the early 1960s and developed further in the 1970s and later by Ingo Rechenberg, Hans-Paul Schwefel and his co-workers. Methods Evolution strategies use natural problem-dependent representations, and primarily mutation and selection, as search operators. In common with evolutionary algorithms, the operators are applied in a loop. An iteration of the loop is called a generation. The sequence of generations is continued until a termination criterion is met. As far as real-valued search spaces are concerned, mutation is normally performed by adding a normally distributed random value to each vector component. The step size or mutation strength (i.e. the standard deviation of the normal distribution) is often governed by self-adaptation (see evolution window). Individual step sizes for each coordinate or correlations between coordinates are either governed by self-adaptation or by covariance matrix adaptation (CMA-ES). The (environmental) selection in evolution strategies is deterministic and only based on the fitness rankings, not on the actual fitness values. The resulting algorithm is therefore invariant with respect to monotonic transformations of the objective function. The simplest evolution strategy operates on a population of size two: the current point (parent) and the result of its mutation. Only if the mutant's fitness is at least as good as the parent one, it becomes the parent of the next generation. Otherwise the mutant is disregarded. This is a (1 + 1)-ES. More generally, λ mutants can be generated and compete with the parent, called (1 + λ)-ES. In (1 , λ)-ES the best mutant becomes the parent of the next generation while the current parent is always disregarded. For some of these variants, proofs of linear convergence (in a stochastic sense) have been derived on unimodal objective functions.[1][2] Contemporary derivatives of evolution strategy often use a population of μ parents and also recombination as an additional operator, called (μ/ρ+, λ)-ES. This makes them less prone to get stuck in local optima.[3] References [1] Auger, A. (2005). "Convergence results for the (1,λ)-SA-ES using the theory of φ-irreducible Markov chains". Theoretical Computer Science (Elsevier) 334 (1-3): 35–69. doi:10.1016/j.tcs.2004.11.017. [2] Jägersküpper, J. (2006). "How the (1+1) ES using isotropic mutations minimizes positive definite quadratic forms". Theoretical Computer Science (Elsevier) 361 (1): 38–56. doi:10.1016/j.tcs.2006.04.004. [3] Hansen, N.; S. Kern (2004). "Evaluating the CMA Evolution Strategy on Multimodal Test Functions". Parallel Problem Solving from Nature - PPSN VIII. Springer. pp. 282–291. doi:10.1007/978-3-540-30217-9_29. Bibliography • Ingo Rechenberg (1971): Evolutionsstrategie – Optimierung technischer Systeme nach Prinzipien der biologischen Evolution (PhD thesis). Reprinted by Fromman-Holzboog (1973). • Hans-Paul Schwefel (1974): Numerische Optimierung von Computer-Modellen (PhD thesis). Reprinted by Birkhäuser (1977). • H.-G. Beyer and H.-P. Schwefel. Evolution Strategies: A Comprehensive Introduction. Journal Natural Computing, 1(1):3–52, 2002. • Hans-Georg Beyer: The Theory of Evolution Strategies: Springer April 27, 2001. 155 Evolution strategy • Hans-Paul Schwefel: Evolution and Optimum Seeking: New York: Wiley & Sons 1995. • Ingo Rechenberg: Evolutionsstrategie '94. Stuttgart: Frommann-Holzboog 1994. • J. Klockgether and H. P. Schwefel (1970). Two-Phase Nozzle And Hollow Core Jet Experiments. AEG-Forschungsinstitut. MDH Staustrahlrohr Project Group. Berlin, Federal Republic of Germany. Proceedings of the 11th Symposium on Engineering Aspects of Magneto-Hydrodynamics, Caltech, Pasadena, Cal., 24.–26.3. 1970. Research centers • Bionics & Evolutiontechnique at the Technical University Berlin (http://www.bionik.tu-berlin.de/institut/ xstart.htm) • Chair of Algorithm Engineering (Ls11) – University of Dortmund (http://ls11-www.cs.uni-dortmund.de/) • Collaborative Research Center 531 – University of Dortmund (http://sfbci.cs.uni-dortmund.de/) External links • http://www.scholarpedia.org/article/Evolution_Strategies :A peer-reviewed discussion of the subject. • Animation: Optimization of a Two-Phase Flashing Nozzle with an Evolution Strategy. (http://evonet.lri.fr/ CIRCUS2/node.php?node=72) Animation of the Classical Experimental Optimization of a two phase flashing nozzle made by Professor Hans-Paul Schwefel and J. Klockgether. The result was shown at the Proceedings of the 11th Symposium on Engineering Aspects of Magneto-Hydrodynamics, Caltech, Pasadena, Cal., 24.–26.3. 1970. • CMA Evolution Strategy (http://www.lri.fr/~hansen/cmaesintro.html) – a contemporary variant where the complete covariance matrix of the multivariate normal mutation distribution is adapted. • Comparison of Evolutionary Algorithms on a Benchmark Function Set – The 2005 IEEE Congress on Evolutionary Computation: Session on Real-Parameter Optimization (http://www.lri.fr/~hansen/cec2005. html) - The CMA-ES (Covariance Matrix Adaptation Evolution Strategy) applied in a benchmark function set and compared to nine other Evolutionary Algorithms. • Evolution Strategies (http://www.bionik.tu-berlin.de/institut/xs2evost.html) – A brief description. • Evolution Strategies Animations (http://www.bionik.tu-berlin.de/institut/xs2anima.html) - Some interesting animations and real world problems (such as format of lenses, bridges configurations, etc) solved through Evolution Strategies. • Evolution Strategy in Action – 10 ES-Demonstrations. By Michael Herdy and Gianino Patone (http://www. bionik.tu-berlin.de/user/giani/esdemos/evo.html) – 10 problems solved through Evolution Strategies. • Evolutionary Algorithms Demos (http://www.frankiedrk.de/demos.html) – There are some applets with Evolution Strategies and Genetic Algorithms that the user can manipulate to solve problems. Very interesting for a comparison between the two Evolutionary Algorithms. • Evolutionary Car Racing Videos (http://togelius.blogspot.com/2006/04/evolutionary-car-racing-videos.html) – The application of Evolution Strategies to evolve cars' behaviours. • EvoWeb. (http://evonet.lri.fr/index.php) – The European Network of Excellence in Evolutionary Computing. • Learning To Fly: Evolving Helicopter Flight Through Simulated Evolution (http://togelius.blogspot.com/2006/ 08/learning-to-fly.html) – A (10 + 23)-ES applied to evolve a helicopter flight controller. • Professor Hans-Paul Schwefel talks to EvoNews (http://evonet.lri.fr/evoweb/news_events/news_features/ article.php?id=5) – An interview with Professor Hans-Paul Schwefel, one of the Evolution Strategy pioneers. 156 Evolution window 157 Evolution window It was observed in evolution strategies that significant progress toward the fitness/objective function's optimum, generally, can only happen in a narrow band of the mutation step size σ. That narrow band is called evolution window. There are three well-known methods to adapt the mutation step size σ in evolution strategies: • (1/5-th) Success Rule • Self-Adaptation (for example through log-normal mutations) • Cumulative Step Size Adaptation (CSA) On simple functions all of them have been empirically shown to keep the step size within the evolution window. References • H.-G. Beyer. Toward a Theory of Evolution Strategies: Self-Adaptation. Evolutionary Computation, 3(3), 311-347. • Ingo Rechenberg: Evolutionsstrategie '94. Stuttgart: Frommann-Holzboog 1994. CMA-ES CMA-ES stands for Covariance Matrix Adaptation Evolution Strategy. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation (via mutation and recombination) and selection: in each generation (iteration) new individuals (candidate solutions, denoted as ) are generated by variation, usually in a stochastic way, and then some individuals are selected for the next generation based on their fitness or objective function value . Like this, over the generation sequence, individuals with better and better -values are generated. In an evolution strategy, new candidate solutions are sampled according to a multivariate normal distribution in the . Pairwise dependencies between the variables in this distribution are represented by a covariance matrix. The covariance matrix adaptation (CMA) is a method to update the covariance matrix of this distribution. This is particularly useful, if the function is ill-conditioned. Adaptation of the covariance matrix amounts to learning a second order model of the underlying objective function similar to the approximation of the inverse Hessian matrix in the Quasi-Newton method in classical optimization. In contrast to most classical methods, fewer assumptions on the nature of the underlying objective function are made. Only the ranking between candidate solutions is exploited for learning the sample distribution and neither derivatives nor even the function values themselves are required by the method. CMA-ES 158 Principles Two main principles for the adaptation of parameters of the search distribution are exploited in the CMA-ES algorithm. First, a maximum-likelihood principle, based on the idea to increase the probability of successful candidate solutions and search steps. The mean Concept behind the covariance matrix adaptation. As the generations develop, the distribution shape can adapt to an ellipsoidal or ridge-like landscape (the shown landscape of the distribution is updated such that is spherical by mistake). the likelihood of previously successful candidate solutions is maximized. The covariance matrix of the distribution is updated (incrementally) such that the likelihood of previously successful search steps is increased. Both updates can be interpreted as a natural gradient descent. Also, in consequence, the CMA conducts an iterated principal components analysis of successful search steps while retaining all principal axes. Estimation of distribution algorithms and the Cross-Entropy Method are based on very similar ideas, but estimate (non-incrementally) the covariance matrix by maximizing the likelihood of successful solution points instead of successful search steps. Second, two paths of the time evolution of the distribution mean of the strategy are recorded, called search or evolution paths. These paths contain significant information about the correlation between consecutive steps. Specifically, if consecutive steps are taken in a similar direction, the evolution paths become long. The evolution paths are exploited in two ways. One path is used for the covariance matrix adaptation procedure in place of single successful search steps and facilitates a possibly much faster variance increase of favorable directions. The other path is used to conduct an additional step-size control. This step-size control aims to make consecutive movements of the distribution mean orthogonal in expectation. The step-size control effectively prevents premature convergence yet allowing fast convergence to an optimum. Algorithm In the following the most commonly used (μ/μw, λ)-CMA-ES is outlined, where in each iteration step a weighted combination of the μ best out of λ new candidate solutions is used to update the distribution parameters. The main loop consists of three main parts: 1) sampling of new solutions, 2) re-ordering of the sampled solutions based on their fitness, 3) update of the internal state variables based on the re-ordered samples. A pseudocode of the algorithm looks as follows. set // number of samples per iteration, at least two, generally > 4 initialize , , while not terminate for in , , // initialize state variables // iterate // sample new solutions and evaluate them = sample_multivariate_normal(mean= = fitness( ← = = argsort( // we need later ← update_ps ← update_pc ← update_C ) ) with ← update_m , covariance_matrix= , ) // sort solutions and // move mean to better solutions // update isotropic evolution path // update anisotropic evolution path // update covariance matrix CMA-ES 159 ← update_sigma return // update step-size using isotropic path length or The order of the five update assignments is relevant. In the following, the update equations for the five state variables are specified. Given are the search space dimension and the iteration step . The five state variables are , the distribution mean and current favorite solution to the optimization problem, , the step-size, , a symmetric and positive definite covariance matrix with and , two evolution paths, initially set to the zero vector. The iteration starts with sampling candidate solutions from a multivariate normal distribution , i.e. for The second line suggests the interpretation as perturbation (mutation) of the current favorite solution vector distribution mean vector). The candidate solutions are evaluated on the objective function minimized. Denoting the (the to be -sorted candidate solutions as the new mean value is computed as where the positive (recombination) weights the weights are chosen such that sum to one. Typically, and . The only feedback used from the objective function here and in the following is an ordering of the sampled candidate solutions due to the indices . The step-size is updated using cumulative step-size adaptation (CSA), sometimes also denoted as path length control. The evolution path (or search path) is updated first. where is the backward time horizon for the evolution path and larger than one, is the variance effective selection mass and by definition of is the unique symmetric square root of the inverse of is the damping parameter usually close to one. For unchanged. or , , and the step-size remains CMA-ES 160 The step-size is increased if and only if is larger than the expected value and decreased if it is smaller. For this reason, the step-size update tends to make consecutive steps -conjugate, [1] in that after the adaptation has been successful . Finally, the covariance matrix is updated, where again the respective evolution path is updated first. where denotes the transpose and is the backward time horizon for the evolution path and the indicator function words, and larger than one, evaluates to one iff or, in other , which is usually the case, makes partly up for the small variance loss in case the indicator is zero, is the learning rate for the rank-one update of the covariance matrix and is the learning rate for the rank- update of the covariance matrix and must not exceed . The covariance matrix update tends to increase the likelihood for and for to be sampled from . This completes the iteration step. The number of candidate samples per iteration, , is not determined a priori and can vary in a wide range. Smaller values, for example default value increasing , lead to more local search behavior. Larger values, for example with , render the search more global. Sometimes the algorithm is repeatedly restarted with by a factor of two for each restart.[2] Besides of setting (or possibly instead, if for example is predetermined by the number of available processors), the above introduced parameters are not specific to the given objective function and therefore not meant to be modified by the user. Example code in Matlab/Octave function xmin=purecmaes % (mu/mu_w, lambda)-CMA-ES % -------------------- Initialization -------------------------------% User defined input parameters (need to be edited) strfitnessfct = 'frosenbrock'; % name of objective/fitness function N = 20; % number of objective variables/problem dimension xmean = rand(N,1); % objective variables initial point sigma = 0.5; % coordinate wise standard deviation (step size) CMA-ES 161 stopfitness = 1e-10; stopeval = 1e3*N^2; evaluations % stop if fitness < stopfitness (minimization) % stop after stopeval number of function % Strategy parameter setting: Selection lambda = 4+floor(3*log(N)); % population size, offspring number mu = lambda/2; % number of parents/points for recombination weights = log(mu+1/2)-log(1:mu)'; % muXone array for weighted recombination mu = floor(mu); weights = weights/sum(weights); % normalize recombination weights array mueff=sum(weights)^2/sum(weights.^2); % variance-effectiveness of sum w_i x_i % Strategy parameter setting: Adaptation cc = (4+mueff/N) / (N+4 + 2*mueff/N); % time constant for cumulation for C cs = (mueff+2) / (N+mueff+5); % t-const for cumulation for sigma control c1 = 2 / ((N+1.3)^2+mueff); % learning rate for rank-one update of C cmu = 2 * (mueff-2+1/mueff) / ((N+2)^2+mueff); % and for rank-mu update damps = 1 + 2*max(0, sqrt((mueff-1)/(N+1))-1) + cs; % damping for sigma % usually close to 1 % Initialize dynamic (internal) strategy parameters and constants pc = zeros(N,1); ps = zeros(N,1); % evolution paths for C and sigma B = eye(N,N); % B defines the coordinate system D = ones(N,1); % diagonal D defines the scaling C = B * diag(D.^2) * B'; % covariance matrix C invsqrtC = B * diag(D.^-1) * B'; % C^-1/2 eigeneval = 0; % track update of B and D chiN=N^0.5*(1-1/(4*N)+1/(21*N^2)); % expectation of % ||N(0,I)|| == norm(randn(N,1)) % -------------------- Generation Loop -------------------------------counteval = 0; % the next 40 lines contain the 20 lines of interesting code while counteval < stopeval % Generate and evaluate lambda offspring CMA-ES 162 for k=1:lambda, arx(:,k) = xmean + sigma * B * (D .* randn(N,1)); % m + sig * Normal(0,C) arfitness(k) = feval(strfitnessfct, arx(:,k)); % objective function call counteval = counteval+1; end % Sort by fitness and compute weighted mean into xmean [arfitness, arindex] = sort(arfitness); % minimization xold = xmean; xmean = arx(:,arindex(1:mu))*weights; % recombination, new mean value % Cumulation: Update evolution paths ps = (1-cs)*ps ... + sqrt(cs*(2-cs)*mueff) * invsqrtC * (xmean-xold) / sigma; hsig = norm(ps)/sqrt(1-(1-cs)^(2*counteval/lambda))/chiN < 1.4 + 2/(N+1); pc = (1-cc)*pc ... + hsig * sqrt(cc*(2-cc)*mueff) * (xmean-xold) / sigma; % Adapt covariance matrix C artmp = (1/sigma) * (arx(:,arindex(1:mu))-repmat(xold,1,mu)); C = (1-c1-cmu) * C ... % regard old matrix + c1 * (pc*pc' ... % plus rank one update + (1-hsig) * cc*(2-cc) * C) ... % minor correction if hsig==0 + cmu * artmp * diag(weights) * artmp'; % plus rank mu update % Adapt step size sigma sigma = sigma * exp((cs/damps)*(norm(ps)/chiN - 1)); % Decomposition of C into B*diag(D.^2)*B' (diagonalization) if counteval - eigeneval > lambda/(c1+cmu)/N/10 % to achieve O(N^2) eigeneval = counteval; C = triu(C) + triu(C,1)'; % enforce symmetry [B,D] = eig(C); % eigen decomposition, B==normalized eigenvectors D = sqrt(diag(D)); % D is a vector of standard deviations now invsqrtC = B * diag(D.^-1) * B'; end % Break, if fitness is good enough or condition exceeds 1e14, better termination methods are advisable CMA-ES 163 if arfitness(1) <= stopfitness || max(D) > 1e7 * min(D) break; end end % while, end generation loop xmin = arx(:, arindex(1)); % Return best point of last iteration. % Notice that xmean is expected to be even % better. % --------------------------------------------------------------function f=frosenbrock(x) if size(x,1) < 2 error('dimension must be greater one'); end f = 100*sum((x(1:end-1).^2 - x(2:end)).^2) + sum((x(1:end-1)-1).^2); Theoretical Foundations Given the distribution parameters—mean, variances and covariances—the normal probability distribution for sampling new candidate solutions is the maximum entropy probability distribution over , that is, the sample distribution with the minimal amount of prior information built into the distribution. More considerations on the update equations of CMA-ES are made in the following. Variable Metric The CMA-ES implements a stochastic variable-metric method. In the very particular case of a convex-quadratic objective function the covariance matrix adapts to the inverse of the Hessian matrix fluctuations. More general, also on the function preserving and , where is convex-quadratic, the covariance matrix , up to a scalar factor and small random is strictly increasing and therefore order adapts to , up to a scalar factor and small random fluctuations. Maximum-Likelihood Updates The update equations for mean and covariance matrix maximize a likelihood while resembling an expectation-maximization algorithm. The update of the mean vector maximizes a log-likelihood, such that where denotes the log-likelihood of from a multivariate normal distribution with mean and any positive definite covariance matrix . To see that is independent of remark first that this is the case for any diagonal matrix , because the coordinate-wise maximizer is independent of a scaling factor. Then, rotation of the data points or choosing The rank- non-diagonal are equivalent. update of the covariance matrix, that is, the right most summand in the update equation of maximizes a log-likelihood in that , CMA-ES for 164 (otherwise is singular, but substantially the same result holds for denotes the likelihood of Therefore, for ). Here, from a multivariate normal distribution with zero mean and covariance matrix and , . is the above maximum-likelihood estimator. See estimation of covariance matrices for details on the derivation. Natural Gradient Descent in the Space of Sample Distributions Akimoto et al.[3] recently found that the update of the distribution parameters resembles the descend in direction of a sampled natural gradient of the expected objective function value E f (x) (to be minimized), where the expectation is taken under the sample distribution. With the parameter setting of and , i.e. without step-size control and rank-one update, CMA-ES can thus be viewed as an instantiation of Natural Evolution Strategies (NES).[3][4] The natural gradient is independent of the parameterization of the distribution. Taken with respect to the parameters θ of the sample distribution p, the gradient of E f (x) can be expressed as where depends on the parameter vector , the so-called score function, , indicates the relative sensitivity of p w.r.t. θ, and the expectation is taken with respect to the distribution p. The natural gradient of E f (x), complying with the Fisher information metric (an informational distance measure between probability distributions and the curvature of the relative entropy), now reads where the Fisher information matrix is the expectation of the Hessian of -lnp and renders the expression independent of the chosen parameterization. Combining the previous equalities we get A Monte Carlo approximation of the latter expectation takes the average over λ samples from p where the notation from above is used and therefore for a more robust approximation, rather such that expression for and for are monotonously decreasing in . We might use, as defined in the CMA-ES and zero for i > μ and let is the density of the multivariate normal distribution . Then, we have an explicit CMA-ES 165 and, after some calculations, the updates in the CMA-ES turn out as[3] \begin{align} m_{k+1} &= m_k - \underbrace{[\tilde{\nabla} \widehat{E}_\theta(f)]_{1,\dots, n}}_{ \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! \text{natural gradient for mean} \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\! } \\ &= m_k + \sum_{i=1}^\lambda w_i (x_{i:\lambda} - m_k) \end{align} and where mat forms the proper matrix from the respective natural gradient sub-vector. That means, setting , the CMA-ES updates descend in direction of the approximation of the natural gradient while using different step-sizes (learning rates) for the orthogonal parameters and respectively. Stationarity or Unbiasedness It is comparatively easy to see that the update equations of CMA-ES satisfy some stationarity conditions, in that they are essentially unbiased. Under neutral selection, where , we find that and under some mild additional assumptions on the initial conditions and with an additional minor correction in the covariance matrix update for the case where the indicator function evaluates to zero, we find Invariance Invariance properties imply uniform performance on a class of objective functions. They have been argued to be an advantage, because they allow to generalize and predict the behavior of the algorithm and therefore strengthen the meaning of empirical results obtained on single functions. The following invariance properties have been established for CMA-ES. • Invariance under order-preserving transformations of the objective function value the behavior is identical on invariance is easy to verify, because only the choice of . • Scale-invariance, in that for any given for all strictly increasing . This -ranking is used in the algorithm, which is invariant under the the behavior is independent of and , in that for any for the objective function . • Invariance under rotation of the search space in that for any is independent of the orthogonal matrix and any , given algorithm is also invariant under general linear transformations the behavior on . More general, the when additionally the initial covariance matrix is chosen as . Any serious parameter optimization method should be translation invariant, but most methods do not exhibit all the above described invariance properties. A prominent example with the same invariance properties is the CMA-ES 166 Nelder–Mead method, where the initial simplex must be chosen respectively. Convergence Conceptual considerations like the scale-invariance property of the algorithm, the analysis of simpler evolution strategies, and overwhelming empirical evidence suggest that the algorithm converges on a large class of functions fast to the global optimum, denoted as . On some functions, convergence occurs independently of the initial conditions with probability one. On some functions the probability is smaller than one and typically depends on the initial and . Empirically, the fastest possible convergence rate in for rank-based direct search methods can often be observed (depending on the context denoted as linear or log-linear or exponential convergence). Informally, we can write for some , and more rigorously or similarly, This means that on average the distance to the optimum is decreased in each iteration by a "constant" factor, namely by . The convergence rate is roughly , given is not much larger than the dimension . Even with optimal recombination weights and , the convergence rate cannot largely exceed are all non-negative. The actual linear dependencies in , given the above and are remarkable and they are in both cases the best one can hope for in this kind of algorithm. Yet, a rigorous proof of convergence is missing. Interpretation as Coordinate System Transformation Using a non-identity covariance matrix for the multivariate normal distribution in evolution strategies is equivalent to a coordinate system transformation of the solution vectors,[5] mainly because the sampling equation can be equivalently expressed in an "encoded space" as The covariance matrix defines a bijective transformation (encoding) for all solution vectors into a space, where the sampling takes place with identity covariance matrix. Because the update equations in the CMA-ES are invariant under coordinate system transformations (general linear transformations), the CMA-ES can be re-written as an adaptive encoding procedure applied to a simple evolution strategy with identity covariance matrix.[5] This adaptive encoding procedure is not confined to algorithms that sample from a multivariate normal distribution (like evolution strategies), but can in principle be applied to any iterative search method. CMA-ES 167 Performance in Practice In contrast to most other evolutionary algorithms, the CMA-ES is, from the users perspective, quasi parameter-free. However, the number of candidate samples λ (population size) can be adjusted by the user in order to change the characteristic search behavior (see above). CMA-ES has been empirically successful in hundreds of applications and is considered to be useful in particular on non-convex, non-separable, ill-conditioned, multi-modal or noisy objective functions. The search space dimension ranges typically between two and a few hundred. Assuming a black-box optimization scenario, where gradients are not available (or not useful) and function evaluations are the only considered cost of search, the CMA-ES method is likely to be outperformed by other methods in the following conditions: • on low-dimensional functions, say , for example by the downhill simplex method or surrogate-based methods (like kriging with expected improvement); • on separable functions without or with only negligible dependencies between the design variables in particular in the case of multi-modality or large dimension, for example by differential evolution; • on (nearly) convex-quadratic functions with low or moderate condition number of the Hessian matrix, where BFGS or NEWUOA are typically ten times faster; • on functions that can already be solved with a comparatively small number of function evaluations, say no more than , where CMA-ES is often slower than, for example, NEWUOA or Multilevel Coordinate Search (MCS). On separable functions the performance disadvantage is likely to be most significant, in that CMA-ES might not be able to find at all comparable solutions. On the other hand, on non-separable functions that are ill-conditioned or rugged or can only be solved with more than function evaluations, the CMA-ES shows most often superior performance. Variations and Extensions The (1+1)-CMA-ES [6] generates only one candidate solution per iteration step which only becomes the new distribution mean, if it is better than the old mean. For it is a close variant of Gaussian adaptation. The CMA-ES has also been extended to multiobjective optimization as MO-CMA-ES .[7] Another remarkable extension has been the addition of a negative update of the covariance matrix with the so-called active CMA .[8] References [1] Hansen, N. (2006), "The CMA evolution strategy: a comparing review", Towards a new evolutionary computation. Advances on estimation of distribution algorithms, Springer, pp. 1769–1776 [2] Auger, A.; N. Hansen (2005). "A Restart CMA Evolution Strategy With Increasing Population Size" (http:/ / citeseerx. ist. psu. edu/ viewdoc/ download?doi=10. 1. 1. 97. 8108& rep=rep1& type=pdf). 2005 IEEE Congress on Evolutionary Computation, Proceedings. IEEE. pp. 1769–1776. . [3] Akimoto, Y.; Y. Nagata and I. Ono and S. Kobayashi (2010). "Bidirectional Relation between CMA Evolution Strategies and Natural Evolution Strategies". Parallel Problem Solving from Nature, PPSN XI. Springer. pp. 154–163. [4] Glasmachers, T.; T. Schaul, Y. Sun, D. Wierstra and J. Schmidhuber (2010). "Exponential Natural Evolution Strategies" (http:/ / www. idsia. ch/ ~tom/ publications/ xnes. pdf). Genetic and Evolutionary Computation Conference GECCO. Portland, OR. . [5] Hansen, N. (2008). "Adpative Encoding: How to Render Search Coordinate System Invariant" (http:/ / hal. archives-ouvertes. fr/ inria-00287351/ en/ ). Parallel Problem Solving from Nature, PPSN X. Springer. pp. 205–214. . [6] Igel, C.; T. Suttorp and N. Hansen (2006). "A Computational Efficient Covariance Matrix Update and a (1+1)-CMA for Evolution Strategies" (http:/ / www. cs. york. ac. uk/ rts/ docs/ GECCO_2006/ docs/ p453. pdf). Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). ACM Press. pp. 453–460. . [7] Igel, C.; N. Hansen and S. Roth (2007). "Covariance Matrix Adaptation for Multi-objective Optimization" (http:/ / www. mitpressjournals. org/ doi/ pdfplus/ 10. 1162/ evco. 2007. 15. 1. 1). Evolutionary Computation (MIT press) 15 (1): 1–28. doi:10.1162/evco.2007.15.1.1. PMID 17388777. . [8] Jastrebski, G.A.; D.V. Arnold (2006). "Improving Evolution Strategies through Active Covariance Matrix Adaptation". 2006 IEEE World Congress on Computational Intelligence, Proceedings. IEEE. pp. 9719–9726. doi:10.1109/CEC.2006.1688662. CMA-ES Bibliography • Hansen N, Ostermeier A (2001). Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2) (http://www.mitpressjournals.org/toc/evco/9/2) pp. 159–195. (http://www.lri.fr/ ~hansen/cmaartic.pdf) • Hansen N, Müller SD, Koumoutsakos P (2003). Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation, 11(1) (http://www. mitpressjournals.org/toc/evco/11/1) pp. 1–18. (http://mitpress.mit.edu/journals/pdf/evco_11_1_1_0.pdf) • Hansen N, Kern S (2004). Evaluating the CMA evolution strategy on multimodal test functions. In Xin Yao et al., editors, Parallel Problem Solving from Nature - PPSN VIII, pp. 282–291, Springer. (http://www.lri.fr/~hansen/ ppsn2004hansenkern.pdf) • Igel C, Hansen N, Roth S (2007). Covariance Matrix Adaptation for Multi-objective Optimization. Evolutionary Computation, 15(1) (http://www.mitpressjournals.org/toc/evco/15/1) pp. 1–28. (http://www. mitpressjournals.org/doi/pdfplus/10.1162/evco.2007.15.1.1) External links • A short introduction to CMA-ES by N. Hansen (http://www.lri.fr/~hansen/cmaesintro.html) • The CMA Evolution Strategy: A Tutorial (http://www.lri.fr/~hansen/cmatutorial.pdf) • CMA-ES source code page (http://www.lri.fr/~hansen/cmaes_inmatlab.html) Cultural algorithm Cultural algorithms (CA) are a branch of evolutionary computation where there is a knowledge component that is called the belief space in addition to the population component. In this sense, cultural algorithms can be seen as an extension to a conventional genetic algorithm. Cultural algorithms were introduced by Reynolds (see references). Belief space The belief space of a cultural algorithm is divided into distinct categories. These categories represent different domains of knowledge that the population has of the search space. The belief space is updated after each iteration by the best individuals of the population. The best individuals can be selected using a fitness function that assesses the performance of each individual in population much like in genetic algorithms. List of belief space categories • Normative knowledge A collection of desirable value ranges for the individuals in the population component e.g. acceptable behavior for the agents in population. • Domain specific knowledge Information about the domain of the problem CA is applied to. • Situational knowledge Specific examples of important events - e.g. successful/unsuccessful solutions • Temporal knowledge History of the search space - e.g. the temporal patterns of the search process • Spatial knowledge Information about the topography of the search space 168 Cultural algorithm Population The population component of the cultural algorithm is approximately the same as that of the genetic algorithm. Communication protocol Cultural algorithms require an interface between the population and belief space. The best individuals of the population can update the belief space via the update function. Also, the knowledge categories of the belief space can affect the population component via the influence function. The influence function can affect population by altering the genome or the actions of the individuals. Pseudo-code for cultural algorithms 1. Initialize population space (choose initial population) 2. Initialize belief space (e.g. set domain specific knowledge and normative value-ranges) 3. Repeat until termination condition is met 1. Perform actions of the individuals in population space 2. Evaluate each individual by using the fitness function 3. Select the parents to reproduce a new generation of offspring 4. Let the belief space alter the genome of the offspring by using the influence function 5. Update the belief space by using the accept function (this is done by letting the best individuals to affect the belief space) Applications • Various optimization problems • Social simulation References • Robert G. Reynolds, Ziad Kobti, Tim Kohler: Agent-Based Modeling of Cultural Change in Swarm Using Cultural Algorithms [1] • R. G. Reynolds, “An Introduction to Cultural Algorithms, ” in Proceedings of the 3rd Annual Conference on Evolutionary Programming, World Scienfific Publishing, pp 131–139, 1994. • Robert G. Reynolds, Bin Peng. Knowledge Learning and Social Swarms in Cultural Systems. Journal of Mathematical Sociology. 29:1-18, 2005 • Reynolds, R. G., and Ali, M. Z, “Embedding a Social Fabric Component into Cultural Algorithms Toolkit for an Enhanced Knowledge-Driven Engineering Optimization”, International Journal of Intelligent Computing and Cybernetics (IJICC), Vol. 1, No 4, pp. 356-378, 2008 • Reynolds, R G., and Ali, M Z., Exploring Knowledge and Population Swarms via an Agent-Based Cultural Algorithms Simulation Toolkit (CAT), in proceedings of IEEE Congress on Computational Intelligence 2007. References [1] http:/ / www. cscs. umich. edu/ swarmfest04/ Program/ PapersSlides/ Kobti-SwarmFest04_kobti_reynolds_kohler. pdf 169 Learning classifier system Learning classifier system A learning classifier system, or LCS, is a machine learning system with close links to reinforcement learning and genetic algorithms. First described by John Holland, his LCS consisted of a population of binary rules on which a genetic algorithm altered and selected the best rules. Rule fitness was based on a reinforcement learning technique. Learning classifier systems can be split into two types depending upon where the genetic algorithm acts. A Pittsburgh-type LCS has a population of separate rule sets, where the genetic algorithm recombines and reproduces the best of these rule sets. In a Michigan-style LCS there is only a single set of rules in a population and the algorithm's action focuses on selecting the best classifiers within that set. Michigan-style LCSs have two main types of fitness definitions, strength-based (e.g. ZCS) and accuracy-based (e.g. XCS). The term "learning classifier system" most often refers to Michigan-style LCSs. Initially the classifiers or rules were binary, but recent research has expanded this representation to include real-valued, neural network, and functional (S-expression) conditions. Learning classifier systems are not fully understood mathematically and doing so remains an area of active research. Despite this, they have been successfully applied in many problem domains. Overview A learning classifier system (LCS) is an adaptive system that learns to perform the best action given its input. By "best" is generally meant the action that will receive the most reward or reinforcement from the system's environment. By "input" is meant the environment as sensed by the system, usually a vector of numerical values. The set of available actions depends on the decision context, for instance a financial one, the actions might be "buy", "sell", etc. In general, an LCS is a simple model of an intelligent agent interacting with an environment. An LCS is "adaptive" in the sense that its ability to choose the best action improves with experience. The source of the improvement is reinforcement—technically, payoff--provided by the environment. In many cases, the payoff is arranged by the experimenter or trainer of the LCS. For instance, in a classification context, the payoff may be 1.0 for "correct" and 0.0 for "incorrect". In a robotic context, the payoff could be a number representing the change in distance to a recharging source, with more desirable changes (getting closer) represented by larger positive numbers, etc. Often, systems can be set up so that effective reinforcement is provided automatically, for instance via a distance sensor. Payoff received for a given action is used by the LCS to alter the likelihood of taking that action, in those circumstances, in the future. To understand how this works, it is necessary to describe some of the LCS mechanics. Inside the LCS is a set—technically, a population--of "condition-action rules" called classifiers. There may be hundreds of classifiers in the population. When a particular input occurs, the LCS forms a so-called match set of classifiers whose conditions are satisfied by that input. Technically, a condition is a truth function t(x) which is satisfied for certain input vectors x. For instance, in a certain classifier, it may be that t(x)=1 (true) for 43 < x3 < 54, where x3 is a component of x, and represents, say, the age of a medical patient. In general, a classifier's condition will refer to more than one of the input components, usually all of them. If a classifier's condition is satisfied, i.e. its t(x)=1, then that classifier joins the match set and influences the system's action decision. In a sense, the match set consists of classifiers in the population that recognize the current input. Among the classifiers—the condition-action rules—of the match set will be some that advocate one of the possible actions, some that advocate another of the actions, and so forth. Besides advocating an action, a classifier will also contain a prediction of the amount of payoff which, speaking loosely, "it thinks" will be received if the system takes that action. How can the LCS decide which action to take? Clearly, it should pick the action that is likely to receive the highest payoff, but with all the classifiers making (in general) different predictions, how can it decide? The technique adopted is to compute, for each action, an average of the predictions of the classifiers advocating that action—and then choose the action with the largest average. The prediction average is in fact weighted by another 170 Learning classifier system classifier quantity, its fitness, which will be described later but is intended to reflect the reliability of the classifier's prediction. The LCS takes the action with the largest average prediction, and in response the environment returns some amount of payoff. If it is in a learning mode, the LCS will use this payoff, P, to alter the predictions of the responsible classifiers, namely those advocating the chosen action; they form what is called the action set. In this adjustment, each action set classifier's prediction p is changed mathematically to bring it slightly closer to P, with the aim of increasing its accuracy. Besides its prediction, each classifier maintains an estimate ε of the error of its predictions. Like p, ε is adjusted on each learning encounter with the environment by moving ε slightly closer to the current absolute error |p - P|. Finally, a quantity called the classifier's fitness is adjusted by moving it closer to an inverse function of ε, which can be regarded as measuring the accuracy of the classifier. The result of these adjustments will hopefully be to improve the classifier's prediction and to derive a measure—the fitness—that indicates its accuracy. The adaptivity of the LCS is not, however, limited to adjusting classifier predictions. At a deeper level, the system treats the classifiers as an evolving population in which accurate—i.e. high fitness—classifiers are reproduced over less accurate ones and the "offspring" are modified by genetic operators such as mutation and crossover. In this way, the population of classifiers gradually changes over time, that is, it adapts structurally. Evolution of the population is the key to high performance since the accuracy of predictions depends closely on the classifier conditions, which are changed by evolution. Evolution takes place in the background as the system is interacting with its environment. Each time an action set is formed, there is finite chance that a genetic algorithm will occur in the set. Specifically, two classifiers are selected from the set with probabilities proportional to their fitnesses. The two are copied and the copies (offspring) may, with certain probabilities, be mutated and recombined ("crossed"). Mutation means changing, slightly, some quantity or aspect of the classifier condition; the action may also be changed to one of the other actions. Crossover means exchanging parts of the two classifiers. Then the offspring are inserted into the population and two classifiers are deleted to keep the population at a constant size. The new classifiers, in effect, compete with their parents, which are still (with high probability) in the population. The effect of classifier evolution is to modify their conditions so as to increase the overall prediction accuracy of the population. This occurs because fitness is based on accuracy. In addition, however, the evolution leads to an increase in what can be called the "accurate generality" of the population. That is, classifier conditions evolve to be as general as possible without sacrificing accuracy. Here, general means maximizing the number of input vectors that the condition matches. The increase in generality results in the population needing fewer distinct classifiers to cover all inputs, which means (if identical classifiers are merged) that populations are smaller, and also that the knowledge contained in the population is more visible to humans—which is important in many applications. The specific mechanism by which generality increases is a major, if subtle, side-effect of the overall evolution. Summarizing, a learning classifier system is a broadly-applicable adaptive system that learns from external reinforcement and through an internal structural evolution derived from that reinforcement. In addition to adaptively increasing its performance, the LCS develops knowledge in the form of rules that respond to different aspects of the environment and capture environmental regularities through the generality of their conditions. Many important aspects of LCS were omitted in the above presentation, including among others: use in sequential (multi-step) tasks, modifications for non-Markov (locally ambiguous) environments, learning in the presence of noise, incorporation of continuous-valued actions, learning of relational concepts, learning of hyper-heuristics, and use for on-line function approximation and clustering. An LCS appears to be a widely applicable cognitive/agent model that can act as a framework for a diversity of learning investigations and practical applications. 171 Learning classifier system External links • • • • Review article by Urbanowicz & Moore [1] LCS & GBML Central [2] UWE Learning Classifier Research Group [3] Prediction Dynamics [4] References [1] [2] [3] [4] http:/ / www. hindawi. com/ archive/ 2009/ 736398. html http:/ / gbml. org/ http:/ / www. cems. uwe. ac. uk/ lcsg/ http:/ / prediction-dynamics. com/ Memetic algorithm Memetic algorithms (MA) represent one of the recent growing areas of research in evolutionary computation. The term MA is now widely used as a synergy of evolutionary or any population-based approach with separate individual learning or local improvement procedures for problem search. Quite often, MA are also referred to in the literature as Baldwinian Evolutionary algorithms (EA), Lamarckian EAs, cultural algorithms or genetic local search. Introduction The theory of “Universal Darwinism” was coined by Richard Dawkins in 1983[1] to provide a unifying framework governing the evolution of any complex system. In particular, “Universal Darwinism” suggests that evolution is not exclusive to biological systems, i.e., it is not confined to the narrow context of the genes, but applicable to any complex system that exhibit the principles of inheritance, variation and selection, thus fulfilling the traits of an evolving system. For example, the new science of memetics represents the mind-universe analogue to genetics in culture evolution that stretches across the fields of biology, cognition and psychology, which has attracted significant attention in the last decades. The term “meme” was also introduced and defined by Dawkins in 1976[2] as “the basic unit of cultural transmission, or imitation”, and in the Oxford English Dictionary as “an element of culture that may be considered to be passed on by non-genetic means”. Inspired by both Darwinian principles of natural evolution and Dawkins’ notion of a meme, the term “Memetic Algorithm” (MA) was first introduced by Moscato in his technical report[3] in 1989 where he viewed MA as being close to a form of population-based hybrid genetic algorithm (GA) coupled with an individual learning procedure capable of performing local refinements. The metaphorical parallels, on the one hand, to Darwinian evolution and, on the other hand, between memes and domain specific (local search) heuristics are captured within memetic algorithms thus rendering a methodology that balances well between generality and problem specificity. In a more diverse context, memetic algorithms are now used under various names including Hybrid Evolutionary Algorithms, Baldwinian Evolutionary Algorithms, Lamarckian Evolutionary Algorithms, Cultural Algorithms or Genetic Local Search. In the context of complex optimization, many different instantiations of memetic algorithms have been reported across a wide range of application domains, in general, converging to high quality solutions more efficiently than their conventional evolutionary counterparts. In general, using the ideas of memetics within a computational framework is called "Memetic Computing" (MC).[4][5] With MC, the traits of Universal Darwinism are more appropriately captured. Viewed in this perspective, MA is a more constrained notion of MC. More specifically, MA covers one area of MC, in particular dealing with areas of evolutionary algorithms that marry other deterministic refinement techniques for solving optimization problems. MC extends the notion of memes to cover conceptual entities of knowledge-enhanced procedures or 172 Memetic algorithm 173 representations. The development of MAs 1st generation The first generation of MA refers to hybrid algorithms, a marriage between a population-based global search (often in the form of an evolutionary algorithm) coupled with a cultural evolutionary stage. This first generation of MA although encompasses characteristics of cultural evolution (in the form of local refinement) in the search cycle, it may not qualify as a true evolving system according to Universal Darwinism, since all the core principles of inheritance/memetic transmission, variation and selection are missing. This suggests why the term MA stirred up criticisms and controversies among researchers when first introduced.[3] Pseudo code: Procedure Memetic Algorithm Initialize: Generate an initial population; while Stopping conditions are not satisfied do Evaluate all individuals in the population. Evolve a new population using stochastic search operators. Select the subset of individuals, for each individual in , that should undergo the individual improvement procedure. do Perform individual learning using meme(s) with frequency or probability of , for a period of Proceed with Lamarckian or Baldwinian learning. end for end while 2nd generation Multi-meme,[6] Hyper-heuristic[7] and Meta-Lamarckian MA[8] are referred to as second generation MA exhibiting the principles of memetic transmission and selection in their design. In Multi-meme MA, the memetic material is encoded as part of the genotype. Subsequently, the decoded meme of each respective individual / chromosome is then used to perform a local refinement. The memetic material is then transmitted through a simple inheritance mechanism from parent to offspring(s). On the other hand, in hyper-heuristic and meta-Lamarckian MA, the pool of candidate memes considered will compete, based on their past merits in generating local improvements through a reward mechanism, deciding on which meme to be selected to proceed for future local refinements. Memes with a higher reward have a greater chance of being replicated or copied. For a review on second generation MA, i.e., MA considering multiple individual learning methods within an evolutionary system, the reader is referred to.[9] 3rd generation Co-evolution[10] and self-generating MAs[11] may be regarded as 3rd generation MA where all three principles satisfying the definitions of a basic evolving system have been considered. In contrast to 2nd generation MA which assumes that the memes to be used are known a priori, 3rd generation MA utilizes a rule-based local search to supplement candidate solutions within the evolutionary system, thus capturing regularly repeated features or patterns in the problem space. . Memetic algorithm 174 Some design notes The frequency and intensity of individual learning directly define the degree of evolution (exploration) against individual learning (exploitation) in the MA search, for a given fixed limited computational budget. Clearly, a more intense individual learning provides greater chance of convergence to the local optima but limits the amount of evolution that may be expended without incurring excessive computational resources. Therefore, care should be taken when setting these two parameters to balance the computational budget available in achieving maximum search performance. When only a portion of the population individuals undergo learning, the issue on which subset of individuals to improve need to be considered to maximize the utility of MA search. Last but not least, the individual learning procedure/meme used also favors a different neighborhood structure, hence the need to decide which meme or memes to use for a given optimization problem at hand would be required. How often should individual learning be applied? One of the first issues pertinent to memetic algorithm design is to consider how often the individual learning should be applied, i.e., individual learning frequency. In one case,[12] the effect of individual learning frequency on MA search performance was considered where various configurations of the individual learning frequency at different stages of the MA search were investigated. Conversely, it was shown elsewhere[13] that it may be worthwhile to apply individual learning on every individual if the computational complexity of the individual learning is relatively low. On which solutions should individual learning be used? On the issue of selecting appropriate individuals among the EA population that should undergo individual learning, fitness-based and distribution-based strategies were studied for adapting the probability of applying individual learning on the population of chromosomes in continuous parametric search problems with Land[14] extending the work to combinatorial optimization problems. Bambha et al. introduced a simulated heating technique for systematically integrating parameterized individual learning into evolutionary algorithms to achieve maximum solution quality.[15] How long should individual learning be run? Individual learning intensity, , is the amount of computational budget allocated to an iteration of individual learning, i.e., the maximum computational budget allowable for individual learning to expend on improving a single solution. What individual learning method or meme should be used for a particular problem or individual? In the context of continuous optimization, individual learning/individual learning exists in the form of local heuristics or conventional exact enumerative methods.[16] Examples of individual learning strategies include the hill climbing, Simplex method, Newton/Quasi-Newton method, interior point methods, conjugate gradient method, line search and other local heuristics. Note that most of common individual learninger are deterministic. In combinatorial optimization, on the other hand, individual learning methods commonly exists in the form of heuristics (which can be deterministic or stochastic), that are tailored to serve a problem of interest well. Typical heuristic procedures and schemes include the k-gene exchange, edge exchange, first-improvement, and many others. Memetic algorithm Applications Memetic algorithms are the subject of intense scientific research (a scientific journal devoted to their research is going to be launched) and have been successfully applied to a multitude of real-world problems. Although many people employ techniques closely related to memetic algorithms, alternative names such as hybrid genetic algorithms are also employed. Furthermore, many people term their memetic techniques as genetic algorithms. The widespread use of this misnomer hampers the assessment of the total amount of applications. Researchers have used memetic algorithms to tackle many classical NP problems. To cite some of them: graph partitioning, multidimensional knapsack, travelling salesman problem, quadratic assignment problem, set cover problem, minimal graph colouring, max independent set problem, bin packing problem and generalized assignment problem. More recent applications include (but are not limited to): training of artificial neural networks,[17] pattern recognition,[18] robotic motion planning,[19] beam orientation,[20] circuit design,[21] electric service restoration,[22] medical expert systems,[23] single machine scheduling,[24] automatic timetabling (notably, the timetable for the NHL),[25] manpower scheduling,[26] nurse rostering and function optimisation,[27] processor allocation,[28] maintenance scheduling (for example, of an electric distribution network),[29] multidimensional knapsack problem,[30] VLSI design,[31] clustering of gene expression profiles,[32] feature/gene selection,[33][34] and multi-class, multi-objective feature selection.[35] Recent Activities in Memetic Algorithms • IEEE Workshop on Memetic Algorithms (WOMA 2009). Program Chairs: Jim Smith, University of the West of England, U.K.; Yew-Soon Ong, Nanyang Technological University, Singapore; Gustafson Steven, University of Nottingham; U.K.; Meng Hiot Lim, Nanyang Technological University, Singapore; Natalio Krasnogor, University of Nottingham, U.K. • Memetic Computing Journal [36], first issue appeared in January 2009. • 2008 IEEE World Congress on Computational Intelligence (WCCI 2008) [37], Hong Kong, Special Session on Memetic Algorithms [38]. • Special Issue on 'Emerging Trends in Soft Computing - Memetic Algorithm' [39], Soft Computing Journal, Completed & In Press, 2008. • IEEE Computational Intelligence Society Emergent Technologies Task Force on Memetic Computing [40] • IEEE Congress on Evolutionary Computation (CEC 2007) [41], Singapore, Special Session on Memetic Algorithms [42]. • 'Memetic Computing' [43] by Thomson Scientific's Essential Science Indicators as an Emerging Front Research Area. • Special Issue on Memetic Algorithms [44], IEEE Transactions on Systems, Man and Cybernetics - Part B, Vol. 37, No. 1, February 2007. • Recent Advances in Memetic Algorithms [45], Series: Studies in Fuzziness and Soft Computing, Vol. 166, ISBN 978-3-540-22904-9, 2005. • Special Issue on Memetic Algorithms [46], Evolutionary Computation Fall 2004, Vol. 12, No. 3: v-vi. 175 Memetic algorithm References [1] Dawkins, Richard (1983). "Universal Darwinism". In Bendall, D. S.. Evolution from molecules to man. Cambridge University Press. [2] Dawkins, Richard (1976). The Selfish Gene. Oxford University Press. ISBN 0199291152. [3] Moscato, P. (1989). "On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms". Caltech Concurrent Computation Program (report 826). [4] Chen, X. S.; Ong, Y. S.; Lim, M. H.; Tan, K. C. (2011). "A Multi-Facet Survey on Memetic Computation". IEEE Transactions on Evolutionary Computation 15 (5): 591-607. [5] Chen, X. S.; Ong, Y. S.; Lim, M. H. (2010). "Research Frontier: Memetic Computation - Past, Present & Future". IEEE Computational Intelligence Magazine 5 (2): 24-36. [6] Krasnogor N. (1999). "Coevolution of genes and memes in memetic algorithms". Graduate Student Workshop: 371. [7] Kendall G. and Soubeiga E. and Cowling P.. "Choice function and random hyperheuristics". 4th Asia-Pacific Conference on Simulated Evolution and Learning SEAL 2002: 667–671. [8] Ong Y. S. and Keane A. J. (2004). "Meta-Lamarckian learning in memetic algorithms". IEEE Transactions on Evolutionary Computation 8 (2): 99–110. doi:10.1109/TEVC.2003.819944. [9] Ong Y. S. and Lim M. H. and Zhu N. and Wong K. W. (2006). "Classification of Adaptive Memetic Algorithms: A Comparative Study". IEEE Transactions on Systems Man and Cybernetics -- Part B. 36 (1): 141. doi:10.1109/TSMCB.2005.856143. [10] Smith J. E. (2007). "Coevolving Memetic Algorithms: A Review and Progress Report". IEEE Transactions on Systems Man and Cybernetics - Part B 37 (1): 6–17. doi:10.1109/TSMCB.2006.883273. [11] Krasnogor N. and Gustafson S. (2002). "Toward truly "memetic" memetic algorithms: discussion and proof of concepts". Advances in Nature-Inspired Computation: the PPSN VII Workshops. PEDAL (Parallel Emergent and Distributed Architectures Lab). University of Reading. [12] Hart W. E. (1994). Adaptive Global Optimization with Local Search. [13] Ku K. W. C. and Mak M. W. and Siu W. C. (2000). "A study of the Lamarckian evolution of recurrent neural networks". IEEE Transactions on Evolutionary Computation 4 (1): 31–42. doi:10.1109/4235.843493. [14] Land M. W. S. (1998). Evolutionary Algorithms with Local Search for Combinatorial Optimization. [15] Bambha N. K. and Bhattacharyya S. S. and Teich J. and Zitzler E. (2004). "Systematic integration of parameterized local search into evolutionary algorithms". IEEE Transactions on Evolutionary Computation 8 (2): 137–155. doi:10.1109/TEVC.2004.823471. [16] Schwefel H. P. (1995). Evolution and optimum seeking. Wiley New York. [17] Ichimura, T.; Kuriyama, Y. (1998). "Learning of neural networks with parallel hybrid GA using a royal road function". IEEE International Joint Conference on Neural Networks. 2. New York, NY. pp. 1131–1136. [18] Aguilar, J.; Colmenares, A. (1998). "Resolution of pattern recognition problems using a hybrid genetic/random neural network learning algorithm". Pattern Analysis and Applications 1 (1): 52–61. doi:10.1007/BF01238026. [19] Ridao, M.; Riquelme, J.; Camacho, E.; Toro, M. (1998). "An evolutionary and local search algorithm for planning two manipulators motion". Lecture Notes in Computer Science. Lecture Notes in Computer Science (Springer-Verlag) 1416: 105–114. doi:10.1007/3-540-64574-8_396. ISBN 3-540-64574-8. [20] Haas, O.; Burnham, K.; Mills, J. (1998). "Optimization of beam orientation in radiotherapy using planar geometry". Physics in Medicine and Biology 43 (8): 2179–2193. doi:10.1088/0031-9155/43/8/013. PMID 9725597. [21] Harris, S.; Ifeachor, E. (1998). "Automatic design of frequency sampling filters by hybrid genetic algorithm techniques". IEEE Transactions on Signal Processing 46 (12): 3304–3314. doi:10.1109/78.735305. [22] Augugliaro, A.; Dusonchet, L.; Riva-Sanseverino, E. (1998). "Service restoration in compensated distribution networks using a hybrid genetic algorithm". Electric Power Systems Research 46 (1): 59–66. doi:10.1016/S0378-7796(98)00025-X. [23] Wehrens, R.; Lucasius, C.; Buydens, L.; Kateman, G. (1993). "HIPS, A hybrid self-adapting expert system for nuclear magnetic resonance spectrum interpretation using genetic algorithms". Analytica Chimica ACTA 277 (2): 313–324. doi:10.1016/0003-2670(93)80444-P. [24] França, P.; Mendes, A.; Moscato, P. (1999). "Memetic algorithms to minimize tardiness on a single machine with sequence-dependent setup times". Proceedings of the 5th International Conference of the Decision Sciences Institute. Athens, Greece. pp. 1708–1710. [25] Costa, D. (1995). "An evolutionary tabu search algorithm and the NHL scheduling problem". Infor 33: 161–178. [26] Aickelin, U. (1998). "Nurse rostering with genetic algorithms". Proceedings of young operational research conference 1998. Guildford, UK. [27] Ozcan, E. (2007). "Memes, Self-generation and Nurse Rostering". Lecture Notes in Computer Science. Lecture Notes in Computer Science (Springer-Verlag) 3867: 85–104. doi:10.1007/978-3-540-77345-0_6. ISBN 978-3-540-77344-3. [28] Ozcan, E.; Onbasioglu, E. (2006). "Memetic Algorithms for Parallel Code Optimization". International Journal of Parallel Programming 35 (1): 33–61. doi:10.1007/s10766-006-0026-x. [29] Burke, E.; Smith, A. (1999). "A memetic algorithm to schedule planned maintenance for the national grid". Journal of Experimental Algorithmics 4 (4): 1–13. doi:10.1145/347792.347801. [30] Ozcan, E.; Basaran, C. (2009). "A Case Study of Memetic Algorithms for Constraint Optimization". Soft Computing: A Fusion of Foundations, Methodologies and Applications 13 (8–9): 871–882. doi:10.1007/s00500-008-0354-4. [31] Areibi, S., Yang, Z. (2004). "Effective memetic algorithms for VLSI design automation = genetic algorithms + local search + multi-level clustering". Evolutionary Computation (MIT Press) 12 (3): 327–353. doi:10.1162/1063656041774947. PMID 15355604. 176 Memetic algorithm 177 [32] Merz, P.; Zell, A. (2002). "Clustering Gene Expression Profiles with Memetic Algorithms". Parallel Problem Solving from Nature — PPSN VII. Springer. pp. 811–820. doi:10.1007/3-540-45712-7_78. [33] Zexuan Zhu, Y. S. Ong and M. Dash (2007). "Markov Blanket-Embedded Genetic Algorithm for Gene Selection". Pattern Recognition 49 (11): 3236–3248. [34] Zexuan Zhu, Y. S. Ong and M. Dash (2007). "Wrapper-Filter Feature Selection Algorithm Using A Memetic Framework". IEEE Transactions on Systems, Man and Cybernetics - Part B 37 (1): 70–76. doi:10.1109/TSMCB.2006.883267. [35] Zexuan Zhu, Y. S. Ong and M. Zurada (2008). "Simultaneous Identification of Full Class Relevant and Partial Class Relevant Genes". IEEE/ACM Transactions on Computational Biology and Bioinformatics. [36] http:/ / www. springer. com/ journal/ 12293 [37] http:/ / www. wcci2008. org/ [38] http:/ / users. jyu. fi/ ~neferran/ MA2008/ MA2008. htm [39] http:/ / www. ntu. edu. sg/ home/ asysong/ SC/ Special-Issue-MA. htm [40] http:/ / www. ntu. edu. sg/ home/ asysong/ ETTC/ ETTC%20Task%20Force%20-%20Memetic%20Computing. htm [41] http:/ / cec2007. nus. edu. sg/ [42] http:/ / ntu-cg. ntu. edu. sg/ ysong/ MA-SS/ MA. htm [43] http:/ / www. esi-topics. com/ erf/ 2007/ august07-Ong_Keane. html [44] http:/ / ieeexplore. ieee. org/ Xplore/ login. jsp?url=/ iel5/ 3477/ 4067063/ 04067075. pdf?tp=& isnumber=& arnumber=4067075 [45] http:/ / www. springeronline. com/ sgw/ cda/ frontpage/ 0,11855,5-40356-72-34233226-0,00. html [46] http:/ / www. mitpressjournals. org/ doi/ abs/ 10. 1162/ 1063656041775009?prevSearch=allfield%3A%28memetic+ algorithm%29 Meta-optimization In numerical optimization, meta-optimization is the use of one optimization method to tune another optimization method. Meta-optimization is reported to have been used as early as in the late 1970s by Mercer and Sampson [1] for finding optimal parameter settings of a genetic algorithm. Meta-optimization is also known in the literature as meta-evolution, super-optimization, automated parameter calibration, hyper-heuristics, etc. Meta-optimization concept. Motivation Optimization methods such as genetic algorithm and differential evolution have several parameters that govern their behaviour and efficacy in optimizing a given problem and these parameters must be chosen by the practitioner to achieve satisfactory results. Selecting the behavioural parameters by hand is a laborious task that is susceptible to human misconceptions of what makes the optimizer perform well. The behavioural parameters of an optimizer can be varied and the optimization performance plotted as a landscape. This is Performance landscape for differential evolution. computationally feasible for optimizers with few behavioural parameters and optimization problems that are fast to compute, but when the number of behavioural parameters increases the time usage for computing such a performance landscape increases exponentially. This is the curse of dimensionality for the search-space consisting of an optimizer's behavioural parameters. An efficient method is therefore needed to search the space of behavioural parameters. Meta-optimization Methods A simple way of finding good behavioural parameters for an optimizer is to employ another overlaying optimizer, called the meta-optimizer. There are different ways of doing this depending on whether the behavioural parameters to be tuned are real-valued or discrete-valued, and depending on what performance measure is being used, etc. Meta-optimizing the parameters of a genetic algorithm was done by Grefenstette [2] and Keane,[3] amongst others, and experiments with Meta-optimization of differential evolution. meta-optimizing both the parameters and the genetic operators were reported by Bäck.[4] Meta-optimization of particle swarm optimization was done by Meissner et al.[5] as well as by Pedersen and Chipperfield,[6] who also meta-optimized differential evolution. Birattari et al.[7][8] meta-optimized ant colony optimization. Statistical models have also been used to reveal more about the relationship between choices of behavioural parameters and optimization performance, see for example Francois and Lavergne,[9] and Nannen and Eiben.[10] A comparison of various meta-optimization techniques was done by Smit and Eiben.[11] References [1] Mercer, R.E.; Sampson, J.R. (1978). "Adaptive search using a reproductive metaplan". Kybernetes (The International Journal of Systems and Cybernetics) 7 (3): 215–228. doi:10.1108/eb005486. [2] Grefenstette, J.J. (1986). "Optimization of control parameters for genetic algorithms". IEEE Transactions Systems, Man, and Cybernetics 16: 122–128. doi:10.1109/TSMC.1986.289288. [3] Keane, A.J. (1995). "Genetic algorithm optimization in multi-peak problems: studies in convergence and robustness". Artificial Intelligence in Engineering 9 (2): 75–83. doi:10.1016/0954-1810(95)95751-Q. [4] Bäck, T. (1994). "Parallel optimization of evolutionary algorithms". Proceedings of the International Conference on Evolutionary Computation. pp. 418–427. [5] Meissner, M.; Schmuker, M.; Schneider, G. (2006). "Optimized Particle Swarm Optimization (OPSO) and its application to artificial neural network training". BMC Bioinformatics 7. [6] Pedersen, M.E.H.; Chipperfield, A.J. (2010). "Simplifying particle swarm optimization" (http:/ / www. hvass-labs. org/ people/ magnus/ publications/ pedersen08simplifying. pdf). Applied Soft Computing 10 (2): 618–628. doi:10.1016/j.asoc.2009.08.029. . [7] Birattari, M.; Stützle, T.; Paquete, L.; Varrentrapp, K. (2002). "A racing algorithm for configuring metaheuristics". Proceedings of the Genetic and Evolutionary Computation Conference (GECCO). pp. 11–18. [8] Birattari, M. (2004). The Problem of Tuning Metaheuristics as Seen from a Machine Learning Perspective (http:/ / iridia. ulb. ac. be/ ~mbiro/ paperi/ BirattariPhD. pdf) (PhD thesis). Université Libre de Bruxelles. . [9] Francois, O.; Lavergne, C. (2001). "Design of evolutionary algorithms - a statistical perspective". IEEE Transactions on Evolutionary Computation 5 (2): 129–148. doi:10.1109/4235.918434. [10] Nannen, V.; Eiben, A.E. (2006). "A method for parameter calibration and relevance estimation in evolutionary algorithms". Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation (GECCO). pp. 183–190. [11] Smit, S.K.; Eiben, A.E. (2009). "Comparing parameter tuning methods for evolutionary algorithms". Proceedings of the IEEE Congress on Evolutionary Computation (CEC). pp. 399–406. 178 Cellular evolutionary algorithm Cellular evolutionary algorithm A Cellular Evolutionary Algorithm (cEA) is a kind of evolutionary algorithm (EA) in which individuals cannot mate arbitrarly, but every one interacts with its closer neighbors on which a basic EA is applied (selection, variation, replacement). The cellular model simulates Natural evolution from the point of view of the individual, which encodes a tentative (optimization, learning, search) problem solution. The essential idea of this model is to provide the EA population with a special structure defined as a connected graph, in which each vertex is an individual who communicates with his nearest neighbors. Particularly, individuals are conceptually set in a toroidal mesh, and are only allowed to recombine with close individuals. This leads us to Example evolution of a cEA depending on the shape of the population, from squared a kind of locality known as isolation by (left) to unidimensional ring (right). Darker colors mean better solutions. Observe distance. The set of potential mates of how shapes different from the traditional square keep diversity (higher exploration) for a longer time. Four snapshots of cEAs at generations 0-50-100-150. an individual is called its neighborhood. It is known that, in this kind of algorithm, similar individuals tend to cluster creating niches, and these groups operate as if they were separate sub-populations (islands). Anyway, there is no clear borderline between adjacent groups, and close niches could be easily colonized by competitive niches and maybe merge solution contents during the process. Simultaneously, farther niches can be affected more slowly. 179 Cellular evolutionary algorithm Introduction A Cellular Evolutionary Algorithm (cEA) usually evolves a structured bidimensional grid of individuals, although other topologies are also possible. In this grid, clusters of similar individuals are naturally created during evolution, promoting exploration in their boundaries, while exploitation is mainly performed by direct competition and merging inside them. The grid is usually 2D toroidal structure, although the number of dimensions can be easily extended (to 3D) or reduced (to 1D, e.g. a ring). The neighborhood of a particular point of the grid (where an individual is placed) is defined in terms of the Manhattan distance from it to others in the population. Each point of the grid has a neighborhood that overlaps the neighborhoods of nearby individuals. In the basic algorithm, all the neighborhoods have the same size and Example models of neighborhoods in cellular EAs: linear, compact, diamond and... identical shapes. The two most any other! commonly used neighborhoods are L5, also called Von Neumann or NEWS (North, East, West y South), and C9, also known as Moore neighborhood. Here, L stands for Linear while C stands for Compact. In cEAs, the individuals can only interact with their neighbors in the reproductive cycle where the variation operators are applied. This reproductive cycle is executed inside the neighborhood of each individual and, generally, consists in selecting two parents among its neighbors according to a certain criterion, applying the variation operators to them (recombination and mutation for example), and replacing the considered individual by the recently created offspring following a given criterion, for instance, replace if the offspring represents a better solution than the considered individual. Synchronous versus Asynchronous cEAs In a regular synchronous cEA, the algorithm proceeds from the very first top left individual to the right and then to the several rows by using the information in the population to create a new temporary population. After finishing with the bottom-right last individual the temporary population is full with the newly computed individuals, and the replacement step starts. In it, the old population is completely and synchronously replaced with the newly computed one according to some criterion. Usually, the replacement keeps the best individual in the same position of both populations, that is, elitism is used. We must notice that according to the update policy of the population used, we could also define an asynchronous cEA. This is also a well-known issue in cellular automata. In asynchronous cEAs the order in which the individuals in the grid are update changes depending on the criterion used: line sweep, fixed random sweep, new random sweep, and uniform choice. These are the four most usual ways of updating the population. All of them keep using the newly computed individual (or the original if better) for the computations of its neighbors immediately. This makes the population to hold at any time individual in different states of evolution, defining a very interesting new line of research. 180 Cellular evolutionary algorithm The overlap of the neighborhoods provides an implicit mechanism of solution migration to the cEA. Since the best solutions spread smoothly through the whole population, genetic diversity in the population is preserved longer than in non structured EAs. This soft dispersion of the best solutions through the population is one of the main issues of the good tradeoff between exploration and exploitation that cEAs perform during the search. It is then easy to see that we could tune The ratio between the radii of the neighborhood to the topology defines the exploration/exploitation capability of the cEA. This could be even tuned during the this tradeoff (and hence, tune the run of the algorithm, giving the researcher a unique mechanism to search in very genetic diversity level along the complex landscapes. evolution) by modifying (for instance) the size of the neighborhood used, as the overlap degree between the neighborhoods grows according to the size of the neighborhood. A cEA can be seen as a cellular automaton (CA) with probabilistic rewritable rules, where the alphabet of the CA is equivalent to the potential number of solutions of the problem. Hence, if we see cEAs as a kind of CA, it is possible to import knowledge from the field of CAs to cEAs, and in fact this is an interesting open research line. Parallelism and cEAs Celluar EAs are very amenable to parallelism, thus usually found in the literature of parallel metaheuristics. In particular, fine grain parallelism can be use to assign independent threads of execution to every individual, thus allowing the whole cEA to run on a concurrent or actually parallel hardware platform. In this way, large time reductions can be got when running cEAs on FPGAs or GPUs. However, it is important to stress that cEAs are a model of search, in many senses different to traditional EAs. Also, they can be run in sequential and parallel platforms, reinforcing the fact that the model and the implementation are two different concepts. See here [3] for a complete description on the fundamentals for the understanding, design, and application of cEAs. References • E. Alba, B. Dorronsoro, Cellular Genetic Algorithms, Springer-Verlag, ISBN 978-0-387-77609-5, 2008 (http:// www.springer.com/business/operations+research/book/978-0-387-77609-5) • A.J. Nebro, J.J. Durillo, F. Luna, B. Dorronsoro, E. Alba, MOCell: A New Cellular Genetic Algorithm for Multiobjective Optimization, International Journal of Intelligent Systems, 24:726-746, 2009 • E. Alba, B. Dorronsoro, F. Luna, A.J. Nebro, P. Bouvry, L. Hogie, A Cellular Multi-Objective Genetic Algorithm for Optimal Broadcasting Strategy in Metropolitan MANETs, Computer Communications, 30(4):685-697, 2007 • E. Alba, B. Dorronsoro, Computing Nine New Best-So-Far Solutions for Capacitated VRP with a Cellular GA, Information Processing Letters, Elsevier, 98(6):225-230, 30 June 2006 • M. Giacobini, M. Tomassini, A. Tettamanzi, E. Alba, The Selection Intensity in Cellular Evolutionary Algorithms for Regular Lattices, IEEE Transactions on Evolutionary Computation, IEEE Press, 9(5):489-505, 2005 • E. Alba, B. Dorronsoro, The Exploration/Exploitation Tradeoff in Dynamic Cellular Genetic Algorithms, IEEE Transactions on Evolutionary Computation, IEEE Press, 9(2)126-142, 2005 181 Cellular evolutionary algorithm External links • THE site on Cellular Evolutionary Algorithms (http://neo.lcc.uma.es/cEA-web/) • NEO Research Group at Universtiy of Málaga, Spain (http://neo.lcc.uma.es) Cellular automaton A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model studied in computability theory, mathematics, physics, complexity science, theoretical biology and microstructure modeling. It consists of a regular grid of cells, each in one of a finite number of states, such as "On" and "Off" (in contrast to a coupled map lattice). The grid can be in any finite number of dimensions. For each cell, a set of cells called its neighborhood (usually including the cell itself) is defined relative to the specified cell. For example, the neighborhood of a cell might be defined as the set of cells a distance of 2 or less from Gosper's Glider Gun creating "gliders" in the [1] the cell. An initial state (time t=0) is selected by assigning a state for cellular automaton Conway's Game of Life each cell. A new generation is created (advancing t by 1), according to some fixed rule (generally, a mathematical function) that determines the new state of each cell in terms of the current state of the cell and the states of the cells in its neighborhood. For example, the rule might be that the cell is "On" in the next generation if exactly two of the cells in the neighborhood are "On" in the current generation, otherwise the cell is "Off" in the next generation. Typically, the rule for updating the state of cells is the same for each cell and does not change over time, and is applied to the whole grid simultaneously, though exceptions are known. Cellular automata are also called "cellular spaces", "tessellation automata", "homogeneous structures", "cellular structures", "tessellation structures", and "iterative arrays".[2] Overview One way to simulate a two-dimensional cellular automaton is with an infinite sheet of graph paper along with a set of rules for the cells to follow. Each square is called a "cell" and each cell has two possible states, black and white. The "neighbors" of a cell are the 8 squares touching it. For such a cell and its neighbors, there are 512 (= 29) possible patterns. For each of the 512 possible patterns, the rule table would state whether the center cell will be black or white on the next time interval. Conway's Game of Life is a popular version of this model. It is usually assumed that every cell in the universe starts in the same state, except for a finite number of cells in other states, often called a configuration. More generally, it is sometimes assumed that the universe starts out covered with a periodic pattern, and only a finite number of cells violate that pattern. The latter assumption is common in one-dimensional cellular automata. 182 Cellular automaton Cellular automata are often simulated on a finite grid rather than an infinite one. In two dimensions, the universe would be a rectangle instead of an infinite plane. The obvious problem with finite grids is how to handle the cells on the edges. How they are handled will affect the values of all the cells in the grid. One possible method is to allow the values in those cells to remain constant. Another method is to define neighbourhoods differently for these cells. One could say that they have fewer neighbours, but then one would also have to define new rules for the cells located A torus, a toroidal shape. on the edges. These cells are usually handled with a toroidal arrangement: when one goes off the top, one comes in at the corresponding position on the bottom, and when one goes off the left, one comes in on the right. (This essentially simulates an infinite periodic tiling, and in the field of partial differential equations is sometimes referred to as periodic boundary conditions.) This can be visualized as taping the left and right edges of the rectangle to form a tube, then taping the top and bottom edges of the tube to form a torus (doughnut shape). Universes of other dimensions are handled similarly. This is done in order to solve boundary problems with neighborhoods, but another advantage of this system is that it is easily programmable using modular arithmetic functions. For example, in a 1-dimensional cellular automaton like the examples below, the neighborhood of a cell xit—where t is the time step (vertical), and i is the index (horizontal) in one generation—is {xi−1t−1, xit−1, xi+1t−1}. There will obviously be problems when a neighbourhood on a left border references its upper left cell, which is not in the cellular space, as part of its neighborhood. History Stanisław Ulam, while working at the Los Alamos National Laboratory in the 1940s, studied the growth of crystals, using a simple lattice network as his model. At the same time, John von Neumann, Ulam's colleague at Los Alamos, was working on the problem of self-replicating systems. Von Neumann's initial design was founded upon the notion of one robot building another robot. This design is known as the kinematic model.[3][4] As he developed this design, von Neumann came to realize the great difficulty of building a self-replicating robot, and of the great cost in providing the robot with a "sea of parts" from which to build its replicant. Ulam suggested that von Neumann develop his design around a mathematical abstraction, such as the one Ulam used to study crystal growth. Thus John von Neumann, Los Alamos was born the first system of cellular automata. Like Ulam's lattice network, von ID badge Neumann's cellular automata are two-dimensional, with his self-replicator implemented algorithmically. The result was a universal copier and constructor working within a CA with a small neighborhood (only those cells that touch are neighbors; for von Neumann's cellular automata, only orthogonal cells), and with 29 states per cell. Von Neumann gave an existence proof that a particular pattern would make endless copies of itself within the given cellular universe. This design is known as the tessellation model, and is called a von Neumann universal constructor. Also in the 1940s, Norbert Wiener and Arturo Rosenblueth developed a cellular automaton model of excitable media.[5] Their specific motivation was the mathematical description of impulse conduction in cardiac systems. Their original work continues to be cited in modern research publications on cardiac arrhythmia and excitable systems.[6] In the 1960s, cellular automata were studied as a particular type of dynamical system and the connection with the mathematical field of symbolic dynamics was established for the first time. In 1969, Gustav A. Hedlund compiled 183 Cellular automaton many results following this point of view[7] in what is still considered as a seminal paper for the mathematical study of cellular automata. The most fundamental result is the characterization in the Curtis–Hedlund–Lyndon theorem of the set of global rules of cellular automata as the set of continuous endomorphisms of shift spaces. In the 1970s a two-state, two-dimensional cellular automaton named Game of Life became very widely known, particularly among the early computing community. Invented by John Conway and popularized by Martin Gardner in a Scientific American article,[8] its rules are as follows: If a cell has 2 black neighbours, it stays the same. If it has 3 black neighbours, it becomes black. In all other situations it becomes white. Despite its simplicity, the system achieves an impressive diversity of behavior, fluctuating between apparent randomness and order. One of the most apparent features of the Game of Life is the frequent occurrence of gliders, arrangements of cells that essentially move themselves across the grid. It is possible to arrange the automaton so that the gliders interact to perform computations, and after much effort it has been shown that the Game of Life can emulate a universal Turing machine.[9] Possibly because it was viewed as a largely recreational topic, little follow-up work was done outside of investigating the particularities of the Game of Life and a few related rules. In 1969, however, German computer pioneer Konrad Zuse published his book Calculating Space, proposing that the physical laws of the universe are discrete by nature, and that the entire universe is the output of a deterministic computation on a giant cellular automaton. This was the first book on what today is called digital physics. In 1983 Stephen Wolfram published the first of a series of papers systematically investigating a very basic but essentially unknown class of cellular automata, which he terms elementary cellular automata (see below). The unexpected complexity of the behavior of these simple rules led Wolfram to suspect that complexity in nature may be due to similar mechanisms. Additionally, during this period Wolfram formulated the concepts of intrinsic randomness and computational irreducibility, and suggested that rule 110 may be universal—a fact proved later by Wolfram's research assistant Matthew Cook in the 1990s. In 2002 Wolfram published a 1280-page text A New Kind of Science, which extensively argues that the discoveries about cellular automata are not isolated facts but are robust and have significance for all disciplines of science. Despite much confusion in the press and academia, the book did not argue for a fundamental theory of physics based on cellular automata, and although it did describe a few specific physical models based on cellular automata, it also provided models based on qualitatively different abstract systems. Elementary cellular automata The simplest nontrivial CA would be one-dimensional, with two possible states per cell, and a cell's neighbors defined to be the adjacent cells on either side of it. A cell and its two neighbors form a neighborhood of 3 cells, so there are 23=8 possible patterns for a neighborhood. A rule consists of deciding, for each pattern, whether the cell will be a 1 or a 0 in the next generation. There are then 28=256 possible rules. These 256 CAs are generally referred to by their Wolfram code, a standard naming convention invented by Stephen Wolfram which gives each rule a number from 0 to 255. A number of papers have analyzed and compared these 256 CAs. The rule 30 and rule 110 CAs are particularly interesting. The images below show the history of each when the starting configuration consists of a 1 (at the top of each image) surrounded by 0's. Each row of pixels represents a generation in the history of the automaton, with t=0 being the top row. Each pixel is colored white for 0 and black for 1. 184 Cellular automaton 185 Rule 30 cellular automaton current pattern new state for center cell 111 110 101 100 011 010 001 000 0 0 0 1 1 1 1 0 Rule 110 cellular automaton current pattern new state for center cell 111 110 101 100 011 010 001 000 0 1 1 0 1 1 1 0 Rule 30 exhibits class 3 behavior, meaning even simple input patterns such as that shown lead to chaotic, seemingly random histories. Rule 110, like the Game of Life, exhibits what Wolfram calls class 4 behavior, which is neither completely random nor completely repetitive. Localized structures appear and interact in various complicated-looking ways. In the course of the development of A New Kind of Science, as a research assistant to Stephen Wolfram in 1994, Matthew Cook proved that some of these structures were rich enough to support universality. This result is interesting because rule 110 is an extremely simple one-dimensional system, and one which is difficult to engineer to perform specific behavior. This result therefore provides significant support for Wolfram's view that class 4 systems are inherently likely to be universal. Cook presented his proof at a Santa Fe Institute conference on Cellular Automata in 1998, but Cellular automaton Wolfram blocked the proof from being included in the conference proceedings, as Wolfram did not want the proof to be announced before the publication of A New Kind of Science. In 2004, Cook's proof was finally published in Wolfram's journal Complex Systems [10] (Vol. 15, No. 1), over ten years after Cook came up with it. Rule 110 has been the basis over which some of the smallest universal Turing machines have been built, inspired on the breakthrough concepts that the development of the proof of rule 110 universality produced. Reversible A cellular automaton is said to be reversible if for every current configuration of the cellular automaton there is exactly one past configuration (preimage). If one thinks of a cellular automaton as a function mapping configurations to configurations, reversibility implies that this function is bijective. If a cellular automaton is reversible, its time-reversed behavior can also be described as a cellular automaton; this fact is a consequence of the Curtis–Hedlund–Lyndon theorem, a topological characterization of cellular automata.[11][12] For cellular automata in which not every configuration has a preimage, the configurations without preimages are called Garden of Eden patterns. For one dimensional cellular automata there are known algorithms for deciding whether a rule is reversible or irreversible.[13][14] However, for cellular automata of two or more dimensions reversibility is undecidable; that is, there is no algorithm that takes as input an automaton rule and is guaranteed to determine correctly whether the automaton is reversible. The proof by Jarkko Kari is related to the tiling problem by Wang tiles.[15] Reversible CA are often used to simulate such physical phenomena as gas and fluid dynamics, since they obey the laws of thermodynamics. Such CA have rules specially constructed to be reversible. Such systems have been studied by Tommaso Toffoli, Norman Margolus and others. Several techniques can be used to explicitly construct reversible CA with known inverses. Two common ones are the second order cellular automaton and the block cellular automaton, both of which involve modifying the definition of a CA in some way. Although such automata do not strictly satisfy the definition given above, it can be shown that they can be emulated by conventional CAs with sufficiently large neighborhoods and numbers of states, and can therefore be considered a subset of conventional CA. Conversely, it has been shown that every reversible cellular automaton can be emulated by a block cellular automaton.[16] Totalistic A special class of CAs are totalistic CAs. The state of each cell in a totalistic CA is represented by a number (usually an integer value drawn from a finite set), and the value of a cell at time t depends only on the sum of the values of the cells in its neighborhood (possibly including the cell itself) at time t−1.[17][18] If the state of the cell at time t does depend on its own state at time t−1 then the CA is properly called outer totalistic.[18] Conway's Game of Life is an example of an outer totalistic CA with cell values 0 and 1; outer totalistic cellular automata with the same Moore neighborhood structure as Life are sometimes called life-like cellular automata.[19] 186 Cellular automaton 187 3D totalistic cellular automata Classification Stephen Wolfram, in A New Kind of Science and in several papers dating from the mid-1980s, defined four classes into which cellular automata and several other simple computational models can be divided depending on their behavior. While earlier studies in cellular automata tended to try to identify type of patterns for specific rules, Wolfram's classification was the first attempt to classify the rules themselves. In order of complexity the classes are: • Class 1: Nearly all initial patterns evolve quickly into a stable, homogeneous state. Any randomness in the initial pattern disappears. • Class 2: Nearly all initial patterns evolve quickly into stable or oscillating structures. Some of the randomness in the initial pattern may filter out, but some remains. Local changes to the initial pattern tend to remain local. • Class 3: Nearly all initial patterns evolve in a pseudo-random or chaotic manner. Any stable structures that appear are quickly destroyed by the surrounding noise. Local changes to the initial pattern tend to spread indefinitely. • Class 4: Nearly all initial patterns evolve into structures that interact in complex and interesting ways. Class 2 type stable or oscillating structures may be the eventual outcome, but the number of steps required to reach this state may be very large, even when the initial pattern is relatively simple. Local changes to the initial pattern may spread indefinitely. Wolfram has conjectured that many, if not all class 4 cellular automata are capable of universal computation. This has been proven for Rule 110 and Conway's game of Life. These definitions are qualitative in nature and there is some room for interpretation. According to Wolfram, "...with almost any general classification scheme there are inevitably cases which get assigned to one class by one definition and another class by another definition. And so it is with cellular automata: there are occasionally rules...that show some features of one class and some of another."[20] Wolfram's classification has been empirically matched to a clustering of the compressed lengths of the outputs of cellular automata.[21] There have been several attempts to classify CA in formally rigorous classes, inspired by the Wolfram's classification. For instance, Culik and Yu proposed three well-defined classes (and a fourth one for the automata not matching any of these), which are sometimes called Culik-Yu classes; membership in these proved to be undecidable.[22][23][24] Cellular automaton Evolving cellular automata using genetic algorithms Recently there has been a keen interest in building decentralized systems, be they sensor networks or more sophisticated micro level structures designed at the network level and aimed at decentralized information processing. The idea of emergent computation came from the need of using distributed systems to do information processing at the global level.[25] The area is still in its infancy, but some people have started taking the idea seriously. Melanie Mitchell who is Professor of Computer Science at Portland State University and also Santa Fe Institute External Professor[26] has been working on the idea of using self-evolving cellular arrays to study emergent computation and distributed information processing.[25] Mitchell and colleagues are using evolutionary computation to program cellular arrays.[27] Computation in decentralized systems is very different from classical systems, where the information is processed at some central location depending on the system’s state. In decentralized system, the information processing occurs in the form of global and local pattern dynamics. The inspiration for this approach comes from complex natural systems like insect colonies, nervous system and economic systems.[27] The focus of the research is to understand how computation occurs in an evolving decentralized system. In order to model some of the features of these systems and study how they give rise to emergent computation, Mitchell and collaborators at the SFI have applied Genetic Algorithms to evolve patterns in cellular automata. They have been able to show that the GA discovered rules that gave rise to sophisticated emergent computational strategies.[28] Mitchell’s group used a single dimensional binary array where each cell has six neighbors. The array can be thought of as a circle where the first and last cells are neighbors. The evolution of the array was tracked through the number of ones and zeros after each iteration. The results were plotted to show clearly how the network evolved and what sort of emergent computation was visible. The results produced by Mitchell’s group are interesting, in that a very simple array of cellular automata produced results showing coordination over global scale, fitting the idea of emergent computation. Future work in the area may include more sophisticated models using cellular automata of higher dimensions, which can be used to model complex natural systems. Cryptography use Rule 30 was originally suggested as a possible Block cipher for use in cryptography (See CA-1.1). Cellular automata have been proposed for public key cryptography. The one way function is the evolution of a finite CA whose inverse is believed to be hard to find. Given the rule, anyone can easily calculate future states, but it appears to be very difficult to calculate previous states. However, the designer of the rule can create it in such a way as to be able to easily invert it. Therefore, it is apparently a trapdoor function, and can be used as a public-key cryptosystem. The security of such systems is not currently known. 188 Cellular automaton 189 Related automata There are many possible generalizations of the CA concept. One way is by using something other than a rectangular (cubic, etc.) grid. For example, if a plane is tiled with regular hexagons, those hexagons could be used as cells. In many cases the resulting cellular automata are equivalent to those with rectangular grids with specially designed neighborhoods and rules. Also, rules can be probabilistic rather than deterministic. A probabilistic rule gives, for each pattern at time t, the probabilities that the central cell will transition to each possible state at time t+1. Sometimes a simpler rule is used; for example: "The rule is the Game of Life, but on each time step there is a 0.001% probability that each cell will transition to the opposite color." A cellular automaton based on hexagonal cells instead of squares (rule 34/2) The neighborhood or rules could change over time or space. For example, initially the new state of a cell could be determined by the horizontally adjacent cells, but for the next generation the vertical cells would be used. The grid can be finite, so that patterns can "fall off" the edge of the universe. In CA, the new state of a cell is not affected by the new state of other cells. This could be changed so that, for instance, a 2 by 2 block of cells can be determined by itself and the cells adjacent to itself. There are continuous automata. These are like totalistic CA, but instead of the rule and states being discrete (e.g. a table, using states {0,1,2}), continuous functions are used, and the states become continuous (usually values in [0,1]). The state of a location is a finite number of real numbers. Certain CA can yield diffusion in liquid patterns in this way. Continuous spatial automata have a continuum of locations. The state of a location is a finite number of real numbers. Time is also continuous, and the state evolves according to differential equations. One important example is reaction-diffusion textures, differential equations proposed by Alan Turing to explain how chemical reactions could create the stripes on zebras and spots on leopards.[29] When these are approximated by CA, such CAs often yield similar patterns. MacLennan [30] considers continuous spatial automata as a model of computation. There are known examples of continuous spatial automata which exhibit propagating phenomena analogous to gliders in the Game of Life.[31] Biology Some biological processes occur—or can be simulated—by cellular automata. Conus textile exhibits a cellular automaton pattern on its shell Patterns of some seashells, like the ones in Conus and Cymbiola genus, are generated by natural CA. The pigment cells reside in a narrow band along the shell's lip. Each cell secretes pigments according to the activating and inhibiting activity of its neighbour pigment cells, obeying a natural version of a mathematical rule. The cell band leaves the colored pattern on the shell as it grows slowly. For example, the widespread species Conus textile bears a pattern resembling Wolfram's rule 30 CA. Cellular automaton Plants regulate their intake and loss of gases via a CA mechanism. Each stoma on the leaf acts as a cell.[32] Moving wave patterns on the skin of cephalopods can be simulated with a two-state, two-dimensional cellular automata, each state corresponding to either an expanded or retracted chromatophore.[33] Threshold automata have been invented to simulate neurons, and complex behaviors such as recognition and learning can be simulated. Fibroblasts bear similarities to cellular automata, as each fibroblast only interacts with its neighbors.[34] Chemical types The Belousov–Zhabotinsky reaction is a spatio-temporal chemical oscillator which can be simulated by means of a cellular automaton. In the 1950s A. M. Zhabotinsky (extending the work of B. P. Belousov) discovered that when a thin, homogenous layer of a mixture of malonic acid, acidified bromate, and a ceric salt were mixed together and left undisturbed, fascinating geometric patterns such as concentric circles and spirals propagate across the medium. In the "Computer Recreations" section of the August 1988 issue of Scientific American,[35] A. K. Dewdney discussed a cellular automaton[36] which was developed by Martin Gerhardt and Heike Schuster of the University of Bielefeld (West Germany). This automaton produces wave patterns resembling those in the Belousov-Zhabotinsky reaction. Computer processors CA processors are physical (not computer simulated) implementations of CA concepts, which can process information computationally. Processing elements are arranged in a regular grid of identical cells. The grid is usually a square tiling, or tessellation, of two or three dimensions; other tilings are possible, but not yet used. Cell states are determined only by interactions with adjacent neighbor cells. No means exists to communicate directly with cells farther away. One such CA processor array configuration is the systolic array. Cell interaction can be via electric charge, magnetism, vibration (phonons at quantum scales), or any other physically useful means. This can be done in several ways so no wires are needed between any elements. This is very unlike processors used in most computers today, von Neumann designs, which are divided into sections with elements that can communicate with distant elements over wires. Error correction coding CA have been applied to design error correction codes in the paper "Design of CAECC – Cellular Automata Based Error Correcting Code", by D. Roy Chowdhury, S. Basu, I. Sen Gupta, P. Pal Chaudhuri. The paper defines a new scheme of building SEC-DED codes using CA, and also reports a fast hardware decoder for the code. CA as models of the fundamental physical reality As Andrew Ilachinski points out in his Cellular Automata, many scholars have raised the question of whether the universe is a cellular automaton.[37] Ilachinsky argues that the importance of this question may be better appreciated with a simple observation, which can be stated as follows. Consider the evolution of rule 110: if it were some kind of "alien physics", what would be a reasonable description of the observed patterns?[38] If you didn't know how the images were generated, you might end up conjecturing about the movement of some particle-like objects (indeed, physicist James Crutchfield made a rigorous mathematical theory out of this idea proving the statistical emergence of "particles" from CA).[39] Then, as the argument goes, one might wonder if our world, which is currently well described by physics with particle-like objects, could be a CA at its most fundamental level. While a complete theory along this line is still to be developed, entertaining and developing this hypothesis led scholars to interesting speculation and fruitful intuitions on how can we make sense of our world within a discrete 190 Cellular automaton framework. Marvin Minsky, the AI pioneer, investigated how to understand particle interaction with a four-dimensional CA lattice;[40] Konrad Zuse—the inventor of the first working computer, the Z3—developed an irregularly organized lattice to address the question of the information content of particles.[41] More recently, Edward Fredkin exposed what he terms the "finite nature hypothesis", i.e., the idea that "ultimately every quantity of physics, including space and time, will turn out to be discrete and finite."[42] Fredkin and Stephen Wolfram are strong proponents of a CA-based physics. In recent years, other suggestions along these lines have emerged from literature in non-standard computation. Stephen Wolfram's A New Kind of Science considers CA to be the key to understanding a variety of subjects, physics included. The Mathematics Of the Models of Reference—created by iLabs[43] founder Gabriele Rossi and developed with Francesco Berto and Jacopo Tagliabue—features an original 2D/3D universe based on a new "rhombic dodecahedron-based" lattice and a unique rule. This model satisfies universality (it is equivalent to a Turing Machine) and perfect reversibility (a desideratum if one wants to conserve various quantities easily and never lose information), and it comes embedded in a first-order theory, allowing computable, qualitative statements on the universe evolution.[44] In popular culture • One-dimensional cellular automata were mentioned in the Season 2 episode of NUMB3RS "Better or Worse".[45] • The Hacker Emblem, a symbol for hacker culture proposed by Eric S. Raymond, depicts a glider from Conway's Game of Life.[46] • The Autoverse, an artificial life simulator in the novel Permutation City, is a cellular automaton.[47] • Cellular automata are discussed in the novel Bloom.[48] • Cellular automata are central to Robert J. Sawyer's trilogy WWW in an attempt to explain how Webmind spontaneously attained consciousness.[49] Reference notes [1] Daniel Dennett (1995), Darwin's Dangerous Idea, Penguin Books, London, ISBN 978-0-14-016734-4, ISBN 0-14-016734-X [2] Wolfram, Stephen (1983). "Statistical mechanics of cellular automata" (http:/ / www. stephenwolfram. com/ publications/ articles/ ca/ 83-statistical/ ). Reviews of Modern Physics 55 (3): 601–644. Bibcode 1983RvMP...55..601W. doi:10.1103/RevModPhys.55.601. [3] John von Neumann, “The general and logical theory of automata,” in L.A. Jeffress, ed., Cerebral Mechanisms in Behavior – The Hixon Symposium, John Wiley & Sons, New York, 1951, pp. 1-31. [4] John G. Kemeny, “Man viewed as a machine,” Sci. Amer. 192(April 1955):58-67; Sci. Amer. 192(June 1955):6 (errata). [5] Wiener, N.; Rosenblueth, A. (1946). "The mathematical formulation of the problem of conduction of impulses in a network of connected excitable elements, specifically in cardiac muscle". Arch. Inst. Cardiol. México 16: 205. [6] Davidenko, J. M.; Pertsov, A. V.; Salomonsz, R.; Baxter, W.; Jalife, J. (1992). "Stationary and drifting spiral waves of excitation in isolated cardiac muscle". Nature 355 (6358): 349–351. Bibcode 1992Natur.355..349D. doi:10.1038/355349a0. PMID 1731248. [7] Hedlund, G. A. (1969). "Endomorphisms and automorphisms of the shift dynamical system" (http:/ / www. springerlink. com/ content/ k62915l862l30377/ ). Math. Systems Theory 3 (4): 320–3751. doi:10.1007/BF01691062. . [8] Gardner, M. (1970). "MATHEMATICAL GAMES The fantastic combinations of John Conway's new solitaire game "life"" (http:/ / www. ibiblio. org/ lifepatterns/ october1970. html). Scientific American: 120–123. . [9] Paul Chapman. Life universal computer. http:/ / www. igblan. free-online. co. uk/ igblan/ ca/ November 2002 [10] http:/ / www. complex-systems. com [11] Richardson, D. (1972). "Tessellations with local transformations". J. Computer System Sci. 6 (5): 373–388. doi:10.1016/S0022-0000(72)80009-6. [12] Margenstern, Maurice (2007). Cellular Automata in Hyperbolic Spaces - Tome I, Volume 1 (http:/ / books. google. com/ books?id=wGjX1PpFqjAC& pg=PA134). Archives contemporaines. p. 134. ISBN 978-2-84703-033-4. . [13] Serafino Amoroso, Yale N. Patt, Decision Procedures for Surjectivity and Injectivity of Parallel Maps for Tessellation Structures. J. Comput. Syst. Sci. 6(5): 448-464 (1972) [14] Sutner, Klaus (1991). "De Bruijn Graphs and Linear Cellular Automata" (http:/ / www. complex-systems. com/ pdf/ 05-1-3. pdf). Complex Systems 5: 19–30. . [15] Kari, Jarkko (1990). "Reversibility of 2D cellular automata is undecidable". Physica D 45: 379–385. Bibcode 1990PhyD...45..379K. doi:10.1016/0167-2789(90)90195-U. 191 Cellular automaton [16] Kari, Jarkko (1999). "On the circuit depth of structurally reversible cellular automata". Fundamenta Informaticae 38: 93–107; Durand-Lose, Jérôme (2001). "Representing reversible cellular automata with reversible block cellular automata" (http:/ / www. dmtcs. org/ dmtcs-ojs/ index. php/ proceedings/ article/ download/ 264/ 855). Discrete Mathematics and Theoretical Computer Science AA: 145–154. . [17] Stephen Wolfram, A New Kind of Science, p. 60 (http:/ / www. wolframscience. com/ nksonline/ page-0060-text). [18] Ilachinski, Andrew (2001). Cellular automata: a discrete universe (http:/ / books. google. com/ books?id=3Hx2lx_pEF8C& pg=PA4). World Scientific. pp. 44–45. ISBN 978-981-238-183-5. . [19] The phrase "life-like cellular automaton" dates back at least to Barral, Chaté & Manneville (1992), who used it in a broader sense to refer to outer totalistic automata, not necessarily of two dimensions. The more specific meaning given here was used e.g. in several chapters of Adamatzky (2010). See: Barral, Bernard; Chaté, Hugues; Manneville, Paul (1992). "Collective behaviors in a family of high-dimensional cellular automata". Physics Letters A 163 (4): 279–285. doi:10.1016/0375-9601(92)91013-H; Adamatzky, Andrew, ed. (2010). Game of Life Cellular Automata. Springer. ISBN 978-1-84996-216-2. [20] Stephen Wolfram, A New Kind of Science p231 ff. [21] Hector Zenil, Compression-based investigation of the dynamical properties of cellular automata and other systems journal of Complex Systems 19:1, 2010 [22] G. Cattaneo, E. Formenti, L. Margara (1998). "Topological chaos and CA" (http:/ / books. google. com/ books?id=dGs87s5Pft0C& pg=PA239). In M. Delorme, J. Mazoyer. Cellular automata: a parallel model. Springer. p. 239. ISBN 978-0-7923-5493-2. . [23] Burton H. Voorhees (1996). Computational analysis of one-dimensional cellular automata (http:/ / books. google. com/ books?id=WcZTQHPrG68C& pg=PA8). World Scientific. p. 8. ISBN 978-981-02-2221-5. . [24] Max Garzon (1995). Models of massive parallelism: analysis of cellular automata and neural networks. Springer. p. 149. ISBN 978-3-540-56149-1. [25] The Evolution of Emergent Computation, James P. Crutchfield and Melanie Mitchell (SFI Technical Report 94-03-012) [26] http:/ / www. santafe. edu/ research/ topics-information-processing-computation. php#4 [27] The Evolutionary Design of Collective Computation in Cellular Automata, James P. Crutchfeld, Melanie Mitchell, Rajarshi Das (In J. P. Crutch¯eld and P. K. Schuster (editors), Evolutionary Dynamics|Exploring the Interplay of Selection, Neutrality, Accident, and Function. New York: Oxford University Press, 2002.) [28] Evolving Cellular Automata with Genetic Algorithms: A Review of Recent Work, Melanie Mitchell, James P. Crutchfeld, Rajarshi Das (In Proceedings of the First International Conference on Evolutionary Computation and Its Applications (EvCA'96). Moscow, Russia: Russian Academy of Sciences, 1996.) [29] Murray, J.. Mathematical Biology II. Springer. [30] http:/ / www. cs. utk. edu/ ~mclennan/ contin-comp. html [31] Pivato, M: "RealLife: The continuum limit of Larger than Life cellular automata", Theoretical Computer Science, 372 (1), March 2007, pp.46-68 [32] Peak, West; Messinger, Mott (2004). "Evidence for complex, collective dynamics and emergent, distributed computation in plants" (http:/ / www. pnas. org/ cgi/ content/ abstract/ 101/ 4/ 918). Proceedings of the National Institute of Science of the USA 101 (4): 918–922. Bibcode 2004PNAS..101..918P. doi:10.1073/pnas.0307811100. PMC 327117. PMID 14732685. . [33] http:/ / gilly. stanford. edu/ past_research_files/ APackardneuralnet. pdf [34] Yves Bouligand (1986). Disordered Systems and Biological Organization. pp. 374–375. [35] A. K. Dewdney, The hodgepodge machine makes waves, Scientific American, p. 104, August 1988. [36] M. Gerhardt and H. Schuster, A cellular automaton describing the formation of spatially ordered structures in chemical systems, Physica D 36, 209-221, 1989. [37] A. Ilachinsky, Cellular Automata, World Scientific Publishing, 2001, pp. 660. [38] A. Ilachinsky, Cellular Automata, World Scientific Publishing, 2001, pp. 661-662. [39] J. P. Crutchfield, "The Calculi of Emergence: Computation, Dynamics, and Induction", Physica D 75, 11-54, 1994. [40] M. Minsky, "Cellular Vacuum", Int. Jour. of Theo. Phy. 21, 537-551, 1982. [41] K. Zuse, "The Computing Universe", Int. Jour. of Theo. Phy. 21, 589-600, 1982. [42] E. Fredkin, "Digital mechanics: an informational process based on reversible universal cellular automata", Physica D 45, 254-270, 1990 [43] iLabs (http:/ / www. ilabs. it/ ) [44] F. Berto, G. Rossi, J. Tagliabue, The Mathematics of the Models of Reference, College Publications, 2010 (http:/ / www. mmdr. it/ defaultEN. asp) [45] Weisstein, Eric W.. "Cellular Automaton" (http:/ / mathworld. wolfram. com/ CellularAutomaton. html). . Retrieved 13 March 2011. [46] the Hacker Emblem page on Eric S. Raymond's site (http:/ / www. catb. org/ hacker-emblem/ ) [47] Blackford, Russell; Ikin, Van; McMullen, Sean (1999). "Greg Egan". Strange constellations: a history of Australian science fiction. Contributions to the study of science fiction and fantasy. 80. Greenwood Publishing Group. pp. 190–200. ISBN 978-0-313-25112-2; Hayles, N. Katherine (2005). "Subjective cosmology and the regime of computation: intermediation in Greg Egan's fiction". My mother was a computer: digital subjects and literary texts. University of Chicago Press. pp. 214–240. ISBN 978-0-226-32147-9. [48] Kasman, Alex. "MathFiction: Bloom" (http:/ / kasmana. people. cofc. edu/ MATHFICT/ mfview. php?callnumber=mf615). . Retrieved 27 March 2011. [49] http:/ / www. sfwriter. com/ syw1. htm 192 Cellular automaton References • "History of Cellular Automata" (http://www.wolframscience.com/reference/notes/876b) from Stephen Wolfram's A New Kind of Science • Cellular Automata: A Discrete View of the World, Joel L. Schiff, Wiley & Sons, Inc., ISBN 0-470-16879-X (0-470-16879-X) • Chopard, B and Droz, M, 1998, Cellular Automata Modeling of Physical Systems, Cambridge University Press, ISBN 0-521-46168-5 • Cellular automaton FAQ (http://cafaq.com/) from the newsgroup comp.theory.cell-automata • A. D. Wissner-Gross. 2007. Pattern formation without favored local interactions (http://www.alexwg.org/ publications/JCellAuto_4-27.pdf), Journal of Cellular Automata 4, 27-36 (2008). • Neighbourhood survey (http://cell-auto.com/neighbourhood/index.html) includes discussion on triangular grids, and larger neighbourhood CAs. • von Neumann, John, 1966, The Theory of Self-reproducing Automata, A. Burks, ed., Univ. of Illinois Press, Urbana, IL. • Cosma Shalizi's Cellular Automata Notebook (http://cscs.umich.edu/~crshalizi/notebooks/cellular-automata. html) contains an extensive list of academic and professional reference material. • Wolfram's papers on CAs (http://www.stephenwolfram.com/publications/articles/ca/) • A.M. Turing. 1952. The Chemical Basis of Morphogenesis. Phil. Trans. Royal Society, vol. B237, pp. 37 – 72. (proposes reaction-diffusion, a type of continuous automaton). • Jim Giles. 2002. What kind of science is this? Nature 417, 216 – 218. (discusses the court order that suppressed publication of the rule 110 proof). • Evolving Cellular Automata with Genetic Algorithms: A Review of Recent Work, Melanie Mitchell, James P. Crutchfeld, Rajarshi Das (In Proceedings of the First International Conference on Evolutionary Computation and Its Applications (EvCA'96). Moscow, Russia: Russian Academy of Sciences, 1996.) • The Evolutionary Design of Collective Computation in Cellular Automata, James P. Crutchfeld, Melanie Mitchell, Rajarshi Das (In J. P. Crutch¯eld and P. K. Schuster (editors), Evolutionary Dynamics|Exploring the Interplay of Selection, Neutrality, Accident, and Function. New York: Oxford University Press, 2002.) • The Evolution of Emergent Computation, James P. Crutchfield and Melanie Mitchell (SFI Technical Report 94-03-012) • Ganguly, Sikdar, Deutsch and Chaudhuri "A Survey on Cellular Automata" (http://www.wepapers.com/ Papers/16352/files/swf/15001To20000/16352.swf) • A. Ilachinsky, Cellular Automata, World Scientific Publishing, 2001 (http://www.ilachinski.com/ca_book. htm) External links • Cellular Automata (http://plato.stanford.edu/entries/cellular-automata) entry by Francesco Berto & Jacopo Tagliabue in the Stanford Encyclopedia of Philosophy • Cellular Automata modelling of landlsides and avalanches (http://www.nhazca.it/?page_id=1331&lang=en) • Mirek's Cellebration (http://www.mirekw.com/ca/index.html) – Home to free MCell and MJCell cellular automata explorer software and rule libraries. The software supports a large number of 1D and 2D rules. The site provides both an extensive rules lexicon and many image galleries loaded with examples of rules. MCell is a Windows application, while MJCell is a Java applet. Source code is available. • Modern Cellular Automata (http://www.collidoscope.com/modernca/) – Easy to use interactive exhibits of live color 2D cellular automata, powered by Java applet. Included are exhibits of traditional, reversible, hexagonal, multiple step, fractal generating, and pattern generating rules. Thousands of rules are provided for viewing. Free software is available. 193 Cellular automaton • Self-replication loops in Cellular Space (http://necsi.edu/postdocs/sayama/sdsr/java/) – Java applet powered exhibits of self replication loops. • A collection of over 10 different cellular automata applets (http://vlab.infotech.monash.edu.au/simulations/ cellular-automata/) (in Monash University's Virtual Lab) • Golly (http://www.sourceforge.net/projects/golly) supports von Neumann, Nobili, GOL, and a great many other systems of cellular automata. Developed by Tomas Rokicki and Andrew Trevorrow. This is the only simulator currently available which can demonstrate von Neumann type self-replication. • Wolfram Atlas (http://atlas.wolfram.com/TOC/TOC_200.html) – An atlas of various types of one-dimensional cellular automata. • Conway Life (http://www.conwaylife.com/) • First replicating creature spawned in life simulator (http://www.newscientist.com/article/mg20627653. 800-first-replicating-creature-spawned-in-life-simulator.html) • The Mathematics of the Models of Reference (http://www.mmdr.it/provaEN.asp), featuring a general tutorial on CA, interactive applet, free code and resources on CA as model of fundamental physics Artificial immune system In computer science, Artificial immune systems (AIS) are a class of computationally intelligent systems inspired by the principles and processes of the vertebrate immune system. The algorithms typically exploit the immune system's characteristics of learning and memory to solve a problem. Definition The field of Artificial Immune Systems (AIS) is concerned with abstracting the structure and function of the immune system to computational systems, and investigating the application of these systems towards solving computational problems from mathematics, engineering, and information technology. AIS is a sub-field of Biologically-inspired computing, and Natural computation, with interests in Machine Learning and belonging to the broader field of Artificial Intelligence. Artificial Immune Systems (AIS) are adaptive systems, inspired by theoretical immunology and observed immune functions, principles and models, which are applied to problem solving.[1] AIS is distinct from computational immunology and theoretical biology that are concerned with simulating immunology using computational and mathematical models towards better understanding the immune system, although such models initiated the field of AIS and continue to provide a fertile ground for inspiration. Finally, the field of AIS is not concerned with the investigation of the immune system as a substrate computation, such as DNA computing. History AIS began in the mid 1980s with Farmer, Packard and Perelson's (1986) and Bersini and Varela's papers on immune networks (1990). However, it was only in the mid 90s that AIS became a subject area in its own right. Forrest et al. (on negative selection) and Kephart et al.[2] published their first papers on AIS in 1994, and Dasgupta conducted extensive studies on Negative Selection Algorithms. Hunt and Cooke started the works on Immune Network models in 1995; Timmis and Neal continued this work and made some improvements. De Castro & Von Zuben's and Nicosia & Cutello's work (on clonal selection) became notable in 2002. The first book on Artificial Immune Systems was edited by Dasgupta in 1999. New ideas, such as danger theory and algorithms inspired by the innate immune system, are also now being explored. Although some doubt that they are yet offering anything over and above existing AIS algorithms, this is 194 Artificial immune system hotly debated, and the debate is providing one the main driving forces for AIS development at the moment. Other recent developments involve the exploration of degeneracy in AIS models,[3][4] which is motivated by its hypothesized role in open ended learning and evolution.[5][6] Originally AIS set out to find efficient abstractions of processes found in the immune system but, more recently, it is becoming interested in modelling the biological processes and in applying immune algorithms to bioinformatics problems. In 2008, Dasgupta and Nino [7] published a textbook on Immunological Computation which presents a compendium of up-to-date work related to immunity-based techniques and describes a wide variety of applications. Techniques The common techniques are inspired by specific immunological theories that explain the function and behavior of the mammalian adaptive immune system. • Clonal Selection Algorithm: A class of algorithms inspired by the clonal selection theory of acquired immunity that explains how B and T lymphocytes improve their response to antigens over time called affinity maturation. These algorithms focus on the Darwinian attributes of the theory where selection is inspired by the affinity of antigen-antibody interactions, reproduction is inspired by cell division, and variation is inspired by somatic hypermutation. Clonal selection algorithms are most commonly applied to optimization and pattern recognition domains, some of which resemble parallel hill climbing and the genetic algorithm without the recombination operator.[8] • Negative Selection Algorithm: Inspired by the positive and negative selection processes that occur during the maturation of T cells in the thymus called T cell tolerance. Negative selection refers to the identification and deletion (apoptosis) of self-reacting cells, that is T cells that may select for and attack self tissues. This class of algorithms are typically used for classification and pattern recognition problem domains where the problem space is modeled in the complement of available knowledge. For example in the case of an anomaly detection domain the algorithm prepares a set of exemplar pattern detectors trained on normal (non-anomalous) patterns that model and detect unseen or anomalous patterns.[9] • Immune Network Algorithms: Algorithms inspired by the idiotypic network theory proposed by Niels Kaj Jerne that describes the regulation of the immune system by anti-idiotypic antibodies (antibodies that select for other antibodies). This class of algorithms focus on the network graph structures involved where antibodies (or antibody producing cells) represent the nodes and the training algorithm involves growing or pruning edges between the nodes based on affinity (similarity in the problems representation space). Immune network algorithms have been used in clustering, data visualization, control, and optimization domains, and share properties with artificial neural networks.[10] • Dendritic Cell Algorithms: The Dendritic Cell Algorithm (DCA) is an example of an immune inspired algorithm developed using a multi-scale approach. This algorithm is based on an abstract model of dendritic cells (DCs). The DCA is abstracted and implemented through a process of examining and modeling various aspects of DC function, from the molecular networks present within the cell to the behaviour exhibited by a population of cells as a whole. Within the DCA information is granulated at different layers, achieved through multi-scale processing.[11] 195 Artificial immune system Notes [1] de Castro, Leandro N.; Timmis, Jonathan (2002). Artificial Immune Systems: A New Computational Intelligence Approach. Springer. pp. 57–58. ISBN 1852335947, 9781852335946. [2] Kephart, J. O. (1994). "A biologically inspired immune system for computers". Proceedings of Artificial Life IV: The Fourth International Workshop on the Synthesis and Simulation of Living Systems. MIT Press. pp. 130–139. [3] Andrews and Timmis (2006). "A Computational Model of Degeneracy in a Lymph Node". Lecture Notes in Computer Science 4163: 164. [4] Mendao et al. (2007). "The Immune System in Pieces: Computational Lessons from Degeneracy in the Immune System". Foundations of Computational Intelligence (FOCI): 394–400. [5] Edelman and Gally (2001). "Degeneracy and complexity in biological systems". Proceedings of the National Academy of Sciences, USA 98 (24): 13763–13768. doi:10.1073/pnas.231499798. [6] Whitacre (2010). "Degeneracy: a link between evolvability, robustness and complexity in biological systems" (http:/ / www. tbiomed. com/ content/ 7/ 1/ 6). Theoretical Biology and Medical Modelling 7 (6). . Retrieved 2011-03-11. [7] Dasgupta, Dipankar; Nino, Fernando (2008). CRC Press. pp. 296. ISBN 978-1-4200-6545-9. [8] de Castro, L. N.; Von Zuben, F. J. (2002). "Learning and Optimization Using the Clonal Selection Principle" (ftp:/ / ftp. dca. fee. unicamp. br/ pub/ docs/ vonzuben/ lnunes/ ieee_tec01. pdf) (PDF). IEEE Transactions on Evolutionary Computation, Special Issue on Artificial Immune Systems (IEEE) 6 (3): 239–251. . [9] Forrest, S.; Perelson, A.S.; Allen, L.; Cherukuri, R. (1994). "Self-nonself discrimination in a computer" (http:/ / www. cs. unm. edu/ ~immsec/ publications/ virus. pdf) (PDF). Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy. Los Alamitos, CA. pp. 202–212. . [10] Timmis, J.; Neal, M.; Hunt, J. (2000). "An artificial immune system for data analysis". BioSystems 55 (1): 143–150. doi:10.1016/S0303-2647(99)00092-1. PMID 10745118. [11] Greensmith, J.; Aickelin, U. (2009). "Artificial Dendritic Cells: Multi-faceted Perspectives" (http:/ / ima. ac. uk/ papers/ greensmith2009. pdf) (PDF). Human-Centric Information Processing Through Granular Modelling: 375–395. . References • J.D. Farmer, N. Packard and A. Perelson, (1986) "The immune system, adaptation and machine learning", Physica D, vol. 2, pp. 187–204 • H. Bersini, F.J. Varela, Hints for adaptive problem solving gleaned from immune networks. Parallel Problem Solving from Nature, First Workshop PPSW 1, Dortmund, FRG, October, 1990. • D. Dasgupta (Editor), Artificial Immune Systems and Their Applications, Springer-Verlag, Inc. Berlin, January 1999, ISBN 3-540-64390-7 • V. Cutello and G. Nicosia (2002) "An Immunological Approach to Combinatorial Optimization Problems" Lecture Notes in Computer Science, Springer vol. 2527, pp. 361–370. • L. N. de Castro and F. J. Von Zuben, (1999) "Artificial Immune Systems: Part I -Basic Theory and Applications", School of Computing and Electrical Engineering, State University of Campinas, Brazil, No. DCA-RT 01/99. • S. Garrett (2005) "How Do We Evaluate Artificial Immune Systems?" Evolutionary Computation, vol. 13, no. 2, pp. 145–178. http://mitpress.mit.edu/journals/pdf/EVCO_13_2_145_0.pdf • V. Cutello, G. Nicosia, M. Pavone, J. Timmis (2007) An Immune Algorithm for Protein Structure Prediction on Lattice Models, IEEE Transactions on Evolutionary Computation, vol. 11, no. 1, pp. 101–117. http://www.dmi. unict.it/nicosia/papers/journals/Nicosia-IEEE-TEVC07.pdf 196 Artificial immune system People • • • • • • • • • Uwe Aickelin (http://aickelin.com) Leandro de Castro (http://www.dca.fee.unicamp.br/~lnunes/) Fernando José Von Zuben (http://www.dca.fee.unicamp.br/~vonzuben/) Dipankar Dasgupta (http://www.msci.memphis.edu/~dasgupta/) Jon Timmis (http://www-users.cs.york.ac.uk/jtimmis/) Giuseppe Nicosia (http://www.dmi.unict.it/nicosia/) Stephanie Forrest (http://www.cs.unm.edu/~forrest/) Pablo Dalbem de Castro (http://www.dca.fee.unicamp.br/~pablo) Julie Greensmith (http://www.cs.nott.ac.uk/~jqg/) External links • AISWeb: The Online Home of Artificial Immune Systems (http://www.artificial-immune-systems.org) Information about AIS in general and links to a variety of resources including ICARIS conference series, code, teaching material and algorithm descriptions. • ARTIST: Network for Artificial Immune Systems (http://www.elec.york.ac.uk/ARTIST) Provides information about the UK AIS network, ARTIST. It provides technical and financial support for AIS in the UK and beyond, and aims to promote AIS projects. • Computer Immune Systems (http://www.cs.unm.edu/~immsec/) Group at the University of New Mexico led by Stephanie Forrest. • AIS: Artificial Immune Systems (http://ais.cs.memphis.edu/) Group at the University of Memphis led by Dipankar Dasgupta. • IBM Antivirus Research (http://www.research.ibm.com/antivirus/) Early work in AIS for computer security. • The ISYS Project (http://www.aber.ac.uk/~dcswww/ISYS) A now out of date project at the University of Wales, Aberystwyth interested in data analysis with AIS. • AIS on Facebook (http://www.facebook.com/group.php?gid=12481710452) Group for people interested in the scientific field of artificial immune systems. • The Center for Modeling Immunity to Enteric Pathogens (MIEP) (http://www.modelingimmunity.org) 197 Evolutionary multi-modal optimization Evolutionary multi-modal optimization In applied mathematics, multimodal optimization deals with Optimization (mathematics) tasks that involve finding all or most of the multiple solutions (as opposed to a single best solution). Motivation Knowledge of multiple solutions to an optimization task is especially helpful in engineering, when due to physical (and/or cost) constraints, the best results may not always be realizable. In such a scenario, if multiple solutions (local and global) are known, the implementation can be quickly switched to another solution and still obtain a optimal system performance. Multiple solutions could also be analyzed to discover hidden properties (or relationships), which makes them high-performing. In addition, the algorithms for multimodal optimization usually not only locate multiple optima in a single run, but also preserve their population diversity, resulting in their global optimization ability on multimodal functions. Moreover, the techniques for multimodal optimization are usually borrowed as diversity maintenance techniques to other problems [1]. Background Classical techniques of optimization would need multiple restart points and multiple runs in the hope that a different solution may be discovered every run, with no guarantee however. Evolutionary algorithms (EAs) due to their population based approach, provide a natural advantage over classical optimization techniques. They maintain a population of possible solutions, which are processed every generation, and if the multiple solutions can be preserved over all these generations, then at termination of the algorithm we will have multiple good solutions, rather than only the best solution. Note that, this is against the natural tendency of EAs, which will always converge to the best solution, or a sub-optimal solution (in a rugged, “badly behaving” function). Finding and Maintenance of multiple solutions is wherein lies the challenge of using EAs for multi-modal optimization. Niching [2] is a generic term referred to as the technique of finding and preserving multiple stable niches, or favorable parts of the solution space possibly around multiple solutions, so as to prevent convergence to a single solution. The field of EAs today encompass Genetic Algorithms (GAs), Differential evolution (DE), Particle Swarm Optimization (PSO), Evolution strategy (ES) among others. Attempts have been made to solve multi-modal optimization in all these realms and most, if not all the various methods implement niching in some form or the other. Multimodal optimization using GAs Petrwoski’s clearing method, Goldberg’s sharing function approach, restricted mating, maintaining multiple subpopulations are some of the popular approaches that have been proposed by the GA Community [3]. The first two methods are very well studied and respected in the GA community. Recently, a Evolutionary Multiobjective optimization (EMO) approach was proposed [4], in which a suitable second objective is added to the originally single objective multimodal optimization problem, so that the multiple solutions form a weak pareto-optimal front. Hence, the multimodal optimization problem can be solved for its multiple solutions using a EMO algorithm. Improving upon their work [5], the same authors have made their algorithm self-adaptive, thus eliminating the need for pre-specifying the parameters. An approach that does not use any radius for separating the population into subpopulations (or species) but employs the space topology instead is proposed in [6]. 198 Evolutionary multi-modal optimization Multimodal optimization using DE The niching methods used in GAs have also been explored with success in the DE community. DE based local selection and global selection approaches have also been attempted for solving multi-modal problems. DE's coupled with local search algorithms (Memetic DE) have been explored as an approach to solve multi-modal problems. For a comprehensive treatment of Finding multiple optima using Genetic Algorithms in a Multi-modal optimization task multimodal optimization methods in (The algorithm demonstrated in this demo is the one proposed by Deb, Saha in the multi-objective approach to multimodal optimization) DE, refer the Ph.D thesis Ronkkonen, J. (2009). Continuous Multimodal Global Optimization with Differential Evolution Based Methods.[7] References [1] Wong,K.C. et al. (2012), Evolutionary multimodal optimization using the principle of locality (http:/ / dx. doi. org/ 10. 1016/ j. ins. 2011. 12. 016) Information Sciences [2] Mahfoud, S.W. (1995), "Niching methods for genetic algorithms" [3] Deb, K. (2001), "Multi-objective optimization using evolutionary algorithms", Wiley [4] Deb,K., Saha,A. (2010) "Finding Multiple Solutions for Multimodal Optimization Problems Using a Multi-Objective Evolutionary Approach" (GECCO 2010, In press) [5] Saha,A., Deb, K. (2010) "A Bi-criterion Approach to Multimodal Optimization: Self-adaptive Approach " (Lecture Notes in Computer Science, 2010, Volume 6457/2010, 95-104) [6] C. Stoean, M. Preuss, R. Stoean, D. Dumitrescu (2010) Multimodal Optimization by means of a Topological Species Conservation Algorithm (http:/ / ieeexplore. ieee. org/ xpl/ freeabs_all. jsp?arnumber=5491155). In IEEE Transactions on Evolutionary Computation, Vol. 14, Issue 6, pages 842-864, 2010. [7] Ronkkonen,J., (2009). Continuous Multimodal Global Optimization with Diferential Evolution Based Methods (https:/ / oa. doria. fi/ bitstream/ handle/ 10024/ 50498/ isbn 9789522148520. pdf) Bibliography • D. Goldberg and J. Richardson. (1987) "Genetic algorithms with sharing for multimodal function optimization". In Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their application table of contents, pages 41–49. L.Erlbaum Associates Inc. Hillsdale, NJ, USA, 1987. • A. Petrowski. (1996) "A clearing procedure as a niching method for genetic algorithms". In Proceedings of the 1996 IEEE International Conference on Evolutionary Computation, pages 798–803. Citeseer, 1996. • Deb,K., (2001) "Multi-objective Optimization using Evolutionary Algorithms", Wiley ( Google Books) (http:// books.google.com/books?id=OSTn4GSy2uQC&printsec=frontcover&dq=multi+objective+optimization& source=bl&ots=tCmpqyNlj0&sig=r00IYlDnjaRVU94DvotX-I5mVCI&hl=en& ei=fHnNS4K5IMuLkAWJ8OgS&sa=X&oi=book_result&ct=result&resnum=8& ved=0CD0Q6AEwBw#v=onepage&q&f=false) • F. Streichert, G. Stein, H. Ulmer, and A. Zell. (2004) "A clustering based niching EA for multimodal search spaces". Lecture notes in computer science, pages 293–304, 2004. 199 Evolutionary multi-modal optimization • Singh, G., Deb, K., (2006) "Comparison of multi-modal optimization algorithms based on evolutionary algorithms". In Proceedings of the 8th annual conference on Genetic and evolutionary computation, pages 8–12. ACM, 2006. • Ronkkonen,J., (2009). Continuous Multimodal Global Optimization with Diferential Evolution Based Methods (https://oa.doria.fi/bitstream/handle/10024/50498/isbn 9789522148520.pdf) • Wong,K.C., (2009). An evolutionary algorithm with species-specific explosion for multimodal optimization. GECCO 2009: 923-930 (http://portal.acm.org/citation.cfm?id=1570027) • J. Barrera and C. A. C. Coello. "A Review of Particle Swarm Optimization Methods used for Multimodal Optimization", pages 9–37. Springer, Berlin, November 2009. • Wong,K.C., (2010). Effect of Spatial Locality on an Evolutionary Algorithm for Multimodal Optimization. EvoApplications (1) 2010: 481-490 (http://www.springerlink.com/content/jn23t10366778017/) • Deb,K., Saha,A. (2010) Finding Multiple Solutions for Multimodal Optimization Problems Using a Multi-Objective Evolutionary Approach. GECCO 2010: 447-454 (http://portal.acm.org/citation. cfm?id=1830483.1830568) • Wong,K.C., (2010). Protein structure prediction on a lattice model via multimodal optimization techniques. GECCO 2010: 155-162 (http://portal.acm.org/citation.cfm?id=1830483.1830513) • Saha, A., Deb,K. (2010), A Bi-criterion Approach to Multimodal Optimization: Self-adaptive Approach. SEAL 2010: 95-104 (http://www.springerlink.com/content/8676217j87173p60/) • C. Stoean, M. Preuss, R. Stoean, D. Dumitrescu (2010) Multimodal Optimization by means of a Topological Species Conservation Algorithm (http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5491155). In IEEE Transactions on Evolutionary Computation, Vol. 14, Issue 6, pages 842-864, 2010. External links • Multi-modal optimization using Particle Swarm Optimization (PSO) (http://tracer.uc3m.es/tws/pso/ multimodal.html) • Niching in Evolution Strategy (ES) (http://www.princeton.edu/~oshir/NichingES/index.htm) 200 Evolutionary music 201 Evolutionary music Evolutionary music is the audio counterpart to Evolutionary art, whereby algorithmic music is created using an evolutionary algorithm. The process begins with a population of individuals which by some means or other produce audio (e.g. a piece, melody, or loop), which is either initialized randomly or based on human-generated music. Then through the repeated application of computational steps analogous to biological selection, recombination and mutation the aim is for the produced audio to become more musical. Evolutionary sound synthesis is a related technique for generating sounds or synthesizer instruments. Evolutionary music is typically generated using an interactive evolutionary algorithm where the fitness function is the user or audience, as it is difficult to capture the aesthetic qualities of music computationally. However, research into automated measures of musical quality is also active. Evolutionary computation techniques have also been applied to harmonization and accompaniment tasks. The most commonly used evolutionary computation techniques are genetic algorithms and genetic programming. History NEUROGEN (Gibson & Byrne, 1991 [1]) employed a genetic algorithm to produce and combine musical fragments and a neural network (trained on examples of "real" music) to evaluate their fitness. A genetic algorithm is also a key part of the improvisation and accompaniment system GenJam [10] which has been developed since 1993 by Al Biles. Al and GenJam are together known as the Al Biles Virtual Quintet and have performed many times to human audiences. Since 1996 Rodney Waschka II has been using genetic algorithms for music composition including works such as Saint Ambrose [2] and his string quartets.[3] In 1997 Brad Johanson and Riccardo Poli developed the GP-Music System [4] which, as the name implies, used genetic programming to breed melodies according to both human and automated ratings. Several systems for drum loop evolution have been produced (including one commercial program called MuSing [5]). Conferences The EvoMUSART Conference[6] from 2012 (previously a workshop) was part of the Evo*[7] event annually from 2003. This event on evolutionary music and art is one of the main outlets for work on evolutionary music. A annual Workshop in Evolutionary Music[8] has been held at GECCO (Genetic and Evolutionary Computation Conference[9]) since 2011. Recent work The EuroGP Song Contest [10] (a pun on Eurovision Song Contest) was held at EuroGP 2004 [11]. In this experiment several tens of users were first tested for their ability to recognise musical differences, and then a short piano-based melody was evolved. Al Biles gave a tutorial on evolutionary music [12] at GECCO 2005 and co-edited a book contributions from many researchers in the field. [13] on the subject with Evolutune [14] is a small Windows application from 2005 for evolving simple loops of "beeps and boops". It has a graphical interface where the user can select parents manually. The GeneticDrummer accompaniment. [15] is a Genetic Algorithm based system for generating human-competitive rhythm The easy Song Builder [16] is a evolutionary composition program. The user decides which version of the song will be the germ for the next generation. Evolutionary music Books • Evolutionary Computer Music. Miranda, Eduardo Reck; Biles, John Al (Eds.) London: Springer, 2007..[17] • The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music, Juan Romero and Penousal Machado (eds.), 2007, Springer[18] • Creative Evolutionary Systems by David W. Corne, Peter J. Bentley[19] References [1] http:/ / ieeexplore. ieee. org/ xpl/ freeabs_all. jsp?arnumber=140338 [2] Capstone Records:Rodney Waschka II - Saint Ambrose (http:/ / www. capstonerecords. org/ CPS-8708. html) [3] SpringerLink - Book Chapter (http:/ / www. springerlink. com/ content/ j1up38mn7205g552/ ?p=e54526113482447681a3114bed6f5eef& pi=5) [4] http:/ / graphics. stanford. edu/ ~bjohanso/ gp-music/ [5] http:/ / www. geneffects. com/ musing/ [6] "EvoMUSART" (http:/ / evostar. dei. uc. pt/ 2012/ call-for-contributions/ evomusart/ ). . [7] "Evo* (EvoStar)" (http:/ / www. evostar. org/ ). . [8] "GECCO workshops" (http:/ / www. sigevo. org/ gecco-2012/ workshops. html). . [9] "GECCO 2012" (http:/ / www. sigevo. org/ gecco-2012/ ). . [10] http:/ / evonet. lri. fr/ eurogp2004/ songcontest. html [11] http:/ / evonet. lri. fr/ eurogp2004/ index. html [12] http:/ / www. it. rit. edu/ ~jab/ EvoMusic/ BilesEvoMusicSlides. pdf [13] http:/ / www. springer. com/ uk/ home/ generic/ search/ results?SGWID=3-40109-22-173674005-0 [14] http:/ / askory. phratry. net/ projects/ evolutune/ [15] http:/ / phoenix. inf. upol. cz/ ~dostal/ evm. html [16] http:/ / www. compose-music. com [17] Evolutionary Computer Music - Multimedia Information Systems Journals, Books & Online Media | Springer (http:/ / www. springer. com/ computer/ information+ systems/ book/ 978-1-84628-599-8?detailsPage=toc) [18] The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music (http:/ / art-artificial-evolution. dei. uc. pt/ ) [19] Creative evolutionary systems (http:/ / books. google. co. uk/ books/ about/ Creative_evolutionary_systems. html?id=kJTUG2dbbMkC). Morgan Kaufmann. 2002. pp. 576. . Links • Al Biles' Evolutionary Music Bibliography (http://www.it.rit.edu/~jab/EvoMusic/EvoMusBib.html) - also includes pointers to work on evolutionary sound synthesis. • Evolectronica (http://evolectronica.com) interactive evolving streaming electronic music 202 Coevolution Coevolution In biology, coevolution is "the change of a biological object triggered by the change of a related object."[1] Coevolution can occur at many biological levels: it can be as microscopic as correlated mutations between amino acids in a protein, or as macroscopic as covarying traits between different species in an environment. Each party in a coevolutionary relationship exerts selective pressures on the other, thereby affecting each other's evolution. Coevolution of different species includes the evolution of a host species and its parasites (host–parasite coevolution), Bumblebees and the flowers they pollinate have coevolved so that and examples of mutualism evolving through time. both have become dependent on each other for survival. Evolution in response to abiotic factors, such as climate change, is not coevolution (since climate is not alive and does not undergo biological evolution). Coevolution between pairs of entities exists, such as that between predator and prey, host and symbiont or host and parasite, but many cases are less clearcut: a species may evolve in response to a number of other species, each of which is also evolving in response to a set of species. This situation has been referred to as "diffuse coevolution." There is little evidence of coevolution driving large-scale changes in Earth's history, since abiotic factors such as mass extinction and expansion into ecospace seem to guide the shifts in the abundance of major groups.[2] However, there is evidence for coevolution at the level of populations and species. For example, the concept of coevolution was briefly described by Charles Darwin in On the Origin of Species, and developed in detail in Fertilisation of Orchids.[3][4][5] It is likely that viruses and their hosts may have coevolved in various scenarios.[6] Coevolution is primarily a biological concept, but has been applied by analogy to fields such as computer science and astronomy. Models One model of coevolution was Leigh Van Valen's Red Queen's Hypothesis, which states that "for an evolutionary system, continuing development is needed just in order to maintain its fitness relative to the systems it is co-evolving with".[7] Emphasizing the importance of sexual conflict, Thierry Lodé described the role of antagonist interactions in evolution, giving rise to a concept of antagonist coevolution.[8] Coevolution branching strategies for asexual population dynamics in limited resource environments have been modeled using the generalized Lotka–Volterra equations.[9] 203 Coevolution Specific examples Hummingbirds and ornithophilous flowers Hummingbirds and ornithophilous (bird-pollinated) flowers have evolved a mutualistic relationship. The flowers have nectar suited to the birds' diet, their color suits the birds' vision and their shape fits that of the birds' bills. The blooming times of the flowers have also been found to coincide with hummingbirds' breeding seasons. Flowers have converged to take advantage of similar birds.[10] Flowers compete for pollinators, and adaptations reduce unfavourable effects of this competition.[10] Bird-pollinated flowers usually have higher volumes of nectar and higher sugar production than those pollinated by insects.[11] This meets the birds' high energy requirements, which are the most important determinants of their flower choice.[11] Following their respective breeding seasons, several species of hummingbirds occur at the same locations in North America, and several hummingbird flowers bloom simultaneously in these habitats. These flowers seem to have converged to a common morphology and color.[11] Different lengths and curvatures of the corolla tubes can affect the efficiency of extraction in hummingbird species in relation to differences in bill morphology.[11] Tubular flowers force a bird to orient its bill in a particular way when probing the flower, especially when the bill and corolla are both curved; this also allows the plant to place pollen on a certain part of the bird's body.[11] This opens the door for a variety of morphological co-adaptations. An important requirement for attraction is conspicuousness to birds, which reflects the properties of avian vision and habitat features.[11] Birds have their greatest spectral sensitivity and finest hue discrimination at the red end of the visual spectrum,[11] so red is particularly conspicuous to them. Hummingbirds may also be able to see ultraviolet "colors".[11] The prevalence of ultraviolet patterns and nectar guides in nectar-poor entomophilous (insect-pollinated) flowers warns the bird to avoid these flowers.[11] Hummingbirds form the family Trochilidae, whose two subfamilies are the Phaethornithinae (hermits) and the Trochilinae. Each subfamily has evolved in conjunction with a particular set of flowers. Most Phaethornithinae species are associated with large monocotyledonous herbs, while the Trochilinae prefer dicotyledonous plant species.[11] Angraecoid orchids and African moths Angraecoid orchids and African moths coevolve because the moths are dependent on the flowers for nectar and the flowers are dependent on the moths to spread pollen so they can reproduce. Coevolution has led to deep flowers and moths with long probosci. Old world swallowtail and fringed rue 204 Coevolution An example of antagonistic coevolution is the old world swallowtail (Papilio machaon) caterpillar living on the fringed rue (Ruta chalepensis) plant. The rue produces etheric oils which repel plant-eating insects. The old world swallowtail caterpillar developed resistance to these poisonous substances, thus reducing competition with other plant-eating insects. Garter snake and rough-skinned newt Coevolution of predator and prey species is illustrated by the Rough-skinned newt (Taricha granulosa) and the common garter snake (Thamnophis sirtalis). The Old world swallowtail caterpillar on fringed rue newts produce a potent neurotoxin that concentrates in their skin. Garter snakes have evolved resistance to this toxin through a series of genetic mutations, and prey upon the newts. The relationship between these animals has resulted in an evolutionary arms race that has driven toxin levels in the newt to extreme levels. This is an example of coevolution because differential survival caused each organism to change in response to changes in the other. California buckeye and pollinators When beehives are populated with bee species that have not coevolved with the California buckeye (Aesculus californica), sensitivity to aesculin, a neurotoxin present in its nectar, may be noticed; this sensitivity is only thought to be present in honeybees and other insects that did not coevolve with A. californica.[12] Acacia ant and bullhorn acacia tree The acacia ant (Pseudomyrmex ferruginea) protects the bullhorn acacia (Acacia cornigera) from preying insects and from other plants competing for sunlight, and the tree provides nourishment and shelter for the ant and its larvae.[13] Nevertheless, some ant species can exploit trees without reciprocating, and hence have been given various names such as 'cheaters', 'exploiters', 'robbers' and 'freeloaders'. Although cheater ants do important damage to the reproductive organs of trees, their net effect on host fitness is difficult to forecast and not necessarily negative.[14] 205 Coevolution 206 Yucca Moth and the yucca plant In this mutualistic symbiotic relationship, the yucca plant (Yucca whipplei) is pollinated exclusively by Tegeticula maculata, a species of yucca moth that in turn relies on the yucca for survival.[15] Yucca moths tend to visit the flowers of only one species of yucca plant. In the flowers, the moth eats the seeds of the plant, while at the same time gathering pollen on special mouth parts. The pollen is very sticky, and will easily remain on the mouth parts when the moth moves to the next flower. The yucca plant also provides a place for the moth to lay its eggs, deep within the flower where they are protected from any potential predators.[16] The adaptations that both species exhibit characterize coevolution because the species have evolved to become dependent on each other. Coevolution outside biology A flowering yucca plant that would be pollinated by a yucca moth Coevolution is primarily a biological concept, but has been applied to other fields by analogy. Technological coevolution Computer software and hardware can be considered as two separate components but tied intrinsically by coevolution. Similarly, operating systems and computer applications, web browsers and web applications. All of these systems depend upon each other and advance step by step through a kind of evolutionary process. Changes in hardware, an operating system or web browser may introduce new features that are then incorporated into the corresponding applications running alongside. Algorithms Coevolutionary algorithms are a class of algorithms used for generating artificial life as well as for optimization, game learning and machine learning. Coevolutionary methods have been applied by Daniel Hillis, who coevolved sorting networks, and Karl Sims, who coevolved virtual creatures. Cosmology and astronomy In his book The Self-organizing Universe, Erich Jantsch attributed the entire evolution of the cosmos to coevolution. In astronomy, an emerging theory states that black holes and galaxies develop in an interdependent way analogous to biological coevolution.[17] Coevolution References [1] Yip et al.; Patel, P; Kim, PM; Engelman, DM; McDermott, D; Gerstein, M (2008). "An integrated system for studying residue coevolution in proteins" (http:/ / bioinformatics. oxfordjournals. org/ cgi/ content/ full/ 24/ 2/ 290). Bioinformatics 24 (2): 290–292. doi:10.1093/bioinformatics/btm584. PMID 18056067. . [2] Sahney, S., Benton, M.J. and Ferry, P.A. (2010). "Links between global taxonomic diversity, ecological diversity and the expansion of vertebrates on land" (http:/ / rsbl. royalsocietypublishing. org/ content/ 6/ 4/ 544. full. pdf+ html) (PDF). Biology Letters 6 (4): 544–547. doi:10.1098/rsbl.2009.1024. PMC 2936204. PMID 20106856. . [3] Thompson, John N. (1994). The coevolutionary process (http:/ / books. google. com/ ?id=AyXPQzEwqPIC& pg=PA27& lpg=PA27& dq=Wallace+ "creation+ by+ law"+ Angræcum). Chicago: University of Chicago Press. ISBN 0-226-79760-0. . Retrieved 2009-07-27. [4] Darwin, Charles (1859). On the Origin of Species (http:/ / darwin-online. org. uk/ content/ frameset?itemID=F373& viewtype=text& pageseq=1) (1st ed.). London: John Murray. . Retrieved 2009-02-07. [5] Darwin, Charles (1877). On the various contrivances by which British and foreign orchids are fertilised by insects, and on the good effects of intercrossing (http:/ / darwin-online. org. uk/ content/ frameset?itemID=F801& viewtype=text& pageseq=1) (2nd ed.). London: John Murray. . Retrieved 2009-07-27. [6] C.Michael Hogan. 2010. Encyclopedia of Earth (http:/ / www. eoearth. org/ articles/ view/ 156858/ ?topic=49496''Virus''. ). Editors: Cutler Cleveland and Sidney Draggan [7] Van Valen L. (1973): "A New Evolutionary Law", Evolutionary Theory 1, p. 1-30. Cited in: The Red Queen Principle (http:/ / pespmc1. vub. ac. be/ REDQUEEN. html) [8] Lodé, Thierry (2007). La guerre des sexes chez les animaux, une histoire naturelle de la sexualité (http:/ / www. amazon. fr/ guerre-sexes-chez-animaux-naturelle/ dp/ 2738119018). Paris: Odile Jacob. ISBN 2-7381-1901-8. . [9] G. S. van Doorn, F. J. Weissing (April 2002). "Ecological versus Sexual Selection Models of Sympatric Speciation: A Synthesis" (http:/ / www. bio. vu. nl/ thb/ course/ ecol/ DoorWeis2001. pdf). Selection (Budapest, Hungary: Akadémiai Kiadó) 2 (1-2): 17–40. doi:10.1556/Select.2.2001.1-2.3. ISBN 1588-287X. ISSN 1585-1931. . Retrieved 2009-09-15. "The intuition behind the occurrence of evolutionary branching of ecological strategies in resource competition was confirmed, at least for asexual populations, by a mathematical formulation based on Lotka–Volterra type population dynamics. (Metz et al., 1996)." [10] Brown James H., Kodric-Brown Astrid (1979). "Convergence, Competition, and Mimicry in a Temperate Community of Hummingbird-Pollinated Flowers" (http:/ / www. jstor. org/ sici?sici=0012-9658(197910)60:5<1022:CCAMIA>2. 0. CO;2-D). Ecology 60 (5): 1022–1035. doi:10.2307/1936870. . [11] Stiles, F. Gary (1981). "Geographical Aspects of Bird Flower Coevolution, with Particular Reference to Central America" (http:/ / www. jstor. org/ sici?sici=0026-6493(1981)68:2<323:GAOBCW>2. 0. CO;). Annals of the Missouri Botanical Garden 68 (2): 323–351. doi:10.2307/2398801. . [12] C. Michael Hogan (13 September 2008). California Buckeye: Aesculus californica (http:/ / globaltwitcher. auderis. se/ artspec_information. asp?thingid=82383& lang=us), GlobalTwitcher.com [13] National Geographic. "Acacia Ant Video" (http:/ / video. nationalgeographic. com/ video/ player/ animals/ bugs-animals/ ants-and-termites/ ant_acaciatree. html). . [14] Palmer TM, Doak DF, Stanton ML, Bronstein JL, Kiers ET, Young TP, Goheen JR, Pringle RM (2010). "Synergy of multiple partners, including freeloaders, increases host fitness in a multispecies mutualism". Proceedings of the National Academy of Sciences of the United States of America 107 (40): 17234–9. doi:10.1073/pnas.1006872107. PMC 2951420. PMID 20855614. [15] Hemingway, Claire (2004). "Pollination Partnerships Fact Sheet" (http:/ / www. fna. org/ files/ imported/ Outreach/ FNAfs_yucca. pdf) (PDF). Flora of North America: 1–2. . Retrieved 2011-02-18. "Yucca and Yucca Moth" [16] Pellmyr, Olle; James Leebens-Mack (1999-08). "Forty million years of mutualism: Evidence for Eocene origin of the yucca-yucca moth association" (http:/ / www. pnas. org/ content/ 96/ 16/ 9178. full. pdf+ html) (PDF). Proc. Natl. Acad. Sci. USA 96 (16): 9178–9183. doi:10.1073/pnas.96.16.9178. PMC 17753. PMID 10430916. . Retrieved 2011-02-18. [17] Britt, Robert. "The New History of Black Holes: 'Co-evolution' Dramatically Alters Dark Reputation" (http:/ / www. space. com/ scienceastronomy/ blackhole_history_030128-1. html). . Further reading • Dawkins, R. Unweaving the Rainbow. • Futuyma, D. J. and M. Slatkin (editors) (1983). Coevolution. Sunderland, Massachusetts: Sinauer Associates. pp. 555 pp. ISBN 0-87893-228-3. • Geffeney, Shana L., et al. "Evolutionary diversification of TTX-resistant sodium channels in a predator-prey interaction". Nature 434 (2005): 759–763. • Michael Pollan The Botany of Desire: A Plant's-eye View of the World. Bloomsbury. ISBN 0-7475-6300-4. Account of the co-evolution of plants and humans 207 Coevolution • Thompson, J. N. (1994). The Coevolutionary Process. Chicago: University of Chicago Press. pp. 376 pp. ISBN 0-226-79759-7. External links • Coevolution (http://www.cosmolearning.com/video-lectures/coevolution-6703/), video of lecture by Stephen C. Stearns (Yale University) • Mintzer, Alex; Vinson, S.B.. "Kinship and incompatibility between colonies of the acacia ant Pseudomyrex ferruginea". Behavioral Ecology and Sociobiology 17 (1): 75–78. Abstract (http://www.jstor.org/stable/ 4599807) • Armstrong, W.P.. "The Yucca and its Moth" (http://waynesword.palomar.edu/ww0902a.htm). Wayne's Word. Palomar College. Retrieved 2011-03-29. Evolutionary art Evolutionary art is created using a computer. The process starts by having a population of many randomly generated individual representations of artworks. Each representation is evaluated for its aesthetic value and given a fitness score. The individuals with the higher fitness scores have a higher chance of remaining in the population while individuals with lower fitness scores are more likely to be removed from the population. This is the evolutionary principle of Survival of the fittest. The survivors are randomly selected in pairs to mate with each other and have offspring. Each offspring will also be a representation of an art work with some inherited properties from both of its parents. These offspring will then be added to the population and will also be evaluated and given a fitness score. This process of evaluation, selection and An image generated using an evolutionary algorithm mating is repeated for many generations. Sometimes mutation is also applied to add new properties or change existing properties of a few randomly selected individuals. Over time the pressure from the fitness selection generally causes the evolution of more aesthetic combinations of properties that make up the representations of the artworks. Evolutionary art is a branch of Generative art, which system is characterized by the use of evolutionary principles and natural selection as generative procedure. It distinguishes from BioArt by its medium dependency. If the latter adapts a similar project with carbon-based organisms, Evolutionary Art evolves silicon-based systems. In common with natural selection and animal husbandry, the members of a population undergoing artificial evolution modify their form or behavior over many reproductive generations in response to a selective regime. In interactive evolution the selective regime may be applied by the viewer explicitly by selecting individuals which are aesthetically pleasing. Alternatively a selection pressure can be generated implicitly, for example according to the length of time a viewer spends near a piece of evolving art. Equally, evolution may be employed as a mechanism for generating a dynamic world of adaptive individuals, in which the selection pressure is imposed by the program, and the viewer plays no role in selection, as in the Black Shoals project. 208 Evolutionary art Further reading • Metacreations: Art and Artificial Life, M Whitelaw, 2004, MIT Press • The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music [1], Juan Romero and Penousal Machado (eds.), 2007, Springer • Evolutionary Art and Computers, W Latham, S Todd, 1992, Academic Press • Genetic Algorithms in Visual Art and Music Special Edition: Leonardo. VOL. 35, ISSUE 2 - 2002 (Part I), C Johnson, J Romero Cardalda (eds), 2002, MIT Press • Evolved Art: Turtles - Volume One [2], ISBN 978-0-615-30034-4, Tim Endres, 2009, EvolvedArt.biz Conferences • "Evomusart. 1st International Conference and 10th European Event on Evolutionary and Biologically Inspired Music, Sound, Art and Design" [3] External links • "Evolutionary Art Gallery" [4], by Thomas Fernandez • "Biomorphs" [5], by Richard Dawkins • EndlessForms.com [3], Collaborative interactive evolution allowing you to evolve 3D objects and have them 3D printed. • "MusiGenesis" [6], a program that evolves music on a PC • "Evolve" [7], a program by Josh Lee that evolves art through a voting process. • "Living Image Project" [8], a site where images are evolved based on votes of visitors. • "An evolutionary art program using Cartesian Genetic Programming" [9] • Evolutionary Art on the Web [4] Interactively generate Mondriaan, Theo van Doesburg, Mandala and Fractal art. • "Darwinian Poetry" [12] • "One mans eyes?" [10], Aesthetically evolved images by Ashley Mills. • "E-volver" [8], interactive breeding units. • "Breed" [11], evolved sculptures produced by rapid manufacturing techniques. • "ImageBreeder" [12], an online breeder and gallery for users to submit evolved images. • "Picbreeder" [15], Collaborative breeder allowing branching from other users' creations that produces pictures like faces and spaceships. • "CFDG Mutate" [13], a tool for image evolution based on Chris Coyne's Context Free Design Grammar. • "xTNZ" [14], a three-dimensional ecosystem, where creatures evolve shapes and sounds. • The Art of Artificial Evolution: A Handbook on Evolutionary Art and Music [1] • Evolved Turtle Website [15] Evolved Turtle Website - Evolve art based on Turtle Logo using the Windows app BioLogo. • Evolvotron [16] - Evolutionary art software (example output [17]). 209 Evolutionary art 210 References [1] http:/ / art-artificial-evolution. dei. uc. pt/ [2] http:/ / www. amazon. com/ Evolved-Art-Turtles-Tim-Endres/ dp/ 0615300340/ [3] http:/ / evostar. dei. uc. pt/ 2012/ call-for-contributions/ evomusart/ [4] http:/ / www. cse. fau. edu/ ~thomas/ GraphicApDev/ ThomasFernandezArt. html [5] http:/ / www. freethoughtdebater. com/ ALifeBiomorphsAbout. htm [6] http:/ / www. musigenesis. com [7] http:/ / artdent. homelinux. net/ evolve/ about/ [8] http:/ / w-shadow. com/ li/ [9] http:/ / www. emoware. org/ evolutionary_art. asp [10] http:/ / www. ashleymills. com/ ?q=ae [11] http:/ / www. xs4all. nl/ ~notnot/ breed/ Breed. html [12] http:/ / www. imagebreeder. com [13] http:/ / www. wickedbean. co. uk/ cfdg/ index. html [14] http:/ / www. pikiproductions. com/ rui/ xtnz/ index. html [15] http:/ / www. evolvedturtle. com/ [16] http:/ / www. bottlenose. demon. co. uk/ share/ evolvotron/ index. htm [17] http:/ / www. bottlenose. demon. co. uk/ share/ evolvotron/ gallery. htm Artificial life Artificial life (often abbreviated ALife or A-Life[1]}) is a field of study and an associated art form which examine systems related to life, its processes, and its evolution through simulations using computer models, robotics, and biochemistry.[2] The discipline was named by Christopher Langton, an American computer scientist, in 1986.[3] There are three main kinds of alife,[4] named for their approaches: soft,[5] from software; hard,[6] from hardware; and wet, from biochemistry. Artificial life imitates traditional biology by trying to recreate biological phenomena.[7] The term "artificial life" is often used to specifically refer to soft alife.[8] Overview Artificial life studies the logic of living systems in artificial environments. The goal is to study the phenomena of living systems in order to come to an understanding of the complex information processing that defines such systems. Also sometimes included in the umbrella term Artificial Life are agent based systems which are used to study the emergent properties of societies of agents. While life is, by definition, alive, artificial life is generally referred to as being confined to a digital environment and existence. Philosophy The modeling philosophy of alife strongly differs from traditional modeling, by studying not only “life-as-we-know-it”, but also “life-as-it-might-be”.[9] A Braitenberg simulation, programmed in breve, an artificial life simulator In the first approach, a traditional model of a biological system will focus on capturing its most important parameters. In contrast, an alife modeling approach will generally seek to decipher the most simple and general Artificial life 211 principles underlying life and implement them in a simulation. The simulation then offers the possibility to analyse new, different life-like systems. Red'ko proposed to generalize this distinction not just to the modeling of life, but to any process. This led to the more general distinction of "processes-as-we-know-them" and "processes-as-they-could-be" [10] At present, the commonly accepted definition of life does not consider any current alife simulations or softwares to be alive, and they do not constitute part of the evolutionary process of any ecosystem. However, different opinions about artificial life's potential have arisen: • The strong alife (cf. Strong AI) position states that "life is a process which can be abstracted away from any particular medium" (John von Neumann). Notably, Tom Ray declared that his program Tierra is not simulating life in a computer but synthesizing it. • The weak alife position denies the possibility of generating a "living process" outside of a chemical solution. Its researchers try instead to simulate life processes to understand the underlying mechanics of biological phenomena. Software-based - "soft" Techniques • Cellular automata were used in the early days of artificial life, and they are still often used for ease of scalability and parallelization. Alife and cellular automata share a closely tied history. • Neural networks are sometimes used to model the brain of an agent. Although traditionally more of an artificial intelligence technique, neural nets can be important for simulating population dynamics of organisms that can learn. The symbiosis between learning and evolution is central to theories about the development of instincts in organisms with higher neurological complexity, as in, for instance, the Baldwin effect. Notable simulators This is a list of artificial life/digital organism simulators, organized by the method of creature definition. Name Driven By Started Ended Aevol translatable dna 2003 NA Avida executable dna 1993 NA breve executable dna 2006 NA Creatures neural net Darwinbots executable dna 2003 DigiHive executable dna 2006 2009 Evolve 4.0 executable dna 1996 2007 Framsticks executable dna 1996 NA Primordial life executable dna 1996 2003 TechnoSphere modules Tierra executable dna Noble Ape neural net Polyworld neural net 3D Virtual Creature Evolution neural net early 1990s ? NA Artificial life Program-based Further information: programming game These contain organisms with a complex DNA language, usually Turing complete. This language is more often in the form of a computer program than actual biological DNA. Assembly derivatives are the most common languages used. Use of cellular automata is common but not required. Module-based Individual modules are added to a creature. These modules modify the creature's behaviors and characteristics either directly, by hard coding into the simulation (leg type A increases speed and metabolism), or indirectly, through the emergent interactions between a creature's modules (leg type A moves up and down with a frequency of X, which interacts with other legs to create motion). Generally these are simulators which emphasize user creation and accessibility over mutation and evolution. Parameter-based Organisms are generally constructed with pre-defined and fixed behaviors that are controlled by various parameters that mutate. That is, each organism contains a collection of numbers or other finite parameters. Each parameter controls one or several aspects of an organism in a well-defined way. Neural net–based These simulations have creatures that learn and grow using neural nets or a close derivative. Emphasis is often, although not always, more on learning than on natural selection. Hardware-based - "hard" Further information: Robot Hardware-based artificial life mainly consist of robots, that is, automatically guided machines, able to do tasks on their own. Biochemical-based - "wet" Further information: Synthetic life and Synthetic biology Biochemical-based life is studied in the field of synthetic biology. It involves e.g. the creation of synthetic DNA. The term "wet" is an extension of the term "wetware". Related subjects 1. Artificial intelligence has traditionally used a top down approach, while alife generally works from the bottom up.[11] 2. Artificial chemistry started as a method within the alife community to abstract the processes of chemical reactions. 3. Evolutionary algorithms are a practical application of the weak alife principle applied to optimization problems. Many optimization algorithms have been crafted which borrow from or closely mirror alife techniques. The primary difference lies in explicitly defining the fitness of an agent by its ability to solve a problem, instead of its ability to find food, reproduce, or avoid death. The following is a list of evolutionary algorithms closely related to and used in alife: • Ant colony optimization • Evolutionary algorithm • Genetic algorithm 212 Artificial life • Genetic programming • Swarm intelligence 4. Evolutionary art uses techniques and methods from artificial life to create new forms of art. 5. Evolutionary music uses similar techniques, but applied to music instead of visual art. 6. Abiogenesis and the origin of life sometimes employ alife methodologies as well. Criticism Alife has had a controversial history. John Maynard Smith criticized certain artificial life work in 1994 as "fact-free science".[12] However, the recent publication of artificial life articles in widely read journals such as Science and Nature is evidence that artificial life techniques are becoming more accepted in the mainstream, at least as a method of studying evolution.[13] References [1] Molecules and Thoughts Y Tarnopolsky - 2003 "Artificial Life (often abbreviated as Alife or A-life) is a small universe existing parallel to the much larger Artificial Intelligence. The origins of both areas were different." [2] "Dictionary.com definition" (http:/ / dictionary. reference. com/ browse/ artificial life). . Retrieved 2007-01-19. [3] The MIT Encyclopedia of the Cognitive Sciences (http:/ / books. google. com/ books?id=-wt1aZrGXLYC& pg=PA37& cd=1#v=onepage), The MIT Press, p.37. ISBN 978-0-262-73144-7 [4] Mark A. Bedau (November 2003). "Artificial life: organization, adaptation and complexity from the bottom up" (http:/ / www. reed. edu/ ~mab/ publications/ papers/ BedauTICS03. pdf) (PDF). TRENDS in Cognitive Sciences. . Retrieved 2007-01-19. [5] Maciej Komosinski and Andrew Adamatzky (2009). Artificial Life Models in Software (http:/ / www. springer. com/ computer/ mathematics/ book/ 978-1-84882-284-9). New York: Springer. ISBN 978-1-84882-284-9. . [6] Andrew Adamatzky and Maciej Komosinski (2009). Artificial Life Models in Hardware (http:/ / www. springer. com/ computer/ hardware/ book/ 978-1-84882-529-1). New York: Springer. ISBN 978-1-84882-529-1. . [7] Christopher Langton. "What is Artificial Life?" (http:/ / zooland. alife. org/ ). . Retrieved 2007-01-19. [8] John Johnston, (2008) "The Allure of Machinic Life: Cybernetics, Artificial Life, and the New AI", MIT Press [9] See Langton, C. G. 1992. Artificial Life (http:/ / www. probelog. com/ texts/ Langton_al. pdf). Addison-Wesley. ., section 1 [10] See Red'ko, V. G. 1999. Mathematical Modeling of Evolution (http:/ / pespmc1. vub. ac. be/ MATHME. html). in: F. Heylighen, C. Joslyn and V. Turchin (editors): Principia Cybernetica Web (Principia Cybernetica, Brussels). For the importance of ALife modeling from a cosmic perspective, see also Vidal, C. 2008. The Future of Scientific Simulations: from Artificial Life to Artificial Cosmogenesis (http:/ / arxiv. org/ abs/ 0803. 1087). In Death And Anti-Death , ed. Charles Tandy, 6: Thirty Years After Kurt Gödel (1906-1978) p. 285-318. Ria University Press.) [11] "AI Beyond Computer Games" (http:/ / web. archive. org/ web/ 20080701040911/ http:/ / www. lggwg. com/ wolff/ aicg99/ stern. html). Archived from the original (http:/ / lggwg. com/ wolff/ aicg99/ stern. html) on 2008-07-01. . Retrieved 2008-07-04. [12] Horgan, J. 1995. From Complexity to Perplexity. Scientific American. p107 [13] "Evolution experiments with digital organisms" (http:/ / myxo. css. msu. edu/ cgi-bin/ lenski/ prefman. pl?group=al). . Retrieved 2007-01-19. External links • • • • • Computers: Artificial Life (http://www.dmoz.org/Computers/Artificial_Life/) at the open directory project Computers: Artificial Life Framework (http://www.artificiallife.org/) International Society of Artificial Life (http://alife.org/) Artificial Life (http://www.mitpressjournals.org/loi/artl) MIT Press Journal The Artificial Life Lab (http://www.envirtech.com/artificial-life-lab.html) Envirtech Island, Second Life 213 Machine learning Machine learning Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases. A learner can take advantage of examples (data) to capture characteristics of interest of their unknown underlying probability distribution. Data can be seen as examples that illustrate relations between observed variables. A major focus of machine learning research is to automatically learn to recognize complex patterns and make intelligent decisions based on data; the difficulty lies in the fact that the set of all possible behaviors given all possible inputs is too large to be covered by the set of observed examples (training data). Hence the learner must generalize from the given examples, so as to be able to produce a useful output in new cases. Definition Tom M. Mitchell provided a widely quoted definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.[1] Generalization Generalization is the ability of a machine learning algorithm to perform accurately on new, unseen examples after training on a finite data set. The core objective of a learner is to generalize from its experience.[2] The training examples from its experience come from some generally unknown probability distribution and the learner has to extract from them something more general, something about that distribution, that allows it to produce useful answers in new cases. Machine learning, knowledge discovery in databases (KDD) and data mining These three terms are commonly confused, as they often employ the same methods and overlap strongly. They can be roughly separated as follows: • Machine learning focuses on the prediction, based on known properties learned from the training data • Data mining (which is the analysis step of Knowledge Discovery in Databases) focuses on the discovery of (previously) unknown properties on the data However, these two areas overlap in many ways: data mining uses many machine learning methods, but often with a slightly different goal in mind. On the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, the performance is usually evaluated with respect to the ability to reproduce known knowledge, while in KDD the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. 214 Machine learning Human interaction Some machine learning systems attempt to eliminate the need for human intuition in data analysis, while others adopt a collaborative approach between human and machine. Human intuition cannot, however, be entirely eliminated, since the system's designer must specify how the data is to be represented and what mechanisms will be used to search for a characterization of the data. Algorithm types Machine learning algorithms can be organized into a taxonomy based on the desired outcome of the algorithm. • Supervised learning generates a function that maps inputs to desired outputs (also called labels, because they are often provided by human experts labeling the training examples). For example, in a classification problem, the learner approximates a function mapping a vector into classes by looking at input-output examples of the function. • Unsupervised learning models a set of inputs, like clustering. See also data mining and knowledge discovery. • Semi-supervised learning combines both labeled and unlabeled examples to generate an appropriate function or classifier. • Reinforcement learning learns how to act given an observation of the world. Every action has some impact in the environment, and the environment provides feedback in the form of rewards that guides the learning algorithm. • Transduction, or transductive inference, tries to predict new outputs on specific and fixed (test) cases from observed, specific (training) cases. • Learning to learn learns its own inductive bias based on previous experience. Theory The computational analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory. Because training sets are finite and the future is uncertain, learning theory usually does not yield guarantees of the performance of algorithms. Instead, probabilistic bounds on the performance are quite common. In addition to performance bounds, computational learning theorists study the time complexity and feasibility of learning. In computational learning theory, a computation is considered feasible if it can be done in polynomial time. There are two kinds of time complexity results. Positive results show that a certain class of functions can be learned in polynomial time. Negative results show that certain classes cannot be learned in polynomial time. There are many similarities between machine learning theory and statistics, although they use different terms. Approaches Decision tree learning Decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the item's target value. Association rule learning Association rule learning is a method for discovering interesting relations between variables in large databases. Artificial neural networks 215 Machine learning An artificial neural network (ANN) learning algorithm, usually called "neural network" (NN), is a learning algorithm that is inspired by the structure and functional aspects of biological neural networks. Computations are structured in terms of an interconnected group of artificial neurons, processing information using a connectionist approach to computation. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs, to find patterns in data, or to capture the statistical structure in an unknown joint probability distribution between observed variables. Genetic programming Genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. It is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task. Inductive logic programming Inductive logic programming (ILP) is an approach to rule learning using logic programming as a uniform representation for examples, background knowledge, and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesized logic program which entails all the positive and none of the negative examples. Support vector machines Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Clustering Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense, while observations in different clusters are dissimilar. The variety of clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated for example by internal compactness (similarity between members of the same cluster) and separation between different clusters. Other methods are based on estimated density and graph connectivity. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis. Bayesian networks A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning. 216 Machine learning Reinforcement learning Reinforcement learning is concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Reinforcement learning differs from the supervised learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. Representation learning Several learning algorithms, mostly unsupervised learning algorithms, aim at discovering better representations of the inputs provided during training. Classical examples include principal components analysis and cluster analysis. Representation learning algorithms often attempt to preserve the information in their input but transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions, allowing to reconstruct the inputs coming from the unknown data generating distribution, while not being necessarily faithful for configurations that are implausible under that distribution. Manifold learning algorithms attempt to do so under the constraint that the learned representation is low-dimensional. Sparse coding algorithms attempt to do so under the constraint that the learned representation is sparse (has many zeros). Deep learning algorithms discover multiple levels of representation, or a hierarchy of features, with higher-level, more abstract features defined in terms of (or generating) lower-level features. It has been argued that an intelligent machine is one that learns a representation that disentangles the underlying factors of variation that explain the observed data.[3] Sparse Dictionary Learning Sparse dictionary learning has been successfully used in a number of learning applications. In this method, a datum is represented as a linear combination of basis functions, and the coefficients are assumed to be sparse. Let x be a d-dimensional datum, D be a d by n matrix, where each column of D represent a basis function. r is the coefficient to represent x using D. Mathematically, sparse dictionary learning means the following where r is sparse. Generally speaking, n is assumed to be larger than d to allow the freedom for a sparse representation. Sparse dictionary learning has been applied in several contexts. In classification, the problem is to determine whether a new data belongs to which classes. Suppose we already build a dictionary for each class, then a new data is associate to the class such that it is best sparsely represented by the corresponding dictionary. People also applied sparse dictionary learning in image denoising. The key idea is that clean image path can be sparsely represented by a image dictionary, but the noise cannot. User can refer to [4] if interested. Applications Applications for machine learning include: • • • • • • • • • machine perception computer vision natural language processing syntactic pattern recognition search engines medical diagnosis bioinformatics brain-machine interfaces cheminformatics • Detecting credit card fraud • stock market analysis 217 Machine learning • • • • • • • • • • • • Classifying DNA sequences Sequence mining speech and handwriting recognition object recognition in computer vision game playing software engineering adaptive websites robot locomotion computational finance structural health monitoring. Sentiment Analysis (or Opinion Mining). Affective computing In 2006, the on-line movie company Netflix held the first "Netflix Prize" competition to find a program to better predict user preferences and beat its existing Netflix movie recommendation system by at least 10%. The AT&T Research Team BellKor beat out several other teams with their machine learning program "Pragmatic Chaos". After winning several minor prizes, it won the grand prize competition in 2009 for $1 million.[5] Software RapidMiner, LIONsolver, KNIME, Weka, ODM, Shogun toolbox, Orange, Apache Mahout, scikit-learn, mlpy are software suites containing a variety of machine learning algorithms. Journals and conferences • • • • • • Machine Learning (journal) Journal of Machine Learning Research Neural Computation (journal) Journal of Intelligent Systems(journal) [6] International Conference on Machine Learning (ICML) (conference) Neural Information Processing Systems (NIPS) (conference) References [1] * Mitchell, T. (1997). Machine Learning, McGraw Hill. ISBN 0-07-042807-7, p.2. [2] Christopher M. Bishop (2006) Pattern Recognition and Machine Learning, Springer ISBN 0-387-31073-8. [3] Yoshua Bengio (2009). Learning Deep Architectures for AI (http:/ / books. google. com/ books?id=cq5ewg7FniMC& pg=PA3). Now Publishers Inc.. p. 1–3. ISBN 978-1-60198-294-0. . [4] Aharon, M, M Elad, and A Bruckstein. 2006. “K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation.” Signal Processing, IEEE Transactions on 54 (11): 4311-4322 [5] "BelKor Home Page" (http:/ / www2. research. att. com/ ~volinsky/ netflix/ ) research.att.com [6] http:/ / www. degruyter. de/ journals/ jisys/ detailEn. cfm 218 Machine learning Further reading • Sergios Theodoridis, Konstantinos Koutroumbas (2009) "Pattern Recognition", 4th Edition, Academic Press, ISBN 978-1-59749-272-0. • Ethem Alpaydın (2004) Introduction to Machine Learning (Adaptive Computation and Machine Learning), MIT Press, ISBN 0-262-01211-1 • Bing Liu (2007), Web Data Mining: Exploring Hyperlinks, Contents and Usage Data (http://www.cs.uic.edu/ ~liub/WebMiningBook.html). Springer, ISBN 3-540-37881-2 • Toby Segaran, Programming Collective Intelligence, O'Reilly ISBN 0-596-52932-5 • Ray Solomonoff, " An Inductive Inference Machine (http://world.std.com/~rjs/indinf56.pdf)" A privately circulated report from the 1956 Dartmouth Summer Research Conference on AI. • Ray Solomonoff, An Inductive Inference Machine, IRE Convention Record, Section on Information Theory, Part 2, pp., 56-62, 1957. • Ryszard S. Michalski, Jaime G. Carbonell, Tom M. Mitchell (1983), Machine Learning: An Artificial Intelligence Approach, Tioga Publishing Company, ISBN 0-935382-05-4. • Ryszard S. Michalski, Jaime G. Carbonell, Tom M. Mitchell (1986), Machine Learning: An Artificial Intelligence Approach, Volume II, Morgan Kaufmann, ISBN 0-934613-00-1. • Yves Kodratoff, Ryszard S. Michalski (1990), Machine Learning: An Artificial Intelligence Approach, Volume III, Morgan Kaufmann, ISBN 1-55860-119-8. • Ryszard S. Michalski, George Tecuci (1994), Machine Learning: A Multistrategy Approach, Volume IV, Morgan Kaufmann, ISBN 1-55860-251-8. • Bishop, C.M. (1995). Neural Networks for Pattern Recognition, Oxford University Press. ISBN 0-19-853864-2. • Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York, ISBN 0-471-05669-3. • Huang T.-M., Kecman V., Kopriva I. (2006), Kernel Based Algorithms for Mining Huge Data Sets, Supervised, Semi-supervised, and Unsupervised Learning (http://learning-from-data.com), Springer-Verlag, Berlin, Heidelberg, 260 pp. 96 illus., Hardcover, ISBN 3-540-31681-7. • KECMAN Vojislav (2001), Learning and Soft Computing, Support Vector Machines, Neural Networks and Fuzzy Logic Models (http://support-vector.ws), The MIT Press, Cambridge, MA, 608 pp., 268 illus., ISBN 0-262-11255-8. • MacKay, D.J.C. (2003). Information Theory, Inference, and Learning Algorithms (http://www.inference.phy. cam.ac.uk/mackay/itila/), Cambridge University Press. ISBN 0-521-64298-1. • Ian H. Witten and Eibe Frank Data Mining: Practical machine learning tools and techniques Morgan Kaufmann ISBN 0-12-088407-0. • Sholom Weiss and Casimir Kulikowski (1991). Computer Systems That Learn, Morgan Kaufmann. ISBN 1-55860-065-5. • Mierswa, Ingo and Wurst, Michael and Klinkenberg, Ralf and Scholz, Martin and Euler, Timm: YALE: Rapid Prototyping for Complex Data Mining Tasks, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), 2006. • Trevor Hastie, Robert Tibshirani and Jerome Friedman (2001). The Elements of Statistical Learning (http:// www-stat.stanford.edu/~tibs/ElemStatLearn/), Springer. ISBN 0-387-95284-5. • Vladimir Vapnik (1998). Statistical Learning Theory. Wiley-Interscience, ISBN 0-471-03003-1. 219 Machine learning External links • International Machine Learning Society (http://machinelearning.org/) • There is a popular online course by Andrew Ng, at ml-class.org (http://www.ml-class.org). It uses GNU Octave. The course is a free version of Stanford University's actual course, whose lectures are also available for free (http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1). • Machine Learning Video Lectures (http://videolectures.net/Top/Computer_Science/Machine_Learning/) Evolvable hardware Evolvable hardware (EH) is a new field about the use of evolutionary algorithms (EA) to create specialized electronics without manual engineering. It brings together reconfigurable hardware, artificial intelligence, fault tolerance and autonomous systems. Evolvable hardware refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment. Introduction In its most fundamental form an evolutionary algorithm manipulates a population of individuals where each individual describes how to construct a candidate circuit. Each circuit is assigned a fitness, which indicates how well a candidate circuit satisfies the design specification. The evolutionary algorithm uses stochastic operators to evolve new circuit configurations from existing ones. Done properly, over time the evolutionary algorithm will evolve a circuit configuration that exhibits desirable behavior. Each candidate circuit can either be simulated or physically implemented in a reconfigurable device. Typical reconfigurable devices are field-programmable gate arrays (for digital designs) or field-programmable analog arrays (for analog designs). At a lower level of abstraction are the field-programmable transistor arrays that can implement either digital or analog designs. The concept was pioneered by Adrian Thompson at the University of Sussex, England, who in 1996 evolved a tone discriminator using fewer than 40 programmable logic gates and no clock signal in a FPGA. This is a remarkably small design for such a device and relied on exploiting peculiarities of the hardware that engineers normally avoid. For example, one group of gates has no logical connection to the rest of the circuit, yet is crucial to its function. Why evolve circuits? In many cases, conventional design methods (formulas, etc.) can be used to design a circuit. But in other cases, the design specification doesn't provide sufficient information to permit using conventional design methods. For example, the specification may only state desired behavior of the target hardware. In other cases, an existing circuit must adapt—i.e., modify its configuration—to compensate for faults or perhaps a changing operational environment. For instance, deep-space probes may encounter sudden high radiation environments, which alter a circuit's performance; the circuit must self-adapt to restore as much of the original behavior as possible. 220 Evolvable hardware Finding the fitness of an evolved circuit The fitness of an evolved circuit is a measure of how well the circuit matches the design specification. Fitness in evolvable hardware problems is determined via two methods:: • extrinsic evolution: all circuits are simulated to see how they perform • intrinsic evolution : physical tests are run on actual hardware. In extrinsic evolution only the final best solution in the final population of the evolutionary algorithm is physically implemented, whereas with intrinsic evolution every individual in every generation of the EA's population is physically realized and tested. Future research directions Evolvable hardware problems fall into two categories: original design and adaptive systems. Original design uses evolutionary algorithms to design a system that meets a predefined specification. Adaptive systems reconfigure an existing design to counteract faults or a changed operational environment. Original design of digital systems is not of much interest because industry already can synthesize enormously complex circuitry. For example, one can buy IP to synthesize USB port circuitry, ethernet microcontrollers and even entire RISC processors. Some research into original design still yields useful results, for example genetic algorithms have been used to design logic systems with integrated fault detection that outperform hand designed equivalents. Original design of analog circuitry is still a wide-open research area. Indeed, the analog design industry is nowhere near as mature as is the digital design industry. Adaptive systems has been and remains an area of intense interest. Literature • Garrison W. Greenwood and Andrew M. Tyrrell, Introduction to Evolvable Hardware: A Practical Guide for Designing Self-Adaptive Systems, Wiley-IEEE Press, 2006 External links • • • • • • • • NASA-DoD-sponsored conference 2004 [1] NASA-DoD-sponsored conference 2005 [2] NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2006) [3] NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2007) [4] NASA used a genetic algorithm to design a novel antenna [5] (see PDF [6] paper for details) Adrian Thompson's Research Page [7] Adrian Thompson's paper on the Discriminator [8] Evolutionary Electronics at the University of Sussex [9] 221 Evolvable hardware 222 References [1] [2] [3] [4] [5] [6] [7] [8] [9] http:/ / ehw. jpl. nasa. gov/ events/ nasaeh04/ http:/ / ic. arc. nasa. gov/ projects/ eh2005/ http:/ / ehw. jpl. nasa. gov/ events/ ahs2006/ http:/ / www. see. ed. ac. uk/ ahs2007/ AHS. htm http:/ / www. arc. nasa. gov/ exploringtheuniverse-evolvablesystems. cfm http:/ / www. genetic-programming. org/ gecco2004hc/ lohn-paper. pdf http:/ / www. informatics. sussex. ac. uk/ users/ adrianth/ ade. html http:/ / www. informatics. sussex. ac. uk/ users/ adrianth/ ices96/ paper. html http:/ / www. informatics. sussex. ac. uk/ users/ adrianth/ NEAT Particles NEAT Particles is an Interactive evolutionary computation program that enables users to evolve particle systems intended for use as special effects in video games or movie graphics. Rather than being hand-coded like typical particle systems, the behaviors of NEAT Particle effects are evolved by user preference. Therefore non-programmer, non-artist users may evolve complex and unique special effects in real time. NEAT Particles is meant to augment and assist the time-consuming computer graphics content generation process. Method In NEAT Particles, each particle system is controlled by a Compositional pattern-producing network (CPPN), a type of artificial neural network, or ANN. In other words, the usually hand-coded 'rules' of a particle system are replaced by automatically generated CPPNs. The CPPNs are evolved and complexified by NeuroEvolution of Augmenting Topologies (NEAT). A simple, interactive evolutionary computation (IEC) interface enables user guided evolution. In this manner increasingly complex particle system effects are evolved by user preference. NEAT Particles IEC interface. Benefit The main benefit of NEAT Particles is to decouple particle system creation from programming, allowing unique and interesting effects to be quickly evolved by users without programming or artistic skill. Additionally, it provides a way for content developers to explore the range of possible effects. And finally, it can act as a concept art tool or idea generator, in which novel and useful effects are easily discovered. Close up of an evolved particle effect and its ANN. NEAT Particles Implications The methodology of NEAT Particles can be applied to generation of other types of content, such as 3D models or programmable shader effects. The most significant implication of NEAT Particles and other Interactive evolutionary computation applications, is the possibility of automated content generation within a game itself, while it is played. Bibliography • Erin Hastings, Ratan Guha, and Kenneth O. Stanley (2007). "NEAT Particles: Design, Representation, and Animation of Particle System Effects" [1]. Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG'07). External links • "Evolutionary Complexity Research Group at UCF" [2] - home of NEAT Particles and other evolutionary complexity research projects • "NEAT Particles" [3] - latest source code and executable References [1] http:/ / eplex. cs. ucf. edu/ papers/ hastings_cig07. pdf [2] http:/ / www. cs. ucf. edu/ eplex [3] http:/ / eplex. cs. ucf. edu/ software/ NEAT_Particles1. 0. zip 223 Article Sources and Contributors Article Sources and Contributors Evolutionary computation Source: http://en.wikipedia.org/w/index.php?oldid=483908669 Contributors: Adoniscik, Alex Kosorukoff, Antonielly, Arkenflame, Babajobu, Biblbroks, Calltech, CharlesGillingham, Dzkd, Ehasl, Epipelagic, Epktsang, Ettrig, FerzenR, Fheyligh, Fryed-peach, Gary King, George100, HRV, Harizotoh9, Headbomb, Horsman, Inego, Intelcompu, Jiy, Jjmerelo, Jlaire, Joe Decker, JonHarder, Jr271, Julian, Jwojt, Kamitsaha, Karada, Lexor, LilHelpa, Lordvolton, Lotje, Luís Felipe Braga, Mdorigo, Michael Allan, Michael Hardy, MikiWiki, Mneser, Mohdavary, Moxon, Obscurans, Oli Filth, Paskornc, Pdcook, Peterdjones, Pruetboonma, Ritchy, Rjwilmsi, Ronz, Salih, Samsara, Sean.hoyland, Sergey539, Tense, TheAMmollusc, TimVickers, Tony1, Vrenator, Wavelength, Wo.luren, Woohookitty, Zach Winkler, Zilupe, Zwgeem, Идтиорел, 85 anonymous edits Evolutionary algorithm Source: http://en.wikipedia.org/w/index.php?oldid=480491843 Contributors: Adrianwn, Aleph Infinity, Alex Kosorukoff, Algorithms, Andmats, Apankrat, Armchair info guy, Armehrabian, Artur adib, Auka, BertSeghers, Bissinger, Boxplot, Brick Thrower, Buenasdiaz, Chire, Chrischan, Cyde, Dbachmann, Diomidis Spinellis, Djoshi116, Duncharris, Dzkd, Evolvsolid, Extro, Ferrarisailor, Gaius Cornelius, George100, Gmlk, Gragus, Gwern, Honeybee-sci, Hongooi, JHunterJ, JackH, JamesBWatson, Jannetta, Jesin, Jonkerz, Jr271, Jwdietrich2, Karada, Kborer, Kimiko, Kjells, Kotasik, KrakatoaKatie, Kumioko, Lambiam, Lexor, Lh389, Lordvolton, Lorian, Luebuwei, M samadi, Magnus Manske, Mark Renier, Markus.Waibel, Marudubshinki, Michael Hardy, Mikeblas, Mneser, Mohdavary, Mr3641, Mserge, Netesq, Ngb, Nojhan, Obscurans, Oli Filth, Paskornc, Perada, Peterdjones, Ph.eyes, Pink!Teen, Pruetboonma, RKUrsem, Riccardopoli, Ronz, Samsara, Sobelk, Sridev, Sshrdp, StephanRudlof, Stirling Newberry, Tedickey, TerminalPreppie, The RedBurn, The locster, Themfromspace, ThomHImself, TittoAssini, Tothwolf, Triponi, Wjousts, Wo.luren, Zwgeem, 143 anonymous edits Mathematical optimization Source: http://en.wikipedia.org/w/index.php?oldid=484810030 Contributors: AManWithNoPlan, APH, Aaronbrick, Ablevy, Ajgorhoe, Albert triv, Alphachimp, AnAj, Andris, Anonymous Dissident, Antonielly, Ap, Armehrabian, Arnab das, Arthur Rubin, Artur adib, Asadi1978, Ashburyjohn, Asm4, Auntof6, Awaterl, AxelBoldt, BenFrantzDale, BlakeJRiley, Bonadea, Bonnans, Boxplot, Bpadmakumar, Bradgib, Brianboonstra, Burgaz, CRGreathouse, Carbaholic, Carbo1200, Carlo.milanesi, Centathlete, Cfg1777, Chan siuman, Chaos, Charles Matthews, CharlesGillingham, Charlesreid1, Chester Markel, Cic, ConstantLearner, Crisgh, Ct529, Czenek, DRHagen, Damian Yerrick, Daniel Dickman, Daryakav, Dattorro, Daveagp, David Eppstein, David Martland, David.Monniaux, Deeptrivia, Delaszk, Deuxoursendormis, Dianegarey, Diego Moya, Diracula, Discospinster, Dmitrey, DonSiano, Doobliebop, Dpbert, Dsol, Dsz4, Duoduoduo, Dwassel, Dysprosia, Edesigner, Ekojnoekoj, EncMstr, Encyclops, Epistemenical, Erkan Yilmaz, Fintor, FiveColourMap, Fred Bauder, G.de.Lange, Galoubet, Georg Stillfried, Gglockner, Ggpauly, Giftlite, H Padleckas, H.ehsaan, Harrycaterpillar, Headbomb, Heroestr, HiYoSilver01, Hike395, Hosseininassab, Hu12, Hua001, Iknowyourider, InverseHypercube, Ish ishwar, Isheden, Isnow, JFPuget, JPRBW, Jackzhp, Jason Quinn, Jasonb05, Jean-Charles.Gilbert, Jitse Niesen, Jmc200, John of Reading, Johngcarlsson, JonMcLoone, Jonnat, Jowa fan, Jurgen, Justin W Smith, KaHa242, Kamitsaha, Karada, Katie O'Hare, Kiefer.Wolfowitz, Kiril Simeonovski, Klochkov.ivan, Knillinux, KrakatoaKatie, Krystofer, LSpring, LastChanceToBe, Lavaka, Leonardo61, Lethe, LokiClock, Ltk, Lycurgus, Lylenorton, MIT Trekkie, MSchlueter, Mange01, Mangogirl2, Marcol, MarkSweep, MartinDK, Martynas Patasius, Mat cross, MaxSem, MaximizeMinimize, Mcld, Mcmlxxxi, Mdd, Mdwang, Metafun, Michael Hardy, Mikewax, Misfeldt, Moink, MrOllie, Mrbynum, Msh210, Mxn, Myleslong, Nacopt, Nageh, Nimur, Nojhan, NormDor, Nwbeeson, Obradovic Goran, Ojigiri, Oleg Alexandrov, Olegalexandrov, Oli Filth, Optimering, Optiy, Osiris, Oğuz Ergin, Paolo.dL, Patrick, Pcap, Peterlin, Philip Trueman, PhotoBox, PimBeers, Polar Bear, Pontus, Pownuk, Procellarum, Pschaus, Psogeek, RKUrsem, Rade Kutil, Ravelite, Rbdevore, Retired username, Riedel, Rinconsoleao, Robiminer, Roleplayer, Rxnt, Ryguasu, Sabamo, Sahinidis, Salih, Salix alba, Sapphic, Saraedum, Schaber, Schlitz4U, Skifreak, Sliders06, Smartcat, Smmurphy, Srinnath, Stebulus, Stevan White, Struway, Suegerman, Syst analytic, TPlantenga, Tbbooher, TeaDrinker, The Anome, The Nut, Thiscomments voice, Thomasmeeks, Thoughtfire, Tizio, Topbanana, Travis.a.buckingham, Truecobb, Tsirel, Twocs, Van helsing, Vermorel, VictorAnyakin, Voyevoda, Wamanning, Wikibuki, Wmahan, Woohookitty, X7q, Xprime, YuriyMikhaylovskiy, Zfeinst, Zundark, Zwgeem, Іванко1, Щегол, 331 anonymous edits Nonlinear programming Source: http://en.wikipedia.org/w/index.php?oldid=465609172 Contributors: Ajgorhoe, Alexander.mitsos, BarryList, Broom eater, Brunner7, Charles Matthews, Dmitrey, Dto, EconoPhysicist, EdJohnston, EncMstr, Frau Holle, FrenchIsAwesome, G.de.Lange, Garde, Giftlite, Headbomb, Hike395, Hu12, Isheden, Jamelan, Jaredwf, Jean-Charles.Gilbert, Jitse Niesen, Kiefer.Wolfowitz, KrakatoaKatie, Leonard G., McSush, Mcmlxxxi, Mdd, Metiscus, Miaow Miaow, Michael Hardy, Mike40033, Monkeyman, MrOllie, Myleslong, Nacopt, Oleg Alexandrov, Olegalexandrov, PimBeers, Psvarbanov, RekishiEJ, Sabamo, Stevenj, Tgdwyer, User A1, Vgmddg, 57 anonymous edits Combinatorial optimization Source: http://en.wikipedia.org/w/index.php?oldid=487087141 Contributors: Akhil999in, Aliekens, Altenmann, Arnab das, Ben pcc, Ben1220, Bonniesteiglitz, Brunato, Brunner ru, CharlesGillingham, Cngoulimis, Cobi, Ctbolt, Daveagp, David Eppstein, Deanlaw, Diomidis Spinellis, Dmyersturnbull, Docu, Duoduoduo, Ebe123, Eiro06, Estr4ng3d, Giftlite, Giraffedata, Hike395, Isheden, Jcc1, Jonkerz, Kiefer.Wolfowitz, Kinema, Ksyrie, Lepikhin, Mellum, Michael Hardy, Miym, Moxon, Nocklas, NotQuiteEXPComplete, Pjrm, RKUrsem, Remuel, Rjpbi, RobinK, Ruud Koot, Sammy1007, Sdorrance, SilkTork, Silverfish, StoneIsle, ThomHImself, Tizio, Tomo, Tribaal, Unara, 40 anonymous edits Travelling salesman problem Source: http://en.wikipedia.org/w/index.php?oldid=487033681 Contributors: 130.233.251.xxx, 28421u2232nfenfcenc, 4ndyD, 62.202.117.xxx, ANONYMOUS COWARD0xC0DE, Aaronbrick, Adammathias, Aftermath1983, Ahoerstemeier, Akokskis, Alan.ca, AlanUS, Aldie, Altenmann, Andreas Kaufmann, Andreasr2d2, Andris, Angus Lepper, Apanag, ArglebargleIV, Aronisstav, Astral, AstroNomer, Azotlichid, B4hand, Bathysphere, Bender2k14, BenjaminTsai, Bensin, Bernard Teo, Bjornson81, Bo Jacoby, Bongwarrior, Boothinator, Brian Gunderson, Brucevdk, Brw12, Bubba73, C. lorenz, CRGreathouse, Can't sleep, clown will eat me, Capricorn42, ChangChienFu, Chris-gore, ChrisCork, Classicalecon, Cngoulimis, Coconut7594, Conversion script, CountingPine, DVdm, Daniel Karapetyan, David Eppstein, David.Mestel, David.Monniaux, David.hillshafer, DavidBiesack, Davidhorman, Dbfirs, Dcoetzee, Devis, Dino, Disavian, Donarreiskoffer, Doradus, Downtown dan seattle, DragonflySixtyseven, DreamGuy, Dwhdwh, Dysprosia, Edward, El C, Ellywa, ErnestSDavis, Fanis84, Ferris37, Fioravante Patrone, Flapitrr, Fmccown, Fmorstatter, Fredrik, French Tourist, Gaeddal, Galoubet, Gdessy, Gdr, Geofftech, Giftlite, Gnomz007, Gogo Dodo, Graham87, Greenmatter, H, Hairy Dude, Hans Adler, Haterade111, Hawk777, Herbee, Hike395, Honnza, Hyperneural, Ironholds, Irrevenant, Isaac, IstvanWolf, IvR, Ixfd64, J.delanoy, JackH, Jackbars, Jamesd9007, Jasonb05, Jeffhoy, Jim.Callahan,Orlando, John of Reading, Johngouf85, Johnleach, Jok2000, JonathanFreed, Jsamarziya, Jugander, Justin W Smith, KGV, Kane5187, Karada, Kenneth M Burke, Kenyon, Kf4bdy, Kiefer.Wolfowitz, Kjells, Klausikm, Kotasik, Kri, Ksana, Kvamsi82, Kyokpae, LFaraone, LOL, Lambiam, Lanthanum-138, Laudaka, Lingwanjae, MSGJ, MagicMatt1021, Male1979, Mantipula, MarSch, Marj Tiefert, Martynas Patasius, Materialscientist, MathMartin, Mdd, Mellum, Melsaran, Mhahsler, Michael Hardy, Michael Slone, Mild Bill Hiccup, Miym, Mojoworker, Monstergurkan, MoraSique, Mormegil, Musiphil, Mzamora2, Naff89, Nethgirb, Nguyen Thanh Quang, Ninjagecko, Nobbie, Nr9, Obradovic Goran, Orfest, Ozziev, Paul Silverman, Pauli133, Pegasusbupt, PeterC, Petrus, Pgr94, Phcho8, Piano non troppo, PierreSelim, Pleasantville, Pmdboi, Pschaus, Qaramazov, Qorilla, Quadell, R3m0t, Random contributor, Ratfox, Raul654, Reconsider the static, RedLyons, Requestion, Rheun, Richmeister, Rjwilmsi, RobinK, Rocarvaj, Ronaldo, Rror, Ruakh, Ruud Koot, Ryan Roos, STGM, Saeed.Veradi, Sahuagin, Sarkar112, Scravy, Seet82, Seraphimblade, Sergey539, Shadowjams, Sharcho, ShelfSkewed, Shoujun, Siddhant, Simetrical, Sladen, Smmurphy, Smremde, Smyth, Some standardized rigour, Soupz, South Texas Waterboy, SpNeo, Spock of Vulcan, SpuriousQ, Stemonitis, Stevertigo, Stimpy, Stochastix, StradivariusTV, Superm401, Superninja, Tamfang, Teamtheo, Tedder, That Guy, From That Show!, The Anome, The Thing That Should Not Be, The stuart, Theodore Kloba, Thisisbossi, Thore Husfeldt, Tigerqin, Tinman, Tobias Bergemann, Tom Duff, Tom3118, Tomgally, Tomhubbard, Tommy2010, Tsplog, Twas Now, Vasiľ, Vgy7ujm, WhatisFeelings?, Wizard191, Wumpus3000, Wwwwolf, Xiaojeng, Xnn, Yixin.cao, Ynhockey, Zaphraud, Zeno Gantner, ZeroOne, Zyqqh, 538 anonymous edits Constraint (mathematics) Source: http://en.wikipedia.org/w/index.php?oldid=481608171 Contributors: Ajgorhoe, Allens, ClockworkSoul, Correogsk, EmmetCaulfield, Finell, Jitse Niesen, Jrtayloriv, Michael Hardy, Nbarth, Oleg Alexandrov, Paolo.dL, Skashoob, Stefano85, T.ogar, Wohingenau, Zheric, Іванко1, 26 anonymous edits Constraint satisfaction problem Source: http://en.wikipedia.org/w/index.php?oldid=486218160 Contributors: 777sms, Alai, AndrewHowse, BACbKA, Beland, Bender2k14, Bengkui, Coneslayer, David Eppstein, Delirium, Dgessner, Diego Moya, DracoBlue, Ertuocel, Headbomb, Jamelan, Jdpipe, Jgoldnight, Jkl, Jradix, Karada, Katieh5584, Linas, Mairi, MarSch, Michael Hardy, Ogai, Oleg Alexandrov, Oliphaunt, Ott2, Patrick, R'n'B, Rl, Simeon, The Anome, Tizio, Uncle G, 35 anonymous edits Constraint satisfaction Source: http://en.wikipedia.org/w/index.php?oldid=460624017 Contributors: AndrewHowse, Antonielly, Auntof6, Carbo1200, D6, Deflective, Delirium, Diego Moya, EagleFan, EncMstr, Epktsang, Ertuocel, Grafen, Harburg, Jdpipe, Linas, LizBlankenship, MilFlyboy, Nabeth, Ott2, R'n'B, Radsz, Tgdwyer, That Guy, From That Show!, Timwi, Tizio, Uncle G, Vuara, WikHead, 27 anonymous edits Heuristic (computer science) Source: http://en.wikipedia.org/w/index.php?oldid=484595763 Contributors: Altenmann, Chris G, Leonardo61, RJFJR, 1 anonymous edits Multi-objective optimization Source: http://en.wikipedia.org/w/index.php?oldid=486621744 Contributors: Anne Koziolek, Anoyzz, BenFrantzDale, Bieren, Billinghurst, Bovineone, Brian.woolley, CheoMalanga, DanMG, DavidCBryant, Dcirovic, Dcraft96, Diego Moya, Duoduoduo, Dvvar Reyn, Gerontech, Gjacquenot, Hello Control, JRSP, Jbicik, Juanjo.durillo, Kamitsaha, Kenneth M Burke, Kiefer.Wolfowitz, Klochkov.ivan, Leonardo61, LilHelpa, Marcuswikipedian, MathMaven, Michael Hardy, Microfries, Miym, MrOllie, Mullur1729, MuthuKutty, Nojhan, Oli Filth, Paradiseo, Paskornc, Phuzion, Pruetboonma, Rjwilmsi, Robiminer, Shd, Sliders06, Timeknight, Zfeinst, 47 anonymous edits Pareto efficiency Source: http://en.wikipedia.org/w/index.php?oldid=485714019 Contributors: 524, AdamSmithee, Aenar, Alex695, Ali.erfani, Alphachimp, Anupa.chakraborty, Audacity, Bebestbe, Beefman, Bfinn, Bkessler, Blathnaid, Bluemoose, Bozboy, BrendelSignature, Brenton, Brighterorange, C S, CRGreathouse, Caseyc1031, Cgray4, Chrisbbehrens, Clementmin, Cntras, Colin Rowat, Colonies Chris, Conchisness, Correogsk, Cretog8, Dabigkid, Daspranab, DavidLevinson, Destynova, Dhochron, Diego Moya, Diomidis Spinellis, Dissident, Dlohcierekim, Dolphonia, Dungodung, DwightKingsbury, Ekoontz, ElementoX, Ellywa, EmersonLowry, Enchanter, Erianna, Ezrakilty, Filippowiki, Fit, Fluffernutter, Frank Romein, Fuzzy Logic, Geometry guy, Giftlite, Gingerjoos, Gomm, Gregalton, Gregbard, Halcyonhazard, Haonhien, Hede2000, Henrygb, Hugetim, I dream of horses, IOLJeff, Igodard, Iridescent, JForget, JaGa, Jackftwist, Jacob Lundberg, Jamdavmiller, Jameslangstonevans, Jdevine, Jeff G., Johnuniq, Josevellezcaldas, João Carlos de Campos Pimentel, Jrincayc, KarmicRag, Kazkaskazkasako, Kiefer.Wolfowitz, Kolesarm, Koolkao, Krigsmakten, Kylesable, Kzollman, Lambiam, LizardJr8, Lmdav2, Logan.aggregate, Los3, Ludwig, MPerel, Maghnus, Marek69, Mausy5043, MaxEnt, Maziotis, Mechanical digger, Meelar, Metamatic, Michael Hardy, Mikechen, Mild Bill Hiccup, Moink, Moosesheppy, Mullur1729, Mydogategodshat, Nakos2208, Nbarth, Neutrality, Niku, Ojigiri, Oleg Alexandrov, 224 Article Sources and Contributors Oliphaunt, Omnipaedista, PAR, Panscient, Patrick, Pbrandao, Pete.Hurd, Petrb, Piotrus, Postdlf, Prari, R Lowry, R'n'B, RainbowOfLight, Ratiocinate, Ravik, RayAYang, Rdalimov, Rjensen, Rjwilmsi, Roberthust, Ruy Lopez, SchfiftyThree, Scott Ritchie, Sheitan, Shervin.j, SidP, SilverStar, SimonP, Smmurphy, Splash, Staffwaterboy, Stephen B Streater, Stirling Newberry, Sydneycathryn, Tarotcards, Tercerista, The Anome, Thomasmeeks, Tide rolls, Toddnob, Tschirl, Vantelimus, Volunteer Marek, Walden, Warren Dew, Wikiborg, Wikid, Woood, Wooster, Wragge, Xnn, Zj, ZoFreX, 281 anonymous edits Stochastic programming Source: http://en.wikipedia.org/w/index.php?oldid=480197062 Contributors: 4th-otaku, BarryList, Bluebusy, Charles Matthews, Headbomb, Hike395, Hongooi, Jaredwf, Jitse Niesen, Kiefer.Wolfowitz, Marcoacostareyes, Mcmlxxxi, Michael Hardy, Myleslong, Pete.Hurd, Pierce.Schiller, Pycoucou, Rinconsoleao, Treeshar, Tribaal, Widefox, 7 anonymous edits Parallel metaheuristic Source: http://en.wikipedia.org/w/index.php?oldid=486062307 Contributors: Enrique.alba1, Falcon8765, Gregbard, Michael Hardy, Mild Bill Hiccup, Paradiseo, 1 anonymous edits There ain't no such thing as a free lunch Source: http://en.wikipedia.org/w/index.php?oldid=485155465 Contributors: A J Luxton, Aaron Schulz, Adashiel, Albmont, Altenmann, Anomalocaris, AnotherSolipsist, Avicennasis, AzaToth, Bdodo1992, Beardo, Bem47, Bevo, Bkkbrad, Brendan Moody, Brion VIBBER, Bunnyhop11, Callmederek, Chuck Marean, Classical geographer, Conical Johnson, Connelly, Dcandeto, Delldot, Denelson83, Dickpenn, DieBuche, Dpbsmith, Eregli bob, Fabulous Creature, Ghosts&empties, Hairy Dude, HalfShadow, Harami2000, Hertz1888, Hu, Iamfscked, InTeGeR13, Inhumandecency, JIP, Jdevine, Jeffq, Jlc46, Jm34harvey, Joerg Kurt Wegner, John Quiggin, John Vandenberg, Kindall, Kingturtle, Kmorozov, LGagnon, Lambiam, Larklight, Llavigne, Lowellian, Lsi, Mandarax, Master shepherd, Mattg82, MissFubar, Mozza, Mxcl, Mzajac, Nedlum, Nervousenergy, NetRolller 3D, Nwbeeson, Ossipewsk, PRRfan, Paladinwannabe2, Patrick, Paul Nollen, Pavel Vozenilek, Pcb21, Peligro, Phil Boswell, Pmanderson, Pol098, PrePressChris, Prezbo, Primadog, Priyadi, Quidam65, R Lowry, Raul654, Reinyday, Rhobite, Richy, Rls, Robert Brockway, Root4(one), Rune.welsh, Rydra Wong, Samwaltz, Sannita, Sardanaphalus, Sasuke Sarutobi, Simon Slavin, Sketch051, Skomorokh, Smallbones, Solace098, Stormwriter, Svetovid, TJRC, Tabletop, Tad Lincoln, The Shreder, TheBigR, Thetorpedodog, ThomHImself, Timwi, Tombomp, Tregoweth, Turidoth, Twas Now, Viriditas, Voidvector, Volcom65, Voretus, Waldir, Walkie, Whcodered, Winhunter, Wk muriithi, Wombletim, Woohookitty, Ww, Wwoods, X7q, YahoKa, Yopienso, Zap Rowsdower, ZimZalaBim, Zmoboros, Ὁ οἶστρος, 139 anonymous edits Fitness landscape Source: http://en.wikipedia.org/w/index.php?oldid=483946666 Contributors: AManWithNoPlan, Adam1128, AdamRetchless, AndrewHowse, Artur adib, Bamkin, BertSeghers, Cmart1, Dmr2, Donarreiskoffer, Dondegroovily, Dougher, Duncharris, HTBrooks, Harizotoh9, I am not a dog, Ian mccarthy, JonHarder, Kae1is, Kilterx, Lauranrg, Lexor, Lightmouse, Michael Hardy, Mohdavary, PAR, Samsara, Shyamal, Simeon, Sodmy, Swpb, Template namespace initialisation script, Tesseract2, Thric3, WAS 4.250, WhiteHatLurker, Wilke, ZayZayEM, 23 anonymous edits Genetic algorithm Source: http://en.wikipedia.org/w/index.php?oldid=486556124 Contributors: "alyosha", .:Ajvol:., 2fargon, A. S. Aulakh, A.Nath, AAAAA, Aabs, Acdx, AdamRaizen, Adrianwn, Ahoerstemeier, Ahyeek, Alansohn, Alex Kosorukoff, Algorithms, Aliekens, Allens, AlterMind, Andreas Kaufmann, Andy Dingley, Angrysockhop, Antandrus, AnthonyQBachler, Antonielly, Antzervos, Anubhab91, Arbor, Arkuat, Armchair info guy, Arthur Rubin, Artur adib, Asbestos, AussieScribe, Avinesh (usurped), Avoided, BAxelrod, Baguio, Beetstra, BertSeghers, Bidabadi, Biker Biker, Bjtaylor01, Bobby D. Bryant, Bockbockchicken, Bovineone, Bradka, Brat32, Breeder8128, Brick Thrower, Brinkost, BryanD, Bumbulski, CShistory, CWenger, CardinalDan, Carl Turner, Centrx, Chaosdruid, CharlesGillingham, Chipchap, Chocolateboy, Chopchopwhitey, Chris Capoccia, CloudNine, Cngoulimis, Cnilep, CoderGnome, Conway71, CosineKitty, Cpcjr, Crispin Cooper, Curps, DabMachine, Daryakav, David Eppstein, David Martland, DavidCBryant, DerrickCheng, Destynova, Dionyziz, Diroth, DixonD, Diza, Djhache, Download, Duncharris, Dylan620, Dzkd, Dúnadan, Edin1, Edrucker, Edward, Eleschinski2000, Esotericengineer, Euhapt1, Evercat, Ewlyahoocom, Felsenst, Ferrarisailor, Fheyligh, Francob, Freiberg, Frongle, Furrykef, Gaius Cornelius, Gatator, George100, Giftlite, Giraffedata, Glrx, Goobergunch, Gpel461, GraemeL, Gragus, GregorB, Grein, Grendelkhan, Gretchen Hea, Guang2500, Hellisp, Hike395, Hippietrail, Hu, InverseHypercube, J.delanoy, Janto, Jasonb05, Jasper53, Jcmiras, Jeff3000, Jeffrey Mall, Jetxee, Jitse Niesen, Jkolom, Johnuniq, Jonkerz, Josilber, Jr271, Justin W Smith, Justinaction, Jwdietrich2, Jwoodger, Jyril, K.menin, KaHa242, Kaell, Kane5187, Kcwong5, Kdakin, Keburjor, Kindyroot, Kjells, Kku, Klausikm, Kon michael, KrakatoaKatie, Kuzaar, Kwertii, Kyokpae, LMSchmitt, Larham, Lawrenceb, Lee J Haywood, Leonard^Bloom, Lexor, LieAfterLie, Loudenvier, Ludvig von Hamburger, Lugel74, MER-C, Madcoverboy, Magnus Manske, Malafaya, Male1979, Manu3d, Marco Krohn, Mark Krueger, Mark Renier, Marksale, Massimo Macconi, MattOates, Mctechuciztecatl, Mdd, Metricopolus, Michael Hardy, MikeMayer, Mikeblas, Mikołaj Koziarkiewicz, Mild Bill Hiccup, Mohan1986, Mohdavary, Mpo, Negrulio, Nentrex, Nikai, No1sundevil, Nosophorus, Novablogger, Oleg Alexandrov, Oli Filth, Omicronpersei8, Oneiros, Open2universe, Optimering, Orenburg1, Otolemur crassicaudatus, Papadim.G, Parent5446, Paskornc, Pecorajr, Pegship, Pelotas, PeterStJohn, Pgr94, Phyzome, Plasticup, Postrach, Poweron, Projectstann, Pruetboonma, Purplesword, Qed, QmunkE, Qwertyus, Radagast83, Raduberinde, Ratfox, Raulcleary, Rdelcueto, Redfoxtx, RevRagnarok, Rfl, Riccardopoli, Rjwilmsi, Roberta F., Robma, Ronz, Ruud Koot, SDas, SSZ, ST47, SamuelScarano, Sankar netsoft, Scarpy, Shyamal, Silver hr, Simeon, Simonham, Simpsons contributor, SlackerMom, Smack, Soegoe, Spoon!, SteelSoul, Stefano KALB, Steinsky, Stephenb, Stewartadcock, Stochastics, Stuartyeates, Sunandwind, Sundaryourfriend, Swarmcode, Tailboom22, Tameeria, Tapan bagchi, Tarantulae, Tarret, Taw, Techna1, TempestCA, Temporary-login, Terryn3, Texture, The Epopt, TheAMmollusc, Thomas weise, Thric3, Tide rolls, TimVickers, Timwi, Toncek, Toshke, Tribaal, Tulkolahten, Twelvethirteen, Twexcom, TyrantX, Unixcrab, Unyounyo, Useight, User A1, Utcursch, VernoWhitney, Versus, Vietbio, Vignaux, Vincom2, VladB, Waveguy, William Avery, Wjousts, Xiaojeng, Xn4, Yinon, YouAndMeBabyAintNothingButCamels, Yuanwang200409, Yuejiao Gong, Zawersh, Zwgeem, 597 anonymous edits Toy block Source: http://en.wikipedia.org/w/index.php?oldid=461201397 Contributors: 2015magroan, 21655, Android.en, ArielGold, Bkell, CIreland, Davecrosby uk, EC77QY, Enviroboy, ErinHowarth, Gamaliel, Gasheadsteve, Graham87, Hmains, Interchange88, Interiot, Jvhertum, Katharineamy, Malecasta, Nakon, Nethgirb, ONUnicorn, OTB, Picklesauce, Polylerus, Punctured Bicycle, Rajah, Reinyday, Robogun, Siawase, T.woelk, Tariqabjotu, Telescope, TenOfAllTrades, Thrissel, WhatamIdoing, Zzffirst, 竜 龍 竜 龍, 53 anonymous edits Chromosome (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=481703047 Contributors: Amir saniyan, Brookie, Ceyockey, Chopchopwhitey, Dali, David Cooke, Eequor, Heimstern, Kwertii, Michael Hardy, Mikeblas, Peter Grey, Zawersh, 8 anonymous edits Genetic operator Source: http://en.wikipedia.org/w/index.php?oldid=451611920 Contributors: Artur adib, BertSeghers, CBM, Chopchopwhitey, Diou, Docu, Edward, Eequor, Kwertii, Mark Renier, Missionpyo, NULL, Nick Number, Oleg Alexandrov, Tomaxer, WissensDürster, Yearofthedragon, Zawersh, 3 anonymous edits Crossover (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=486887380 Contributors: Canterbury Tail, Capricorn42, CharlesGillingham, Chire, Chopchopwhitey, Chris the speller, Costyn, Ebe123, Eequor, Fastfinge, Ficuep, Insanity Incarnate, Julesd, Koala man, Kwertii, Mark Renier, Missionpyo, Mohdavary, Neilc, Otcin, Pelotas, Ph.eyes, Ramana.iiit, Rgarvage, Runtime, Shwetakambare, Simonham, Ssd, Timo, Tulkolahten, WissensDürster, Woodshed, Zawersh, 46 anonymous edits Mutation (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=481243903 Contributors: BertSeghers, Chaosdruid, Chopchopwhitey, Dionyziz, Eequor, Fastfinge, Ficuep, Flamerecca, Jag123, Jeffrey Henning, Kalzekdor, Mtoxcv, Postcard Cathy, R. S. Shaw, Rgarvage, Sae1962, Shwetakambare, Tasior, Wikid77, YahoKa, 17 anonymous edits Inheritance (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=470235943 Contributors: Alai, Biochemza, Grafen, Hooperbloob, RJFJR, RedWolf, Ta bu shi da yu, TakuyaMurata, Wapcaplet Selection (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=481596180 Contributors: Aitter, Alai, Audrey.nemeth, BertSeghers, Evolutionarycomputation, Karada, Mild Bill Hiccup, Oleg Alexandrov, Owen, Pablo-flores, Radagast83, Ruud Koot, Yearofthedragon, 10 anonymous edits Tournament selection Source: http://en.wikipedia.org/w/index.php?oldid=476746359 Contributors: Atreys, Brim, Chiassons, Evolutionarycomputation, J. Finkelstein, Ksastry, Radagast83, Rbrwr, Robthebob, Will Thimbleby, Zawersh, 17 anonymous edits Truncation selection Source: http://en.wikipedia.org/w/index.php?oldid=336284941 Contributors: BertSeghers, Biochemza, Chopchopwhitey, D3, JulesH, Pearle, Ruud Koot, Snoyes, Steinsky, Tigrisek, 4 anonymous edits Fitness proportionate selection Source: http://en.wikipedia.org/w/index.php?oldid=480682378 Contributors: Acdx, AnthonyQBachler, Basu, Chopchopwhitey, Cyan, Evolutionarycomputation, Fuhghettaboutit, Hooperbloob, Jleedev, Magioladitis, Mahanga, Perfectpasta, Peter Grey, Philipppixel, Radagast83, Rmyeid, Saund, Shd, Simon.hatthon, Vinocit, 9 anonymous edits Reward-based selection Source: http://en.wikipedia.org/w/index.php?oldid=476900680 Contributors: Bearcat, Evolutionarycomputation, Felix Folio Secundus, Rjwilmsi, Skullers, TheHappiestCritic Edge recombination operator Source: http://en.wikipedia.org/w/index.php?oldid=466442663 Contributors: Allstarecho, AvicAWB, FF2010, Favonian, Gary King, J. Finkelstein, Kizengal, Koala man, Mandolinface, Moggie100, Raunaky, Sadads, TheAMmollusc, Tr00st, 8 anonymous edits Population-based incremental learning Source: http://en.wikipedia.org/w/index.php?oldid=454417448 Contributors: Adoniscik, CoderGnome, Edaeda, Edaeda2, Foobarhoge, FredTschanz, Jitse Niesen, Michael Hardy, Tkdice, WaysToEscape, 8 anonymous edits Defining length Source: http://en.wikipedia.org/w/index.php?oldid=477105522 Contributors: Alai, Alexbateman, Doranchak, JakubHampl, Jyril, Mak-hak, Melaen, Michal Jurosz, NOCHEBUENA, Nick Number, R'n'B, Torzsmokus, Trampled, Where, 5 anonymous edits 225 Article Sources and Contributors Holland's schema theorem Source: http://en.wikipedia.org/w/index.php?oldid=457008231 Contributors: Ajensen, Beetstra, Buenasdiaz, ChrisKalt, Geometry guy, Giftlite, J04n, Linas, Macha, Mthwppt, Oleg Alexandrov, Omnipaedista, SlipperyHippo, Torzsmokus, Uthbrian, 15 anonymous edits Genetic memory (computer science) Source: http://en.wikipedia.org/w/index.php?oldid=372805518 Contributors: Dbachmann, Ora Stendar, RobinK Premature convergence Source: http://en.wikipedia.org/w/index.php?oldid=457670211 Contributors: Chire, Edward, EncycloPetey, Ganymead, Gragus, J3ff, Jitse Niesen, Michael Hardy, Private meta, Tomaxer, 8 anonymous edits Schema (genetic algorithms) Source: http://en.wikipedia.org/w/index.php?oldid=466008377 Contributors: Allens, Arthena, Boing! said Zebedee, Chaosdruid, Epistemenical, J04n, Linas, The Fish, Torzsmokus, 6 anonymous edits Fitness function Source: http://en.wikipedia.org/w/index.php?oldid=439826924 Contributors: Alex Kosorukoff, Andreas Kaufmann, Artur adib, BertSeghers, Ihsankhairir, Ingolfson, Jitse Niesen, Jiuguang Wang, Kwertii, MarSch, Markus.Waibel, Mohdavary, Oleg Alexandrov, Piano non troppo, Rizzoj, S3000, Stern, TheAMmollusc, TubularWorld, VKokielov, 27 anonymous edits Black box Source: http://en.wikipedia.org/w/index.php?oldid=485144765 Contributors: AdamWro, Adoniscik, Alatro, Alex756, Alexisapple, Alhen, Altenmann, Amire80, Andreas.sta, Antonio Lopez, Badgernet, Benhocking, Benmiller314, Billgordon1099, BillyH, Blenxi, BrokenSegue, Bryan Derksen, Burns28, Bxzhang88, Can't sleep, clown will eat me, Chanlyn, Clangin, Conversion script, Correogsk, Curps, D, Daniel.Cardenas, DerHexer, Dgw, Dodger67, Drmies, Duja, DylanW, Edgar181, Espoo, Feministo, Figaro, Fosnez, Frap, Frau Holle, Garth M, Gimboid13, Glenn, Goatasaur, Grammaticus Repairo, Gronky, Hidro, Hulten, Ike9898, Inwind, IronGargoyle, Ivar Y, J.delanoy, JMSwtlk, Jebus989, Jjiijjii, Jjron, Johnuniq, Jugander, Jwrosenzweig, KeithH, Kskk2, Ksyrie, Kusluj, Kuzaar, L Kensington, LC, Lekoman, Lissajous, Lockesdonkey, Lupinelawyer, Marek69, Mark Christensen, Mausy5043, Mdd, Meggar, Metahacker, Michael Hardy, MrOllie, Mrsocial99mfine, Mstrehlke, Mtnerd, N5iln, Naohiro19, Neve224, Nick Green, Nihiltres, Nmenachemson, Ohnoitsjamie, OlEnglish, Oleg Alexandrov, Oran, Parishan, Peter Fleet, PieterJanR, Piledhigheranddeeper, Psb777, Purple omlet, R'n'B, RTC, Ravidreams, Ray G. Van De Walker, Ray Van De Walker, RazorXX8, Reallybored999, RucasHost, Rumping, Rwestera, Rz1115, SD6-Agent, Schoen, ScottMHoward, Sgtjallen, Shadowjams, Sharon08tam, Slambo, Smily, Smurrayinchester, Snowmanradio, Sopranosmob781, Spinningspark, StaticGull, TMC1221, Tarquin, Template namespace initialisation script, The Anome, Thinktdub, Thoglette, Tibinomen123, Tide rolls, Tobias Hoevekamp, Tomo, Treesmill, Tregoweth, Tubeyes, Unint, Van helsing, Vanished user 39948282, Vchadaga, Wapcaplet, WhisperToMe, Whiteghost.ink, XJamRastafire, Xerxes314, Xin0427, Zacatecnik, Zhou Yu, 158 , דולבanonymous edits Black box theory Source: http://en.wikipedia.org/w/index.php?oldid=476402791 Contributors: Alfinal, Amire80, Anarchia, BD2412, Bryan Derksen, Cjmclark, Dendodge, Derekchan831, Drift chambers, Drilnoth, Fyyer, Gregbard, Iridescent, James086, Jjron, Katharineamy, Kenji000, Linas, MCTales, Mandarax, Mdd, Neelix, Osarius, Snoyes, Susan Elisabeth McDonald, Treesmill, Viriditas, Zhen Lin, Zorblek, 26 anonymous edits Fitness approximation Source: http://en.wikipedia.org/w/index.php?oldid=475913779 Contributors: Bmiller98, Dhatfield, Drunauthorized, Jitse Niesen, LilHelpa, Michael Hardy, Mohdavary, Oliverm1983, Rich Farmbrough, TLPA2004, 24 anonymous edits Effective fitness Source: http://en.wikipedia.org/w/index.php?oldid=374309381 Contributors: AJCham, Cyrius, Falcon8765, Muchness, R'n'B, Uncle G, 3 anonymous edits Speciation (genetic algorithm) Source: http://en.wikipedia.org/w/index.php?oldid=478543716 Contributors: DavidWBrooks, J04n, Pascal.Tesson, R'n'B, Ridernyc, Tailboom22, TheAMmollusc, Uther Dhoul, Woodshed, 1 anonymous edits Genetic representation Source: http://en.wikipedia.org/w/index.php?oldid=416498614 Contributors: Alex Kosorukoff, Annonymous3456543, AvicAWB, Bobo192, Gurch, Jleedev, Kri, Mark Renier, Michal Jurosz, WAS 4.250, 6 anonymous edits Stochastic universal sampling Source: http://en.wikipedia.org/w/index.php?oldid=477453298 Contributors: Adrianwn, Evolutionarycomputation, J. Finkelstein, Melcombe, Simon.hatthon, 3 anonymous edits Quality control and genetic algorithms Source: http://en.wikipedia.org/w/index.php?oldid=461778149 Contributors: Aristides Hatjimihail, CharlesGillingham, Chase me ladies, I'm the Cavalry, Freek Verkerk, Hongooi, J04n, King of Hearts, Mdd, Michael Hardy, R'n'B, ShelfSkewed, Sigma 7, YouAndMeBabyAintNothingButCamels, 9 anonymous edits Human-based genetic algorithm Source: http://en.wikipedia.org/w/index.php?oldid=458651034 Contributors: Alex Kosorukoff, CharlesGillingham, Chimaeridae, DXBari, DanMS, DerrickCheng, Dorftrottel, Ettrig, Mark Renier, Michael Allan, Nunh-huh, Silvestre Zabala, 17 anonymous edits Interactive evolutionary computation Source: http://en.wikipedia.org/w/index.php?oldid=458673886 Contributors: Alex Kosorukoff, Bobby D. Bryant, Borderiesmarkman432, DXBari, DerrickCheng, Duncharris, FiP, Gaius Cornelius, InverseHypercube, Kamitsaha, Lordvolton, Mattw2, Michael Allan, Oleg Alexandrov, Peterdjones, Radagast83, Ruud Koot, Spot, TheProject, 24 anonymous edits Genetic programming Source: http://en.wikipedia.org/w/index.php?oldid=487139985 Contributors: 139.57.232.xxx, 216.60.221.xxx, Ahoerstemeier, Alaa safeef, Algorithms, Allens, Andreas Kaufmann, Aris Katsaris, Arkenflame, Artur adib, BAxelrod, Barek, BenjaminTsai, Boffob, Brat32, BrokenSegue, Bryan Derksen, Ceran, CharlesGillingham, Chchen, Chuffy, Classicalecon, Cmbay, Conversion script, Crispin Cooper, Cyde, David Martland, DeLarge, Diego Moya, Don4of4, Duncharris, EminNew, Especialist, Farthur2, Feijai, Firegnome, Furrykef, Golmschenk, Gragus, Guaka, Gwax, Halhen, Hari, Ianboggs, Ilent2, Jamesmichaelmcdermott, Janet Davis, Jleedev, Joel7687, Jorge.maturana, Karlyoxall, Klausikm, Klemen Kocjancic, Knomegnome, Kri, Lexor, Liao, Linas, Lmatt, Mahlon, Mark Renier, MartijnBodewes, Mdd, Mentifisto, Michal Jurosz, Micklin, Minesweeper, Mohdavary, Mr, Mrberryman, Nentrex, NicMcPhee, Nuwanda, ParadoxGreen, Pengo, Pgan002, PowerMacX, Riccardopoli, Roboo.jack, Rogerfgay, RozanovFF, Sergey539, Snowscorpio, Soler97, Squillero, Stewartadcock, Tarantulae, Teja.Nanduri, Terrycojones, Thattommyguy, Themfromspace, Thomas weise, Timwi, TittoAssini, Tualha, Tudlio, Uncoolbob, Waldir, Wavelength, Wrp103, YetAnotherMatt, 296 anonymous edits Gene expression programming Source: http://en.wikipedia.org/w/index.php?oldid=392404486 Contributors: Bob0the0mighty, Cholling, Destynova, Exabyte, Frazzydee, James Travis, Mark Renier, Michal Jurosz, Phoebe, Torst, WurmWoode, 25 anonymous edits Grammatical evolution Source: http://en.wikipedia.org/w/index.php?oldid=426090190 Contributors: Conorlime, Edward, Harrigan, Johnmarksuave, PigFlu Oink, Polydeuces, Vernanimalcula, Whenning, 18 anonymous edits Grammar induction Source: http://en.wikipedia.org/w/index.php?oldid=457670509 Contributors: 1ForTheMoney, Aabs, Antonielly, Bobblehead, Chire, Delirium, Dfass, Erxnmedia, Gregbard, Hiihammuk, Hukkinen, Jim Horning, Koavf, KoenDelaere, MCiura, Mgalle, NTiOzymandias, Rizzardi, Rjwilmsi, Took, Tremilux, 5 anonymous edits Java Grammatical Evolution Source: http://en.wikipedia.org/w/index.php?oldid=471112319 Contributors: Aragorngr, RHaworth, Racklever, 6 anonymous edits Linear genetic programming Source: http://en.wikipedia.org/w/index.php?oldid=487148986 Contributors: Academic Challenger, Alai, Algorithms, Artur adib, Lineslarge, Marudubshinki, Master Mar, Michal Jurosz, Mihoshi, Oblivious, Riccardopoli, Rogerfgay, TheParanoidOne, Yonir, Ческий, 17 anonymous edits Evolutionary programming Source: http://en.wikipedia.org/w/index.php?oldid=471487669 Contributors: Alan Liefting, Algorithms, CharlesGillingham, Customline, Dq1, Gadig, Hirsutism, Jitse Niesen, Karada, Melaen, Mira, Pgr94, Pooven, Psb777, Samsara, Sergey539, Sm8900, Soho123, Tobym, Tsaitgaist, 18 anonymous edits Gaussian adaptation Source: http://en.wikipedia.org/w/index.php?oldid=454455774 Contributors: Alastair Haines, Avicennasis, Centrx, Colonies Chris, CommonsDelinker, DanielCD, Guy Macon, Hrafn, Jitse Niesen, Kjells, Lambiam, Mattisse, Michael Devore, Michael Hardy, Obscurans, Plrk, SheepNotGoats, Sintaku, Sjö, Vampireesq, 18 anonymous edits Differential evolution Source: http://en.wikipedia.org/w/index.php?oldid=482162977 Contributors: Alkarex, Aminrahimian, Andreas Kaufmann, Athaenara, Calltech, Chipchap, D14C050, Diego Moya, Discospinster, Dvunkannon, Fell.inchoate, Guroadrunner, Hongooi, J.A. Vital, Jamesontai, Jasonb05, Jorge.maturana, K.menin, Kjells, KrakatoaKatie, Lilingxi, Michael Hardy, MidgleyDJ, Mishrasknehu, MrOllie, NawlinWiki, Oleg Alexandrov, Optimering, Ph.eyes, R'n'B, RDBury, Rich Farmbrough, Rjwilmsi, Robert K S, Ruud Koot, Wmpearl, 54 anonymous edits Particle swarm optimization Source: http://en.wikipedia.org/w/index.php?oldid=487144595 Contributors: AdrianoCunha, Amgine, Anne Koziolek, Armehrabian, Bdonckel, Becritical, BenFrantzDale, Betamoo, Blake-, Blanchardb, Bolufe, Bshahul44, CharlesGillingham, Chipchap, CoderGnome, Cybercobra, Daryakav, Datakid, Dbratton, Diego Moya, DustinFreeman, Dzkd, Ehheh, Ender.ozcan, Enzzef, Epipelagic, Foma84, George I. Evers, Gfoidl, Giftlite, Hgkamath, Hike395, Horndude77, Huabdo, Jalsck, Jder, Jitse Niesen, Jiuguang Wang, K.menin, Khafanus, Kingpin13, KrakatoaKatie, Lexor, Lysy, MClerc, Ma8thew, Mange01, Mcld, Mexy ok, Michael Hardy, Mild Bill Hiccup, Mishrasknehu, MrOllie, MuffledThud, Murilo.pontes, MuthuKutty, Mxn, My wing hk, NawlinWiki, Neomagus00, NerdyNSK, Oleg Alexandrov, Oli Filth, Optimering, PS., Prometheum, Rich Farmbrough, Rjwilmsi, Ronaldo, Ronz, Ruud Koot, Saeed.Veradi, Sanremofilo, Saveur, Seamustara, Seb az86556, Sepreece, Sharkyangliu, Slicing, Sliders06, Sriramvijay124, Storkk, Swarming, Swiftly, Tjh22, Unknown, Waldir, Wgao03, Whenning, Wingman4l7, YakbutterT, Younessabdussalam, Yuejiao Gong, Zhanapollo, Σμήνος, ﺳﺮﺏ, 190 anonymous edits 226 Article Sources and Contributors Ant colony optimization algorithms Source: http://en.wikipedia.org/w/index.php?oldid=487053465 Contributors: 4th-otaku, AllenJB, Altenmann, Amossin, Andrewpmk, Asbestos, BenFrantzDale, BrotherE, Bsod2, Bsrinath, CMG, CalumH93, Cburnett, Cobi, Damzam, Daryakav, Dcoetzee, Der Golem, Diego Moya, Dl2653, Dzkd, Editdorigo, Edokter, Enochlau, Epipelagic, Explicit, Favonian, Feraudyh, Fubar Obfusco, Gnewf, Gretchen Hea, Gueleri, Haakon, Halberdo, IDSIAupdate, Itub, J ham3, J04n, Jaardon, Jamie King, Jbinder, Jonkerz, Jpgordon, Kopophex, KrakatoaKatie, Lasta, Lawrenceb, Leonardo61, LiDaobing, MSchlueter, Maarten van Emden, Magioladitis, Mattbr, Matthewfallshaw, Maximus Rex, Mdorigo, Melcombe, Mernst, Michael Hardy, Miguel Andrade, Mindmatrix, Mmanfrin73, MoyMan, Mrwojo, NerdyNSK, Nickg, NicoMon, Nojhan, Oleg Alexandrov, Omegatron, PHaze, Paul August, Pepanek Nezdara, Petebutt, Philip Trueman, Pratik.mallya, Praveenv253, Pyxzer, Quadrescence, Quiddity, Ratchet11111, Redgolpe, Retodon8, Rich Farmbrough, Richardsonlima, Ritchy, Rjwilmsi, Ronz, Royote, Runtime, SWAdair, Saeed.Veradi, Santiperez, Scott5834, Sdornan, Senarclens, SiobhanHansa, Smitty1337, Speicus, Spiritia, SunCreator, Swagato Barman Roy, Tabletop, Tamfang, Tango.ta, Tholme, Thumperward, Tomaxer, Trylks, Tupolev154, Vberger, Ventania, Vprashanth87, Welsh, Whenning, WikHead, Woohookitty, Xanzzibar, ZILIANGdotME, Zwgeem, 186 anonymous edits Artificial bee colony algorithm Source: http://en.wikipedia.org/w/index.php?oldid=477636059 Contributors: Andreas Kaufmann, Bahriyebasturk, Buddy23Lee, Courcelles, Diego Moya, Eugenecheung, Fluffernutter, JamesR, Jiuguang Wang, K.menin, Michael Hardy, Minimac, Rjwilmsi, Smooth O, Tony1, Truthanado, WRK, WikHead, 24 anonymous edits Evolution strategy Source: http://en.wikipedia.org/w/index.php?oldid=479795525 Contributors: Alai, Alex Kosorukoff, Algorithms, Alireza.mirian, An ES Expert, Dhollm, Gjacquenot, JHunterJ, Jeodesic, Jjmerelo, Lectonar, Lh389, MattOates, Melcombe, Michael Hardy, Muutze, Nosophorus, Oleg Alexandrov, Risk one, Rls, Ronz, Sannse, Sergey539, Skapur, TenPoundHammer, Tomschaul, Txrazy, 49 anonymous edits Evolution window Source: http://en.wikipedia.org/w/index.php?oldid=337059158 Contributors: Davewild, Hongooi, Nosophorus, Zawersh, 7 anonymous edits CMA-ES Source: http://en.wikipedia.org/w/index.php?oldid=481310018 Contributors: CBM, Dhatfield, Edward, Elonka, Frank.Schulz, Jamesontai, Jitse Niesen, K.menin, Malcolma, Mandarax, Mild Bill Hiccup, Obscurans, Optimering, Rjwilmsi, Sentewolf, Tangonacht, Thiseye, Tomschaul, 359 anonymous edits Cultural algorithm Source: http://en.wikipedia.org/w/index.php?oldid=478898204 Contributors: Aitias, CommonsDelinker, Discospinster, EagleFan, Jitse Niesen, Ludvig von Hamburger, Mandarax, Mark Renier, Michael Hardy, Motevallian, Neelix, RobinK, Tabletop, TyIzaeL, Zwgeem, 32 anonymous edits Learning classifier system Source: http://en.wikipedia.org/w/index.php?oldid=482751468 Contributors: Binksternet, Chire, Cholling, D6, Darkmeerkat, DavidLevinson, Docu, Docurbs, Frencheigh, Hopeiamfine, Joe Wreschnig, Loadquo, MikiWiki, Reedy, Toujoursmoi, Zearin, 17 anonymous edits Memetic algorithm Source: http://en.wikipedia.org/w/index.php?oldid=480962739 Contributors: Alai, Alex.g, Bikeable, D6, Diego Moya, DoriSmith, Elkman, Ender.ozcan, Jder, Josedavid, Jyril, Kamruladfa, Macha, Mark Arsten, Michael Hardy, Moonriddengirl, Nihola, Oyewsoon, Rjwilmsi, SeineRiver, Timekeeper77, Tonyfaull, Werdna, WikHead, Wingman4l7, Xtyx.r, 23 anonymous edits Meta-optimization Source: http://en.wikipedia.org/w/index.php?oldid=465609610 Contributors: Kiefer.Wolfowitz, Michael Hardy, MrOllie, Optimering, Ruud Koot, This, that and the other, Will Beback Auto Cellular evolutionary algorithm Source: http://en.wikipedia.org/w/index.php?oldid=470819128 Contributors: Bearcat, Beeblebrox, Enrique.alba1, Katharineamy, Khazar, Shashwat986, Thompson.matthew Cellular automaton Source: http://en.wikipedia.org/w/index.php?oldid=485642337 Contributors: -Ril-, 524, ACW, Acidburn24m, AdRock, Agora2010, Akramm1, Alexwg, Allister MacLeod, Alpha Omicron, Angela, AnonEMouse, Anonymous Dissident, Argon233, Asmeurer, Avaya1, Axd, AxelBoldt, B.huseini, Baccyak4H, Balsarxml, Banus, Bearian, Beddowve, Beeblebrox, Benjah-bmm27, Bento00, Bevo, Bhumiya, BorysB, Bprentice, Brain, Bryan Derksen, Caileagleisg, Calwiki, CharlesC, Chmod007, Chopchopwhitey, Christian Kreibich, Chuckwolber, Ckatz, Crazilla, Cstheoryguy, Curps, DVdm, Dalf, Dave Feldman, David Eppstein, Dawnseeker2000, Dcornforth, Dekart, Deltabeignet, Dhushara, Dmcq, Dra, Dysprosia, EagleFan, Edward Z. Yang, Elektron, EmreDuran, Erauch, Eric119, Error, Evil saltine, Ezubaric, Felicity Knife, Ferkel, FerrenMacI, Froese, GSM83, Geneffects, Giftlite, Gioto, Gleishma, Gragus, Graham87, GregorB, Gthen, Guanaco, HairyFotr, Hannes Eder, Headbomb, Hephaestos, Hfastedge, Hillgentleman, Hiner, Hmonroe, Hope09, I do not exist, Ideogram, Ilmari Karonen, Imroy, InverseHypercube, Iridescent, Iseeaboar, Iztok.jeras, J.delanoy, JaGa, Jarble, Jasper Chua, Jdandr2, Jlopez1967, JocK, Joeyramoney, Jogloran, Jon Awbrey, Jonkerz, Jose Icaza, Joseph Myers, JuliusCarver, Justin W Smith, K-UNIT, Kaini, Karlscherer3, Kb, Keenan Pepper, Kiefer.Wolfowitz, Kieff, Kizor, Kku, Kneb, Kotasik, Kyber, Kzollman, LC, Laesod, Lamro, Lgallindo, Lightmouse, Lpdurocher, LunchboxGuy, Mahlon, Mandalaschmandala, MarSch, Marasmusine, Marcus Wilkinson, Mattisse, Mbaudier, Metric, Mgiganteus1, Michael Hardy, Mihai Damian, Mosiah, MrOllie, Mudd1, MuthuKutty, Mydogtrouble, NAHID, Nakon, Nekura, NickCT, Ninly, Nippashish, Oliviersc2, On you again, Orborde, Oubiwann, P0lyglut, PEHowland, Pasicles, Pcorteen, Peak Freak, Perceval, Phaedriel, Pi is 3.14159, PierreAbbat, Pixelface, Pleasantville, Pygy, Quuxplusone, R.e.s., RDBury, Radagast83, Raven4x4x, Requestion, RexNL, Rjwilmsi, Robin klein, RyanB88, Sadi Carnot, Sam Tobar, Samohyl Jan, Sbp, ScAvenger, Schneelocke, Selket, Setoodehs, Shoemaker's Holiday, Smjg, Spectrogram, Srleffler, Sumanafsu, SunCreator, Svrist, The Temple Of Chuck Norris, Throwaway85, Tijfo098, TittoAssini, Tobias Bergemann, Torcini, Tropylium, Ummit, Versus22, Visor, Warrado, Watcher, Watertree, Wavelength, Welsh, Wik, William R. Buckley, Wolfpax50, Woohookitty, XJamRastafire, Xerophytes, Xihr, Yonkeltron, Yugsdrawkcabeht, ZeroOne, Zoicon5, Zom-B, Zorbid, 341 anonymous edits Artificial immune system Source: http://en.wikipedia.org/w/index.php?oldid=480634937 Contributors: Alai, Aux1496, Betacommand, BioWikiEditor, CBM, CRGreathouse, Calltech, Canon, CharlesGillingham, Chris the speller, Dfletter, Hadal, Hiko-seijuro, Jamelan, Jasonb05, Jeff Kephart, Jitse Niesen, Jtimmis, K.menin, KrakatoaKatie, Kumioko, Leonardo61, Lisilec, MattOates, Michal Jurosz, Moxon, Mpo, MrOllie, Mrwojo, Narasimhanator, Nicosiagiuseppe, Ravn, Retired username, Rjwilmsi, Sietse Snel, SimonP, Tevildo, That Guy, From That Show!, Wavelength, Ymei, Мих1991, 72 anonymous edits Evolutionary multi-modal optimization Source: http://en.wikipedia.org/w/index.php?oldid=473395850 Contributors: Autoerrant, Chire, Kamitsaha, Kcwong5, Matt5091, Michael Hardy, MrOllie, Scata79, 6 anonymous edits Evolutionary music Source: http://en.wikipedia.org/w/index.php?oldid=484187691 Contributors: Crystallina, Dfwedit, Iridescent, Kvng, LittleHow, Oo7565, Rainwarrior, Skittleys, Uncoolbob, 19 anonymous edits Coevolution Source: http://en.wikipedia.org/w/index.php?oldid=487107884 Contributors: 12tsheaffer, Aliekens, Andycjp, Anþony, Artemis Gray, Avoided, AzureCitizen, Bornslippy, Bourgaeana, BrownHairedGirl, CDN99, Cadiomals, Chopchopwhitey, Cohesion, ConCompS, Cremepuff222, DARTH SIDIOUS 2, Danger, Dave souza, Dhess13, El C, Emw, Espresso Addict, Etan J. Tal, Extremophile, Extro, Favonian, Fcummins, Flammifer, Gaius Cornelius, Goethean, Harizotoh9, JHunterJ, Jef-Infojef, JimR, Joan-of-arc, Johnuniq, KYPark, Kaiwhakahaere, Kbodouhi, Kdakin, Kotasik, LilHelpa, Look2See1, M rickabaugh, MER-C, Macdonald-ross, Mccready, Mexipedium xerophyticum, Midgley, MilitaryTarget, Momo san, Morel, Nathanielvirgo, Nightmare The Incarnal, Odinbolt, Plastikspork, Plumpurple, Polaron, Rhetth, Rich Farmbrough, Richard001, Rick Block, Rjwilmsi, Ruakh, Sannab, Sawahlstrom, Scientizzle, Smsarmad, Srbauer, Stfg, Succulentpope, TedPavlic, Thehelpfulone, Tijfo098, Tommyjs, Uncle Dick, Vanished user, Velella, Vicki Rosenzweig, Vicpeters, Victor falk, Viriditas, Vlmastra, Vsmith, Wetman, WikHead, Wlodzimierz, Xiaowei JIANG, Z10x, 124 anonymous edits Evolutionary art Source: http://en.wikipedia.org/w/index.php?oldid=453236792 Contributors: Andrewborrell, Biggiebistro, Bokaratom, Darlene4, Dlrohrer2003, Fheyligh, Haakon, JiFish, JmountZedZed, JockoJonson, KAtremer, Marudubshinki, Simonham, Spot, Svea Kollavainen, Timendres, Uncoolbob, Wolfsheep113, Yworo, ZeroOne, 50 anonymous edits Artificial life Source: http://en.wikipedia.org/w/index.php?oldid=485074045 Contributors: -ts-, AAAAA, Ahyeek, Ancheta Wis, Aniu, Barbalet, BatteryIncluded, Bcameron54, Beetstra, BenRayfield, BloodGrapefruit, Bobby D. Bryant, Bofoc Tagar, Brion VIBBER, Bryan Derksen, CLW, CatherineMunro, Cdocrun, Cedric71, Chaos, CharlesGillingham, Chris55, Ckatz, Cmdrjameson, Cough, Dan Polansky, David Latapie, DavidCary, Davidcofer73, Davidhorman, Dbachmann, Demomoer, DerBorg, DerHexer, Dggreen, Discospinster, Draeco, Drpickem, Ds13, EagleOne, El C, Emperorbma, Erauch, Eric Catoire, Erikwithaknotac, Extro, Ferkel, Fheyligh, ForestDim, Francis Tyers, Franksbnetwork, Gaius Cornelius, Graham87, GreenReaper, Guaka, Hajor, Heron, Hingfat, Husky, In ictu oculi, Iota, Ivan Štambuk, JDspeeder1, JLaTondre, Jackobogger, James pic, JiFish, JimmyShelter, Jjmerelo, Joel7687, Jon Awbrey, Jwdietrich2, Kbh3rd, Kenrinaldo, Kenstauffer, Khazar, Kimiko, Kwekubo, Levil, Lexor, Liam Skoda, Ligulem, Lordvolton, MKFI, Macrakis, MakeRocketGoNow, Marasmusine, Markus.Waibel, MattBan, MattOates, Matthew Stannard, Mav, Mdd, Melongrower, Michal Jurosz, Mikael Häggström, Milkbreath, MisfitToys, MrDolomite, MrOllie, Myles325a, N16HTM4R3, NeilN, Newsmare, Ngb, Nick, Numsgil, Oddity-, Oliviermichel, Omermar, Onorem, Peruvianllama, Phoenixthebird, Pietro speroni, Pinar, Pjacobi, Predictor, Psb777, Quuxplusone, RainbowCrane, Rankiri, RashmiPatel, Rfl, Rjwilmsi, Ronz, RoyBoy, SDC, SaTaMaS, SallyForth123, Sam, Sam Hocevar, Samsara, Seth Manapio, Sina2, Skinsmoke, Slark, Snleo, Spacemonster, Spamburgler, SpikeZOM, Squidonius, Stephenchou0722, Stewartadcock, Svea Kollavainen, Tailpig, Tarcieri, Taxisfolder, Tesfatsion, The Anome, The Transhumanist, TheCoffee, Themfromspace, Thsgrn, Timwi, Tobias Bergemann, Tommy2010, Trovatore, Truthnlove, Wbm1058, Why Not A Duck, Wik, Wilke, William Caputo, William R. Buckley, Zach Winkler, Zeimusu, 190 anonymous edits Machine learning Source: http://en.wikipedia.org/w/index.php?oldid=486816105 Contributors: APH, AXRL, Aaron Kauppi, Aaronbrick, Aceituno, Addingrefs, Adiel, Adoniscik, Ahoerstemeier, Ahyeek, Aiwing, AnAj, André P Ricardo, Anubhab91, Arcenciel, Arvindn, Ataulf, Autologin, BD2412, BMF81, Baguasquirrel, Beetstra, BenKovitz, BertSeghers, Biochaos, BlaiseFEgan, Blaz.zupan, Bonadea, Boxplot, Bumbulski, Buridan, Businessman332211, CWenger, Calltech, Candace Gillhoolley, Casia wyq, Celendin, Centrx, Cfallin, ChangChienFu, ChaoticLogic, CharlesGillingham, Chire, Chriblo, Chris the speller, Chrisoneall, Clemwang, Clickey, Cmbishop, CommodiCast, Crasshopper, Ctacmo, CultureDrone, Cvdwalt, Damienfrancois, Dana2020, Dancter, Darnelr, DasAllFolks, Dave Runger, DaveWF, DavidCBryant, Debejyo, Debora.riu, Defza, Delirium, Denoir, Devantheryv, Dicklyon, Dondegroovily, Dsilver, Dzkd, Edouard.darchimbaud, Essjay, Evansad, Examtester, Fabiform, FidesLT, Fram, Funandtrvl, Furrykef, Gareth Jones, Gene s, Genius002, Giftlite, GordonRoss, Grafen, Graytay, Gtfjbl, Haham 227 Article Sources and Contributors hanuka, Helwr, Hike395, Hut 8.5, Innohead, Intgr, InverseHypercube, IradBG, Ishq2011, J04n, James Kidd, Jbmurray, Jcautilli, Jdizzle123, Jim15936, JimmyShelter, Jmartinezot, Joehms22, Joerg Kurt Wegner, Jojit fb, JonHarder, Jrennie, Jrljrl, Jroudh, Jwojt, Jyoshimi, KYN, Keefaas, KellyCoinGuy, Khalid hassani, Kinimod, Kithira, Kku, KnightRider, Kumioko, Kyhui, L Kensington, Lars Washington, Lawrence87, Levin, Lisasolomonsalford, LittleBenW, Liuyipei, LokiClock, Lordvolton, Lovok Sovok, MTJM, Masatran, Mdd, Mereda, Michael Hardy, Misterwindupbird, Mneser, Moorejh, Mostafa mahdieh, Movado73, MrOllie, Mxn, Nesbit, Netalarm, Nk, NotARusski, Nowozin, Ohandyya, Ohnoitsjamie, Pebkac, Penguinbroker, Peterdjones, Pgr94, Philpraxis, Piano non troppo, Pintaio, Plehn, Pmbhagat, Pranjic973, Prari, Predictor, Proffviktor, PseudoOne, Quebec99, QuickUkie, Quintopia, Qwertyus, RJASE1, Rajah, Ralf Klinkenberg, Redgecko, RexSurvey, Rjwilmsi, Robiminer, Ronz, Ruud Koot, Ryszard Michalski, Salih, Scigrex14, Scorpion451, Seabhcan, Seaphoto, Sebastjanmm, Shinosin, Shirik, Shizhao, Silvonen, Sina2, Smorsy, Soultaco, Spiral5800, Srinivasasha, StaticGull, Stephen Turner, Superbacana, Swordsmankirby, Tedickey, Tillander, Topbanana, Trondtr, Ulugen, Utcursch, VKokielov, Velblod, Vilapi, Vivohobson, Vsweiner, WMod-NS, Webidiap, WhatWasDone, Wht43, Why Not A Duck, Wikinacious, WilliamSewell, Winnerdy, WinterSpw, Wjbean, Wrdieter, Yoshua.Bengio, YrPolishUncle, Yworo, ZeroOne, Zosoin, Иъ Лю Ха, 330 anonymous edits Evolvable hardware Source: http://en.wikipedia.org/w/index.php?oldid=484709551 Contributors: Crispin Cooper, Foobar, Hooperbloob, Lordvolton, Luzian, Mdd, Michael Hardy, MikeCombrink, Nabarry, Nicklott, Rajpaj, Rl, Sejomagno, Thekingofspain, Wbm1058, 32 anonymous edits NEAT Particles Source: http://en.wikipedia.org/w/index.php?oldid=429948708 Contributors: JockoJonson, Rjwilmsi, 5 anonymous edits 228 Image Sources, Licenses and Contributors Image Sources, Licenses and Contributors File:MaximumParaboloid.png Source: http://en.wikipedia.org/w/index.php?title=File:MaximumParaboloid.png License: GNU Free Documentation License Contributors: Original uploader was Sam Derbyshire at en.wikipedia Image:Nonlinear programming jaredwf.png Source: http://en.wikipedia.org/w/index.php?title=File:Nonlinear_programming_jaredwf.png License: Public Domain Contributors: Jaredwf Image:Nonlinear programming 3D.svg Source: http://en.wikipedia.org/w/index.php?title=File:Nonlinear_programming_3D.svg License: Public Domain Contributors: derivative work: McSush (talk) Nonlinear_programming_3D_jaredwf.png: Jaredwf Image:TSP Deutschland 3.png Source: http://en.wikipedia.org/w/index.php?title=File:TSP_Deutschland_3.png License: Public Domain Contributors: Original uploader was Kapitän Nemo at de.wikipedia. Later version(s) were uploaded by MrMonstar at de.wikipedia. Image:William Rowan Hamilton painting.jpg Source: http://en.wikipedia.org/w/index.php?title=File:William_Rowan_Hamilton_painting.jpg License: Public Domain Contributors: Quibik Image:Weighted K4.svg Source: http://en.wikipedia.org/w/index.php?title=File:Weighted_K4.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Sdo Image:Aco TSP.svg Source: http://en.wikipedia.org/w/index.php?title=File:Aco_TSP.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:Nojhan, User:Nojhan Image:Pareto_Efficient_Frontier_for_the_Markowitz_Portfolio_selection_problem..png Source: http://en.wikipedia.org/w/index.php?title=File:Pareto_Efficient_Frontier_for_the_Markowitz_Portfolio_selection_problem..png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Marcuswikipedian File:Production Possibilities Frontier Curve Pareto.svg.png Source: http://en.wikipedia.org/w/index.php?title=File:Production_Possibilities_Frontier_Curve_Pareto.svg.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: Jarry1250, Joxemai, Sheitan Image:Front pareto.svg Source: http://en.wikipedia.org/w/index.php?title=File:Front_pareto.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:Nojhan, User:Nojhan File:Parallel_models.png Source: http://en.wikipedia.org/w/index.php?title=File:Parallel_models.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Enrique.alba1 Image:fitness-landscape-cartoon.png Source: http://en.wikipedia.org/w/index.php?title=File:Fitness-landscape-cartoon.png License: Public Domain Contributors: User:Wilke Image:Toyblocks.JPG Source: http://en.wikipedia.org/w/index.php?title=File:Toyblocks.JPG License: Public Domain Contributors: Briho, Stilfehler File:Eakins, Baby at Play 1876.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Eakins,_Baby_at_Play_1876.jpg License: Public Domain Contributors: User:Picasa Review Bot Image:SinglePointCrossover.png Source: http://en.wikipedia.org/w/index.php?title=File:SinglePointCrossover.png License: GNU Free Documentation License Contributors: RedWolf, Rgarvage Image:TwoPointCrossover.png Source: http://en.wikipedia.org/w/index.php?title=File:TwoPointCrossover.png License: GNU Free Documentation License Contributors: Quadell, Rgarvage Image:CutSpliceCrossover.png Source: http://en.wikipedia.org/w/index.php?title=File:CutSpliceCrossover.png License: GNU Free Documentation License Contributors: RedWolf, Rgarvage File:UniformCrossover.png Source: http://en.wikipedia.org/w/index.php?title=File:UniformCrossover.png License: GNU Free Documentation License Contributors: Missionpyo Image:Fitness proportionate selection example.png Source: http://en.wikipedia.org/w/index.php?title=File:Fitness_proportionate_selection_example.png License: Creative Commons Attribution-Sharealike 2.5 Contributors: Lukipuk, Simon.Hatthon Image:Genetic ero crossover.svg Source: http://en.wikipedia.org/w/index.php?title=File:Genetic_ero_crossover.svg License: Public Domain Contributors: GregManninLB, Koala man Image:Genetic indirect binary crossover.svg Source: http://en.wikipedia.org/w/index.php?title=File:Genetic_indirect_binary_crossover.svg License: Public Domain Contributors: GregManninLB, Koala man Image:Ero vs pmx vs indirect for tsp ga.png Source: http://en.wikipedia.org/w/index.php?title=File:Ero_vs_pmx_vs_indirect_for_tsp_ga.png License: Public Domain Contributors: Koala man Image:Blackbox.svg Source: http://en.wikipedia.org/w/index.php?title=File:Blackbox.svg License: Public Domain Contributors: Original uploader was Frap at en.wikipedia Image:Statistically Uniform.png Source: http://en.wikipedia.org/w/index.php?title=File:Statistically_Uniform.png License: Creative Commons Attribution-Sharealike 2.5 Contributors: Simon.Hatthon Image:Genetic Program Tree.png Source: http://en.wikipedia.org/w/index.php?title=File:Genetic_Program_Tree.png License: Public Domain Contributors: Original uploader was BAxelrod at en.wikipedia Image:Fraktal.gif Source: http://en.wikipedia.org/w/index.php?title=File:Fraktal.gif License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Gregor Kjellström Image:Mountain crest.GIF Source: http://en.wikipedia.org/w/index.php?title=File:Mountain_crest.GIF License: GNU Free Documentation License Contributors: Gregor Kjellström Image:Schematic_of_a_neural_network_executing_the_Gaussian_adaptation_algorithm.GIF Source: http://en.wikipedia.org/w/index.php?title=File:Schematic_of_a_neural_network_executing_the_Gaussian_adaptation_algorithm.GIF License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Gregor Kjellström Image:Efficiency.GIF Source: http://en.wikipedia.org/w/index.php?title=File:Efficiency.GIF License: GNU Free Documentation License Contributors: Gregor Kjellström Image:DE Meta-Fitness Landscape (Sphere and Rosenbrock).JPG Source: http://en.wikipedia.org/w/index.php?title=File:DE_Meta-Fitness_Landscape_(Sphere_and_Rosenbrock).JPG License: Public Domain Contributors: Pedersen, M.E.H., Tuning & Simplifying Heuristical Optimization, PhD Thesis, 2010, University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. Image:PSO Meta-Fitness Landscape (12 benchmark problems).JPG Source: http://en.wikipedia.org/w/index.php?title=File:PSO_Meta-Fitness_Landscape_(12_benchmark_problems).JPG License: Public Domain Contributors: Pedersen, M.E.H., Tuning & Simplifying Heuristical Optimization, PhD Thesis, 2010, University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. Image:Safari ants.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Safari_ants.jpg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: Mehmet Karatay Image:Aco branches.svg Source: http://en.wikipedia.org/w/index.php?title=File:Aco_branches.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:Nojhan, User:Nojhan, User:Nojhan Image:Knapsack ants.svg Source: http://en.wikipedia.org/w/index.php?title=File:Knapsack_ants.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: Andreas Plank, Dake, 1 anonymous edits Image:Aco shortpath.svg Source: http://en.wikipedia.org/w/index.php?title=File:Aco_shortpath.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: User:Nojhan, User:Nojhan File:Magnify-clip.png Source: http://en.wikipedia.org/w/index.php?title=File:Magnify-clip.png License: Public Domain Contributors: User:Erasoft24 Image:Concept of directional optimization in CMA-ES algorithm.png Source: http://en.wikipedia.org/w/index.php?title=File:Concept_of_directional_optimization_in_CMA-ES_algorithm.png License: Public Domain Contributors: Sentewolf Image:Meta-Optimization Concept.JPG Source: http://en.wikipedia.org/w/index.php?title=File:Meta-Optimization_Concept.JPG License: Public Domain Contributors: Pedersen, M.E.H., Tuning & Simplifying Heuristical Optimization, PhD Thesis, 2010, University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. Image:DE Meta-Fitness Landscape (12 benchmark problems).JPG Source: http://en.wikipedia.org/w/index.php?title=File:DE_Meta-Fitness_Landscape_(12_benchmark_problems).JPG License: Public Domain Contributors: Pedersen, M.E.H., Tuning & Simplifying Heuristical Optimization, PhD Thesis, 2010, University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. Image:DE Meta-Optimization Progress (12 benchmark problems).JPG Source: http://en.wikipedia.org/w/index.php?title=File:DE_Meta-Optimization_Progress_(12_benchmark_problems).JPG License: Public Domain Contributors: Pedersen, M.E.H., Tuning & Simplifying Heuristical Optimization, PhD Thesis, 2010, University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group. File:evolution of several cEAs.png Source: http://en.wikipedia.org/w/index.php?title=File:Evolution_of_several_cEAs.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Enrique.alba1 File:cEA neighborhood types.png Source: http://en.wikipedia.org/w/index.php?title=File:CEA_neighborhood_types.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Enrique.alba1 File:ratio concept in cEAs.png Source: http://en.wikipedia.org/w/index.php?title=File:Ratio_concept_in_cEAs.png License: Creative Commons Attribution-Sharealike 3.0 Contributors: User:Enrique.alba1 229 Image Sources, Licenses and Contributors Image:Gospers glider gun.gif Source: http://en.wikipedia.org/w/index.php?title=File:Gospers_glider_gun.gif License: GNU Free Documentation License Contributors: Kieff Image:Torus.png Source: http://en.wikipedia.org/w/index.php?title=File:Torus.png License: Public Domain Contributors: Kieff, Rimshot, SharkD Image:John von Neumann ID badge.png Source: http://en.wikipedia.org/w/index.php?title=File:John_von_Neumann_ID_badge.png License: Public Domain Contributors: Bomazi, Diego Grez, Fastfission, Frank C. Müller, Kilom691, Materialscientist, 1 anonymous edits Image:CA rule30s.png Source: http://en.wikipedia.org/w/index.php?title=File:CA_rule30s.png License: GNU Free Documentation License Contributors: Falcorian, InverseHypercube, Maksim, Simeon87, 1 anonymous edits Image:CA rule110s.png Source: http://en.wikipedia.org/w/index.php?title=File:CA_rule110s.png License: GNU Free Documentation License Contributors: InverseHypercube, Maksim, Simeon87 Image:AC rhombo.png Source: http://en.wikipedia.org/w/index.php?title=File:AC_rhombo.png License: Creative Commons Attribution 3.0 Contributors: Akramm Image:Oscillator.gif Source: http://en.wikipedia.org/w/index.php?title=File:Oscillator.gif License: GNU Free Documentation License Contributors: Original uploader was Grontesca at en.wikipedia Image:Textile cone.JPG Source: http://en.wikipedia.org/w/index.php?title=File:Textile_cone.JPG License: GNU Free Documentation License Contributors: Ausxan, InverseHypercube, Rling, Valérie75, 1 anonymous edits File:GA-Multi-modal.ogv Source: http://en.wikipedia.org/w/index.php?title=File:GA-Multi-modal.ogv License: Creative Commons Attribution 3.0 Contributors: Kamitsaha Image:Bombus 6867.JPG Source: http://en.wikipedia.org/w/index.php?title=File:Bombus_6867.JPG License: GNU Free Documentation License Contributors: ComputerHotline, Josette Image:Papilio machaon caterpillar on Ruta.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Papilio_machaon_caterpillar_on_Ruta.jpg License: Creative Commons Attribution 3.0 Contributors: איתן טלEtan Tal File:Yuccaharrimaniae.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Yuccaharrimaniae.jpg License: Public Domain Contributors: Epibase, Martin H., Stickpen Image:Imagebreeder_example.png Source: http://en.wikipedia.org/w/index.php?title=File:Imagebreeder_example.png License: Public Domain Contributors: Simonham Image:Braitenberg.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Braitenberg.jpg License: GNU Free Documentation License Contributors: Original uploader was Rxke at en.wikipedia Image:NEAT PARTICLES 1.jpg Source: http://en.wikipedia.org/w/index.php?title=File:NEAT_PARTICLES_1.jpg License: Public Domain Contributors: JockoJonson Image:NEAT PARTICLES 2.jpg Source: http://en.wikipedia.org/w/index.php?title=File:NEAT_PARTICLES_2.jpg License: Public Domain Contributors: JockoJonson 230 License License Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ 231