Avida V2.6.2 PDF (UNIX) Docs
Transcription
Avida V2.6.2 PDF (UNIX) Docs
Avida : A Guided Tour of an Ancestor and its Gardware Return to the Index 08/28/2007 04:33 PM Revised 2006-09-05 DMB A Guided Tour of an Ancestor and its Hardware This document describes the structure of the classic virtual CPU and an example organism running on it. The Virtual CPU Structure The virtual CPU, which is the default "body" or "hardware" of the organisms, contains the following set of components, (as further illustrated in the figure below). A memory that consists of a sequence of instructions, each associated with a set of flags to denote if the instruction has been executed, copied, mutated, etc. An instruction pointer (IP) that indicates the next site in the memory to be executed. Three registers that can be used by the organism to hold data currently being manipulated. These are often operated upon by the various instructions, and can contain arbitrary 32-bit integers. Two stacks that are used for storage. The organism can theoretical store an arbitrary amount of data in the stacks, but for practical purposes we currently limit the maximum stack depth to ten. An input buffer and an output bufferthat the organism uses to receive information, and return the processed results. A Read-Head, a Write-Head, and a Flow-Head which are used to specify positions in the CPU's memory. A copy command reads from the Read-Head and writes to the Write-Head. Jump-type statements move the IP to the Flow-Head. file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 1 of 6 Avida : A Guided Tour of an Ancestor and its Gardware 08/28/2007 04:33 PM Instruction Set Configuration The instruction set in Avida is loaded on startup from a configuration file specified in the avida.cfg file. This allows selection of different instruction sets without recompiling the source code, as well as allowing different sized instruction sets to be specified. It is not possible to alter the behavior of individual instructions or add new instructions without recompiling Avida; such activities have to be done directly in the source code. The available instructions are listed in the inst_set.* files with a 1 or a 0 next to an instruction to indicate if it should or should not be included. Changing the instruction set to be used simply involves adjusting these flags. The instructions were created with three things in mind: To be as complete as possible (both in a "Turing complete" sense -- that is, it can compute any computable function -- and, more practically, to ensure that simple operations only require a few instructions). For each instruction to be as robust and versatile as possible; all file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 2 of 6 Avida : A Guided Tour of an Ancestor and its Gardware 08/28/2007 04:33 PM instructions should take an "appropriate" action in any situation where they can be executed. To have as little redundancy as possible between instructions. (Several instructions have been implemented that are redundant, but such combinations will typically not be turned on simultaneously for a run.) One major concept that differentiates this virtual assembly language from its real-world counterparts is in the additional uses of nop instructions (nooperation commands). These have no direct effect on the virtual CPU when executed, but often modify the effect of any instruction that precedes them. In a sense, you can think of them as purely regulatory genes. The default instruction set has three such nop instructions: nop-A, nop-B, and nop-C. The remaining instructions can be seperated into three classes. The first class is those few instructions that are unaffected by nops. Most of these are the "biological" instructions involved directly in the replication process. The second class of instructions is those for which a nop changes the head or register affected by the previous command. For example, an inc command followed by the instruction nop-A would cause the contents of the AX register to be incremented, while an inc command followed by a nop-B would increment BX. The notation we use in instruction definitions to describe that a default component (that is, a register or head) can be replaced due to a nop command is by surrounding the component name with ?'s. The component listed is the default one to be used, but if a nop follows the command, the component it represents in this context will replace this default. If the component between the question marks is a register than a subsequent nop-A represents the AX register, nop-B is BX, and nop-C is CX. If the component listed is a head (including the instruction pointer) then a nop-A represents the Instruction Pointer, nop-B represents the Read-Head, and nop-C is the Write-Head. Currently the Flow-Head has no nop associated with it. The third class of instructions are those that use a series of nop instructions as a template (label) for a command that needs to reference another position in the code, such as h-search. If nop-A follows a search command, it scans for the first complementary template (nop-B) and moves the Flow-Head there. Templates may be composed of more than a single nop instruction. A series of nops is typically abbreviated to the associated lecodeer and separated by colons. This the sequence "nop-A nop-A nop-C" would be displayed as "A:A:C". The label system used in Avida allows for an arbitrary number of nops. By default, we have three: nop-A's complement is nop-B, nop-B's is nop-C, and nop-C's is nop-A. Likewise, some instructions talk about the complement of a register or head -- the same pacodeern is used in those cases. So if an instruction tests if ?BX? is equal to its complement, it will test if BX == CX by default, but if it is followed by a nop-C it will test if CX == AX. Instruction Set Reference file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 3 of 6 Avida : A Guided Tour of an Ancestor and its Gardware 08/28/2007 04:33 PM The full instruction set description is included here. An abbreviated description of the 26 default instructions is below. (a- nop-A, c) nop-B, and nop-C (d) if-n-equ (e) if-less (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) (q) (r) pop push swap-stk swap shift-r shift-l inc dec add sub nand IO h-alloc (s) h-divide (t) h-copy (u) (v) (w) (x) h-search mov-head jmp-head get-head (y) if-label (z) set-flow No-operation instructions; these modify other instructions. Execute next instruction only-if ?BX? does not equal its complement Execute next instruction only if ?BX? is less than its complement Remove a number from the current stack and place it in ?BX? Copy the value of ?BX? onto the top of the current stack Toggle the active stack Swap the contents of ?BX? with its complement. Shift all the bits in ?BX? one to the right Shift all the bits in ?BX? one to the left Increment ?BX? Decrement ?BX? Calculate the sum of BX and CX; put the result in ?BX? Calculate the BX minus CX; put the result in ?BX? Perform a bitwise NAND on BX and CX; put the result in ?BX? Output the value ?BX? and replace it with a new input Allocate memory for an offspring Divide off an offspring located between the Read-Head and Write-Head. Copy an instruction from the Read-Head to the Write-Head and advance both. Find a complement template and place the Flow-Head after it. Move the ?IP? to the same position as the Flow-Head Move the ?IP? by a fixed amount found in CX Write the position of the ?IP? into CX Execute the next instruction only if the given template complement was just copied Move the Flow-Head to the memory position specified by ?CX? An Example Ancestor The following organism is stored in the file organism.heads.15, which you should find in the support/config/misc/ directory. This is a simplified version of organism.default and organism.heads.100, of lengths 50 and 100 respectively (each has additional instructions placed before the copy loop) # --- Setup --# Allocate extra space at the end of the genome to copy the offspring h-alloc into. h# Locate an A:B template (at the end of the organism) and place the file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 4 of 6 Avida : A Guided Tour of an Ancestor and its Gardware search nop-C nop-A movhead nop-C 08/28/2007 04:33 PM Flow-Head after it # # # Place the Write-Head at the Flow-Head (which is at beginning of offspring-to-be). # [ Extra nop-C commands can be placed here w/o harming the organism! ] # --- Copy Loop --h# No template, so place the Flow-Head on the next search # Copy a single instruction from the read head to h-copy advance both heads!) if# Execute the line following this template only if label an A:B template. nop-C # nop-A # h# ...Divide off offspring! (note if-statement divide mov# Otherwise, move the IP back to the Flow-Head at head copy loop. nop-A # End label. nop-B # End label. line code the write head (and we have just copied above!) the beginning of the This program begins by allocating extra space for its offspring. The exact amount of space does not need to be specified -- it will allocate as much as it is allowed to. The organism will then do a search for the end of its genome (where this new space was just placed) so that it will know where to start copying. First the Flow-Head is placed there, and then the Write-Head is moved to the same point. It is after this initial setup and before the actual copying process commences that extra nop instructions can be included. The only caveat is that you need to make sure that you don't duplicate any templates that the program will be searching for, or else it will no longer function properly. The easiest thing to do is insert a long sequence of nop-C instructions. Next we have the beginning of the "copy loop". This segement of code starts off with an h-search command with no template following it. In such as case, the Flow-Head is placed on the line immediately following the search. This head will be used to designate the place that the IP keeps returning to with each cycle of the loop. The h-copy command will copy a single instruction from the Read-Head (still at the very start of the genome, where it begins) to the Write-Head (which we placed at the beginning of the offspring). With any copy command there is a user-specified chance of a copy mutation. If one occurs, the Write-Head will place a random instruction rather than the one that it gathered from the ReadHead. After the copy occurs (for becodeer or worse), both the Read-Head and the Write-Head are advanced to the next instruction in the genome. It is for file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 5 of 6 Avida : A Guided Tour of an Ancestor and its Gardware 08/28/2007 04:33 PM this reason that a common mutation we see happening will place a long string of h-copy instruction one after another. The next command, if-label (followed by a nop-C and a nop-A) tests to see if the complement of C:A is the most thing copied. That is, if the two most recent instructions copied were a nop-A followed by a nop-B as is found at the end of the organism. If so, we are done! Execute the next instruction which is h-divide (when this occurs, the read and write heads will surround the portion of memory to be split off as the offspring's genome). If not, then we need to keep going. Skip the next instruction and move on to the mov-head which will move the head specified by the nop that follows (in this case nop-A which is the Instruction Pointer) to the Flow-Head at the beginning of the copy loop. This process will continue until all of the lines of code have been copies, and an offspring is born. An Example Logic Gene Here is a short example program to demonstrate one way for an organism to perform the "OR" logic operation. This time I'm only going to show the contents of the registers after each command because the functionality of the individual instructions should be clear, and the logic itself won't be helped much by a line-by-line explanation in English. Line # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Instruction IO push pop nop-C nand nop-A IO push pop nop-C nand swap nop-C nand IO AX ? ? ? BX X X X CX ? ? X Stack Output ? ? X, ? ? ~X X X ? ~X ~X ~X Y Y Y X X Y ? Y, ? ? ~X Y ~Y ~Y Y ~X ? ? Y Y X or Y ~X Z ~X ? ? X X or Y Return to the Index file:///Users/boccio/Desktop/documentation/cpu_tour.html Page 6 of 6 Avida : Directory and File Structure Return to the Index 08/28/2007 04:33 PM Revised 2006-09-05 DMB Directory and File Structure This document contains a guide to the files present in Avida, and where they are located. Filenames Source code files in Avida follow a standard naming convention. The C++ core, in general, maintains one class per header/source file pair. The file name should exactly match the class that it defines. All header files use .h and all source files use .cc as their respective file extensions. When you compile a program in C++, it goes through a compilation phase and then a link phase. The compilation phase takes each source (.cc) file and compiles it independently into an object (.o) file. In the link phase, all of these compiled object files are linked together into a single executable (such as avida). Since the bodies of the methods are only in the source files, they only need to be compiled once into a single object file. If you place a function body in the header file it will get compiled again each time another class includes that header. Since a header will often be included because only one or two methods from a class are required, this can increase compile time dramatically -- a function will be compiled as long as its body is included, even if the method is never directly called within the object file being created. For example: The cOrganism object is declared in the file cOrganism.h and fully defined in cOrganism.cc. When this file is compiled, it creates the object file cOrganism.o. Both the cPopulation class (cPopulation.cc) and the cTestCPU class (cTestCPU.cc) use the cOrganism object. Since the majority of its methods are defined in cOrganism.cc, the compiler only needs to be compile these methods once. During the link phase the linker connects the references together. Occasionally short functions are implemented with their bodies directly in the header file. When a function compiled in one object file is run from another, the linker basically points the caller to the location of that function. A few extra CPU cycles must be expended while the program jumps to the function. Many small function, especially one line access methods, can be made inline, which means it will be placed, as a whole, right inside of the function that calls it. If the function is short enough, it only takes up as much space as the call to it would have taken anyway, and hence does not increase the size of the executable. file:///Users/boccio/Desktop/documentation/structure.html Page 1 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM Directory Structure The following sections provide a high level overview of the directory structure within the Avida source code distribution. Many directory sections contain partial listings of the files contained within them, however these list are not to be considered complete. Top Level Directory All of the files for the current version of Avida reside in the directory labeled trunk/ by default when checked out of Subversion. In addition to the subdirectories documentation/, source/ and support/ (all described below), this directory contains several key sources of information and automatic compilation files. The most important of these are described here. AUTHORS This file contains information about the authorship of Avida. Avida.xcodeproj This file (or directory on non-Mac OS platforms) contains the Xcode project information for development and building Avida within the Xcode IDE on Mac OS. This project file requires Xcode 2.1 or greater. BuildAvida.py The main entry point for the new experimental SCONS python based build system. CHANGES A listing of important changes to Avida that affect users of previous releases. COPYING COPYING.gpl These files contain copyright information. KNOWN_BUGS A listing of known issues that may be pertinent to various users. README A general guide on how to get started once you put the Avida files on your machine. build_avida A one step build script for compiling Avida under Unix platforms that have CMake installed. test_avida After Avida has been built, this script serves as an entry point for executing a series of consistency tests on the produced executable. Directory: build/work/ (CMake) Directory: build/{Target Name}/work/ (Xcode) After compilation, this directory will contain all of the configuration files necessary for Avida (explained in more detail under in their own documentation files). The key files and directories here are: analyze.cfg The default file used to write analysis scripts. avida.cfg file:///Users/boccio/Desktop/documentation/structure.html Page 2 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM This is the main configuration file that is used by default. environment.cfg This file contains the default environment information. events.cfg This file contains the default event list. inst_set.default This is the main, heads-based instruction set that is used by default. organism.default This file contains the default starting ancestor of length 100. data/ This is the name of the default output directory and is created by Avida if it does not exist. The name and location of this directory can be configured in avida.cfg. Directory: source/ This is a large sub-directory structure that contains all of the source code that makes up Avida. Each sub-directory here includes its own CMake and SCONS build information. The high level purpose of each sub-directory is: actions/ Contains various source files that define action classes that are usable as schedule events and analyze commands. Also contains the cActionLibrary responsible for instiating objects based on cString names. analyze/ Contains classes responsible for performing and managing data from detailed analyses. classification/ Classes that define and manage classification of current and past properties of the population are stored here. cpu/ Files and classes used to implement all of the virtual hardware within the Avida software. drivers/ Classes and infrastructure used to orchestrate the execution of Avida. event/ Contains classes responsible for event scheduling and triggering. main/ Contains all of the core classes that define the world and the population within it. platform/ Contains platform specific software in various subdirectories, such as the high performance malloc library for POSIX platforms. targets/ Target (executable) specific source code. The source code of the NCurses viewer resides in the avida-viewer/ subdirectory. tools/ Contains a number of generic tools classes, including custom data structures and robust string manipulation classes. Directory: source/main/ This sub-directory contains all of the core source code files for the file:///Users/boccio/Desktop/documentation/structure.html Page 3 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM software. For ease, there are two separate groups of more important components and less important components, each in alphabetical order. The syntax name.?? refers to header/source file pairs, name.h and name.cc. The more important files are: cAvidaConfig.?? These files define the cAvidaConfig object that maintains the current configuration state of Avida. This class is initialized by the avida.cfg file and processed command line arguments and can be modified via various events during the run. cEnvironment.?? This file defines the cEnvironment object, which controls all of the environmental interactions in an Avida run. It makes use of reactions, resources, and tasks. cGenome.?? The cGenome object maintains of a sequence of objects of class cInstruction. cInstruction.?? The cInstruction class is very simple, maintaining a single instruction in Avida. cInstLibBase.h The cInstLibBase class serves as a base class for objects that associate instructions with their corresponding functionality in the virtual hardware. cMutationRates.?? These files contain the cMutationRates class which maintain the probability of occurrence for each type of mutation. cOrganism.?? The cOrganism class represents a single organism, and contains the initial genome of that organism, its phenotypic information, its virtual hardware, etc. cPopulation.?? The cPopulation class manages the organisms that exist in an Avida population. It maintains a collection of cPopulationCell objects (either as A grid, or independent cells for mass action) and contains the scheduler, genebank, event manager, etc. cPopulationCell.?? A cPopulationCell is a single location in an Avida population. It can contain an organism, and has its own mutation rates (but not yet its own environment.) cStats.?? A cStats object keeps track of many different population-wide statistics. cWorld.?? The cWorld object contains all of the state information used by a particular run and can be used to access many globally important classes. Below are various less important files that may still be useful to know about: cOrgInterface.h The cOrgInterface class defines the interface used by organisms to interact back with the population or test CPU environment. cReaction.?? The cReaction class contains all of the information for what triggers a file:///Users/boccio/Desktop/documentation/structure.html Page 4 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM reaction, its restrictions, and the process that occurs. cReactionResult.?? The cReactionResult class contains all of the information about the results of a reaction after one occurs, such as the amount of resources consumed, what the merit change is, what tasks triggered it, etc. cResource.?? The cResource class contains information about a single resource, such as its inflow rate, outflow, name, etc. cResourceCount.?? The resource count keeps track of how much of each resource is present in the region being tracked. cTaskLib.?? This class contains all of the information associated with task evaluation. Directory: source/analyze/ The primary class in this directory is cAnalyze. This class processes analyze.cfg files to perform data analysis on run data. The additional classes in this directory support various types of analyses, along with provide the foundation for multithreaded execution. The cAnalyzeJobQueue object, instatiated by cAnalyze, orchestrates queuing and executing jobs on parallel worker objects. Directory: source/cpu/ This sub-directory contains the files used to define the virtual CPUs in Avida. cCodeLabel.?? The cCodeLabel class marks labels (series of no-operation instructions) in a genome. These are used when a label needs to be used as an instruction argument. cCPUMemory.?? The cCPUMemory class inherits from the cGenome class, extending its functionality to facilitate insertions and deletions. It also associates flags with each instruction in the genome to mark if they have been executed, copied, mutated, etc. cCPUStack.?? The cCPUStack class is an integer-stack component in the virtual CPUs. cHardwareBase.?? The cHardwareBase class is an abstract base class that all other hardware types must be overloaded from. It has minimal built in functionality. cHardwareCPU.?? The cHardwareCPU class extends cHardwareBase into a proper virtual CPU, with registers, stacks, memory, IO Buffers, etc. cHardwareManager.?? The cHardwareManager manages the building of new hardware as well Test CPU creation. cHardwareSMT.?? This class represents the in process experimental implementation of next generation virtual hardware. cHardwareTransSMT.?? file:///Users/boccio/Desktop/documentation/structure.html Page 5 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM An intermediate step on the path to cHardwareSMT, this transitional hardware is used in a number of ongoing research projects. cHeadCPU.?? The cCPUHead class implements a head pointing to a position in the memory of a virtual CPU. cTestCPU.?? The cTestCPU class maintains a test environment to run organisms in that we don't want to be able to directly affect the real population. cTestUtil.?? The cTestUtil utility class is for test-related functions that require a test CPU, such as printing out a genome to a file with collected information. Directory: source/tools/ The tools sub-directory contains C++ source code that is used throughout Avida, but is not specific to the project. cDataEntry.?? Associates data names with functions for printing out data file with a user specified format. cDataFile.?? A class useful for handling output files with named columns. cDataFileManager.?? This class manages a collection of data files and handles the creation and output of user-designed data files at runtime. cMerit.?? Provides a very large integer number, dissectable in useful ways. cRandom.?? A powerful and portable random number generator, that can output numbers in a variety of formats. cString.?? A standard string object, but with lots of functionality. cStringList.?? A specialized class for collections of strings, with added functionality over a normal list. cStringUtil.?? Contains a bunch of static methods to manipulate and compare strings. functions.h Some useful math functions such as Min, Max, and Log. Templates are special classes that interact with another data-type that doesn't need to be specified until the programmer instantiates an object in the class. Its a hard concept to get used to, but allows for remarkably flexible programming, and makes very reusable code. The main drawback (other than brain-strain) is that templates must be entirely defined in header files since separate code is generated for each class the template interacts with. tArray.h A fixed-length array template; array sizes may be adjusted manually when needed. tBuffer.h A container that keeps only the last N entries, indexed with the most file:///Users/boccio/Desktop/documentation/structure.html Page 6 of 7 Avida : Directory and File Structure 08/28/2007 04:33 PM recent first. tDictionary.h A container template that allows the user to search for a target object based on a keyword (of type cString). tHashTable.h A mapping container that maps keys to values using a hashing function to provide fast lookup. tList.h A reasonably powerful linked list and iterators. The list will keep track of the iterators and never allow them to have an illegal value. tManagedPointerArray.h A derivative of tArray, a managed pointer array is ideal for storing arrays of large objects that may need to be resized. The backing storage mechanism simple resizes an array of pointers, preventing the unnecessary copying of large objects. tMatrix.h A fixed size matrix template with arbitrary indexing. tMemTrack.h This is a template that can be put over any class or data type to keep track of it. If all creations of objects in the class are done through this template rather than (or in conjunction with) "new", memory leaks should be detectable. This is new, and not yet used in Avida. tSmartArray.h A derivative of tArray that provides hidden capacity management. This type of array is ideal for arrays of small objects that may be resized often. tVector.h A variable-length array object; array sizes will be automatically adjusted to accommodate any positions accessed in it. Directory: support/config/ This directory contains all of the originals of the files that are copied into the work/ directory on the installation process for the user to modify. There is also a misc/ sub-directory under here with additional, optional configuration files that you may want to look at to see other possible preconfigured settings. Return to the Index file:///Users/boccio/Desktop/documentation/structure.html Page 7 of 7 Avida : The Avida Configuration File 08/28/2007 04:34 PM Return to the Index Revised 2006-09-05 DMB The Avida Configuration File The Avida configuration file (avida.cfg) is the main configuration file for Avida. With this file, the user can setup all of the basic conditions for a run. Below are detailed descriptions for some of the settings in the configuration file, with particularly important settings highlighted in green. The non-colored entries will probably never need to change unless you are performing a very specialized experiment. Architecture Variables This section covers all of the basic variables that describe the Avida run. This is effectively a miscellaneous category for settings that don't fit anywhere below. MAX_UPDATES These settings allow the user to determine for how long MAX_GENERATIONS the run should progress in generations and in updates, and END_CONDITION_MODE determine if one or both criteria need to be met for the run to end. The run will also end if ever the entire population has died out. A setting of -1 for either ending condition will indicate no limit. End conditions can also be set in the events file, as is done by default, so you typically won't need to worry about this. WORLD_X WORLD_Y The settings determine the size of the Avida grid that the organisms populate. In mass action mode the shape of the grid is not relevant, only the number of organisms that are in it. RANDOM_SEED The random number seed initializes the random number generator. You should alter only this seed if you want to perform a collection of replicate runs. Setting the random number seed to zero (or a negative number) will base the seed on the starting time of the run -- effectively a random random number seed. In practice, you want to always be able to re-do an exact run in case you want to get more information about what happened. Configuration Files This section relates Avida to other files that it requires. DATA_DIR The name (or path) of the directory where output files generated by Avida should be placed. file:///Users/boccio/Desktop/documentation/config.html Page 1 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM INST_SET EVENT_FILE These settings indicate the names of all of the other ANALYZE_FILE configuration files used in an Avida run. See the individual ENVIRONMENT_FILE documents for more information about how to use these files. START_CREATURE Reproduction These settings control how creatures are born and die in Avida. BIRTH_METHOD The birth method sets how the placement of a child organism is determined. Currently, there are six ways of doing this -- the first four (0-3) are all gridbased (offspring are only placed in the immediate neighborhood), and the last two (4-5) assume a wellstirred population. In all non-random methods, empty sites are preferred over replacing a living organism. DEATH_METHOD AGE_LIMIT By default, replacement is the only way for an organism to die in Avida. However, if a death method is set, organisms will die of old age. In method one, organisms will die when they reach the user-specified age limit. In method 2, the age limit is a multiple of their length, so larger organisms can live longer. ALLOC_METHOD During the replication process in the default virtual CPU, parent organisms must allocate memory space for their child-to-be. Before the child is copied into this new memory, it must have an initial value. Setting the alloc method to zero sets this memory to a default instruction (typical nop-A). Mode 1 leaves it uninitialized (and hence keeps the contents of the last organism that inhabited that space; if only a partial copy occurs, the child is a hybrid if the parent and the dead organism, hence the name necrophilia). Mode 2 just randomizes each instruction. This means that the organism will behave unpredictably if the uninitialized code is executed. DIVIDE_METHOD When a divide occurs, does the parent divide into two children, or else do we have a distinct parent and child? The latter method will allow more age structure in a population where an organism may behave differently when it produces its second or later offspring. GENERATION_INC_METHOD The generation of an organism is the number of organisms in the chain between it and the original ancestor. Thus, the generation of a population can be calculated as the average generation of the individual organisms. When a divide occurs, the child always receives a generation one higher than the parent, but what should happen to the generation of the parent file:///Users/boccio/Desktop/documentation/config.html Page 2 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM itself? In general, this should be set the same as divide method. Divide Restrictions These place limits on when an organism can successfully issue a divide command to produce an offspring. CHILD_SIZE_RANGE This is the maximal difference in genome size between a parent and offspring. The default of 2.0 means that the genome of the child must be between one-half and twice the length of the parent. This it to prevent out-of-control size changes. Setting this to 1.0 will ensure fixed length organisms (but make sure to also turn off insertion and deletion mutations). MIN_COPIED_LINES MIN_EXE_LINES These settings place limits on what the parent must have done before the child can be born; they set the minimum fraction of instructions that must have been copied into the child (vs. left as default) and the minimum fraction of instructions in the parent that must have been executed. If either of these are not met, the divide will fail. These settings prevent organisms from producing pathological offspring. In practice, either of them can be set to 0.0 to turn them off. REQUIRE_ALLOCATE Is an allocate required between each successful divide (in virtual hardware types where allocate is meaningful)? If so, this will limit the flexibility of how organisms produce children (they can't make multiple copies and divide them off all at once, for example). But if we don't require allocates, the resulting organisms can be a lot more difficult to understand. REQUIRED_TASK This was originally a hack. It allows the user to set the ID number for a task that must occur for a divide to be successful. At -1, no tasks are required. Ideally, this should be incorporated into the environment configuration file. NOTE: A task can fire without triggering a reaction. To add a required reaction see below. IMMUNITY_TASK Allows user to set the ID number for a task which, if it occures, provides immunity from the required task (above) - divide will proceede even if the required task is not done if immunity task is done. Defaults to -1, no immunity task present. REQUIRED_REACTION Allows the user to set the ID number for a reaction that must occur for a divide to be successful. At -1, no reactions are required. DIE_PROB Determines the probability of organism dieing when 'die' instruction is executed. Set to 0 by default, making the instruction neutral. file:///Users/boccio/Desktop/documentation/config.html Page 3 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM Mutations These settings control how and when mutations occur in organisms. Ideally, there will be more options here in the future. POINT_MUT_PROB Point mutations (sometimes referred to as "cosmic ray" mutations) occur every update; the rate set here is a probability for each site that it will be mutated each update. In other words, this should be a very low value if it is turned on at all. If a mutation occurs, that site is replaced with a random instruction. In practice this also slows Avida down if it is non-zero because it requires so many random numbers to be tested every update. COPY_MUT_PROB The copy mutation probability is tested each time an organism copies a single instruction. If a mutation occurs, a random instruction is copied to the destination. In practice this is the most common type of mutations that we use in most of our experiments. INS_MUT_PROB DEL_MUT_PROB These probabilities are tested once per gestation cycle (when an organism is first born) at each position where an instruction could be inserted or deleted, respectively. Each of these mutations change the genome length. Deletions just remove an instruction while insertions add a new, random instruction at the position tested. Multiple insertions and deletions are possible each generation. DIVIDE_MUT_PROB Divide mutation probabilities are tested when an organism is DIVIDE_INS_PROB being divided off from its parent. If one of these mutations DIVIDE_DEL_PROB occurs, a random site is picked for it within the genome. At most one divide mutation of each type is possible during a single divide. Mutation Reversions This section covers tests that are very CPU intensive, but allow for Avida experiments that would not be possible in any other system. Basically, each time a mutation occurs, we can run the resulting organism in a test CPU, and determine if that effect of the mutation was lethal, detrimental, neutral, or beneficial. This section allows us to act on this. (Note that as soon as anything here is turned on, the mutations need to be tested. Turning multiple settings on will not cause additional speed decrease) REVERT_FATAL REVERT_DETRIMENTAL REVERT_NEUTRAL REVERT_BENEFICIAL When a mutation occurs of the specified type, the number listed next to that entry is the probability that the mutation will be reverted. That is, the child organism's genome will be restored as if the mutation file:///Users/boccio/Desktop/documentation/config.html Page 4 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM had never occurred. This allows us both to manually manipulate the abundance of certain mutation types, or to entirely eliminate them. STERILIZE_FATAL STERILIZE_DETRIMENTAL STERILIZE_NEUTRAL STERILIZE_BENEFICIAL The sterilize options work similarly to revert; the difference being that an organism never has its genome restored. Instead, if the selected mutation category occurs, the child is sterilized so that it still takes up space, but can never produce an offspring of its own. FAIL_IMPLICIT If this toggle is set, organisms must be able to produce exact copies of themselves or else they are sterilized and cannot produce any offspring. An organism that naturally (without any external effects) produces an inexact copy of itself is said to have implicit mutations. If this flag is set, explicit mutations (as described in the mutations section above) can still occur. Time Slicing These settings describe exactly what an update is, and how CPU time is allocated to organisms during that update. AVE_TIME_SLICE This sets the average number of instructions an organism should execute each update. Organisms with a low merit will consistently obtain fewer, while organisms of a higher merit will receive more. SLICING_METHOD This setting determines the method by which CPU time is handed out to the organisms. Method 0 ignores merit, and hands out time on the CPU evenly; each organism executes one instruction for the whole population before moving onto the second. Method 1 is probabilistic; each organism has a chance of executing the next instruction proportional to it merit. This method is slow due to the large number of random values that need to be obtained and evaluated (and it only gets slower as merits get higher). Method 2 is fully integrated; the organisms get CPU time proportional to their merit, but in a fixed, deterministic order. SIZE_MERIT_METHOD This setting determines the base value of an organism's merit. Merit is typically proportional to genome length otherwise there is a strong selective pressure for shorter genomes (shorter genome => less to copy => reduced copying time => replicative advantage). Unfortunately, organisms will cheat if merit is proportional to the full genome length -- they will add on unexecuted and uncopied code to their genomes creating a code bloat. This isn't the most elegant fix, but it works. file:///Users/boccio/Desktop/documentation/config.html Page 5 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM MAX_LABEL_EXE_SIZE Labels are sequences of nop (no-operation) instructions used only to modify the behavior of other instructions. Quite often, an organism will have these labels in their genomes where the nops are used by another instruction, but never executed directly. To represent the executed length of an organism correctly, we need to somehow count these labels. Unfortunately, if we count the entire label, the organisms will again "cheat" artificially increasing their length by growing huge labels. This setting limits the number of nops that are counted as executed when a label is used. MAX_CPU_THREADS Determines the number of simultaneous processes that an organism can run. That is, basically, the number of things it can do at once. This setting is meaningless unless threads are supported in the virtual hardware and the instructions are available within the instruction set. Geneology Info These settings control how Avida monitors and deals with genotypes, species, and lineages. THRESHOLD For some statistics, we only want to measure organisms that we are sure are alive, but its not worth taking the time to run them all in isolation, without outside effect (and in some eco-system situations that isn't even possible!). For these purposes, we call a genotype "threshold" if there have ever been more than a certain number of organisms of that genotype. A higher number here ensures a greater probability that the organisms are indeed "alive". Recently, we've been shifting away from using threshold genotypes and instead finding other, more accurate testing methods. GENOTYPE_PRINT Should all genotypes be printed out upon reaching threshold? Each will receive its own file in the archive directory, so this can get very hard disk intensive. Many runs will have in the millions of organisms. GENOTYPE_PRINT_DOM Printing only the dominant genotype keeps track of the most successful individual genotypes without costing a huge amount of memory. The number you place here is the total number of updates that a genotype must remain dominant for it to be printed out. A 0 turns this off. SPECIES_THRESHOLD In Avida, two organisms are said to be of the same species if you can perform all possible crossovers between them, and no more than a certain threshold (set here) fail to be viable offspring. The crossovers are done in isolation, and never affect the population as a whole. SPECIES_RECORDING This entry sets if and how species should be recorded in file:///Users/boccio/Desktop/documentation/config.html Page 6 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM Avida. A setting of 0 turns all species tests off. A setting of 1 means that every time a genotype reaches threshold, it is tested against all currently existing species to determine if it is part of any of them. If so, its species is set, and if not, it becomes the prototype of a new species. Finally, a setting of 2 only tests a new threshold genotype against the species of its parent (since each species test can take a long time) and if that fails immediately creates a new species. In practice, methods 1 and 2 produce similar results, but method 1 can take a lot longer to run. SPECIES_PRINT Toggle: Should new species be printed as soon as they are created? TEST_CPU_TIME_MOD Many of our analysis methods (such as species testing) require that we be able to run organisms in isolation. Unfortunately, some of these organisms we test might be non-viable. At some point, we have to give up the test and label it as non-viable, but we can't give up too soon or else we might miss a viable, though slow replicator. This setting is multiplied by the length of the organism's genome in order to determine how many CPU-cycles to run the organism for. A setting of 20 effectively means that the average instruction must be executed twenty times before we give up. In practice, most organisms have an efficiency here of about 5, so 20 works well, but for accurate tests on some pathological organisms, we will be required to raise this number. TRACK_MAIN_LINEAGE In a normal Avida run, the genebank keeps track of all existing genotypes, and deletes them when the last organism of that genotype dies out. With this flag set, a genotype will not be deleted unless both it and all of its descendents have died off. This allows us to track back from any genotypes to its distant ancestors, monitoring all of the differences along the way. Once this information is being saved, see the events file for how to output it. Log Files Log files are printed every time a specified event occurs. By default, all logs settings are 0 (i.e. the logs are turned off). Each time a logged event is printed, the update and identifying information on the individual that triggered it is always included. LOG_CREATURES If toggle is set, print an entry to creature.log whenever a new organism is born. Include position information, parent organism, and a link to it genotype so the run can be reconstructed. This gets very large. file:///Users/boccio/Desktop/documentation/config.html Page 7 of 8 Avida : The Avida Configuration File 08/28/2007 04:34 PM LOG_GENOTYPES If toggle is set, print an entry to genotype.log whenever a new genotype is created. Includes information on its parent genotype. LOG_THRESHOLD If toggle is set, print an entry to threshold.log whenever a genotype reaches threshold. Includes information on what species it is. LOG_SPECIES If toggle is set, print an entry to species.log whenever a new species is created. Includes information on the genotype the triggered the creation. LOG_LINEAGES Lineages can be given unique identifies and printed (into the file lineage.log) whenever they are created. Includes details about the event that created the lineage. LINEAGE_CREATION_METHOD Details when lineages are created. See config file comments for more detailed information. Return to the Index file:///Users/boccio/Desktop/documentation/config.html Page 8 of 8 Avida : The Instruction Set File 08/28/2007 04:34 PM Return to the Index Revised 2006-09-05 DMB The Instruction Set File An instruction set file consists of a list of instructions that belong to that instruction set, each of which is followed by a series of numbers that define how that instruction should be used. The exact format is as follows: inst-name redundancy cost ft_cost prob_fail inst-name The name of the instruction to include in the described instruction set. redundancy The frequency of the instruction in the set. One instruction with twice the redundancy of another with also have twice the probability of being mutated to. A redundancy of zero is allowed, and indicates that injected organisms are allowed to have this instruction, but it can never be mutated to. cost The number of CPU cycles required to execute this instruction. One is the default if this value is not specified. ft_cost The additional cost to be paid the first time this instruction is executed. This is used to lower the diversity of instructions inside an organism. The default value here is 0. prob_fail The probability of this instruction not working properly. If an instruction fails it will simply do nothing, but still cost the CPU cycles to execute. The defailt probability of failure is zero. Normally only the first column of numbers is used in the file. Description of Default Instruction Set Below are the descriptions of the instructions turned on in the file instsetclassic.cfg. The one-letter codes are assigned automatically to each instruction in the set, so if additional instructions are turned on, the letters given below may no longer correspond to the instructions they are presented with. If more than 26 instructions are in a set, both lowercase and capital letters will be used, and then numbers. Currently, no more than 62 distinct instructions will be represented by unique symbols. Most terminology below that may not be familiar to you has been given a link to a file containing its definition. (a - c) Nop Instructions The instructions nop-A (a), nop-B (b), and nop-C (c) are no-operation file:///Users/boccio/Desktop/documentation/inst_set.html Page 1 of 5 Avida : The Instruction Set File 08/28/2007 04:34 PM instructions, and will not do anything when executed. They will, however, modifiy the behavior of the instruction preceeding it (by changing the CPU component that it affects; see also nop-register notation and nop-head notation) or act as part of a template to denote positions in the genome. (d) if-n-equ This instruction compares the ?BX? register to its complement. If they are not equal, the next instruction (after a modifying no-operation instruction, if one is present) is executed. If they are equal, that next instruction is skipped. (e) if-less This instruction compares the ?BX? register to its complement. If ?BX? is the lesser of the pair, the next instruction (after a modifying no-operation instruction, if one is present) is executed. If it is greater or equal, then that next instruction is skipped. (f) pop This instruction removes the top element from the active stack, and places it into the ?BX? register. (g) push This instruction reads in the contents of the ?BX? register, and places it as a new entry at the top of the active stack. The ?BX? register itself remains unchanged. (h) swap-stk This instruction toggles the active stack in the CPU. All other instructions that use a stack will always use the active one. (i) swap This instruction swaps the contents of the ?BX? register with its complement. (j) shift-r This instruction reads in the contents of the ?BX? register, and shifts all of the bits in that register to the right by one. In effect, it divides the value stored in the register by two, rounding down. (k) shift-l This instruction reads in the contents of the ?BX? register, and shifts all of the bits in that register to the left by one, placing a zero as the new rightmost bit, and trunkating any bits beyond the 32 maximum. For values that require fewer than 32 bits, it effectively multiplies that value by two. file:///Users/boccio/Desktop/documentation/inst_set.html Page 2 of 5 Avida : The Instruction Set File 08/28/2007 04:34 PM (l) inc and (m) dec These instructions read in the contents of the ?BX? register and increment or decrement it by one. (n) add and (o) sub These instructions read in the contents of the BX and CX registers and either sums them together or subtracts CX from BX (respectively). The result of this operation is then placed in the ?BX? register. (p) nand This instruction reads in the contents of the BX and CX registers (each of which are 32-bit numbers) and performs a bitwise nand operation on them. The result of this operation is placed in the ?BX? register. Note that this is the only logic operation provided in the basic Avida instruction set. (q) IO This is the input/output instruction. It takes the contents of the ?BX? register and outputs it, checking it for any tasks that may have been performed. It will then place a new input into ?BX?. (r) h-alloc This instruction allocates additional memory for the organism up to the maximum it is allowed to use for its offspring. (s) h-divide This instruction is used for an organism to divide off an finnished offspring. The original organism keeps the state of its memory up until the read-head. The offspring's memory is initialized to everything between the read-head and the write-head. All memory past the write-head is removed entirely. (t) h-copy This instruction reads the contents of the organism's memory at the position of the read-head, and copy that to the position of the write-head. If a nonzero copy mutation rate is set, a test will be made based on this probability to determine if a mutation occurs. If so, a random instruction (chosen from the full set with equal probability) will be placed at the write-head instead. (u) h-search This instruction will read in the template the follows it, and find the location of a complement template in the code. The BX register will be set to the distance to the complement from the current position of the instructionpointer, and the CX register will be set to the size of the template. The flow-head will also be placed at the beginning of the complement template. If no template follows, both BX and CX will be set to zero, and the flow-head file:///Users/boccio/Desktop/documentation/inst_set.html Page 3 of 5 Avida : The Instruction Set File 08/28/2007 04:34 PM will be placed on the instruction immediatly following the h-search. (v) mov-head This instruction will cause the ?IP? to jump to the position in memory of the flow-head. (w) jmp-head This instruction will read in the value of the CX register, and the move the ? IP? by that fixed amount through the organism's memory. (x) get-head This instruction will copy the position of the ?IP? into the CX register. (y) if-label This instruction reads in the template that follows it, and tests if its complement template was the most recent series of instructions copied. If so, it executed the next instruction, otherwise it skips it. This instruction is commonly used for an organism to determine when it has finished producing its offspring. (z) set-flow This instruction moves the flow-head to the memory position denoted in the ? CX? register. Other available instructions h-push and h-pop These instructions act siminar to push and pop above, but instead of working with registers, the place the position of the ?IP? on the stack, or put the ? IP? at the position taken from the stack (respectively). inject This instruction acts similar to divide, but instead of splitting off an offspring, it will remove the section of code between the read and write heads, and attempt to inject it into the neighbor that the organism is facing. The template following this instruction will be used; if an exact match is found (with no extre nops in it) in the target organism, the injected code will be placed immediately after that template. Otherwise the command fails, and the code intended for injection is instead discarded. rotate-l and rotate-r These instructions rotate the facing of an organism. If no teplate follows, file:///Users/boccio/Desktop/documentation/inst_set.html Page 4 of 5 Avida : The Instruction Set File 08/28/2007 04:34 PM the organism will turn one cell in the appropriate direction (left or right). If a template is present, it will keep turning in that direction until either it has made a full 360 degree turn, or else it finds an organism that possesses the complement template. div-asex Same as h-divide (added for symetry with the divide-sex). div-sex Divide with recombination. After the offspring genome is created, it is not immediately placed into the population. Instead, it goes into "birth chamber". If there is already another genome there, they recombine. If not, it waits untill the next sexually produced genotype arrives. When another genome arrives two random points are picked in the genome, and the area between them is swapped between the two genomes in the birth chamber. Then, they are both placed into the population. div-asex-w Control for the effect of sexual genomes waiting in the birth chamber. There is no recombination here, but each genome must wait in the birth chamber until another one arrives before they are both placed into the population. die When executed, kills the organism, with the probability set by DIE_PROB in genesis. Return to the Index file:///Users/boccio/Desktop/documentation/inst_set.html Page 5 of 5 Avida : The Events File 08/28/2007 04:34 PM Return to the Index Revised 2006-09-05 DMB The Events File The events file controls events that need to occur throughout the course of a run. This includes the output of data files as well as active events that effect the population (such as extinction events or changes to the mutation rate). File Formats This file consists of a list of events that will be triggered either singly or periodically. The format for each line is: type timing event arguments The type determines what kind of timings the event will be based off of. This can be immediate [i], based on update [u], or based on generation [g]. The timing should only be included for non-immediate events. If a single number is given for timing, the event occurs at that update/generation. A second number can be included (seperated by a colon ':') to indicate how often the event should be repeated. And if a third number is listed (again, colon seperated) this will be the last time the event can occur on. For example, the type and timing u 100:100:5000 would indicate that the event that follows first occurs at update 100, and repeats every 100 updates thereafter until update 5000. A type timing of g 10:10 would cause the event to be triggered every 10 generations for the entire run. The event is simply the name of the action that should be performed, and the arguments detail exactly how it should work when it is triggered. Each action has its own arguments. See the List of Actions for details about all of the available options. Some examples: i Inject Inject an additional start creature immediately. u 100:100 PrintAverageData Print out all average measurements collected every one hundred updates, starting at update 100. g 10000:10:20000 PrintData dom_info.dat update,dom_fitness,dom_depth,dom_sequence Between generations 10,000 and 20,000, append the specified information to the file dom_info.dat every ten generations. Specifically, the first column in the file would be update number, second is the fitness of the dominant genotype, followed by the depth in the phylogentic tree of the dominant file:///Users/boccio/Desktop/documentation/events.html Page 1 of 2 Avida : The Events File 08/28/2007 04:34 PM genotype, and finally its genome sequence. Return to the Index file:///Users/boccio/Desktop/documentation/events.html Page 2 of 2 Avida : The Environment File 08/28/2007 04:35 PM Return to the Index Revised 2006-09-05 DMB The Environment File This is the setup file for the task/resource system in Avida. Two main keywords are used in this file, RESOURCE and REACTION. Their formats are: RESOURCE REACTION name[:flow] {name ...} name task [process:...] [requisite:...] Where name is a unique identifier. Resources can have additional flow information to indicate starting amounts, inflow and outflow. Reactions are further described by the task that triggers them, the processes they perform (including resources used and the results of using them), and requisites on when they can occur. All entries on a resource line are names of individual resources. Resources have a global quantity depleatable by all organisms. The resource name infinite is used to refer to an undepleatable resource. The following chart specifies additional descriptions for resource initialization. Table 1: Resource Specifications (blue variables used for all resources while red variables are only used for spatial resources) Argument Description Default inflow The number of units of the resource that enter the population over the course of an update. For a global resource this inflow occurs evenly throughout the update, not all at once. For a spatial resource this 0 inflow amount is added every update evenly to all grid cells in the rectangle described by the points (inflowx1,inflowy1) and (inflowx2,inflowy2). outflow The fraction of the resource that will flow out of the population each update. As with inflow, this happens continuously over the course of the update for a global resource. In the case of a spatial resource 0.0 the fraction is withdrawn each update from each cell in the rectangle described by the points (outflowx1,outflowy1) and (outflowx2,outflowy2). initial The initial abundance of the resource in the population at the start of an experiment. For a spatial resource the initial amount is spread evenly to each cell in the world grid. geometry The layout of the resource in space. global -- the entire pool of a resource is available to all organisms grid -- organisms can only access resources in their grid cell. Resource can not flow past the edges of the world grid. (resource will global use spatial parameters) torus -- organisms can only access resources in their grid cell. Resource can flow to the oposite edges of the world grid. (resource will use spatial parameters) inflowx1 Leftmost coordinate of the rectange where resource will flow into world grid. inflowx2 Rightmost coordinate of the rectange where resource will flow into file:///Users/boccio/Desktop/documentation/environment.html 0 0 0 Page 1 of 6 Avida : The Environment File inflowx2 08/28/2007 04:35 PM 0 world grid. inflowy1 Topmost coordinate of the rectange where resource will flow into world 0 grid. inflowy2 Bottommost coordinate of the rectange where resource will flow into world grid. 0 outflowx1 Leftmost coordinate of the rectange where resource will flow out of world grid. 0 outflowx2 Rightmost coordinate of the rectange where resource will flow out of world grid. 0 outflowy1 Topmost coordinate of the rectange where resource will flow out of world grid. 0 outflowy2 Bottommost coordinate of the rectange where resource will flow out of world grid. 0 xdiffuse How fast material will diffuse right and left. This flow depends on the amount of resources in a given cell and amount in the cells to the 1.0 right and left of it. (0.0 - 1.0) xgravity How fast material will move to the right or left. This movement depends only on the amount of resource in a given cell. (-1.0 - 1.0) 0 ydiffuse How fast material will diffuse up and down. This flow depends on the amount of resources in a given cell and amount in the cells above and below it. (0.0 - 1.0) 1.0 ygravity How fast material will move to the up or down. This movement depends only on the amount of resource in a given cell. (-1.0 - 1.0) 0 An example of a RESOURCE statement that begins a run with a fixed amount of the (global) resource in the environment, but has no inflow or outflows is: RESOURCE glucose:initial=10000 If you wanted to make this into a chemostat with a 10000 equilibrium concentration for unused resources, you could put: RESOURCE maltose:initial=10000:inflow=100:outflow=0.01 If you want a resource that exists spatially where the resource enters from the top and flows towards the bottom where it exits the system, you could use: RESOURCE lactose:initial=100000:inflow=100:outflow=0.1:inflowx1=0:\ inflowx2=100:inflowy1=0:inflowy2=0:outflowx1=0:outflowx2=100:\ outflowy1=100:outflowy2=100:ygravity=0.5 Defining a resource with no parameters means that it will start at a zero quantity and have no inflow or outflow. This is sometimes desirable if you want that resource to only be present as a byproduct of a reaction. Remember, though, that you should still have an outflow rate if its in a chemostat. Each reaction must have a task that triggers it. Currently, eighty tasks have been implemented, as summarized in the following table (in approximate order of complexity): Table 2: Available Tasks Task file:///Users/boccio/Desktop/documentation/environment.html Description Page 2 of 6 Avida : The Environment File 08/28/2007 04:35 PM echo This task is triggered when an organism inputs a single number and outputs it without modification. add This task is triggered when an organism inputs two numbers, sums them together, and outputs the result. sub This task is triggered when an organism inputs two numbers, subtracts one from the other, and outputs the result. not This task is triggered when an organism inputs a 32 bit number, toggles all of the bits, and outputs the result. This is typically done either by nanding (by use of the nand instruction) the sequence to itself, or negating it and subtracting one. The latter approach only works since numbers are stored in twos-complement notation. nand This task is triggered when two 32 bit numbers are input, the values are 'nanded' together in a bitwise fashion, and the result is output. Nand stands for "not and". The nand operation returns a zero if and only if both inputs are one; otherwise it returns a one. and This task is triggered when two 32 bit numbers are input, the values are 'anded' together in a bitwise fashion, and the result is output. The and operation returns a one if and only if both inputs are one; otherwise it returns a zero. orn This task is triggered when two 32 bit numbers are input, the values are 'orn' together in a bitwise fashion, and the result is output. The orn operation stands for or-not. It is returns true if for each bit pair one input is one or the other one is zero. or This task is triggered when two 32 bit numbers are input, the values are 'ored' together in a bitwise fashion, and the result is output. It returns a one if either the first input or the second input is a one, otherwise it returns a zero. andn This task is triggered when two 32 bit numbers are input, the values are 'andn-ed' together in a bitwise fashion, and the result is output. The andn operation stands for and-not. It only returns a one if for each bit pair one input is a one and the other input is not a one. Otherwise it returns a zero. nor This task is triggered when two 32 bit numbers are input, the values are 'nored' together in a bitwise fashion, and the result is output. The nor operation stands for not-or and returns a one only if both inputs are zero. Otherwise a zero is returned. xor This task is triggered when two 32 bit numbers are input, the values are 'xored' together in a bitwise fashion, and the result is output. The xor operation stands for "exclusive or" and returns a one if one, but not both, of the inputs is a one. Otherwise a zero is returned. equ This task is triggered when two 32 bit numbers are input, the values are equated together in a bitwise fashion, and the result is output. The equ operation stands for 'equals' and will return a one if both bits are identical, and a zero if they are different. logic_3AA- These tasks include all 68 possible unique 3-input logic operations, many of logic_3CP which don't have easy-to-understand human readable names. When describing a reaction, the process portion determines consumption of resources, their byproducts, and the resulting bonuses. There are several arguments (separated by colons; example below) to detail the use of a resource. Default values are in brackets: file:///Users/boccio/Desktop/documentation/environment.html Page 3 of 6 Avida : The Environment File 08/28/2007 04:35 PM Table 3: Reaction Process Specifications Argument Description Default resource The name of the resource consumed. By default, no resource is being consumed, and the 'max' limit is the amount absorbed. infinite value Multiply the value set here by the amount of the resource consumed to obtain the bonus. (0.5 may be inefficient, while 5.0 is very efficient.) This allows different reactions to make use of the same resource at different efficiency levels. 1.0 type Determines how to apply the bonus (i.e. the amount of the resource absorbed times the value of this process) to change the merit of the organism. add: Directly add the bonus to the current merit. mult: Multiply the current merit by the bonus (warning: if the bonus add is ever less than one, this will be detrimental!) pow: Multiply the current merit by 2 bonus . this is effectively multiplicative, but positive bonuses are always beneficial, and negative bonuses are harmful. max The maximum amount of the resource consumed per occurrence. 1.0 min The minimum amount of resource required. If less than this quantity is available, the reaction ceases to proceed. 0.0 frac The maximum fraction of the available resource that can be consumed. 1.0 product The name of the by-product resource. At the moment, only a single by-product can be produced at a time. conversion The conversion rate to by-product resource none 1.0 inst The instruction that gets executed when this reaction gets preformed. If you do not want an organism to be able to have the instruction in their genome, you still must put it in the none instruction set file, but set its weight to zero. The instruction is executed at no cost to the organism. lethal Whether the cell dies after performing the process 0 If no process is given, a single associated process with all default settings is assumed. If multiple process statements are given, all are acted upon when the reaction is triggered. Assuming you were going to set all of the portions of process to be their default values, this portion of the reaction statement would appear as: process:resource=infinite:value=1:type=add:max=1:min=0:frac=1:product=none:conversion=1 This statement has many redundancies; for example, it would indicate that the associated reaction should use the inifite resource, making 'frac' and 'min' settings irrelevant. Likewise, since 'product' is set to none, the 'conversion' rate is never considered. The requisite entry limits when this reaction can be triggered. The following requisites (in any combination) are possible: Table 4: Reaction Requisite Specifications Argument Description Default This limits this reaction from being triggered until the other file:///Users/boccio/Desktop/documentation/environment.html Page 4 of 6 Avida : The Environment File reaction 08/28/2007 04:35 PM reaction specified here has been triggered first. With this, the user none can force organisms to perform reactions in a specified order. This limits this reaction from being triggered if the reaction specified here has already been triggered. This allows the user to noreaction make mutually exclusive reactions, and force organisms to "choose" their own path. none min_count This restriction requires that the task used to trigger this reaction must be performed a certain number of times before the trigger will actually occur. This (along with max_count) allows the user to 0 provide different reactions depending on the number of times an organism has performed a task. max_count This restriction places a cap on the number of times a task can be done and still trigger this reaction. It allows the user to limit the number of times a reaction can be done, as well as (along with INT_MAX min_count) provide different reactions depending on the number of times an organism as performed a task. No restrictions are present by default. If there are multiple requisite entries, only *one* of them need be satisfied in order to trigger the reaction. Note though that a single requisite entry can have as many portions as needed. Examples We could simulate the pre-environment system (in which no resources were present and task performace was rewarded with a fixed bonus) with a file including only lines like: REACTION AND logic:2a process:type=mult:value=4.0 REACTION EQU logic:2h process:type=mult:value=32.0 requisite:max_count=1 requisite:max_count=1 No RESOURCE statements need be included since only the infinite resource is used (by default, since we don't specify another resources' name) # To create an environment with two resources that are converted back and forth as tasks are performed, we might have: RESOURCE RESOURCE REACTION REACTION yummyA:initial=1000 yummyB:initial=1000 AtoB gobbleA process:resource=yummyA:frac=0.001:product=yummyB BtoA gobbleB process:resource=yummyB:frac=0.001:product=yummyA A value of 1.0 per reaction is default. Obviously gobbleA and gobbleB would have to be tasks described within Avida. A requisite against the other reaction being performed would prevent a single organism from garnering both rewards in equal measure. As an example, to simulate a chemostat, we might have: RESOURCE glucose:inflow=100:outflow=0.01 This would create a resource called "glucose" that has a fixed inflow rate of 10000 units where 20% flows out every update. (Leaving a steady state of 50,000 units if no organismconsumption occurs). Limitations to this system: Resources are currently all global; at some point soon we need to implement local resources. Only a single resource can be required at a time, and only a single by-product can file:///Users/boccio/Desktop/documentation/environment.html Page 5 of 6 Avida : The Environment File 08/28/2007 04:35 PM be produced. The default setup is: REACTION REACTION REACTION REACTION REACTION REACTION REACTION REACTION REACTION NOT NAND AND ORN OR ANDN NOR XOR EQU not nand and orn or andn nor xor equ process:value=1.0:type=pow process:value=1.0:type=pow process:value=2.0:type=pow process:value=2.0:type=pow process:value=3.0:type=pow process:value=3.0:type=pow process:value=4.0:type=pow process:value=4.0:type=pow process:value=5.0:type=pow requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 requisite:max_count=1 This creates an environment where the organisms get a bonus for performing any of nine tasks. Since none of the reactions are associated with a resource, the infinite resource is assumed, which is non-depeleatable. The max_count of one means they can only get the bonus from each reaction a single time. A similar setup that has 9 resources, one corresponding to each of the nine possible tasks listed above is: RESOURCE RESOURCE RESOURCE RESOURCE RESOURCE resNOT:inflow=100:outflow=0.01 resAND:inflow=100:outflow=0.01 resOR:inflow=100:outflow=0.01 resNOR:inflow=100:outflow=0.01 resEQU:inflow=100:outflow=0.01 REACTION REACTION REACTION REACTION REACTION REACTION REACTION REACTION REACTION NOT NAND AND ORN OR ANDN NOR XOR EQU not nand and orn or andn nor xor equ resNAND:inflow=100:outflow=0.01 resORN:inflow=100:outflow=0.01 resANDN:inflow=100:outflow=0.01 resXOR:inflow=100:outflow=0.01 process:resource=resNOT:value=1.0:frac=0.0025 process:resource=resNAND:value=1.0:frac=0.0025 process:resource=resAND:value=2.0:frac=0.0025 process:resource=resORN:value=2.0:frac=0.0025 process:resource=resOR:value=4.0:frac=0.0025 process:resource=resANDN:value=4.0:frac=0.0025 process:resource=resNOR:value=8.0:frac=0.0025 process:resource=resXOR:value=8.0:frac=0.0025 process:resource=resEQU:value=16.0:frac=0.0025 Return to the Index file:///Users/boccio/Desktop/documentation/environment.html Page 6 of 6 Avida : The Analyze File 08/28/2007 04:35 PM Return to the Index Revised 2006-09-03 DMB The Analyze File The file analyze.cfg is used to setup Avida when it is run in analyze mode, which can be done by running avida -a. Analyze mode is useful for performing additional tests on genotypes after a run has completed. This analysis language is basically a simple programming language. The structure of a program involves loading in genotypes in one or more batches, and then either manipulating single batches, or doing comparisons between batches. Currently there can be up to 2000 batches of genotypes, but we will eventually remove this limit. The rest of this file describes how individual commands work, as well as some notes on other languages features, like how to use variables. As a formatting guide, command arguments will be presented between brackets, such as [filename]. If that argument is mandatory, it will be in blue. If it is optional, it will be in green, and (if relevant) a default value will be listed, such as [filename='output.dat']. Analyze Mode Commands Analyze mode provides a number of commands for loading, manipulating, and saving analysis data. In addition to the analyze mode specific commands detailed in the following sections, all of the Avida actions can be called as well. Load Commands There are currently four ways to load in genotypes: LOAD_ORGANISM [filename] Load in a normal single-organism file of the type that is output from Avida. These consist of lots of organismal information inside of comments, and then the full genome of the organism with one instruction per line. LOAD [filename] Load in a file that contains a list of genotypes, one-per-line with additional informaiton about those genotypes. Avida now includes a header on such files indicating the values containted in each column. LOAD_SEQUENCE [sequence] Load in a user-provided sequence as the genotype. Avida has a symbol associated with each instruction; this command is simply followed by a sequence of such symbols that is than translated back into a proper genotype. file:///Users/boccio/Desktop/documentation/analyze.html Page 1 of 9 Avida : The Analyze File 08/28/2007 04:35 PM LOAD_MULTI_DETAIL [start-UD] [step-UD] [stop-UD] [dir='./'] [start batch=0] Allows the user to load in multiple detail files at once, one per batch. This is helpful when you're trying to do parallel analysis on many detail files, or else to create a phylogenetic depth map. Example: LOAD_MULTI_DETAIL 100 100 100000 ../my_run/run100/ This would load in the files detail_pop.100 through detail_pop.100000 in steps of 100, from the directory of my choosing. Since 1000 files will be loaded and we didn't specify starting batch, they will be put in batches 0 through 999. A future addition to this list is a command that will use the "dominant.dat" file to identify all of the dominant genotypes from a run, and then lookup and load their individual genomes from the archive directory. Batch Control Commands All of the load commands place the new genotypes into the current batch, which can be set with the SET_BATCH command. Below is the list of control functions that allow you to manipulate the batches. SET_BATCH [id] Set the batch that is currently active; the initial active batch at the start of a program is 0. NAME_BATCH [name] Attach a name to the current batch. Some of the printing methods will print data from multiple batches, and we want the data from each batch to be attached to a meaningful identifier. PURGE_BATCH [id=current] Remove all genotypes in the specified batch (if no argument is given, the current batch is purged. DUPLICATE [id1] [id2=current] Copy the genotypes from batch id1 into id2. By default, copy id1 into the current batch. Note that duplicate is non-destructive so you should purge the target batch first if you don't want to just add more genotypes to the ones already in that batch. STATUS Print out (to the screen) the genotype count of each non-empty batch and identify the currently active batch. Analysis Control Commands There are several other commands that will allow you to interact with the analysis mode in some very important ways, but don't actually trigger any analysis tests or output. Below are a list of some of the more important control commands. SYSTEM [command] Run the command listed on the command line. This is particularly useful if you need to unzip files before you can use them, or if you want to file:///Users/boccio/Desktop/documentation/analyze.html Page 2 of 9 Avida : The Analyze File 08/28/2007 04:35 PM delete files no longer in use. INCLUDE [filename] Include another file into this one and run its contents immediately. This is useful if you have some pre-written routines that you want to have available in several analysis files. Watch out because there are currently no protections against circular includes. INTERACTIVE Place Avida analysis into interactive mode so that you can type commands have have them immediately acted upon. You can place this anywhere within the analyze file, so that you can have some processing done before interactive mode starts. You can type quit at any point to continue with the normal processing of the file. DEBUG [message] ECHO [message] These are both echo commands that will print a message (the arguments given) onto the screen. If there are any variables (see below) in the message, they will be translated before printing, so this is a good way of debugging your programs. Genotype Manipulation Commands Now that we know how to interact with analysis mode, and load in genotypes, its important to be able to manipulate them. The next batch of commands will do basic analysis on genotypes, and allow the user to prune batches to only include those genotypes that are needed. RECALCULATE [use_resources=0] Run all of the genotypes in the current batch through a test CPU and record the measurements taken (fitness, gestation time, etc.). This overrides any values that may have been loaded in with the genotypes. The use_resources flags signifies whether or not the test cpu will use resources when it runs. For more information on resources, see the summary below. FIND_GENOTYPE [type='num_cpus' ...] Remove all genotypes but the one selected. Type indicates which genotype to choose. Options available for type are num_cpus (to choose the genotype with the maximum organismal abundance at time of printing), total_cpus (number of organisms ever of this genotype), fitness, or merit. If a the type entered is numerical, it is used as an id number to indicate the desired genotype (if no such id exists, a warning will be given). Multiple arguments can be given to this command, in which case all those genotypes in that list will be preserved and the remainder deleted. FIND_ORGANISM [random] Picks out a random organism from the population and removes all others. It is different from FIND_GENOTYPE because it takes into account relative number of organisms within each genotype. To pick more than one organisms, list the word 'random' multiple times. This is essentially sampling without replacement from the population. FIND_LINEAGE [type="num_cpus"] Delete everything except the lineage from the chosen genotype back to the most distant ancestor available. This command will only function properly file:///Users/boccio/Desktop/documentation/analyze.html Page 3 of 9 Avida : The Analyze File 08/28/2007 04:35 PM if parental information was loaded in with the genotypes. Type is the same as the FIND_GENOTYPE command. FIND_SEX_LINEAGE [type="num_cpus"] [parent_method="rec_region_size"] Delete everything except the lineage from the chosen genotype back to the most distant ancestor available. Similar to FIND_LINEAGE but works in sexual populations. To simplify things, only maternal lineage plus immediate fathers are saved, i.e. info about father's parents is discarded. The second option, parent_method, determines which parent is considered the 'mother' in each particular recombination. If parent_method is "rec_region_size" : 'mother' is the parent contributing more code to the offspring genome (default); if it's genome_size, 'mother' is the parent with the longer genome, no matter how much of it was contributed to the offspring. This command will only function properly if parental information was loaded in with the genotypes. Type is the same as the FIND_GENOTYPE command. ALIGN Create an alignment of all the genome's sequences; It will place '_'s in the sequences to show the alignment. Note that a FIND_LINEAGE must first be run on the batch in order for the alignment to be possible. SAMPLE_ORGANISMS [fraction] [test_viable=0] Keep only fraction of organisms in the current batch. This is done per organism, not per genotype. Thus, genotypes of high abundance may only have their abundance lowered, while genotypes of abundance 1 will either stay or be removed entirely. If test_viable is set to 1, sample only from the viable organisms. SAMPLE_GENOTYPES [fraction] [test_viable=0] Keep only fraction of genotypes in the current batch. If test_viable is set to 1, sample only from the viable genotypes. RENAME [start_id=0] Change the id numbers of all the genotypes to start at a given value. Often in long runs we will be dealing with ID's in the millions. In particular, after reducing a batch to a lineage, we will often want to number the genotypes in order from the ancestor to the final one. Basic Output Commands Next, we are going to look at the standard output commands that will used to save information generated in analyze mode. PRINT [dir='archive/'] [filename] Print the genotypes from the current batch as individual files (one genotype per file) in the directory given. If no filename is specified, the files will be named by the genotype name, with a .gen appended to them. Specifying the filename is useful when printing a single genotype. TRACE [dir='archive/'] [ use_resources=0] Trace all of the genotypes and print a listing of their execution. This will show step-by-step the status of all of the CPU components and the genome during the course of the execution. The filename used for each trace will be the genotype's name with a .trace appended. The use resources flag signifies whether or not the test cpu will use resources when it runs. For more information on resources, see the summary below. file:///Users/boccio/Desktop/documentation/analyze.html Page 4 of 9 Avida : The Analyze File 08/28/2007 04:35 PM PRINT_TASKS [file='tasks.dat'] This will print out the tasks doable by each genotype, one per line in the output file specified. Note that this information must either have been loaded in, or a RECALCULATE must have been run to collect it. DETAIL [file='detail.dat'] [format ...] Print out all of the stats for each genotype, one per line. The format indicates the layout of columns in the file. If the filename specified ends in .html, html formatting will be used instead of plain text. For the format, see the section on Output Formats below. DETAIL_TIMELINE [file='detail_timeline.dat'] [time_step=100] [max_time=100000] Details a time-sequence of dump files. DETAIL_BATCHES [file='detail_baches.dat'] [format ...] Details all batches. DETAIL_INDEX [file] [min_batch] [max_batch] [format ...] Detail all the batches between min_batch and max_batch. DETAIL_AVERAGE [file="detail.dat"] [format ...] Detail the current batch, but print out the average for each argument, as opposed to the individual values for each genotype, the way DETAIL would. Arguments are the same as for DETAIL. it takes into account the relative abundance of each genotype in the batch when calculating the averages. Analysis Commands And at last, we have the actual analysis commands that perform tests on the data and output the results. ANALYZE_EPISTASIS [file='epistasis.dat'] [num_test=(all)] For each genotype in the current batch, test possible double mutatants, and single mutations composing them; print both of individual relative fitnesses and the double mutant relative fitness. By default all double mutants are tested. If in a hurry, specify the number to be tested. MAP_TASKS [dir="phenotype/"] [flags ...] [format ...] Construct a genotype-phenotype array for each genotype in the current batch. The format is the list of stats that you want to include as columns in the array. Additionally you can have special format flags; the possible flags are 'html' to print output in HTML format, and 'link_maps' to create html links between consecutive genotypes in a lineage. MAP_MUTATIONS [dir="mutations/"] [flags ...] Construct a genome-mutation array for each genotype in the current batch. The format has each line in the genome as a row in the chart, and all available instructions representing the columns. The cells in the chart indicate the fitness were a mutation to occur at the position in the matrix, to the listed instruction. If the 'html' flag is used, the charts will be output in HTML format. MAP_DEPTH [filename='depth_map.dat'] [min_batch=0] [max_batch=cur_batch-1] This will create a depth map (like those we use for phylogeny visualization) in the filename specified. You can direct which batches to take this from, but by default it will work perfectly after a LOAD_MULTI_DETAIL. AVERAGE_MODULATITY [file='modularity.dat'] [task.0 task.1 task.2 task.3 task.4 task.5 task.6 task.7 task.8] file:///Users/boccio/Desktop/documentation/analyze.html Page 5 of 9 Avida : The Analyze File 08/28/2007 04:35 PM Calculate several modularity measuers, such as how many tasks is an instruction involved in, number of sites required for each task, etc. The measures are averaged over all the organisms in the current batch that perform any tasks. For the full output list, do AVERAGE_MODULATITY legend.dat At the moment doesn't support html output format and works with only 1 and 2 input tasks. HAMMING [file="hamming.dat"] [b1=current] [b2=b1] Calculate the hamming distance between batches b1 and b2. If only one batch is given, calculations are on all pairs within that batch. LEVENSTEIN [file='lev.dat'] [batch1] [b2=b1] Calculate the levenstein distance (edit distance) between batches b1 and b2. This metric is similar to hamming distance, but calculates the minimum number of single insertions, deletions, and mutations to move from one sequence to the other. SPECIES [file='species.dat'] [batch1] [batch2] [num_recombinants] Calculates the percentage of non-viable recombinants between all pairs of organisms from batches 1 and 2. Number of random recombination events for each pair of organisms is specified by num_recombinants. Recombination is done in the same way as in the birth chamber when divide-sex is executed. Output: Batch1Name Batch2Name AveDistance Count FailCount RECOMBINE [batch1] [batch2] [batch3] [num_recombinants] Similar to Species command, but instead of calculating things on the spot, just create all the recombinant genotypes using organisms from baches 1 and 2 and put them in the batch3. Using Test CPU Resources Summary This summary is given to help explain the use and constraints for using resources. When a command specifies the use of resources for the test cpu, it should not affect the state of the test cpu after the command has finished. However, this means that the test cpu is no longer guaranteed to be reentrant. Each command will set up the environment and the resource count in the test cpu with it's own environment and resource count. When the command has finished it will set the the test cpu's environment and resource count back to what they were before the command was executed. Resource usage for the test cpu occurs by setting the environment and then setting up the resource count using the environment. Once the resource count has been set up, it will not change during the use of the test cpu. When an organism performs and IO, completing a task, the concentrations are not changed. This was a design decision, but is easily changed. In analyze, a new data structure was included which contains a time ordered list of resource concentrations. This list can be used to set up resources from different time points. By using the FillResources function, you can have the resource library updated with resource concentrations from a time point closest to the user specified time point. If the LOAD_RESOURCES command is not called, the list defaults to a single entry which is the the initial concentrations of the resources specified in the environment configuration file:///Users/boccio/Desktop/documentation/analyze.html Page 6 of 9 Avida : The Analyze File 08/28/2007 04:35 PM file. PRINT_TEST_CPU_RESOURCES This command first prints the whether or not the test cpu is using resources. Then it will print the concentration for each resource. LOAD_RESOURCES [file_name="resource.dat"] This command loads a time oriented list of resource concentrations. The command takes a file name containing this type of data, and defaults to resource.dat. The format of the file must be the same as resource.dat, and each line must be in the correct chronological order with oldest first. Output Formats Several commands (such as DETAIL and MAP) require format parameters to specify what genotypic features should be output. Before the such commands are used, other collection functions may need to be run. Allowable formats after a normal load (assuming these values were available from the input file to be loaded in) are: id (Genome ID) total_cpus (Total CPUs Ever) update_dead (Update Dead) parent_id (Parent ID) length (Genome Length) depth (Tree Depth) num_cpus (Number of CPUs) update_born (Update Born) sequence (Genome Sequence) After a RECALCULATE, the additional formats become available: viable (Is Viable [0/1]) merit (Merit) efficiency (Replication Efficiency) task.n (# of times task number n is done) copy_length (Copied Length) comp_merit (Computational Merit) fitness (Fitness) exe_length (Executed Length) gest_time (Gestation Time) div_type (Divide type used; 1 is default) task.n:binary (is task n done, 0/1) If a FIND_LINEAGE was done before the RECALCULATE, the parent genotype for each regular genotype will be available, enabling the additional formats: parent_dist (Parent Distance) efficiency_ratio (Replication Efficiency Ratio with parent) parent_muts (Mutations from Parent) file:///Users/boccio/Desktop/documentation/analyze.html comp_merit_ratio, (Computational Merit Ratio with parent) fitness_ratio (Fitness Ratio with parent) html.sequence (Genome Sequence in Color; html format) Page 7 of 9 Avida : The Analyze File 08/28/2007 04:35 PM If an ALIGN is run, one additional format is available: alignment (Aligned Sequence) Finally, there are a handful of commands that will automatically perform landscapping. The landscape will only be run once per organism even when multiple output variables are used. For enhanced performance on multiprocessor/multi-core systems, see the PrecalcLandscape action. frac_dead (Fraction of Lethal Mutations) frac_neut (Fraction of Neutral Mutations) complexity (Physical Complexity of Organism) frac_neg (Fraction of Harmful Mutations) frac_pos (Fraction of Beneficial Mutations) land_fitness (Average Mutation Fitness) Variables For the moment, all variables can only be a single character (letter or number) and begin with a $ whenever they need to be translated to their value. Lowercase letters are global variables, capital letters are local to a function (described later), and numbers are arguments to a function. A $$ will act as a single dollar sign, if needed. SET [variable] [value] Sets the variable to the value... FOREACH [variable] [value] [value ...] Set the variable to each of the values listed, and run follows between here and the next END command once for values. FORRANGE [variable] [min_value] [max_value] [step_value=1] Set the variable to each of the values between min and given), and run the code that follows between here and command, once for each of those values. the code that each of those max (at steps the next END Functions These functions are currently very primitive with fixed inputs of $0 through $9. $0 is always the function name, and then there can be up to 9 other arguments passed through. Once a function is created, it can be run just like any other command. FUNCTION [name] This will create a function of the given name, including in it all of the commands up until an END is found. These commands will be bound to the function, but are not executed until the function is run as a command. Inside the function, the variables $1 through $9 can be used to access arguments passed in. Currently there are no conditionals or mathematical commands in this scripting file:///Users/boccio/Desktop/documentation/analyze.html Page 8 of 9 Avida : The Analyze File 08/28/2007 04:35 PM language. These are both planned for the future. Return to the Index file:///Users/boccio/Desktop/documentation/analyze.html Page 9 of 9 Avida : Sample Programs from Analyze Mode Return to the Index 08/28/2007 04:35 PM Revised 2006-09-03 DMB Sample Programs from Analyze Mode This document gives some example analyze programs and explains how they function. Testing a genome sequence The following program will load in a genome sequence, run it through a test CPU, and output the information about it in a couple of formats. VERBOSE LOAD_SEQUENCE rmzavcgmciqqptqpqcpctletncogcbeamqdtqcptipqfpgqxutycuastttva RECALCULATE DETAIL detail_test.dat fitness merit gest_time length viable sequence TRACE PRINT This program starts off with the VERBOSE command so that Avida will print to the screen all of the details about what is going on as it runs the analyze script; I recommend you begin all of your programs this way for debugging purposes. The program then uses the LOAD_SEQUENCE command to allow the user to enter a specific genome sequence in its compressed format. This will translate the genome into the proper genotype as long as you are using the correct instruction set file, since that file determines the mappings of letters to instructions). The RECALCULATE command places the genome sequence into a test CPU, and determines its fitness, merit, gestation time, etc. so that the DETAIL command that follows it can have access to all of this information as it prints it to the file "detail_test.dat" (its first argument). The TRACE and PRINT commands will then print individual files about this genome, the first tracing its execution line-by-line, and the second summarizing all sorts of statistics about it and displaying the genome. Since no directory was specified for these commands, archive/ is assumed, and the filenames are org-S1.trace and orgS1.gen. If a genotype has a name when it is loaded, that name will be kept, but if it doesn't, it will be assigned a name starting at org-S1, then org-S2, and so on counting higher. The TRACE and PRINT commands add their own suffixes to the genome's name to determine the filename they will be printed as. Using Variables Often, you will want to run the same section of analyze code with multiple different inputs each time through, or else you might simply want a single value to be easy to change throughout the code. To facilitate such programming file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 1 of 6 Avida : Sample Programs from Analyze Mode 08/28/2007 04:35 PM practices, variables are available in analyze mode that can be altered for each repitition through the code. There are actually several types of variables, all of which are a single letter of number. For a command that requires a variable name as an input, you simply put that variable where it is requested. For example, if you were going to set the variable i to be equal to the number 12, you would type: SET i 12 But later on in the code, how does Avida know when you type an i if you really want the letter 'i' there, or if you prefer the number 12 to be there? To distinguish these cases, you must put a dollar sign '$' before a variable wherever you want it to be translated to its value instead of just using the variable name itself. There are a few different commands that allow you to manipulate a variable's value, and sometimes execute a section of code multiple times based off of each of the possible values. Here is one example: FORRANGE i 100 199 SET d /home/charles/dev/avida/runs/evo-neut/evo_neut_$i PURGE_BATCH LOAD_DETAIL_DUMP $d/detail_pop.100000 RECALCULATE DETAIL $d/detail.dat update length fitness sequence END The FORRANGE command runs the contents of the loop once for each possible value in the range, setting the variable i to each of these values in turn. Thus the first time through the loop, 'i' will be equal to the value '100', then '101', '102', all the way up to '199'. In this particular case, we have 100 runs (numbered 100 through 199) that we want to work with. The first thing we do once we're inside the loop is set the value of the variable 'd' to be the name of the directory we're going to be working with. Since this is a long directory name, we don't want to have to type it over every time we need it. If we set it to the variable d, then all we need to do is type '$d' in the future, and it will be translated to the full name. Note that in this case we are setting a variable to a string instead of a number; that's just fine and Avida will figure out how to handle it properly. This directory we are working with will change each time through the loop, and that it is no problem to use one variable as part of setting another. After we know what directory we are using, we run a PURGE_BATCH to get rid of all of the genotypes from the last time through the loop (lest we just keep building up more and more genotypes in the current batch) and then we refill the batch by using LOAD_DETAIL_DUMP to load in all of the genotypes saved in the file detail-100000.pop within our chosen directory. The RECALCULATE command runs all of the genotypes through a test CPU so we have all the statistics we need, and finally DETAIL will print out the stats we want to the file detail.dat, again placing it in the proper directory. The END command signifies the end of the FORRANGE loop. file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 2 of 6 Avida : Sample Programs from Analyze Mode 08/28/2007 04:35 PM Finding Lineages Quite often, the portion of an Avida run that we will be most interested in is the lineage from the final dominant genotype back to the original ancestor. As such, there are tools in Avida to get at this information. FORRANGE i 100 199 SET d /home/charles/dev/avida/runs/evo-neut/evo_neut_$i PURGE_BATCH LOAD_DETAIL_DUMP $d/detail_pop.100000 LOAD_DETAIL_DUMP $d/historic_dump.100000 FIND_LINEAGE num_cpus RECALCULATE DETAIL lineage.$i.html depth parent_dist length fitness html.sequence END This program looks very similar to the last one. The first four lines are actually identical, but after loading the detail dump at update 100,000, we also want to load the historic dump from the same time point. A detail file contains all of the genotypes that were currently alive in the population at the time it was printed, while the historic files contain all of the genotypes that are direct ancestors of those that were still alive. The combination of these two files gives us the lineages of the entire population back to the original ancestor. Since we are only interested in a single lineage, the next thing we do is run the FIND_LINEAGE command to pick out a single genotype, and discard everything else except for its lineage. In this case, we pick the genotype with the highest abundance (the most virtual CPUs associated with it) at the time of printing. As before, the RECALCULATE command gets us any additional information we may need about the genotypes, and then we print that information to a file using the DETAIL command. The filenames that we are using this time have the format lineage.$i.html, so they are all being written to the current directory with filenames that incorporate the run number right in them. Also, because the filename ends in the suffix '.html', Avida knows to print the file in a proper html format. Note that the specific values that we choose to print take advantage of the fact that we have a lineage (and hence measured things like the genetic distance to the parent) and are in html mode (and thus can print the sequence using colors to specify where exactly mutations occurred). Working with Batches In analyze mode, we can load genotypes into multiple batches and we then operate on a single batch at a time. So, for example, if we wanted to only consider the dominant genotypes at time points 100 updates apart, but all we had to work with were the detail files (containing all genotypes at each time point) we might write a program like: file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 3 of 6 Avida : Sample Programs from Analyze Mode 08/28/2007 04:35 PM SET d /home/charles/avida/runs/mydir/here-it-is SET_BATCH 0 FORRANGE u 100 100000 100 # Cycle through updates PURGE_BATCH # Purge current batch (0) LOAD_DETAIL_DUMP $d/detail_pop.$u # Load in the population at this update FIND_GENOTYPE num_cpus # Remove all but most abundant genotype DUPLICATE 0 1 # Duplicate batch 0 into batch 1 END SET_BATCH 1 # Switch to batch 1 RECALCULATE # Recalculate statistics... DETAIL dom.dat fitness sequence # Print info for all dominants! This program is slightly more complicated than the others, so I added in comments directly inside it. Basically, what we do here is use batch 0 as our staging area where we load the full detail dumps into, strip them down to only the single most abundant genotype, and then copy that genotype over into batch one. By the time we're done, we have all of the dominant genotypes inside of batch one, so we can print anything we need right from there. Building your own Commands One really useful feature that I have added to the analyze mode is the ability for the user to construct a variety of their own commands without modifying the source code. This is done with the FUNCTION command. For example, if you know you will always need a file called lineage.html with very specific information in it, you might write a helper command for yourself as follows: FUNCTION MY_HTML_LINEAGE # arg1=run_directory PURGE_BATCH LOAD_DETAIL_DUMP $1/detail_pop.100000 LOAD_DETAIL_DUMP $1/historic_dump.100000 FIND_LINEAGE num_cpus RECALCULATE DETAIL $1/lineage.html depth parent_dist length fitness html.sequence END This works identically to how we found lineages and printed their data in the section above. Only this time, it has created the new command called MY_HTML_LINEAGE that you can use anytime thereafter. Arguments to functions work similar to variables, but they are numbers instead of letters. Thus $1 translates to the first arguments, $2 becomes the second, and so on. You are limited to 9 arguments at this point, but that should be enough for most tasks. $0 is the name of the function you are running, in case you ever need to use that. You may be interested in also using functions in conjunction with the SYSTEM command. Anything you type as arguments to this command gets run on the command line, so you can make functions to do anything that could otherwise be done were you at the shell prompt. For example, imagine that you were going to use a lot of compressed files in your analysis that you would first need to uncompress. You might right a function like: file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 4 of 6 Avida : Sample Programs from Analyze Mode 08/28/2007 04:35 PM FUNCTION UNZIP # Arg1=filename SYSTEM gunzip $1 END This is a shorter example than you might typically want to write a function for, but it does get the point across. This would allow you to just type UNZIP <filename> whenever you needed to uncompress something. Functions are particularly useful in conjunction with the INCLUDE command. You can create a file called something like my_functions.cfg in your Avida work directory, define a bunch of functions there, and then start all of your analyze.cfg files with the line: INCLUDE my_functions.cfg and you will have access to all of your functions thereafter. Ideally, as this language becomes more flexible, so will your ability to create functions within the language, so you will be able to develop flexible and useful libraries for yourself. Try it Out... Here are a couple of example problems you can try to see how well you can use analyze mode. These should get you used to working with it for future projects. Problem 1. A detail file in Avida contains one line associated with each genotype, in order from the most abundant to the least. Currently, the LOAD_DETAIL_DUMP command will load the entire file's worth of genotypes into the current batch, but what if you only wanted the top few? You should write a function called LOAD_DETAIL_TOP that takes two arguments. The first ($1) is the name file that needs to be loaded in (just as in the original command), and the second is the number of lines you want to load. The easiest way to go about doing this is by using the SYSTEM command along with the Unix command head which will output the very top of a file. If you typed the line: head -42 detail_pop.1000 > my_temp_file The file my_temp_file would be created, and its contents would be the first 42 lines of detail-1000.pop. So, what you need this function to do is create a temporary file with proper number of lines from the detail file in it, load that temp file into the current batch, and then delete the file (using the rm command). Warning: be very careful with the automated deletions -- you don't want to accidentally remove something that you really need! I recommend that you use the command rm -i until you finish debugging. This problem may end up being a little tricky for you, but you should be able to work your way through it. Problem 2. Now that you have a working LOAD_DETAIL_TOP command, you can run file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 5 of 6 Avida : Sample Programs from Analyze Mode 08/28/2007 04:35 PM LOAD_DETAIL_TOP <filename> 1 in order to only load the most dominant genotype from the detail file. Rewrite the example program from the section "Working with Batches" above such that you now only need to work within a single batch. Return to the Index file:///Users/boccio/Desktop/documentation/analyze_samples.html Page 6 of 6 Avida : List of Actions Return to the Index 08/28/2007 04:35 PM | The Events File | The Analyze File Revised 2006-09-03 DMB List of Actions There is a large library of actions available for scheduling as events. Additionally, all of these actions can be used within analyze scripts. Below you will find a listing of the high level groupings of these actions, along with detailed sections for each them. Print Print actions are the primary way of saving data from an Avida experiments. Population Population actions modify the state of the population, and will actually change the course of the run. Environment Actions that allow user to change properties of the environment, including resources. Save and Load Actions that allow for saving and loading large data sets, such as full populations. Landscape Analysis Actions that use data from the current state of Avida, process it and then output the results. Driver Actions that allow user to control program execution, including experiment termination. Alphabetical Listing of Available Actions AnalyzeLandscape AnalyzePopulation CompeteDemes ConnectCells CopyDeme DeletionLandscape DisconnectCells DumpDonorGrid DumpFitnessGrid DumpGenotypeIDGrid DumpMemory DumpPopulation DumpReceiverGrid DumpTaskGrid Echo Exit ExitAveLineageLabelGreater ExitAveLineageLabelLess FullLandscape HillClimb Inject InjectAll InjectParasite InjectRandom InjectRange KillRectangle LoadClone LoadPopulation ModMutProb OutflowScaledResource PairTestLandscape PrecalcLandscape PredictNuLandscape PredictWLandscape PrintAverageData PrintCountData PrintData PrintDebug PrintDemeStats PrintDepthHistogram PrintDetailedFitnessData PrintDivideMutData PrintDominantData PrintDominantGenotype PrintDominantParaData PrintDominantParasiteGenotype PrintErrorData PrintGeneticDistanceData PrintGenotypeAbundanceHistogram PrintGenotypeMap file:///Users/boccio/Desktop/documentation/actions.html PrintPopulationDistanceData PrintResourceData PrintSpeciesAbundanceData PrintStatsData PrintTasksSnapshot PrintTasksExeData PrintTasksQualData PrintTimeData PrintTotalsData PrintTreeDepths PrintVarianceData PrintViableTasksData RandomLandscape ResetDemes SampleLandscape SaveClone SaveHistoricPopulation SaveHistoricSexPopulation SaveParasitePopulation SavePopulation SaveSexPopulation SerialTransfer SetMutProb SetReactionInst SetReactionValue Page 1 of 10 Avida : List of Actions InjectResource InjectScaledResource InjectSequence InsertionLandscape JoinGridCol JoinGridRow KillProb KillRate 08/28/2007 04:35 PM PrintGenotypes PrintInstructionAbundanceHistogram PrintInstructionData PrintLineageCounts PrintLineageTotals PrintMutationRateData PrintPhenotypeData PrintPhenotypeStatus SetReactionValue SetReactionValueMult SetResource SetVerbose SeverGridCol SeverGridRow TestDominant ZeroMuts Print Actions Output events are the primary way of saving data from an Avida experiments. The main two types are continuous output, which append to a single file every time the event is trigged, and singular output, which produce a single, complete file for each trigger. PrintAverageData [string filename='average.dat'] Print all of the population averages the specified file. PrintErrorData [string filename='error.dat'] Print all of the standard errors of the average population statistics. PrintVarianceData [string filename='variance.dat'] Print all of the variances of the average population statistics. PrintDominantData [string filename='dominant.dat'] Print all of the statistics relating to the dominant genotype. PrintStatsData [string filename='stats.dat'] Print all of the miscellanous population statistics. PrintCountData [string filename='count.dat'] Print all of the statistics the keep track of counts (such as the number of organisms in the population or the number of instructions executed). PrintTotalsData [string filename='totals.dat'] Print various totals for the entire length of the run (for example, the total number of organisms ever). PrintTasksData [string filename='tasks.dat'] Print the number of organisms that are able to perform each task. This uses the environment configuration to determine what tasks are in use. PrintTasksExeData [string filename='tasks_exe.dat'] Print number of times the particular task has been executed this update. PrintTasksQualData [string filename='tasks_quality.dat'] Print the total quality of each task. By default a successful task is valued as 1.0. Some tasks, however, can grant partial values and/or special bonuses via the quality value. file:///Users/boccio/Desktop/documentation/actions.html Page 2 of 10 Avida : List of Actions 08/28/2007 04:35 PM PrintResourceData [string filename='resource.dat'] Print the current counts of each resource available to the population. This uses the environment configuration to determine what resources are in use. Also creates seperate files resource_resource_name.m (in a format that is designed to be read into Matlab) for each spatial resource. PrintTimeData [string filename='time.dat'] Print all of the timing related statistics. PrintMutationRateData [string filename='mutation_rates.dat'] Output (regular and log) statistics about individual copy mutation rates (aver, stdev, skew, cur). Useful only when mutation rate is set per organism. PrintDivideMutData [string filename='divide_mut.dat'] Output (regular and log) statistics about individual, per site, rates divide mutation rates (aver, stdev, skew, cur) to divide_mut.dat. Use with multiple divide instuction set. PrintDominantParaData [string filename='parasite.dat'] Print various quantites related to the dominant parasite. PrintInstructionData [string filename='instruction.dat'] Print the by-organisms counts of what instructions they _successfully_ executed beteween birth and divide. Prior to their first divide, organisms values for their parents. PrintGenotypeMap [string filename='genotype_map.m'] This event is used to output a map of the genotype IDs for the population grid to a file that is suitable to be read into Matlab. PrintPhenotypeData [string filename='phenotype_count.dat'] Print the number of phenotypes based on tasks executed this update. Executing a task any number of times is considered the same as executing it once. PrintPhenotypeStatus [string filename='phenotype_status.dat'] PrintDemeStats PrintData <string fname> <string format> Append to the file specified (continuous output), the data given in the column list. The column list needs to be a comma-seperated list of keywords representing the data types. Many possible data types can be output; see the complete listing for details. Note that this event will even create a detailed column legend at the top of your file so you don't need to seperately keep track of what the columns mean. PrintInstructionAbundanceHistogram [string filename='instruction_histogram.dat'] Appends a line containing the bulk count (abundance) of each instruction in the population onto a file. PrintDepthHistogram [string filename='depth_histogram.dat'] file:///Users/boccio/Desktop/documentation/actions.html Page 3 of 10 Avida : List of Actions 08/28/2007 04:35 PM Echo <string message> Print the supplied message to standard output. PrintGenotypeAbundanceHistogram [string fname='genotype_abundance_histogram.dat'] Writes out a genotype abundance histogram. PrintSpeciesAbundanceHistogram [string fname='species_abundance_histogram.dat'] Writes out a species abundance histogram. PrintLineageTotals [string fname='lineage_totals.dat'] [int verbose=1] PrintLineageCounts [string fname='lineage_counts.dat'] [int verbose=1] PrintDominantGenotype [string fname=''] Print the dominant organism's genome (and lots of information about it) into the file specified. If no filename is given, the genotype's assigned name is used and the file is placed into the archive subdirectory. PrintDominantParasiteGenotype [string fname=''] Print the dominant parasite's genome (and lots of information about it) into the file specified. If no filename is given, the parasite's assigned name is used and the file is placed into the archive subdirectory. PrintDetailedFitnessData [int save_max_f_genotype=0] [int print_fitness_histo=0] [double hist_fmax=1] [double hist_fstep=0.1] [string datafn='fitness.dat'] [string histofn='fitness_histos.dat'] [string histotestfn='fitness_histos_testCPU.dat'] PrintGeneticDistanceData [string ref_creature_file='START_CREATURE'] [string filename='genetic_distance.dat'] PrintPopulationDistanceData [string creature='START_CREATURE'] [string fname=''] [int save_genotypes=0] PrintDebug PrintGenotypes [string data_fields='all'] [int print_historic=0] [string filename='genotypes-<update>.dat'] This command is used to print out information about all of the genotypes in the population. The file output from here can be read back into the analyze mode of Avida with the LOAD command. The data_fields parameter indicates what columns should be included in the file, which must be comma seperated. Options are: all, id, parent_id, parent2_id (for sex), parent_dist, num_cpus, total_cpus, length, merit, gest_time, fitness, update_born, update_dead, depth, lineage, sequence. Use all (the default) if you want all of the fields included. The print_historic parameter included in this output. For current population that died '-1' in this field indicates indicates how many updates back in time should be example, '200' would indicate that any ancestor of the out in the last 200 updates should also be printed. A that all ancestors should be printed. The filename parameter simply indicates what you want to call the file. Example: u 1000:1000 print_genotypes id,parent_id,fitness 1000 file:///Users/boccio/Desktop/documentation/actions.html Page 4 of 10 Avida : List of Actions 08/28/2007 04:35 PM This will print out the full population every 1000 updates, including all genotypes that have died out since the last time it was printed. TestDominant [string fname='dom-test.dat'] PrintTaskSnapshot [string fname=''] Run all organisms in the population through test cpus and print out the number of tasks each can perform. PrintViableTasksData [string fname='viable_tasks.dat'] PrintTreeDepths [string fname=''] Reconstruction of phylogenetic trees. DumpMemory [string filename='memory_dump-<update>.dat'] Dump memory summary information. DumpFitnessGrid [string filename='grid_fitness.<update>.dat'] Print out the grid of organism fitness values. DumpGenotypeIDGrid [string filename='grid_genotype_id.<update>.dat'] Print out the grid of genotype IDs. DumpTaskGrid [string filename='grid_task.<update>.dat'] Print out the grid of takss that organisms do. For each organism, tasks are first encoded as a binary string (e.g. 100000001 means that organism is doing NOT and EQU and then reported as a base-10 number (257 in the example above). DumpDonorGrid [string filename='grid_donor.<update>.dat'] Print out the grid of organisms who donated their merit. DumpRecieverGrid [string filename='grid_receiver.<update>.dat'] Print out the grid of organisms who received merit. SetVerbose [string verbosity=''] Change the level of output verbosity. Verbose messages will print all of the details of what is happening to the screen. Minimal messages will only briefly state the process being run. Verbose messages are recommended if you're in interactive analysis mode. When no arguments are supplied, action will toggle between NORMAL and ON. Levels: SILENT, NORMAL, ON, DETAILS, DEBUG Population Actions Population events modify the state of the population, and will actually change the course of the run. There are a wide variety of these. Inject [string fname='START_CREATURE'] [int cell_id=0] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Inject a single organisms into the population. Arguments must be included from left file:///Users/boccio/Desktop/documentation/actions.html Page 5 of 10 Avida : List of Actions 08/28/2007 04:35 PM to right; if all arguments are left out, the default creature is the ancestral organism, and it will be injected into cell 0, have an uninitialized merit, and be marked as liniage id 0. InjectRandom <int length> [int cell_id=0] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Injects a randomly generated genome of the supplied length into the population. InjectAll [string fname='START_CREATURE'] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Same as Inject, but no cell_id is specified and the organism is placed into all cells in the population. InjectRange [string fname='START_CREATURE'] [int cell_start=0] [int cell_end=-1] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Injects identical organisms into a range of cells of the population. Example: InjectRange 000-aaaaa.org 0 10 Will inject 10 organisms into cells 0 through 9. InjectSequence <string sequence> [int cell_start=0] [int cell_end=-1] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Injects identical organisms based on the supplied genome sequence into a range of cells of the population. Example: InjectSequence ckdfhgklsahnfsaggdsgajfg 0 10 100 Will inject 10 organisms into cells 0 through 9 with a merit of 100. InjectParasite <string filename> <string label> [int cell_start=0] [int cell_end=1] Attempt to inject a parasite genome into the supplied population cell range with the specified label. InjectParasitePair <string filename_genome> <string filename_parasite> <string label> [int cell_start=0] [int cell_end=-1] [double merit=-1] [int lineage_label=0] [double neutral_metric=0] Inject host parasite pairs into the population cell range specified. KillProb [double probability=0.9] Using the specified probability, test each organism to see if it is killed off. KillRate [double probability=0.9] Randomly removes a certain does the same thing as the one has to specify a rate. fitness is 20000, than you removal rate of 10000. proportion of the population. In principle, this action KillProb event. However, instead of a probability, here The rate has the same unit as fitness. So if the average remove 50% of the population on every update with a KillRectangle [int x1=0] [int y1=0] [int x2=0] [int y2=0] Kill off all organisms in a rectangle defined by the points (x1, y1) and (x2, y2). file:///Users/boccio/Desktop/documentation/actions.html Page 6 of 10 Avida : List of Actions 08/28/2007 04:35 PM SerialTransfer [int transfer_size=1] [int ignore_deads=1] Similar to KillProb, but we specify the exact number of organisms to keep alive after the event. The ignore_deads argument determines whether only living organisms are retainted. SetMutProb [string mut_type='copy'] [double prob=0.0] [int start_cell=-1] [int end_cell=-1] ModMutProb [string mut_type='copy'] [double prob=0.0] [int start_cell=-1] [int end_cell=-1] ZeroMuts This event will set all mutation rates to zero. CompeteDemes [int type=1] ResetDemes CopyDeme <int src_id> <int dest_id> SeverGridCol [int col_id=-1] [int min_row=0] [int max_row=-1] Remove the connections between cells along a column in an Avida grid. SeverGridRow [int row_id=-1] [int min_col=0] [int max_col=-1] Remove the connections between cells along a row in an Avida grid. JoinGridCol [int col_id=-1] [int min_row=0] [int max_row=-1] Add connections between cells along a column in an Avida grid. JoinGridRow [int row_id=-1] [int min_col=0] [int max_col=-1] Add connections between cells along a row in an Avida grid. ConnectCells <int cellA_x> <int cellA_y> <int cellB_x> <int cellB_y> Connects a pair of specified cells. DisconnectCells <int cellA_x> <int cellA_y> <int cellB_x> <int cellB_y> Disconnects a pair of specified cells. Environment Actions Events that allow user to change environment properties, such as resources and reaction parameters. InjectResource <string res_name> <double res_count> Inject (add) a specified amount of a specified resource. res_name must already exist as a resource in environment file. InjectScaledResource <string res_name> <double res_count> OutflowScaledResource <string res_name> <double res_percent> file:///Users/boccio/Desktop/documentation/actions.html Page 7 of 10 Avida : List of Actions 08/28/2007 04:35 PM SetResource <string res_name> <double res_count> Set the resource amount to a specific level. res_name must already exist as a resource in environment file. SetReactionValue <string reaction_name> <double value> Set the reaction value to a specific level. reaction_name must already exist in the environment file. value can be negative. SetReactionValueMult <string reaction_name> <double value> Multiply the reaction value by the value. reaction_name must already exist in the environment file. value can be negative. SetReactionInst <string reaction_name> <string inst> Set the instruction triggered by this reaction. reaction_name must already exist in the environment file. inst must be in the instruction set. Save Load Actions SaveClone [string fname=''] Save a clone of this organism to the file specified; if no filename is given, use the name clone.update. The update number allows regular clones with distinct filenames to be saved with the same periodic event. Running avida -l filename will start an Avida population with the saved clone. Note that a clone only consists of the genomes in the population, and their current state is lost, so the run may not proceed identically as to if it had continued as it was going. LoadClone <string fname> LoadPopulation <string fname> [int update=-1] Sets up a population based on a save file such as written out by SavePopulation. It is also possible to append a history file to the save file, in order to preserve the history of a previous run. DumpPopulation [string fname=''] SavePopulation [string fname=''] Save the genotypes and lots of statistics about the population to the file specified; if not filename is given, use the name detail-update.pop. As with clones, the update number allows a single event to produce many detail files. The details are used to collect crossection data about the population. SaveSexPopulation [string fname=''] SaveParasitePopulation [string fname=''] SaveHistoricPopulation [int back_dist=-1] [string fname=''] This action is used to output all of the ancestors of the currently living population to the file specified, or historic-update.pop. SaveHistoricSexPopulation [string fname=''] file:///Users/boccio/Desktop/documentation/actions.html Page 8 of 10 Avida : List of Actions 08/28/2007 04:35 PM Landscape Analysis Actions Landscape analysis actions perform various types mutation studies to calculate properties of the fitness landscape for a particular genome. When scheduled as an event during a run, these actions will typically perform analysis on the dominant genotype. In analyze mode, analysis is performed on the entire currently selected batch. These actions are often very computationally intensive, thus will take a long time to compute. In order to take advantage of increasingly available multi-processor/multicore systems, a number of these actions have been enhanced to make use of multiple threads to parallize work. Set the configuration setting MT_CONCURRENCY to the number of logical processors available to make use of all processor resources for these compuations. AnalyzeLandscape [filename='land-analyze.dat'] [int trials=1000] [int min_found=0] [int max_trials=0] [int max_dist=10] PrecalcLandscape Precalculate the distance 1 full landscape for the current batch in parallel using multiple threads. The resulting data is stored into the current batch and can be used by many subsequent output commands within Analyze mode. FullLandscape [string filename='land-full.dat'] [int distance=1] [string entropy_file=''] [string sitecount_file=''] Do a landscape analysis of the dominant genotype or current batch of genotypes, depending on the current mode. The resulting output is a collection of statistics obtained from examining all possible mutations at the distance specified. The default distance is one. DeletionLandscape [string filename='land-del.dat'] [int distance=1] [string sitecount_file=''] InsertionLandscape [string filename='land-ins.dat'] [int distance=1] [string sitecount_file=''] PredictWLandscape [string filename='land-predict.dat'] PredictNuLandscape [string filename='land-predict.dat'] RandomLandscape [string filename='land-random.dat'] [int distance=1] [int trials=0] SampleLandscape [string filename='land-sample.dat'] [int trials=0] HillClimb [string filename='hillclimb.dat'] Does a hill climb with the dominant genotype. PairTestLandscape [string filename=''] [int sample_size=0] If sample_size = 0, pairtest the full landscape. AnalyzePopulation [double sample_prob=1] [int landscape=0] [int save_genotype=0] [string filename=''] file:///Users/boccio/Desktop/documentation/actions.html Page 9 of 10 Avida : List of Actions 08/28/2007 04:35 PM Driver Actions These actions control the driver object responsible for executing the current run. Exit Unconditionally terminate the current run. ExitAveLineageLabelGreater <double threshold> Halts the run if the current average lineage label is larger than threshold. ExitAveLineageLabelLess <double threshold> Halts the run if the current average lineage label is smaller than threshold. Return to the Index | The Events File file:///Users/boccio/Desktop/documentation/actions.html | The Analyze File Page 10 of 10