Avida V2.6.2 PDF (UNIX) Docs

Transcription

Avida : A Guided Tour of an Ancestor and its Gardware
Return to the Index
08/28/2007 04:33 PM
Revised 2006-09-05 DMB
A Guided Tour of an Ancestor and its
Hardware
This document describes the structure of the classic virtual CPU and an
example organism running on it.
The Virtual CPU Structure
The virtual CPU, which is the default "body" or "hardware" of the organisms,
contains the following set of components, (as further illustrated in the
figure below).
A memory that consists of a sequence of instructions, each associated
with a set of flags to denote if the instruction has been executed,
copied, mutated, etc.
An instruction pointer (IP) that indicates the next site in the memory to
be executed.
Three registers that can be used by the organism to hold data currently
being manipulated. These are often operated upon by the various
instructions, and can contain arbitrary 32-bit integers.
Two stacks that are used for storage. The organism can theoretical store
an arbitrary amount of data in the stacks, but for practical purposes we
currently limit the maximum stack depth to ten.
An input buffer and an output bufferthat the organism uses to receive
information, and return the processed results.
A Read-Head, a Write-Head, and a Flow-Head which are used to specify
positions in the CPU's memory. A copy command reads from the Read-Head
and writes to the Write-Head. Jump-type statements move the IP to the
Flow-Head.
file:///Users/boccio/Desktop/documentation/cpu_tour.html
Page 1 of 6
08/28/2007 04:33 PM
Instruction Set Configuration
The instruction set in Avida is loaded on startup from a configuration file
specified in the avida.cfg file. This allows selection of different
instruction sets without recompiling the source code, as well as allowing
different sized instruction sets to be specified. It is not possible to alter
the behavior of individual instructions or add new instructions without
recompiling Avida; such activities have to be done directly in the source
code.
The available instructions are listed in the inst_set.* files with a 1 or a 0
next to an instruction to indicate if it should or should not be included.
Changing the instruction set to be used simply involves adjusting these flags.
The instructions were created with three things in mind:
To be as complete as possible (both in a "Turing complete" sense -- that
is, it can compute any computable function -- and, more practically, to
ensure that simple operations only require a few instructions).
For each instruction to be as robust and versatile as possible; all
Page 2 of 6
08/28/2007 04:33 PM
instructions should take an "appropriate" action in any situation where
they can be executed.
To have as little redundancy as possible between instructions. (Several
instructions have been implemented that are redundant, but such
combinations will typically not be turned on simultaneously for a run.)
One major concept that differentiates this virtual assembly language from its
real-world counterparts is in the additional uses of nop instructions (nooperation commands). These have no direct effect on the virtual CPU when
executed, but often modify the effect of any instruction that precedes them.
In a sense, you can think of them as purely regulatory genes. The default
instruction set has three such nop instructions: nop-A, nop-B, and nop-C.
The remaining instructions can be seperated into three classes. The first
class is those few instructions that are unaffected by nops. Most of these are
the "biological" instructions involved directly in the replication process.
The second class of instructions is those for which a nop changes the head or
register affected by the previous command. For example, an inc command
followed by the instruction nop-A would cause the contents of the AX register
to be incremented, while an inc command followed by a nop-B would increment
BX.
The notation we use in instruction definitions to describe that a default
component (that is, a register or head) can be replaced due to a nop command
is by surrounding the component name with ?'s. The component listed is the
default one to be used, but if a nop follows the command, the component it
represents in this context will replace this default. If the component between
the question marks is a register than a subsequent nop-A represents the AX
register, nop-B is BX, and nop-C is CX. If the component listed is a head
(including the instruction pointer) then a nop-A represents the Instruction
Pointer, nop-B represents the Read-Head, and nop-C is the Write-Head.
Currently the Flow-Head has no nop associated with it.
The third class of instructions are those that use a series of nop
instructions as a template (label) for a command that needs to reference
another position in the code, such as h-search. If nop-A follows a search
command, it scans for the first complementary template (nop-B) and moves the
Flow-Head there. Templates may be composed of more than a single nop
instruction. A series of nops is typically abbreviated to the associated
lecodeer and separated by colons. This the sequence "nop-A nop-A nop-C" would
be displayed as "A:A:C".
The label system used in Avida allows for an arbitrary number of nops. By
default, we have three: nop-A's complement is nop-B, nop-B's is nop-C, and
nop-C's is nop-A. Likewise, some instructions talk about the complement of a
register or head -- the same pacodeern is used in those cases. So if an
instruction tests if ?BX? is equal to its complement, it will test if BX == CX
by default, but if it is followed by a nop-C it will test if CX == AX.
Instruction Set Reference
Page 3 of 6
08/28/2007 04:33 PM
The full instruction set description is included here. An abbreviated
description of the 26 default instructions is below.
(a- nop-A,
c) nop-B,
and nop-C
(d) if-n-equ
(e) if-less
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
(p)
(q)
(r)
pop
push
swap-stk
swap
shift-r
shift-l
inc
dec
add
sub
nand
IO
h-alloc
(s) h-divide
(t) h-copy
(u)
(v)
(w)
(x)
h-search
mov-head
jmp-head
get-head
(y) if-label
(z) set-flow
No-operation instructions; these modify other instructions.
Execute next instruction only-if ?BX? does not equal its
complement
Execute next instruction only if ?BX? is less than its
complement
Remove a number from the current stack and place it in ?BX?
Copy the value of ?BX? onto the top of the current stack
Toggle the active stack
Swap the contents of ?BX? with its complement.
Shift all the bits in ?BX? one to the right
Shift all the bits in ?BX? one to the left
Increment ?BX?
Decrement ?BX?
Calculate the sum of BX and CX; put the result in ?BX?
Calculate the BX minus CX; put the result in ?BX?
Perform a bitwise NAND on BX and CX; put the result in ?BX?
Output the value ?BX? and replace it with a new input
Allocate memory for an offspring
Divide off an offspring located between the Read-Head and
Write-Head.
Copy an instruction from the Read-Head to the Write-Head and
advance both.
Find a complement template and place the Flow-Head after it.
Move the ?IP? to the same position as the Flow-Head
Move the ?IP? by a fixed amount found in CX
Write the position of the ?IP? into CX
Execute the next instruction only if the given template
complement was just copied
Move the Flow-Head to the memory position specified by ?CX?
An Example Ancestor
The following organism is stored in the file organism.heads.15, which you
should find in the support/config/misc/ directory. This is a simplified
version of organism.default and organism.heads.100, of lengths 50 and 100
respectively (each has additional instructions placed before the copy loop)
# --- Setup --# Allocate extra space at the end of the genome to copy the offspring
h-alloc
into.
h# Locate an A:B template (at the end of the organism) and place the
Page 4 of 6
search
nop-C
nop-A
movhead
nop-C
08/28/2007 04:33 PM
Flow-Head after it
#
#
# Place the Write-Head at the Flow-Head (which is at beginning of
offspring-to-be).
# [ Extra nop-C commands can be placed here w/o harming the organism!
]
# --- Copy Loop --h# No template, so place the Flow-Head on the next
search
# Copy a single instruction from the read head to
h-copy
advance both heads!)
if# Execute the line following this template only if
label
an A:B template.
nop-C
#
nop-A
#
h#
...Divide off offspring! (note if-statement
divide
mov# Otherwise, move the IP back to the Flow-Head at
head
copy loop.
nop-A
# End label.
nop-B
# End label.
line code
the write head (and
we have just copied
above!)
the beginning of the
This program begins by allocating extra space for its offspring. The exact
amount of space does not need to be specified -- it will allocate as much as
it is allowed to. The organism will then do a search for the end of its genome
(where this new space was just placed) so that it will know where to start
copying. First the Flow-Head is placed there, and then the Write-Head is moved
to the same point.
It is after this initial setup and before the actual copying process commences
that extra nop instructions can be included. The only caveat is that you need
to make sure that you don't duplicate any templates that the program will be
searching for, or else it will no longer function properly. The easiest thing
to do is insert a long sequence of nop-C instructions.
Next we have the beginning of the "copy loop". This segement of code starts
off with an h-search command with no template following it. In such as case,
the Flow-Head is placed on the line immediately following the search. This
head will be used to designate the place that the IP keeps returning to with
each cycle of the loop.
The h-copy command will copy a single instruction from the Read-Head (still at
the very start of the genome, where it begins) to the Write-Head (which we
placed at the beginning of the offspring). With any copy command there is a
user-specified chance of a copy mutation. If one occurs, the Write-Head will
place a random instruction rather than the one that it gathered from the ReadHead. After the copy occurs (for becodeer or worse), both the Read-Head and
the Write-Head are advanced to the next instruction in the genome. It is for
Page 5 of 6
08/28/2007 04:33 PM
this reason that a common mutation we see happening will place a long string
of h-copy instruction one after another.
The next command, if-label (followed by a nop-C and a nop-A) tests to see if
the complement of C:A is the most thing copied. That is, if the two most
recent instructions copied were a nop-A followed by a nop-B as is found at the
end of the organism. If so, we are done! Execute the next instruction which is
h-divide (when this occurs, the read and write heads will surround the portion
of memory to be split off as the offspring's genome). If not, then we need to
keep going. Skip the next instruction and move on to the mov-head which will
move the head specified by the nop that follows (in this case nop-A which is
the Instruction Pointer) to the Flow-Head at the beginning of the copy loop.
This process will continue until all of the lines of code have been copies,
and an offspring is born.
An Example Logic Gene
Here is a short example program to demonstrate one way for an organism to
perform the "OR" logic operation. This time I'm only going to show the
contents of the registers after each command because the functionality of the
individual instructions should be clear, and the logic itself won't be helped
much by a line-by-line explanation in English.
Line #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Instruction
IO
push
pop
nop-C
nand
nop-A
IO
push
pop
nop-C
nand
swap
nop-C
nand
IO
AX
?
?
?
BX
X
X
X
CX
?
?
X
Stack Output
?
?
X, ?
?
~X
X
X
?
~X
~X
~X
Y
Y
Y
X
X
Y
?
Y, ?
?
~X
Y
~Y
~Y
Y
~X
?
?
Y
Y
X or Y ~X
Z
~X
?
?
X
X or Y
Return to the Index
Page 6 of 6
Avida : Directory and File Structure
Return to the Index
08/28/2007 04:33 PM
Directory and File Structure
This document contains a guide to the files present in Avida, and where they
are located.
Filenames
Source code files in Avida follow a standard naming convention. The C++ core,
in general, maintains one class per header/source file pair. The file name
should exactly match the class that it defines. All header files use .h and
all source files use .cc as their respective file extensions.
When you compile a program in C++, it goes through a compilation phase and
then a link phase. The compilation phase takes each source (.cc) file and
compiles it independently into an object (.o) file. In the link phase, all of
these compiled object files are linked together into a single executable (such
as avida).
Since the bodies of the methods are only in the source files, they only need
to be compiled once into a single object file. If you place a function body in
the header file it will get compiled again each time another class includes
that header. Since a header will often be included because only one or two
methods from a class are required, this can increase compile time dramatically
-- a function will be compiled as long as its body is included, even if the
method is never directly called within the object file being created.
For example: The cOrganism object is declared in the file cOrganism.h and
fully defined in cOrganism.cc. When this file is compiled, it creates the
object file cOrganism.o. Both the cPopulation class (cPopulation.cc) and the
cTestCPU class (cTestCPU.cc) use the cOrganism object. Since the majority of
its methods are defined in cOrganism.cc, the compiler only needs to be compile
these methods once. During the link phase the linker connects the references
together.
Occasionally short functions are implemented with their bodies directly in the
header file. When a function compiled in one object file is run from another,
the linker basically points the caller to the location of that function. A few
extra CPU cycles must be expended while the program jumps to the function.
Many small function, especially one line access methods, can be made inline,
which means it will be placed, as a whole, right inside of the function that
calls it. If the function is short enough, it only takes up as much space as
the call to it would have taken anyway, and hence does not increase the size
of the executable.
file:///Users/boccio/Desktop/documentation/structure.html
Page 1 of 7
08/28/2007 04:33 PM
Directory Structure
The following sections provide a high level overview of the directory
structure within the Avida source code distribution. Many directory sections
contain partial listings of the files contained within them, however these
list are not to be considered complete.
Top Level Directory
All of the files for the current version of Avida reside in the directory
labeled trunk/ by default when checked out of Subversion. In addition to the
subdirectories documentation/, source/ and support/ (all described below),
this directory contains several key sources of information and automatic
compilation files. The most important of these are described here.
AUTHORS
This file contains information about the authorship of Avida.
Avida.xcodeproj
This file (or directory on non-Mac OS platforms) contains the Xcode
project information for development and building Avida within the Xcode
IDE on Mac OS. This project file requires Xcode 2.1 or greater.
BuildAvida.py
The main entry point for the new experimental SCONS python based build
system.
CHANGES
A listing of important changes to Avida that affect users of previous
releases.
COPYING
COPYING.gpl
These files contain copyright information.
KNOWN_BUGS
A listing of known issues that may be pertinent to various users.
README
A general guide on how to get started once you put the Avida files on
your machine.
build_avida
A one step build script for compiling Avida under Unix platforms that
have CMake installed.
test_avida
After Avida has been built, this script serves as an entry point for
executing a series of consistency tests on the produced executable.
Directory: build/work/ (CMake)
Directory: build/{Target Name}/work/ (Xcode)
After compilation, this directory will contain all of the configuration files
necessary for Avida (explained in more detail under in their own documentation
files). The key files and directories here are:
analyze.cfg
The default file used to write analysis scripts.
avida.cfg
Page 2 of 7
08/28/2007 04:33 PM
This is the main configuration file that is used by default.
environment.cfg
This file contains the default environment information.
events.cfg
This file contains the default event list.
inst_set.default
This is the main, heads-based instruction set that is used by default.
organism.default
This file contains the default starting ancestor of length 100.
data/
This is the name of the default output directory and is created by Avida
if it does not exist. The name and location of this directory can be
configured in avida.cfg.
Directory: source/
This is a large sub-directory structure that contains all of the source code
that makes up Avida. Each sub-directory here includes its own CMake and SCONS
build information. The high level purpose of each sub-directory is:
actions/
Contains various source files that define action classes that are usable
as schedule events and analyze commands. Also contains the cActionLibrary
responsible for instiating objects based on cString names.
analyze/
Contains classes responsible for performing and managing data from
detailed analyses.
classification/
Classes that define and manage classification of current and past
properties of the population are stored here.
cpu/
Files and classes used to implement all of the virtual hardware within
the Avida software.
drivers/
Classes and infrastructure used to orchestrate the execution of Avida.
event/
Contains classes responsible for event scheduling and triggering.
main/
Contains all of the core classes that define the world and the population
within it.
platform/
Contains platform specific software in various subdirectories, such as
the high performance malloc library for POSIX platforms.
targets/
Target (executable) specific source code. The source code of the NCurses
viewer resides in the avida-viewer/ subdirectory.
tools/
Contains a number of generic tools classes, including custom data
structures and robust string manipulation classes.
Directory: source/main/
This sub-directory contains all of the core source code files for the
Page 3 of 7
08/28/2007 04:33 PM
software. For ease, there are two separate groups of more important components
and less important components, each in alphabetical order. The syntax name.??
refers to header/source file pairs, name.h and name.cc. The more important
files are:
cAvidaConfig.??
These files define the cAvidaConfig object that maintains the current
configuration state of Avida. This class is initialized by the avida.cfg
file and processed command line arguments and can be modified via various
events during the run.
cEnvironment.??
This file defines the cEnvironment object, which controls all of the
environmental interactions in an Avida run. It makes use of reactions,
resources, and tasks.
cGenome.??
The cGenome object maintains of a sequence of objects of class
cInstruction.
cInstruction.??
The cInstruction class is very simple, maintaining a single instruction
in Avida.
cInstLibBase.h
The cInstLibBase class serves as a base class for objects that associate
instructions with their corresponding functionality in the virtual
hardware.
cMutationRates.??
These files contain the cMutationRates class which maintain the
probability of occurrence for each type of mutation.
cOrganism.??
The cOrganism class represents a single organism, and contains the
initial genome of that organism, its phenotypic information, its virtual
hardware, etc.
cPopulation.??
The cPopulation class manages the organisms that exist in an Avida
population. It maintains a collection of cPopulationCell objects (either
as A grid, or independent cells for mass action) and contains the
scheduler, genebank, event manager, etc.
cPopulationCell.??
A cPopulationCell is a single location in an Avida population. It can
contain an organism, and has its own mutation rates (but not yet its own
environment.)
cStats.??
A cStats object keeps track of many different population-wide statistics.
cWorld.??
The cWorld object contains all of the state information used by a
particular run and can be used to access many globally important classes.
Below are various less important files that may still be useful to know about:
cOrgInterface.h
The cOrgInterface class defines the interface used by organisms to
interact back with the population or test CPU environment.
cReaction.??
The cReaction class contains all of the information for what triggers a
Page 4 of 7
08/28/2007 04:33 PM
reaction, its restrictions, and the process that occurs.
cReactionResult.??
The cReactionResult class contains all of the information about the
results of a reaction after one occurs, such as the amount of resources
consumed, what the merit change is, what tasks triggered it, etc.
cResource.??
The cResource class contains information about a single resource, such as
its inflow rate, outflow, name, etc.
cResourceCount.??
The resource count keeps track of how much of each resource is present in
the region being tracked.
cTaskLib.??
This class contains all of the information associated with task
evaluation.
Directory: source/analyze/
The primary class in this directory is cAnalyze. This class processes
analyze.cfg files to perform data analysis on run data. The additional classes
in this directory support various types of analyses, along with provide the
foundation for multithreaded execution. The cAnalyzeJobQueue object,
instatiated by cAnalyze, orchestrates queuing and executing jobs on parallel
worker objects.
Directory: source/cpu/
This sub-directory contains the files used to define the virtual CPUs in
Avida.
cCodeLabel.??
The cCodeLabel class marks labels (series of no-operation instructions)
in a genome. These are used when a label needs to be used as an
instruction argument.
cCPUMemory.??
The cCPUMemory class inherits from the cGenome class, extending its
functionality to facilitate insertions and deletions. It also associates
flags with each instruction in the genome to mark if they have been
executed, copied, mutated, etc.
cCPUStack.??
The cCPUStack class is an integer-stack component in the virtual CPUs.
cHardwareBase.??
The cHardwareBase class is an abstract base class that all other hardware
types must be overloaded from. It has minimal built in functionality.
cHardwareCPU.??
The cHardwareCPU class extends cHardwareBase into a proper virtual CPU,
with registers, stacks, memory, IO Buffers, etc.
cHardwareManager.??
The cHardwareManager manages the building of new hardware as well Test
CPU creation.
cHardwareSMT.??
This class represents the in process experimental implementation of next
generation virtual hardware.
cHardwareTransSMT.??
Page 5 of 7
08/28/2007 04:33 PM
An intermediate step on the path to cHardwareSMT, this transitional
hardware is used in a number of ongoing research projects.
cHeadCPU.??
The cCPUHead class implements a head pointing to a position in the memory
of a virtual CPU.
cTestCPU.??
The cTestCPU class maintains a test environment to run organisms in that
we don't want to be able to directly affect the real population.
cTestUtil.??
The cTestUtil utility class is for test-related functions that require a
test CPU, such as printing out a genome to a file with collected
information.
Directory: source/tools/
The tools sub-directory contains C++ source code that is used throughout
Avida, but is not specific to the project.
cDataEntry.??
Associates data names with functions for printing out data file with a
user specified format.
cDataFile.??
A class useful for handling output files with named columns.
cDataFileManager.??
This class manages a collection of data files and handles the creation
and output of user-designed data files at runtime.
cMerit.??
Provides a very large integer number, dissectable in useful ways.
cRandom.??
A powerful and portable random number generator, that can output numbers
in a variety of formats.
cString.??
A standard string object, but with lots of functionality.
cStringList.??
A specialized class for collections of strings, with added functionality
over a normal list.
cStringUtil.??
Contains a bunch of static methods to manipulate and compare strings.
functions.h
Some useful math functions such as Min, Max, and Log.
Templates are special classes that interact with another data-type that
doesn't need to be specified until the programmer instantiates an object in
the class. Its a hard concept to get used to, but allows for remarkably
flexible programming, and makes very reusable code. The main drawback (other
than brain-strain) is that templates must be entirely defined in header files
since separate code is generated for each class the template interacts with.
tArray.h
A fixed-length array template; array sizes may be adjusted manually when
needed.
tBuffer.h
A container that keeps only the last N entries, indexed with the most
Page 6 of 7
08/28/2007 04:33 PM
recent first.
tDictionary.h
A container template that allows the user to search for a target object
based on a keyword (of type cString).
tHashTable.h
A mapping container that maps keys to values using a hashing function to
provide fast lookup.
tList.h
A reasonably powerful linked list and iterators. The list will keep track
of the iterators and never allow them to have an illegal value.
tManagedPointerArray.h
A derivative of tArray, a managed pointer array is ideal for storing
arrays of large objects that may need to be resized. The backing storage
mechanism simple resizes an array of pointers, preventing the unnecessary
copying of large objects.
tMatrix.h
A fixed size matrix template with arbitrary indexing.
tMemTrack.h
This is a template that can be put over any class or data type to keep
track of it. If all creations of objects in the class are done through
this template rather than (or in conjunction with) "new", memory leaks
should be detectable. This is new, and not yet used in Avida.
tSmartArray.h
A derivative of tArray that provides hidden capacity management. This
type of array is ideal for arrays of small objects that may be resized
often.
tVector.h
A variable-length array object; array sizes will be automatically
adjusted to accommodate any positions accessed in it.
Directory: support/config/
This directory contains all of the originals of the files that are copied into
the work/ directory on the installation process for the user to modify. There
is also a misc/ sub-directory under here with additional, optional
configuration files that you may want to look at to see other possible preconfigured settings.
Return to the Index
Page 7 of 7
Avida : The Avida Configuration File
08/28/2007 04:34 PM
Return to the Index
The Avida Configuration File
The Avida configuration file (avida.cfg) is the main configuration file for
Avida. With this file, the user can setup all of the basic conditions for a
run. Below are detailed descriptions for some of the settings in the
configuration file, with particularly important settings highlighted in green.
The non-colored entries will probably never need to change unless you are
performing a very specialized experiment.
Architecture Variables
This section covers all of the basic variables that describe the Avida run.
This is effectively a miscellaneous category for settings that don't fit
anywhere below.
MAX_UPDATES
These settings allow the user to determine for how long
MAX_GENERATIONS
the run should progress in generations and in updates, and
END_CONDITION_MODE determine if one or both criteria need to be met for the
run to end. The run will also end if ever the entire
population has died out. A setting of -1 for either ending
condition will indicate no limit. End conditions can also
be set in the events file, as is done by default, so you
typically won't need to worry about this.
WORLD_X
WORLD_Y
The settings determine the size of the Avida grid that the
organisms populate. In mass action mode the shape of the
grid is not relevant, only the number of organisms that
are in it.
RANDOM_SEED
The random number seed initializes the random number
generator. You should alter only this seed if you want to
perform a collection of replicate runs. Setting the random
number seed to zero (or a negative number) will base the
seed on the starting time of the run -- effectively a
random random number seed. In practice, you want to always
be able to re-do an exact run in case you want to get more
information about what happened.
Configuration Files
This section relates Avida to other files that it requires.
DATA_DIR
The name (or path) of the directory where output files
generated by Avida should be placed.
file:///Users/boccio/Desktop/documentation/config.html
Page 1 of 8
08/28/2007 04:34 PM
INST_SET
EVENT_FILE
These settings indicate the names of all of the other
ANALYZE_FILE
configuration files used in an Avida run. See the individual
ENVIRONMENT_FILE documents for more information about how to use these files.
START_CREATURE
Reproduction
These settings control how creatures are born and die in Avida.
BIRTH_METHOD
The birth method sets how the placement of a child
organism is determined. Currently, there are six ways
of doing this -- the first four (0-3) are all gridbased (offspring are only placed in the immediate
neighborhood), and the last two (4-5) assume a wellstirred population. In all non-random methods, empty
sites are preferred over replacing a living organism.
DEATH_METHOD
AGE_LIMIT
By default, replacement is the only way for an organism
to die in Avida. However, if a death method is set,
organisms will die of old age. In method one, organisms
will die when they reach the user-specified age limit.
In method 2, the age limit is a multiple of their
length, so larger organisms can live longer.
ALLOC_METHOD
During the replication process in the default virtual
CPU, parent organisms must allocate memory space for
their child-to-be. Before the child is copied into this
new memory, it must have an initial value. Setting the
alloc method to zero sets this memory to a default
instruction (typical nop-A). Mode 1 leaves it
uninitialized (and hence keeps the contents of the last
organism that inhabited that space; if only a partial
copy occurs, the child is a hybrid if the parent and
the dead organism, hence the name necrophilia). Mode 2
just randomizes each instruction. This means that the
organism will behave unpredictably if the uninitialized
code is executed.
DIVIDE_METHOD
When a divide occurs, does the parent divide into two
children, or else do we have a distinct parent and
child? The latter method will allow more age structure
in a population where an organism may behave
differently when it produces its second or later
offspring.
GENERATION_INC_METHOD The generation of an organism is the number of
organisms in the chain between it and the original
ancestor. Thus, the generation of a population can be
calculated as the average generation of the individual
organisms. When a divide occurs, the child always
receives a generation one higher than the parent, but
what should happen to the generation of the parent
Page 2 of 8
08/28/2007 04:34 PM
itself? In general, this should be set the same as
divide method.
Divide Restrictions
These place limits on when an organism can successfully issue a divide command
to produce an offspring.
CHILD_SIZE_RANGE
This is the maximal difference in genome size between a
parent and offspring. The default of 2.0 means that the
genome of the child must be between one-half and twice the
length of the parent. This it to prevent out-of-control
size changes. Setting this to 1.0 will ensure fixed length
organisms (but make sure to also turn off insertion and
deletion mutations).
MIN_COPIED_LINES
MIN_EXE_LINES
These settings place limits on what the parent must have
done before the child can be born; they set the minimum
fraction of instructions that must have been copied into
the child (vs. left as default) and the minimum fraction of
instructions in the parent that must have been executed. If
either of these are not met, the divide will fail. These
settings prevent organisms from producing pathological
offspring. In practice, either of them can be set to 0.0 to
turn them off.
REQUIRE_ALLOCATE
Is an allocate required between each successful divide (in
virtual hardware types where allocate is meaningful)? If
so, this will limit the flexibility of how organisms
produce children (they can't make multiple copies and
divide them off all at once, for example). But if we don't
require allocates, the resulting organisms can be a lot
more difficult to understand.
REQUIRED_TASK
This was originally a hack. It allows the user to set the
ID number for a task that must occur for a divide to be
successful. At -1, no tasks are required. Ideally, this
should be incorporated into the environment configuration
file. NOTE: A task can fire without triggering a reaction.
To add a required reaction see below.
IMMUNITY_TASK
Allows user to set the ID number for a task which, if it
occures, provides immunity from the required task (above) - divide will proceede even if the required task is not
done if immunity task is done. Defaults to -1, no immunity
task present.
REQUIRED_REACTION Allows the user to set the ID number for a reaction that
must occur for a divide to be successful. At -1, no
reactions are required.
DIE_PROB
Determines the probability of organism dieing when 'die'
instruction is executed. Set to 0 by default, making the
instruction neutral.
Page 3 of 8
08/28/2007 04:34 PM
Mutations
These settings control how and when mutations occur in organisms. Ideally,
there will be more options here in the future.
POINT_MUT_PROB
Point mutations (sometimes referred to as "cosmic ray"
mutations) occur every update; the rate set here is a
probability for each site that it will be mutated each
update. In other words, this should be a very low value if it
is turned on at all. If a mutation occurs, that site is
replaced with a random instruction. In practice this also
slows Avida down if it is non-zero because it requires so
many random numbers to be tested every update.
COPY_MUT_PROB
The copy mutation probability is tested each time an organism
copies a single instruction. If a mutation occurs, a random
instruction is copied to the destination. In practice this is
the most common type of mutations that we use in most of our
experiments.
INS_MUT_PROB
DEL_MUT_PROB
These probabilities are tested once per gestation cycle (when
an organism is first born) at each position where an
instruction could be inserted or deleted, respectively. Each
of these mutations change the genome length. Deletions just
remove an instruction while insertions add a new, random
instruction at the position tested. Multiple insertions and
deletions are possible each generation.
DIVIDE_MUT_PROB Divide mutation probabilities are tested when an organism is
DIVIDE_INS_PROB being divided off from its parent. If one of these mutations
DIVIDE_DEL_PROB occurs, a random site is picked for it within the genome. At
most one divide mutation of each type is possible during a
single divide.
Mutation Reversions
This section covers tests that are very CPU intensive, but allow for Avida
experiments that would not be possible in any other system. Basically, each
time a mutation occurs, we can run the resulting organism in a test CPU, and
determine if that effect of the mutation was lethal, detrimental, neutral, or
beneficial. This section allows us to act on this. (Note that as soon as
anything here is turned on, the mutations need to be tested. Turning multiple
settings on will not cause additional speed decrease)
REVERT_FATAL
REVERT_DETRIMENTAL
REVERT_NEUTRAL
REVERT_BENEFICIAL
When a mutation occurs of the specified type, the
number listed next to that entry is the probability
that the mutation will be reverted. That is, the child
organism's genome will be restored as if the mutation
Page 4 of 8
08/28/2007 04:34 PM
had never occurred. This allows us both to manually
manipulate the abundance of certain mutation types, or
to entirely eliminate them.
STERILIZE_FATAL
STERILIZE_DETRIMENTAL
STERILIZE_NEUTRAL
STERILIZE_BENEFICIAL
The sterilize options work similarly to revert; the
difference being that an organism never has its genome
restored. Instead, if the selected mutation category
occurs, the child is sterilized so that it still takes
up space, but can never produce an offspring of its
own.
FAIL_IMPLICIT
If this toggle is set, organisms must be able to
produce exact copies of themselves or else they are
sterilized and cannot produce any offspring. An
organism that naturally (without any external effects)
produces an inexact copy of itself is said to have
implicit mutations. If this flag is set, explicit
mutations (as described in the mutations section above)
can still occur.
Time Slicing
These settings describe exactly what an update is, and how CPU time is
allocated to organisms during that update.
AVE_TIME_SLICE
This sets the average number of instructions an organism
should execute each update. Organisms with a low merit
will consistently obtain fewer, while organisms of a
higher merit will receive more.
SLICING_METHOD
This setting determines the method by which CPU time is
handed out to the organisms. Method 0 ignores merit, and
hands out time on the CPU evenly; each organism executes
one instruction for the whole population before moving
onto the second. Method 1 is probabilistic; each organism
has a chance of executing the next instruction
proportional to it merit. This method is slow due to the
large number of random values that need to be obtained and
evaluated (and it only gets slower as merits get higher).
Method 2 is fully integrated; the organisms get CPU time
proportional to their merit, but in a fixed, deterministic
order.
SIZE_MERIT_METHOD
This setting determines the base value of an organism's
merit. Merit is typically proportional to genome length
otherwise there is a strong selective pressure for shorter
genomes (shorter genome => less to copy => reduced copying
time => replicative advantage). Unfortunately, organisms
will cheat if merit is proportional to the full genome
length -- they will add on unexecuted and uncopied code to
their genomes creating a code bloat. This isn't the most
elegant fix, but it works.
Page 5 of 8
08/28/2007 04:34 PM
MAX_LABEL_EXE_SIZE Labels are sequences of nop (no-operation) instructions
used only to modify the behavior of other instructions.
Quite often, an organism will have these labels in their
genomes where the nops are used by another instruction,
but never executed directly. To represent the executed
length of an organism correctly, we need to somehow count
these labels. Unfortunately, if we count the entire label,
the organisms will again "cheat" artificially increasing
their length by growing huge labels. This setting limits
the number of nops that are counted as executed when a
label is used.
MAX_CPU_THREADS
Determines the number of simultaneous processes that an
organism can run. That is, basically, the number of things
it can do at once. This setting is meaningless unless
threads are supported in the virtual hardware and the
instructions are available within the instruction set.
Geneology Info
These settings control how Avida monitors and deals with genotypes, species,
and lineages.
THRESHOLD
For some statistics, we only want to measure organisms
that we are sure are alive, but its not worth taking the
time to run them all in isolation, without outside effect
(and in some eco-system situations that isn't even
possible!). For these purposes, we call a genotype
"threshold" if there have ever been more than a certain
number of organisms of that genotype. A higher number here
ensures a greater probability that the organisms are
indeed "alive". Recently, we've been shifting away from
using threshold genotypes and instead finding other, more
accurate testing methods.
GENOTYPE_PRINT
Should all genotypes be printed out upon reaching
threshold? Each will receive its own file in the archive
directory, so this can get very hard disk intensive. Many
runs will have in the millions of organisms.
GENOTYPE_PRINT_DOM Printing only the dominant genotype keeps track of the
most successful individual genotypes without costing a
huge amount of memory. The number you place here is the
total number of updates that a genotype must remain
dominant for it to be printed out. A 0 turns this off.
SPECIES_THRESHOLD
In Avida, two organisms are said to be of the same species
if you can perform all possible crossovers between them,
and no more than a certain threshold (set here) fail to be
viable offspring. The crossovers are done in isolation,
and never affect the population as a whole.
SPECIES_RECORDING
This entry sets if and how species should be recorded in
Page 6 of 8
08/28/2007 04:34 PM
Avida. A setting of 0 turns all species tests off. A
setting of 1 means that every time a genotype reaches
threshold, it is tested against all currently existing
species to determine if it is part of any of them. If so,
its species is set, and if not, it becomes the prototype
of a new species. Finally, a setting of 2 only tests a new
threshold genotype against the species of its parent
(since each species test can take a long time) and if that
fails immediately creates a new species. In practice,
methods 1 and 2 produce similar results, but method 1 can
take a lot longer to run.
SPECIES_PRINT
Toggle: Should new species be printed as soon as they are
created?
TEST_CPU_TIME_MOD
Many of our analysis methods (such as species testing)
require that we be able to run organisms in isolation.
Unfortunately, some of these organisms we test might be
non-viable. At some point, we have to give up the test and
label it as non-viable, but we can't give up too soon or
else we might miss a viable, though slow replicator. This
setting is multiplied by the length of the organism's
genome in order to determine how many CPU-cycles to run
the organism for. A setting of 20 effectively means that
the average instruction must be executed twenty times
before we give up. In practice, most organisms have an
efficiency here of about 5, so 20 works well, but for
accurate tests on some pathological organisms, we will be
required to raise this number.
TRACK_MAIN_LINEAGE In a normal Avida run, the genebank keeps track of all
existing genotypes, and deletes them when the last
organism of that genotype dies out. With this flag set, a
genotype will not be deleted unless both it and all of its
descendents have died off. This allows us to track back
from any genotypes to its distant ancestors, monitoring
all of the differences along the way. Once this
information is being saved, see the events file for how to
output it.
Log Files
Log files are printed every time a specified event occurs. By default, all
logs settings are 0 (i.e. the logs are turned off). Each time a logged event
is printed, the update and identifying information on the individual that
triggered it is always included.
LOG_CREATURES
If toggle is set, print an entry to creature.log
whenever a new organism is born. Include position
information, parent organism, and a link to it
genotype so the run can be reconstructed. This gets
very large.
Page 7 of 8
08/28/2007 04:34 PM
LOG_GENOTYPES
If toggle is set, print an entry to genotype.log
whenever a new genotype is created. Includes
information on its parent genotype.
LOG_THRESHOLD
If toggle is set, print an entry to threshold.log
whenever a genotype reaches threshold. Includes
information on what species it is.
LOG_SPECIES
If toggle is set, print an entry to species.log
whenever a new species is created. Includes
information on the genotype the triggered the
creation.
LOG_LINEAGES
Lineages can be given unique identifies and printed
(into the file lineage.log) whenever they are
created. Includes details about the event that
created the lineage.
LINEAGE_CREATION_METHOD Details when lineages are created. See config file
comments for more detailed information.
Return to the Index
Page 8 of 8
Avida : The Instruction Set File
08/28/2007 04:34 PM
Return to the Index
The Instruction Set File
An instruction set file consists of a list of instructions that belong to that
instruction set, each of which is followed by a series of numbers that define
how that instruction should be used. The exact format is as follows:
inst-name redundancy cost ft_cost prob_fail
inst-name
The name of the instruction to include in the described instruction set.
redundancy
The frequency of the instruction in the set. One instruction with twice
the redundancy of another with also have twice the probability of being
mutated to. A redundancy of zero is allowed, and indicates that injected
organisms are allowed to have this instruction, but it can never be
mutated to.
cost
The number of CPU cycles required to execute this instruction. One is the
default if this value is not specified.
ft_cost
The additional cost to be paid the first time this instruction is
executed. This is used to lower the diversity of instructions inside an
organism. The default value here is 0.
prob_fail
The probability of this instruction not working properly. If an
instruction fails it will simply do nothing, but still cost the CPU
cycles to execute. The defailt probability of failure is zero.
Normally only the first column of numbers is used in the file.
Description of Default Instruction Set
Below are the descriptions of the instructions turned on in the file instsetclassic.cfg. The one-letter codes are assigned automatically to each
instruction in the set, so if additional instructions are turned on, the
letters given below may no longer correspond to the instructions they are
presented with. If more than 26 instructions are in a set, both lowercase and
capital letters will be used, and then numbers. Currently, no more than 62
distinct instructions will be represented by unique symbols.
Most terminology below that may not be familiar to you has been given a link
to a file containing its definition.
(a - c) Nop Instructions
The instructions nop-A (a), nop-B (b), and nop-C (c) are no-operation
file:///Users/boccio/Desktop/documentation/inst_set.html
Page 1 of 5
08/28/2007 04:34 PM
instructions, and will not do anything when executed. They will, however,
modifiy the behavior of the instruction preceeding it (by changing the CPU
component that it affects; see also nop-register notation and nop-head
notation) or act as part of a template to denote positions in the genome.
(d) if-n-equ
This instruction compares the ?BX? register to its complement. If they are not
equal, the next instruction (after a modifying no-operation instruction, if
one is present) is executed. If they are equal, that next instruction is
skipped.
(e) if-less
This instruction compares the ?BX? register to its complement. If ?BX? is the
lesser of the pair, the next instruction (after a modifying no-operation
instruction, if one is present) is executed. If it is greater or equal, then
that next instruction is skipped.
(f) pop
This instruction removes the top element from the active stack, and places it
into the ?BX? register.
(g) push
This instruction reads in the contents of the ?BX? register, and places it as
a new entry at the top of the active stack. The ?BX? register itself remains
unchanged.
(h) swap-stk
This instruction toggles the active stack in the CPU. All other instructions
that use a stack will always use the active one.
(i) swap
This instruction swaps the contents of the ?BX? register with its complement.
(j) shift-r
This instruction reads in the contents of the ?BX? register, and shifts all of
the bits in that register to the right by one. In effect, it divides the value
stored in the register by two, rounding down.
(k) shift-l
This instruction reads in the contents of the ?BX? register, and shifts all of
the bits in that register to the left by one, placing a zero as the new
rightmost bit, and trunkating any bits beyond the 32 maximum. For values that
require fewer than 32 bits, it effectively multiplies that value by two.
Page 2 of 5
08/28/2007 04:34 PM
(l) inc and (m) dec
These instructions read in the contents of the ?BX? register and increment or
decrement it by one.
(n) add and (o) sub
These instructions read in the contents of the BX and CX registers and either
sums them together or subtracts CX from BX (respectively). The result of this
operation is then placed in the ?BX? register.
(p) nand
This instruction reads in the contents of the BX and CX registers (each of
which are 32-bit numbers) and performs a bitwise nand operation on them. The
result of this operation is placed in the ?BX? register. Note that this is the
only logic operation provided in the basic Avida instruction set.
(q) IO
This is the input/output instruction. It takes the contents of the ?BX?
register and outputs it, checking it for any tasks that may have been
performed. It will then place a new input into ?BX?.
(r) h-alloc
This instruction allocates additional memory for the organism up to the
maximum it is allowed to use for its offspring.
(s) h-divide
This instruction is used for an organism to divide off an finnished offspring.
The original organism keeps the state of its memory up until the read-head.
The offspring's memory is initialized to everything between the read-head and
the write-head. All memory past the write-head is removed entirely.
(t) h-copy
This instruction reads the contents of the organism's memory at the position
of the read-head, and copy that to the position of the write-head. If a nonzero copy mutation rate is set, a test will be made based on this probability
to determine if a mutation occurs. If so, a random instruction (chosen from
the full set with equal probability) will be placed at the write-head instead.
(u) h-search
This instruction will read in the template the follows it, and find the
location of a complement template in the code. The BX register will be set to
the distance to the complement from the current position of the instructionpointer, and the CX register will be set to the size of the template. The
flow-head will also be placed at the beginning of the complement template. If
no template follows, both BX and CX will be set to zero, and the flow-head
Page 3 of 5
08/28/2007 04:34 PM
will be placed on the instruction immediatly following the h-search.
(v) mov-head
This instruction will cause the ?IP? to jump to the position in memory of the
flow-head.
(w) jmp-head
This instruction will read in the value of the CX register, and the move the ?
IP? by that fixed amount through the organism's memory.
(x) get-head
This instruction will copy the position of the ?IP? into the CX register.
(y) if-label
This instruction reads in the template that follows it, and tests if its
complement template was the most recent series of instructions copied. If so,
it executed the next instruction, otherwise it skips it. This instruction is
commonly used for an organism to determine when it has finished producing its
offspring.
(z) set-flow
This instruction moves the flow-head to the memory position denoted in the ?
CX? register.
Other available instructions
h-push and h-pop
These instructions act siminar to push and pop above, but instead of working
with registers, the place the position of the ?IP? on the stack, or put the ?
IP? at the position taken from the stack (respectively).
inject
This instruction acts similar to divide, but instead of splitting off an
offspring, it will remove the section of code between the read and write
heads, and attempt to inject it into the neighbor that the organism is facing.
The template following this instruction will be used; if an exact match is
found (with no extre nops in it) in the target organism, the injected code
will be placed immediately after that template. Otherwise the command fails,
and the code intended for injection is instead discarded.
rotate-l and rotate-r
These instructions rotate the facing of an organism. If no teplate follows,
Page 4 of 5
08/28/2007 04:34 PM
the organism will turn one cell in the appropriate direction (left or right).
If a template is present, it will keep turning in that direction until either
it has made a full 360 degree turn, or else it finds an organism that
possesses the complement template.
div-asex
Same as h-divide (added for symetry with the divide-sex).
div-sex
Divide with recombination. After the offspring genome is created, it is not
immediately placed into the population. Instead, it goes into "birth chamber".
If there is already another genome there, they recombine. If not, it waits
untill the next sexually produced genotype arrives. When another genome
arrives two random points are picked in the genome, and the area between them
is swapped between the two genomes in the birth chamber. Then, they are both
placed into the population.
div-asex-w
Control for the effect of sexual genomes waiting in the birth chamber. There
is no recombination here, but each genome must wait in the birth chamber until
another one arrives before they are both placed into the population.
die
When executed, kills the organism, with the probability set by DIE_PROB in
genesis.
Return to the Index
Page 5 of 5
Avida : The Events File
08/28/2007 04:34 PM
Return to the Index
The Events File
The events file controls events that need to occur throughout the course of a
run. This includes the output of data files as well as active events that
effect the population (such as extinction events or changes to the mutation
rate).
File Formats
This file consists of a list of events that will be triggered either singly or
periodically. The format for each line is:
type
timing
event
arguments
The type determines what kind of timings the event will be based off of. This
can be immediate [i], based on update [u], or based on generation [g].
The timing should only be included for non-immediate events. If a single
number is given for timing, the event occurs at that update/generation. A
second number can be included (seperated by a colon ':') to indicate how often
the event should be repeated. And if a third number is listed (again, colon
seperated) this will be the last time the event can occur on. For example, the
type and timing u 100:100:5000 would indicate that the event that follows
first occurs at update 100, and repeats every 100 updates thereafter until
update 5000. A type timing of g 10:10 would cause the event to be triggered
every 10 generations for the entire run.
The event is simply the name of the action that should be performed, and the
arguments detail exactly how it should work when it is triggered. Each action
has its own arguments. See the List of Actions for details about all of the
available options.
Some examples:
i Inject
Inject an additional start creature immediately.
u 100:100 PrintAverageData
Print out all average measurements collected every one hundred updates,
starting at update 100.
g 10000:10:20000 PrintData dom_info.dat
update,dom_fitness,dom_depth,dom_sequence
Between generations 10,000 and 20,000, append the specified information to
the file dom_info.dat every ten generations. Specifically, the first column
in the file would be update number, second is the fitness of the dominant
genotype, followed by the depth in the phylogentic tree of the dominant
file:///Users/boccio/Desktop/documentation/events.html
Page 1 of 2
Avida : The Events File
08/28/2007 04:34 PM
genotype, and finally its genome sequence.
Return to the Index
file:///Users/boccio/Desktop/documentation/events.html
Page 2 of 2
Avida : The Environment File
08/28/2007 04:35 PM
Return to the Index
The Environment File
This is the setup file for the task/resource system in Avida.
Two main keywords are used in this file, RESOURCE and REACTION. Their formats are:
RESOURCE
REACTION
name[:flow] {name ...}
name task [process:...]
[requisite:...]
Where name is a unique identifier. Resources can have additional flow information to
indicate starting amounts, inflow and outflow. Reactions are further described by the
task that triggers them, the processes they perform (including resources used and the
results of using them), and requisites on when they can occur.
All entries on a resource line are names of individual resources. Resources have a global
quantity depleatable by all organisms. The resource name infinite is used to refer to an
undepleatable resource. The following chart specifies additional descriptions for
resource initialization.
Table 1: Resource Specifications
(blue variables used for all resources while red variables are only used for spatial
resources)
Argument
Description
Default
inflow
The number of units of the resource that enter the population over the
course of an update. For a global resource this inflow occurs evenly
throughout the update, not all at once. For a spatial resource this
0
inflow amount is added every update evenly to all grid cells in the
rectangle described by the points (inflowx1,inflowy1) and
(inflowx2,inflowy2).
outflow
The fraction of the resource that will flow out of the population each
update. As with inflow, this happens continuously over the course of
the update for a global resource. In the case of a spatial resource
0.0
the fraction is withdrawn each update from each cell in the rectangle
described by the points (outflowx1,outflowy1) and
(outflowx2,outflowy2).
initial
The initial abundance of the resource in the population at the start
of an experiment. For a spatial resource the initial amount is spread
evenly to each cell in the world grid.
geometry
The layout of the resource in space.
global -- the entire pool of a resource is available to all organisms
grid -- organisms can only access resources in their grid cell.
Resource can not flow past the edges of the world grid. (resource will
global
use spatial parameters)
torus -- organisms can only access resources in their grid cell.
Resource can flow to the oposite edges of the world grid. (resource
will use spatial parameters)
inflowx1
Leftmost coordinate of the rectange where resource will flow into
world grid.
inflowx2
Rightmost coordinate of the rectange where resource will flow into
file:///Users/boccio/Desktop/documentation/environment.html
0
0
0
Page 1 of 6
inflowx2
08/28/2007 04:35 PM
0
world grid.
inflowy1
Topmost coordinate of the rectange where resource will flow into world
0
grid.
inflowy2
Bottommost coordinate of the rectange where resource will flow into
world grid.
0
outflowx1
Leftmost coordinate of the rectange where resource will flow out of
world grid.
0
outflowx2
Rightmost coordinate of the rectange where resource will flow out of
world grid.
0
outflowy1
Topmost coordinate of the rectange where resource will flow out of
world grid.
0
outflowy2
Bottommost coordinate of the rectange where resource will flow out of
world grid.
0
xdiffuse
How fast material will diffuse right and left. This flow depends on
the amount of resources in a given cell and amount in the cells to the 1.0
right and left of it. (0.0 - 1.0)
xgravity
How fast material will move to the right or left. This movement
depends only on the amount of resource in a given cell. (-1.0 - 1.0)
0
ydiffuse
How fast material will diffuse up and down. This flow depends on the
amount of resources in a given cell and amount in the cells above and
below it. (0.0 - 1.0)
1.0
ygravity
How fast material will move to the up or down. This movement depends
only on the amount of resource in a given cell. (-1.0 - 1.0)
0
An example of a RESOURCE statement that begins a run with a fixed amount of the (global)
resource in the environment, but has no inflow or outflows is:
RESOURCE
glucose:initial=10000
If you wanted to make this into a chemostat with a 10000 equilibrium concentration for
unused resources, you could put:
RESOURCE
maltose:initial=10000:inflow=100:outflow=0.01
If you want a resource that exists spatially where the resource enters from the top and
flows towards the bottom where it exits the system, you could use:
RESOURCE lactose:initial=100000:inflow=100:outflow=0.1:inflowx1=0:\
inflowx2=100:inflowy1=0:inflowy2=0:outflowx1=0:outflowx2=100:\
outflowy1=100:outflowy2=100:ygravity=0.5
Defining a resource with no parameters means that it will start at a zero quantity and
have no inflow or outflow. This is sometimes desirable if you want that resource to only
be present as a byproduct of a reaction. Remember, though, that you should still have an
outflow rate if its in a chemostat.
Each reaction must have a task that triggers it. Currently, eighty tasks have been
implemented, as summarized in the following table (in approximate order of complexity):
Table 2: Available Tasks
Task
Description
Page 2 of 6
08/28/2007 04:35 PM
echo
This task is triggered when an organism inputs a single number and outputs it
without modification.
add
This task is triggered when an organism inputs two numbers, sums them
together, and outputs the result.
sub
This task is triggered when an organism inputs two numbers, subtracts one
from the other, and outputs the result.
not
This task is triggered when an organism inputs a 32 bit number, toggles all
of the bits, and outputs the result. This is typically done either by nanding
(by use of the nand instruction) the sequence to itself, or negating it and
subtracting one. The latter approach only works since numbers are stored in
twos-complement notation.
nand
This task is triggered when two 32 bit numbers are input, the values are
'nanded' together in a bitwise fashion, and the result is output. Nand stands
for "not and". The nand operation returns a zero if and only if both inputs
are one; otherwise it returns a one.
and
'anded' together in a bitwise fashion, and the result is output. The and
operation returns a one if and only if both inputs are one; otherwise it
returns a zero.
orn
'orn' together in a bitwise fashion, and the result is output. The orn
operation stands for or-not. It is returns true if for each bit pair one
input is one or the other one is zero.
or
'ored' together in a bitwise fashion, and the result is output. It returns a
one if either the first input or the second input is a one, otherwise it
returns a zero.
andn
'andn-ed' together in a bitwise fashion, and the result is output. The andn
operation stands for and-not. It only returns a one if for each bit pair one
input is a one and the other input is not a one. Otherwise it returns a zero.
nor
'nored' together in a bitwise fashion, and the result is output. The nor
operation stands for not-or and returns a one only if both inputs are zero.
Otherwise a zero is returned.
xor
'xored' together in a bitwise fashion, and the result is output. The xor
operation stands for "exclusive or" and returns a one if one, but not both,
of the inputs is a one. Otherwise a zero is returned.
equ
equated together in a bitwise fashion, and the result is output. The equ
operation stands for 'equals' and will return a one if both bits are
identical, and a zero if they are different.
logic_3AA- These tasks include all 68 possible unique 3-input logic operations, many of
logic_3CP which don't have easy-to-understand human readable names.
When describing a reaction, the process portion determines consumption of resources,
their byproducts, and the resulting bonuses. There are several arguments (separated by
colons; example below) to detail the use of a resource. Default values are in brackets:
Page 3 of 6
08/28/2007 04:35 PM
Table 3: Reaction Process Specifications
Argument
Description
Default
resource
The name of the resource consumed. By default, no resource is being
consumed, and the 'max' limit is the amount absorbed.
infinite
value
Multiply the value set here by the amount of the resource consumed
to obtain the bonus. (0.5 may be inefficient, while 5.0 is very
efficient.) This allows different reactions to make use of the same
resource at different efficiency levels.
1.0
type
Determines how to apply the bonus (i.e. the amount of the resource
absorbed times the value of this process) to change the merit of the
organism.
add: Directly add the bonus to the current merit.
mult: Multiply the current merit by the bonus (warning: if the bonus add
is ever less than one, this will be detrimental!)
pow: Multiply the current merit by 2 bonus . this is effectively
multiplicative, but positive bonuses are always beneficial, and
negative bonuses are harmful.
max
The maximum amount of the resource consumed per occurrence.
1.0
min
The minimum amount of resource required. If less than this quantity
is available, the reaction ceases to proceed.
0.0
frac
The maximum fraction of the available resource that can be consumed. 1.0
product
The name of the by-product resource. At the moment, only a single
by-product can be produced at a time.
conversion The conversion rate to by-product resource
none
1.0
inst
The instruction that gets executed when this reaction gets
preformed. If you do not want an organism to be able to have the
instruction in their genome, you still must put it in the
none
instruction set file, but set its weight to zero. The instruction is
executed at no cost to the organism.
lethal
Whether the cell dies after performing the process
0
If no process is given, a single associated process with all default settings is assumed.
If multiple process statements are given, all are acted upon when the reaction is
triggered. Assuming you were going to set all of the portions of process to be their
default values, this portion of the reaction statement would appear as:
process:resource=infinite:value=1:type=add:max=1:min=0:frac=1:product=none:conversion=1
This statement has many redundancies; for example, it would indicate that the associated
reaction should use the inifite resource, making 'frac' and 'min' settings irrelevant.
Likewise, since 'product' is set to none, the 'conversion' rate is never considered.
The requisite entry limits when this reaction can be triggered. The following requisites
(in any combination) are possible:
Table 4: Reaction Requisite Specifications
Argument
Description
Default
This limits this reaction from being triggered until the other
Page 4 of 6
reaction
08/28/2007 04:35 PM
reaction specified here has been triggered first. With this, the user none
can force organisms to perform reactions in a specified order.
This limits this reaction from being triggered if the reaction
specified here has already been triggered. This allows the user to
noreaction
make mutually exclusive reactions, and force organisms to "choose"
their own path.
none
min_count
This restriction requires that the task used to trigger this reaction
must be performed a certain number of times before the trigger will
actually occur. This (along with max_count) allows the user to
0
provide different reactions depending on the number of times an
organism has performed a task.
max_count
This restriction places a cap on the number of times a task can be
done and still trigger this reaction. It allows the user to limit the
number of times a reaction can be done, as well as (along with
INT_MAX
min_count) provide different reactions depending on the number of
times an organism as performed a task.
No restrictions are present by default. If there are multiple requisite entries, only
*one* of them need be satisfied in order to trigger the reaction. Note though that a
single requisite entry can have as many portions as needed.
Examples
We could simulate the pre-environment system (in which no resources were present and task
performace was rewarded with a fixed bonus) with a file including only lines like:
REACTION AND logic:2a process:type=mult:value=4.0
REACTION EQU logic:2h process:type=mult:value=32.0
requisite:max_count=1
No RESOURCE statements need be included since only the infinite resource is used (by
default, since we don't specify another resources' name) # To create an environment with
two resources that are converted back and forth as tasks are performed, we might have:
RESOURCE
RESOURCE
REACTION
REACTION
yummyA:initial=1000
yummyB:initial=1000
AtoB gobbleA process:resource=yummyA:frac=0.001:product=yummyB
BtoA gobbleB process:resource=yummyB:frac=0.001:product=yummyA
A value of 1.0 per reaction is default. Obviously gobbleA and gobbleB would have to be
tasks described within Avida.
A requisite against the other reaction being performed would prevent a single organism
from garnering both rewards in equal measure.
As an example, to simulate a chemostat, we might have:
RESOURCE glucose:inflow=100:outflow=0.01
This would create a resource called "glucose" that has a fixed inflow rate of 10000 units
where 20% flows out every update. (Leaving a steady state of 50,000 units if no organismconsumption occurs).
Limitations to this system:
Resources are currently all global; at some point soon we need to implement local
resources.
Only a single resource can be required at a time, and only a single by-product can
Page 5 of 6
08/28/2007 04:35 PM
be produced.
The default setup is:
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
NOT
NAND
AND
ORN
OR
ANDN
NOR
XOR
EQU
not
nand
and
orn
or
andn
nor
xor
equ
process:value=1.0:type=pow
This creates an environment where the organisms get a bonus for performing any of nine
tasks. Since none of the reactions are associated with a resource, the infinite resource
is assumed, which is non-depeleatable. The max_count of one means they can only get the
bonus from each reaction a single time.
A similar setup that has 9 resources, one corresponding to each of the nine possible
tasks listed above is:
RESOURCE
RESOURCE
RESOURCE
RESOURCE
RESOURCE
resNOT:inflow=100:outflow=0.01
resAND:inflow=100:outflow=0.01
resOR:inflow=100:outflow=0.01
resNOR:inflow=100:outflow=0.01
resEQU:inflow=100:outflow=0.01
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
REACTION
NOT
NAND
AND
ORN
OR
ANDN
NOR
XOR
EQU
not
nand
and
orn
or
andn
nor
xor
equ
resNAND:inflow=100:outflow=0.01
resORN:inflow=100:outflow=0.01
resANDN:inflow=100:outflow=0.01
resXOR:inflow=100:outflow=0.01
process:resource=resNOT:value=1.0:frac=0.0025
process:resource=resNAND:value=1.0:frac=0.0025
process:resource=resAND:value=2.0:frac=0.0025
process:resource=resORN:value=2.0:frac=0.0025
process:resource=resOR:value=4.0:frac=0.0025
process:resource=resANDN:value=4.0:frac=0.0025
process:resource=resNOR:value=8.0:frac=0.0025
process:resource=resXOR:value=8.0:frac=0.0025
process:resource=resEQU:value=16.0:frac=0.0025
Return to the Index
Page 6 of 6
Avida : The Analyze File
08/28/2007 04:35 PM
Return to the Index
The Analyze File
The file analyze.cfg is used to setup Avida when it is run in analyze mode,
which can be done by running avida -a. Analyze mode is useful for performing
additional tests on genotypes after a run has completed.
This analysis language is basically a simple programming language. The
structure of a program involves loading in genotypes in one or more batches,
and then either manipulating single batches, or doing comparisons between
batches. Currently there can be up to 2000 batches of genotypes, but we will
eventually remove this limit.
The rest of this file describes how individual commands work, as well as some
notes on other languages features, like how to use variables. As a formatting
guide, command arguments will be presented between brackets, such as
[filename]. If that argument is mandatory, it will be in blue. If it is
optional, it will be in green, and (if relevant) a default value will be
listed, such as [filename='output.dat'].
Analyze Mode Commands
Analyze mode provides a number of commands for loading, manipulating, and
saving analysis data. In addition to the analyze mode specific commands
detailed in the following sections, all of the Avida actions can be called as
well.
Load Commands
There are currently four ways to load in genotypes:
LOAD_ORGANISM [filename]
Load in a normal single-organism file of the type that is output from
Avida. These consist of lots of organismal information inside of
comments, and then the full genome of the organism with one instruction
per line.
LOAD [filename]
Load in a file that contains a list of genotypes, one-per-line with
additional informaiton about those genotypes. Avida now includes a header
on such files indicating the values containted in each column.
LOAD_SEQUENCE [sequence]
Load in a user-provided sequence as the genotype. Avida has a symbol
associated with each instruction; this command is simply followed by a
sequence of such symbols that is than translated back into a proper
genotype.
file:///Users/boccio/Desktop/documentation/analyze.html
Page 1 of 9
08/28/2007 04:35 PM
LOAD_MULTI_DETAIL [start-UD] [step-UD] [stop-UD] [dir='./'] [start batch=0]
Allows the user to load in multiple detail files at once, one per batch.
This is helpful when you're trying to do parallel analysis on many detail
files, or else to create a phylogenetic depth map.
Example: LOAD_MULTI_DETAIL 100 100 100000 ../my_run/run100/
This would load in the files detail_pop.100 through detail_pop.100000 in
steps of 100, from the directory of my choosing. Since 1000 files will be
loaded and we didn't specify starting batch, they will be put in batches
0 through 999.
A future addition to this list is a command that will use the "dominant.dat"
file to identify all of the dominant genotypes from a run, and then lookup and
load their individual genomes from the archive directory.
Batch Control Commands
All of the load commands place the new genotypes into the current batch, which
can be set with the SET_BATCH command. Below is the list of control functions
that allow you to manipulate the batches.
SET_BATCH [id]
Set the batch that is currently active; the initial active batch at the
start of a program is 0.
NAME_BATCH [name]
Attach a name to the current batch. Some of the printing methods will
print data from multiple batches, and we want the data from each batch to
be attached to a meaningful identifier.
PURGE_BATCH [id=current]
Remove all genotypes in the specified batch (if no argument is given, the
current batch is purged.
DUPLICATE [id1] [id2=current]
Copy the genotypes from batch id1 into id2. By default, copy id1 into the
current batch. Note that duplicate is non-destructive so you should purge
the target batch first if you don't want to just add more genotypes to
the ones already in that batch.
STATUS
Print out (to the screen) the genotype count of each non-empty batch and
identify the currently active batch.
Analysis Control Commands
There are several other commands that will allow you to interact with the
analysis mode in some very important ways, but don't actually trigger any
analysis tests or output. Below are a list of some of the more important
control commands.
SYSTEM [command]
Run the command listed on the command line. This is particularly useful
if you need to unzip files before you can use them, or if you want to
Page 2 of 9
08/28/2007 04:35 PM
delete files no longer in use.
INCLUDE [filename]
Include another file into this one and run its contents immediately. This
is useful if you have some pre-written routines that you want to have
available in several analysis files. Watch out because there are
currently no protections against circular includes.
INTERACTIVE
Place Avida analysis into interactive mode so that you can type commands
have have them immediately acted upon. You can place this anywhere within
the analyze file, so that you can have some processing done before
interactive mode starts. You can type quit at any point to continue with
the normal processing of the file.
DEBUG [message]
ECHO [message]
These are both echo commands that will print a message (the arguments
given) onto the screen. If there are any variables (see below) in the
message, they will be translated before printing, so this is a good way
of debugging your programs.
Genotype Manipulation Commands
Now that we know how to interact with analysis mode, and load in genotypes,
its important to be able to manipulate them. The next batch of commands will
do basic analysis on genotypes, and allow the user to prune batches to only
include those genotypes that are needed.
RECALCULATE [use_resources=0]
Run all of the genotypes in the current batch through a test CPU and
record the measurements taken (fitness, gestation time, etc.). This
overrides any values that may have been loaded in with the genotypes. The
use_resources flags signifies whether or not the test cpu will use
resources when it runs. For more information on resources, see the
summary below.
FIND_GENOTYPE [type='num_cpus' ...]
Remove all genotypes but the one selected. Type indicates which genotype
to choose. Options available for type are num_cpus (to choose the genotype
with the maximum organismal abundance at time of printing), total_cpus
(number of organisms ever of this genotype), fitness, or merit. If a the
type entered is numerical, it is used as an id number to indicate the
desired genotype (if no such id exists, a warning will be given).
Multiple arguments can be given to this command, in which case all those
genotypes in that list will be preserved and the remainder deleted.
FIND_ORGANISM [random]
Picks out a random organism from the population and removes all others.
It is different from FIND_GENOTYPE because it takes into account relative
number of organisms within each genotype. To pick more than one
organisms, list the word 'random' multiple times. This is essentially
sampling without replacement from the population.
FIND_LINEAGE [type="num_cpus"]
Delete everything except the lineage from the chosen genotype back to the
most distant ancestor available. This command will only function properly
Page 3 of 9
08/28/2007 04:35 PM
if parental information was loaded in with the genotypes. Type is the
same as the FIND_GENOTYPE command.
FIND_SEX_LINEAGE [type="num_cpus"] [parent_method="rec_region_size"]
Delete everything except the lineage from the chosen genotype back to the
most distant ancestor available. Similar to FIND_LINEAGE but works in
sexual populations. To simplify things, only maternal lineage plus
immediate fathers are saved, i.e. info about father's parents is
discarded. The second option, parent_method, determines which parent is
considered the 'mother' in each particular recombination. If
parent_method is "rec_region_size" : 'mother' is the parent contributing
more code to the offspring genome (default); if it's genome_size, 'mother'
is the parent with the longer genome, no matter how much of it was
contributed to the offspring. This command will only function properly if
parental information was loaded in with the genotypes. Type is the same
as the FIND_GENOTYPE command.
ALIGN
Create an alignment of all the genome's sequences; It will place '_'s in
the sequences to show the alignment. Note that a FIND_LINEAGE must first
be run on the batch in order for the alignment to be possible.
SAMPLE_ORGANISMS [fraction] [test_viable=0]
Keep only fraction of organisms in the current batch. This is done per
organism, not per genotype. Thus, genotypes of high abundance may only
have their abundance lowered, while genotypes of abundance 1 will either
stay or be removed entirely. If test_viable is set to 1, sample only from
the viable organisms.
SAMPLE_GENOTYPES [fraction] [test_viable=0]
Keep only fraction of genotypes in the current batch. If test_viable is
set to 1, sample only from the viable genotypes.
RENAME [start_id=0]
Change the id numbers of all the genotypes to start at a given value.
Often in long runs we will be dealing with ID's in the millions. In
particular, after reducing a batch to a lineage, we will often want to
number the genotypes in order from the ancestor to the final one.
Basic Output Commands
Next, we are going to look at the standard output commands that will used to
save information generated in analyze mode.
PRINT [dir='archive/'] [filename]
Print the genotypes from the current batch as individual files (one
genotype per file) in the directory given. If no filename is specified,
the files will be named by the genotype name, with a .gen appended to
them. Specifying the filename is useful when printing a single genotype.
TRACE [dir='archive/'] [ use_resources=0]
Trace all of the genotypes and print a listing of their execution. This
will show step-by-step the status of all of the CPU components and the
genome during the course of the execution. The filename used for each
trace will be the genotype's name with a .trace appended. The use
resources flag signifies whether or not the test cpu will use resources
when it runs. For more information on resources, see the summary below.
Page 4 of 9
08/28/2007 04:35 PM
PRINT_TASKS [file='tasks.dat']
This will print out the tasks doable by each genotype, one per line in
the output file specified. Note that this information must either have
been loaded in, or a RECALCULATE must have been run to collect it.
DETAIL [file='detail.dat'] [format ...]
Print out all of the stats for each genotype, one per line. The format
indicates the layout of columns in the file. If the filename specified
ends in .html, html formatting will be used instead of plain text. For
the format, see the section on Output Formats below.
DETAIL_TIMELINE [file='detail_timeline.dat'] [time_step=100] [max_time=100000]
Details a time-sequence of dump files.
DETAIL_BATCHES [file='detail_baches.dat'] [format ...]
Details all batches.
DETAIL_INDEX [file] [min_batch] [max_batch] [format ...]
Detail all the batches between min_batch and max_batch.
DETAIL_AVERAGE [file="detail.dat"] [format ...]
Detail the current batch, but print out the average for each argument, as
opposed to the individual values for each genotype, the way DETAIL would.
Arguments are the same as for DETAIL. it takes into account the relative
abundance of each genotype in the batch when calculating the averages.
Analysis Commands
And at last, we have the actual analysis commands that perform tests on the
data and output the results.
ANALYZE_EPISTASIS [file='epistasis.dat'] [num_test=(all)]
For each genotype in the current batch, test possible double mutatants,
and single mutations composing them; print both of individual relative
fitnesses and the double mutant relative fitness. By default all double
mutants are tested. If in a hurry, specify the number to be tested.
MAP_TASKS [dir="phenotype/"] [flags ...] [format ...]
Construct a genotype-phenotype array for each genotype in the current
batch. The format is the list of stats that you want to include as
columns in the array. Additionally you can have special format flags; the
possible flags are 'html' to print output in HTML format, and 'link_maps'
to create html links between consecutive genotypes in a lineage.
MAP_MUTATIONS [dir="mutations/"] [flags ...]
Construct a genome-mutation array for each genotype in the current batch.
The format has each line in the genome as a row in the chart, and all
available instructions representing the columns. The cells in the chart
indicate the fitness were a mutation to occur at the position in the
matrix, to the listed instruction. If the 'html' flag is used, the charts
will be output in HTML format.
MAP_DEPTH [filename='depth_map.dat'] [min_batch=0] [max_batch=cur_batch-1]
This will create a depth map (like those we use for phylogeny
visualization) in the filename specified. You can direct which batches to
take this from, but by default it will work perfectly after a
LOAD_MULTI_DETAIL.
AVERAGE_MODULATITY [file='modularity.dat'] [task.0 task.1 task.2 task.3 task.4
task.5 task.6 task.7 task.8]
Page 5 of 9
08/28/2007 04:35 PM
Calculate several modularity measuers, such as how many tasks is an
instruction involved in, number of sites required for each task, etc. The
measures are averaged over all the organisms in the current batch that
perform any tasks. For the full output list, do AVERAGE_MODULATITY
legend.dat At the moment doesn't support html output format and works
with only 1 and 2 input tasks.
HAMMING [file="hamming.dat"] [b1=current] [b2=b1]
Calculate the hamming distance between batches b1 and b2. If only one
batch is given, calculations are on all pairs within that batch.
LEVENSTEIN [file='lev.dat'] [batch1] [b2=b1]
Calculate the levenstein distance (edit distance) between batches b1 and
b2. This metric is similar to hamming distance, but calculates the
minimum number of single insertions, deletions, and mutations to move
from one sequence to the other.
SPECIES [file='species.dat'] [batch1] [batch2] [num_recombinants]
Calculates the percentage of non-viable recombinants between all pairs of
organisms from batches 1 and 2. Number of random recombination events for
each pair of organisms is specified by num_recombinants. Recombination is
done in the same way as in the birth chamber when divide-sex is executed.
Output: Batch1Name Batch2Name AveDistance Count FailCount
RECOMBINE [batch1] [batch2] [batch3] [num_recombinants]
Similar to Species command, but instead of calculating things on the
spot, just create all the recombinant genotypes using organisms from
baches 1 and 2 and put them in the batch3.
Using Test CPU Resources Summary
This summary is given to help explain the use and constraints for using
resources.
When a command specifies the use of resources for the test cpu, it should not
affect the state of the test cpu after the command has finished. However, this
means that the test cpu is no longer guaranteed to be reentrant. Each command
will set up the environment and the resource count in the test cpu with it's
own environment and resource count. When the command has finished it will set
the the test cpu's environment and resource count back to what they were
before the command was executed.
Resource usage for the test cpu occurs by setting the environment and then
setting up the resource count using the environment. Once the resource count
has been set up, it will not change during the use of the test cpu. When an
organism performs and IO, completing a task, the concentrations are not
changed. This was a design decision, but is easily changed.
In analyze, a new data structure was included which contains a time ordered
list of resource concentrations. This list can be used to set up resources
from different time points. By using the FillResources function, you can have
the resource library updated with resource concentrations from a time point
closest to the user specified time point. If the LOAD_RESOURCES command is not
called, the list defaults to a single entry which is the the initial
concentrations of the resources specified in the environment configuration
Page 6 of 9
08/28/2007 04:35 PM
file.
PRINT_TEST_CPU_RESOURCES
This command first prints the whether or not the test cpu is using
resources. Then it will print the concentration for each resource.
LOAD_RESOURCES [file_name="resource.dat"]
This command loads a time oriented list of resource concentrations. The
command takes a file name containing this type of data, and defaults to
resource.dat. The format of the file must be the same as resource.dat,
and each line must be in the correct chronological order with oldest
first.
Output Formats
Several commands (such as DETAIL and MAP) require format parameters to specify
what genotypic features should be output. Before the such commands are used,
other collection functions may need to be run.
Allowable formats after a normal load (assuming these values were available
from the input file to be loaded in) are:
id (Genome ID)
total_cpus (Total CPUs
Ever)
update_dead (Update
Dead)
parent_id (Parent ID)
length (Genome Length)
depth (Tree Depth)
num_cpus (Number of
CPUs)
update_born (Update
Born)
sequence (Genome
Sequence)
After a RECALCULATE, the additional formats become available:
viable (Is Viable [0/1])
merit (Merit)
efficiency (Replication
Efficiency)
task.n (# of times task
number n is done)
copy_length (Copied
Length)
comp_merit
(Computational Merit)
fitness (Fitness)
exe_length (Executed
Length)
gest_time (Gestation
Time)
div_type (Divide type
used; 1 is default)
task.n:binary (is task n
done, 0/1)
If a FIND_LINEAGE was done before the RECALCULATE, the parent genotype for
each regular genotype will be available, enabling the additional formats:
parent_dist (Parent Distance)
efficiency_ratio (Replication
Efficiency Ratio with parent)
parent_muts (Mutations from Parent)
comp_merit_ratio, (Computational
Merit Ratio with parent)
fitness_ratio (Fitness Ratio with
parent)
html.sequence (Genome Sequence in
Color; html format)
Page 7 of 9
08/28/2007 04:35 PM
If an ALIGN is run, one additional format is available:
alignment (Aligned Sequence)
Finally, there are a handful of commands that will automatically perform
landscapping. The landscape will only be run once per organism even when
multiple output variables are used. For enhanced performance on multiprocessor/multi-core systems, see the PrecalcLandscape action.
frac_dead (Fraction of Lethal
Mutations)
frac_neut (Fraction of Neutral
Mutations)
complexity (Physical Complexity of
Organism)
frac_neg (Fraction of Harmful
Mutations)
frac_pos (Fraction of Beneficial
Mutations)
land_fitness (Average Mutation
Fitness)
Variables
For the moment, all variables can only be a single character (letter or
number) and begin with a $ whenever they need to be translated to their value.
Lowercase letters are global variables, capital letters are local to a
function (described later), and numbers are arguments to a function. A $$ will
act as a single dollar sign, if needed.
SET [variable] [value]
Sets the variable to the value...
FOREACH [variable] [value] [value ...]
Set the variable to each of the values listed, and run
follows between here and the next END command once for
values.
FORRANGE [variable] [min_value] [max_value] [step_value=1]
Set the variable to each of the values between min and
given), and run the code that follows between here and
command, once for each of those values.
the code that
each of those
max (at steps
the next END
Functions
These functions are currently very primitive with fixed inputs of $0 through
$9. $0 is always the function name, and then there can be up to 9 other
arguments passed through. Once a function is created, it can be run just like
any other command.
FUNCTION [name]
This will create a function of the given name, including in it all of the
commands up until an END is found. These commands will be bound to the
function, but are not executed until the function is run as a command.
Inside the function, the variables $1 through $9 can be used to access
arguments passed in.
Currently there are no conditionals or mathematical commands in this scripting
Page 8 of 9
08/28/2007 04:35 PM
language. These are both planned for the future.
Return to the Index
Page 9 of 9
Avida : Sample Programs from Analyze Mode
Return to the Index
08/28/2007 04:35 PM
Sample Programs from Analyze Mode
This document gives some example analyze programs and explains how they
function.
Testing a genome sequence
The following program will load in a genome sequence, run it through a test
CPU, and output the information about it in a couple of formats.
VERBOSE
LOAD_SEQUENCE rmzavcgmciqqptqpqcpctletncogcbeamqdtqcptipqfpgqxutycuastttva
RECALCULATE
DETAIL detail_test.dat fitness merit gest_time length viable sequence
TRACE
PRINT
This program starts off with the VERBOSE command so that Avida will print to
the screen all of the details about what is going on as it runs the analyze
script; I recommend you begin all of your programs this way for debugging
purposes. The program then uses the LOAD_SEQUENCE command to allow the user to
enter a specific genome sequence in its compressed format. This will translate
the genome into the proper genotype as long as you are using the correct
instruction set file, since that file determines the mappings of letters to
instructions).
The RECALCULATE command places the genome sequence into a test CPU, and
determines its fitness, merit, gestation time, etc. so that the DETAIL command
that follows it can have access to all of this information as it prints it to
the file "detail_test.dat" (its first argument). The TRACE and PRINT commands
will then print individual files about this genome, the first tracing its
execution line-by-line, and the second summarizing all sorts of statistics
about it and displaying the genome. Since no directory was specified for these
commands, archive/ is assumed, and the filenames are org-S1.trace and orgS1.gen. If a genotype has a name when it is loaded, that name will be kept,
but if it doesn't, it will be assigned a name starting at org-S1, then org-S2,
and so on counting higher. The TRACE and PRINT commands add their own suffixes
to the genome's name to determine the filename they will be printed as.
Using Variables
Often, you will want to run the same section of analyze code with multiple
different inputs each time through, or else you might simply want a single
value to be easy to change throughout the code. To facilitate such programming
file:///Users/boccio/Desktop/documentation/analyze_samples.html
Page 1 of 6
08/28/2007 04:35 PM
practices, variables are available in analyze mode that can be altered for
each repitition through the code.
There are actually several types of variables, all of which are a single
letter of number. For a command that requires a variable name as an input, you
simply put that variable where it is requested. For example, if you were going
to set the variable i to be equal to the number 12, you would type:
SET i 12
But later on in the code, how does Avida know when you type an i if you really
want the letter 'i' there, or if you prefer the number 12 to be there? To
distinguish these cases, you must put a dollar sign '$' before a variable
wherever you want it to be translated to its value instead of just using the
variable name itself.
There are a few different commands that allow you to manipulate a variable's
value, and sometimes execute a section of code multiple times based off of
each of the possible values. Here is one example:
FORRANGE i 100 199
SET d /home/charles/dev/avida/runs/evo-neut/evo_neut_$i
PURGE_BATCH
LOAD_DETAIL_DUMP $d/detail_pop.100000
RECALCULATE
DETAIL $d/detail.dat update length fitness sequence
END
The FORRANGE command runs the contents of the loop once for each possible
value in the range, setting the variable i to each of these values in turn.
Thus the first time through the loop, 'i' will be equal to the value '100',
then '101', '102', all the way up to '199'. In this particular case, we have
100 runs (numbered 100 through 199) that we want to work with.
The first thing we do once we're inside the loop is set the value of the
variable 'd' to be the name of the directory we're going to be working with.
Since this is a long directory name, we don't want to have to type it over
every time we need it. If we set it to the variable d, then all we need to do
is type '$d' in the future, and it will be translated to the full name. Note
that in this case we are setting a variable to a string instead of a number;
that's just fine and Avida will figure out how to handle it properly. This
directory we are working with will change each time through the loop, and that
it is no problem to use one variable as part of setting another.
After we know what directory we are using, we run a PURGE_BATCH to get rid of
all of the genotypes from the last time through the loop (lest we just keep
building up more and more genotypes in the current batch) and then we refill
the batch by using LOAD_DETAIL_DUMP to load in all of the genotypes saved in
the file detail-100000.pop within our chosen directory. The RECALCULATE
command runs all of the genotypes through a test CPU so we have all the
statistics we need, and finally DETAIL will print out the stats we want to the
file detail.dat, again placing it in the proper directory. The END command
signifies the end of the FORRANGE loop.
Page 2 of 6
08/28/2007 04:35 PM
Finding Lineages
Quite often, the portion of an Avida run that we will be most interested in is
the lineage from the final dominant genotype back to the original ancestor. As
such, there are tools in Avida to get at this information.
FORRANGE i 100 199
SET d /home/charles/dev/avida/runs/evo-neut/evo_neut_$i
PURGE_BATCH
LOAD_DETAIL_DUMP $d/detail_pop.100000
LOAD_DETAIL_DUMP $d/historic_dump.100000
FIND_LINEAGE num_cpus
RECALCULATE
DETAIL lineage.$i.html depth parent_dist length fitness html.sequence
END
This program looks very similar to the last one. The first four lines are
actually identical, but after loading the detail dump at update 100,000, we
also want to load the historic dump from the same time point. A detail file
contains all of the genotypes that were currently alive in the population at
the time it was printed, while the historic files contain all of the genotypes
that are direct ancestors of those that were still alive. The combination of
these two files gives us the lineages of the entire population back to the
original ancestor. Since we are only interested in a single lineage, the next
thing we do is run the FIND_LINEAGE command to pick out a single genotype, and
discard everything else except for its lineage. In this case, we pick the
genotype with the highest abundance (the most virtual CPUs associated with it)
at the time of printing.
As before, the RECALCULATE command gets us any additional information we may
need about the genotypes, and then we print that information to a file using
the DETAIL command. The filenames that we are using this time have the format
lineage.$i.html, so they are all being written to the current directory with
filenames that incorporate the run number right in them. Also, because the
filename ends in the suffix '.html', Avida knows to print the file in a proper
html format. Note that the specific values that we choose to print take
advantage of the fact that we have a lineage (and hence measured things like
the genetic distance to the parent) and are in html mode (and thus can print
the sequence using colors to specify where exactly mutations occurred).
Working with Batches
In analyze mode, we can load genotypes into multiple batches and we then
operate on a single batch at a time. So, for example, if we wanted to only
consider the dominant genotypes at time points 100 updates apart, but all we
had to work with were the detail files (containing all genotypes at each time
point) we might write a program like:
Page 3 of 6
08/28/2007 04:35 PM
SET d /home/charles/avida/runs/mydir/here-it-is
SET_BATCH 0
FORRANGE u 100 100000 100
# Cycle through updates
PURGE_BATCH
# Purge current batch (0)
LOAD_DETAIL_DUMP $d/detail_pop.$u # Load in the population at this update
FIND_GENOTYPE num_cpus
# Remove all but most abundant genotype
DUPLICATE 0 1
# Duplicate batch 0 into batch 1
END
SET_BATCH 1
# Switch to batch 1
RECALCULATE
# Recalculate statistics...
DETAIL dom.dat fitness sequence
# Print info for all dominants!
This program is slightly more complicated than the others, so I added in
comments directly inside it. Basically, what we do here is use batch 0 as our
staging area where we load the full detail dumps into, strip them down to only
the single most abundant genotype, and then copy that genotype over into batch
one. By the time we're done, we have all of the dominant genotypes inside of
batch one, so we can print anything we need right from there.
Building your own Commands
One really useful feature that I have added to the analyze mode is the ability
for the user to construct a variety of their own commands without modifying
the source code. This is done with the FUNCTION command. For example, if you
know you will always need a file called lineage.html with very specific
information in it, you might write a helper command for yourself as follows:
FUNCTION MY_HTML_LINEAGE # arg1=run_directory
PURGE_BATCH
LOAD_DETAIL_DUMP $1/detail_pop.100000
LOAD_DETAIL_DUMP $1/historic_dump.100000
FIND_LINEAGE num_cpus
RECALCULATE
DETAIL $1/lineage.html depth parent_dist length fitness html.sequence
END
This works identically to how we found lineages and printed their data in the
section above. Only this time, it has created the new command called
MY_HTML_LINEAGE that you can use anytime thereafter. Arguments to functions
work similar to variables, but they are numbers instead of letters. Thus $1
translates to the first arguments, $2 becomes the second, and so on. You are
limited to 9 arguments at this point, but that should be enough for most
tasks. $0 is the name of the function you are running, in case you ever need
to use that.
You may be interested in also using functions in conjunction with the SYSTEM
command. Anything you type as arguments to this command gets run on the
command line, so you can make functions to do anything that could otherwise be
done were you at the shell prompt. For example, imagine that you were going to
use a lot of compressed files in your analysis that you would first need to
uncompress. You might right a function like:
Page 4 of 6
08/28/2007 04:35 PM
FUNCTION UNZIP
# Arg1=filename
SYSTEM gunzip $1
END
This is a shorter example than you might typically want to write a function
for, but it does get the point across. This would allow you to just type UNZIP
<filename> whenever you needed to uncompress something.
Functions are particularly useful in conjunction with the INCLUDE command. You
can create a file called something like my_functions.cfg in your Avida work
directory, define a bunch of functions there, and then start all of your
analyze.cfg files with the line:
INCLUDE my_functions.cfg
and you will have access to all of your functions thereafter. Ideally, as this
language becomes more flexible, so will your ability to create functions
within the language, so you will be able to develop flexible and useful
libraries for yourself.
Try it Out...
Here are a couple of example problems you can try to see how well you can use
analyze mode. These should get you used to working with it for future
projects.
Problem 1. A detail file in Avida contains one line associated with each
genotype, in order from the most abundant to the least. Currently, the
LOAD_DETAIL_DUMP command will load the entire file's worth of genotypes into
the current batch, but what if you only wanted the top few? You should write a
function called LOAD_DETAIL_TOP that takes two arguments. The first ($1) is
the name file that needs to be loaded in (just as in the original command),
and the second is the number of lines you want to load.
The easiest way to go about doing this is by using the SYSTEM command along
with the Unix command head which will output the very top of a file. If you
typed the line:
head -42 detail_pop.1000 > my_temp_file
The file my_temp_file would be created, and its contents would be the first 42
lines of detail-1000.pop. So, what you need this function to do is create a
temporary file with proper number of lines from the detail file in it, load
that temp file into the current batch, and then delete the file (using the rm
command). Warning: be very careful with the automated deletions -- you don't
want to accidentally remove something that you really need! I recommend that
you use the command rm -i until you finish debugging. This problem may end up
being a little tricky for you, but you should be able to work your way through
it.
Problem 2. Now that you have a working LOAD_DETAIL_TOP command, you can run
Page 5 of 6
08/28/2007 04:35 PM
LOAD_DETAIL_TOP <filename> 1 in order to only load the most dominant genotype
from the detail file. Rewrite the example program from the section "Working
with Batches" above such that you now only need to work within a single batch.
Return to the Index
Page 6 of 6
Avida : List of Actions
Return to the Index
08/28/2007 04:35 PM
|
The Events File
|
The Analyze File
List of Actions
There is a large library of actions available for scheduling as events. Additionally,
all of these actions can be used within analyze scripts. Below you will find a listing
of the high level groupings of these actions, along with detailed sections for each
them.
Print
Print actions are the primary way of saving data from an Avida experiments.
Population
Population actions modify the state of the population, and will actually change the
course of the run.
Environment
Actions that allow user to change properties of the environment, including
resources.
Save and Load
Actions that allow for saving and loading large data sets, such as full
populations.
Landscape Analysis
Actions that use data from the current state of Avida, process it and then output
the results.
Driver
Actions that allow user to control program execution, including experiment
termination.
Alphabetical Listing of Available Actions
AnalyzeLandscape
AnalyzePopulation
CompeteDemes
ConnectCells
CopyDeme
DeletionLandscape
DisconnectCells
DumpDonorGrid
DumpFitnessGrid
DumpGenotypeIDGrid
DumpMemory
DumpPopulation
DumpReceiverGrid
DumpTaskGrid
Echo
Exit
ExitAveLineageLabelGreater
ExitAveLineageLabelLess
FullLandscape
HillClimb
Inject
InjectAll
InjectParasite
InjectRandom
InjectRange
KillRectangle
LoadClone
LoadPopulation
ModMutProb
OutflowScaledResource
PairTestLandscape
PrecalcLandscape
PredictNuLandscape
PredictWLandscape
PrintAverageData
PrintCountData
PrintData
PrintDebug
PrintDemeStats
PrintDepthHistogram
PrintDetailedFitnessData
PrintDivideMutData
PrintDominantData
PrintDominantGenotype
PrintDominantParaData
PrintDominantParasiteGenotype
PrintErrorData
PrintGeneticDistanceData
PrintGenotypeAbundanceHistogram
PrintGenotypeMap
file:///Users/boccio/Desktop/documentation/actions.html
PrintPopulationDistanceData
PrintResourceData
PrintSpeciesAbundanceData
PrintStatsData
PrintTasksSnapshot
PrintTasksExeData
PrintTasksQualData
PrintTimeData
PrintTotalsData
PrintTreeDepths
PrintVarianceData
PrintViableTasksData
RandomLandscape
ResetDemes
SampleLandscape
SaveClone
SaveHistoricPopulation
SaveHistoricSexPopulation
SaveParasitePopulation
SavePopulation
SaveSexPopulation
SerialTransfer
SetMutProb
SetReactionInst
SetReactionValue
Page 1 of 10
InjectResource
InjectScaledResource
InjectSequence
InsertionLandscape
JoinGridCol
JoinGridRow
KillProb
KillRate
08/28/2007 04:35 PM
PrintGenotypes
PrintInstructionAbundanceHistogram
PrintInstructionData
PrintLineageCounts
PrintLineageTotals
PrintMutationRateData
PrintPhenotypeData
PrintPhenotypeStatus
SetReactionValue
SetReactionValueMult
SetResource
SetVerbose
SeverGridCol
SeverGridRow
TestDominant
ZeroMuts
Print Actions
Output events are the primary way of saving data from an Avida experiments. The main
two types are continuous output, which append to a single file every time the event is
trigged, and singular output, which produce a single, complete file for each trigger.
PrintAverageData [string filename='average.dat']
Print all of the population averages the specified file.
PrintErrorData [string filename='error.dat']
Print all of the standard errors of the average population statistics.
PrintVarianceData [string filename='variance.dat']
Print all of the variances of the average population statistics.
PrintDominantData [string filename='dominant.dat']
Print all of the statistics relating to the dominant genotype.
PrintStatsData [string filename='stats.dat']
Print all of the miscellanous population statistics.
PrintCountData [string filename='count.dat']
Print all of the statistics the keep track of counts (such as the number of
organisms in the population or the number of instructions executed).
PrintTotalsData [string filename='totals.dat']
Print various totals for the entire length of the run (for example, the total
number of organisms ever).
PrintTasksData [string filename='tasks.dat']
Print the number of organisms that are able to perform each task. This uses the
environment configuration to determine what tasks are in use.
PrintTasksExeData [string filename='tasks_exe.dat']
Print number of times the particular task has been executed this update.
PrintTasksQualData [string filename='tasks_quality.dat']
Print the total quality of each task. By default a successful task is valued as
1.0. Some tasks, however, can grant partial values and/or special bonuses via the
quality value.
Page 2 of 10
08/28/2007 04:35 PM
PrintResourceData [string filename='resource.dat']
Print the current counts of each resource available to the population. This uses
the environment configuration to determine what resources are in use. Also creates
seperate files resource_resource_name.m (in a format that is designed to be read
into Matlab) for each spatial resource.
PrintTimeData [string filename='time.dat']
Print all of the timing related statistics.
PrintMutationRateData [string filename='mutation_rates.dat']
Output (regular and log) statistics about individual copy mutation rates (aver,
stdev, skew, cur). Useful only when mutation rate is set per organism.
PrintDivideMutData [string filename='divide_mut.dat']
Output (regular and log) statistics about individual, per site, rates divide
mutation rates (aver, stdev, skew, cur) to divide_mut.dat. Use with multiple divide
instuction set.
PrintDominantParaData [string filename='parasite.dat']
Print various quantites related to the dominant parasite.
PrintInstructionData [string filename='instruction.dat']
Print the by-organisms counts of what instructions they _successfully_ executed
beteween birth and divide. Prior to their first divide, organisms values for their
parents.
PrintGenotypeMap [string filename='genotype_map.m']
This event is used to output a map of the genotype IDs for the population grid to a
file that is suitable to be read into Matlab.
PrintPhenotypeData [string filename='phenotype_count.dat']
Print the number of phenotypes based on tasks executed this update. Executing a
task any number of times is considered the same as executing it once.
PrintPhenotypeStatus [string filename='phenotype_status.dat']
PrintDemeStats
PrintData <string fname> <string format>
Append to the file specified (continuous output), the data given in the column
list. The column list needs to be a comma-seperated list of keywords representing
the data types. Many possible data types can be output; see the complete listing
for details. Note that this event will even create a detailed column legend at the
top of your file so you don't need to seperately keep track of what the columns
mean.
PrintInstructionAbundanceHistogram [string filename='instruction_histogram.dat']
Appends a line containing the bulk count (abundance) of each instruction in the
population onto a file.
PrintDepthHistogram [string filename='depth_histogram.dat']
Page 3 of 10
08/28/2007 04:35 PM
Echo <string message>
Print the supplied message to standard output.
PrintGenotypeAbundanceHistogram [string fname='genotype_abundance_histogram.dat']
Writes out a genotype abundance histogram.
PrintSpeciesAbundanceHistogram [string fname='species_abundance_histogram.dat']
Writes out a species abundance histogram.
PrintLineageTotals [string fname='lineage_totals.dat'] [int verbose=1]
PrintLineageCounts [string fname='lineage_counts.dat'] [int verbose=1]
PrintDominantGenotype [string fname='']
Print the dominant organism's genome (and lots of information about it) into the
file specified. If no filename is given, the genotype's assigned name is used and
the file is placed into the archive subdirectory.
PrintDominantParasiteGenotype [string fname='']
Print the dominant parasite's genome (and lots of information about it) into the
file specified. If no filename is given, the parasite's assigned name is used and
the file is placed into the archive subdirectory.
PrintDetailedFitnessData [int save_max_f_genotype=0] [int print_fitness_histo=0]
[double hist_fmax=1] [double hist_fstep=0.1] [string datafn='fitness.dat'] [string
histofn='fitness_histos.dat'] [string histotestfn='fitness_histos_testCPU.dat']
PrintGeneticDistanceData [string ref_creature_file='START_CREATURE'] [string
filename='genetic_distance.dat']
PrintPopulationDistanceData [string creature='START_CREATURE'] [string fname='']
[int save_genotypes=0]
PrintDebug
PrintGenotypes [string data_fields='all'] [int print_historic=0] [string
filename='genotypes-<update>.dat']
This command is used to print out information about all of the genotypes in the
population. The file output from here can be read back into the analyze mode of
Avida with the LOAD command.
The data_fields parameter indicates what columns should be included in the file,
which must be comma seperated. Options are: all, id, parent_id, parent2_id (for sex),
parent_dist, num_cpus, total_cpus, length, merit, gest_time, fitness, update_born,
update_dead, depth, lineage, sequence. Use all (the default) if you want all of the
fields included.
The print_historic parameter
included in this output. For
current population that died
'-1' in this field indicates
indicates how many updates back in time should be
example, '200' would indicate that any ancestor of the
out in the last 200 updates should also be printed. A
that all ancestors should be printed.
The filename parameter simply indicates what you want to call the file.
Example:
u 1000:1000 print_genotypes id,parent_id,fitness 1000
Page 4 of 10
08/28/2007 04:35 PM
This will print out the full population every 1000 updates, including all genotypes
that have died out since the last time it was printed.
TestDominant [string fname='dom-test.dat']
PrintTaskSnapshot [string fname='']
Run all organisms in the population through test cpus and print out the number of
tasks each can perform.
PrintViableTasksData [string fname='viable_tasks.dat']
PrintTreeDepths [string fname='']
Reconstruction of phylogenetic trees.
DumpMemory [string filename='memory_dump-<update>.dat']
Dump memory summary information.
DumpFitnessGrid [string filename='grid_fitness.<update>.dat']
Print out the grid of organism fitness values.
DumpGenotypeIDGrid [string filename='grid_genotype_id.<update>.dat']
Print out the grid of genotype IDs.
DumpTaskGrid [string filename='grid_task.<update>.dat']
Print out the grid of takss that organisms do. For each organism, tasks are first
encoded as a binary string (e.g. 100000001 means that organism is doing NOT and EQU
and then reported as a base-10 number (257 in the example above).
DumpDonorGrid [string filename='grid_donor.<update>.dat']
Print out the grid of organisms who donated their merit.
DumpRecieverGrid [string filename='grid_receiver.<update>.dat']
Print out the grid of organisms who received merit.
SetVerbose [string verbosity='']
Change the level of output verbosity. Verbose messages will print all of the
details of what is happening to the screen. Minimal messages will only briefly
state the process being run. Verbose messages are recommended if you're in
interactive analysis mode. When no arguments are supplied, action will toggle
between NORMAL and ON.
Levels: SILENT, NORMAL, ON, DETAILS, DEBUG
Population Actions
Population events modify the state of the population, and will actually change the
course of the run. There are a wide variety of these.
Inject [string fname='START_CREATURE'] [int cell_id=0] [double merit=-1] [int
lineage_label=0] [double neutral_metric=0]
Inject a single organisms into the population. Arguments must be included from left
Page 5 of 10
08/28/2007 04:35 PM
to right; if all arguments are left out, the default creature is the ancestral
organism, and it will be injected into cell 0, have an uninitialized merit, and be
marked as liniage id 0.
InjectRandom <int length> [int cell_id=0] [double merit=-1] [int lineage_label=0]
[double neutral_metric=0]
Injects a randomly generated genome of the supplied length into the population.
InjectAll [string fname='START_CREATURE'] [double merit=-1] [int lineage_label=0]
Same as Inject, but no cell_id is specified and the organism is placed into all
cells in the population.
InjectRange [string fname='START_CREATURE'] [int cell_start=0] [int cell_end=-1]
[double merit=-1] [int lineage_label=0] [double neutral_metric=0]
Injects identical organisms into a range of cells of the population.
Example:
InjectRange 000-aaaaa.org 0 10
Will inject 10 organisms into cells 0 through 9.
InjectSequence <string sequence> [int cell_start=0] [int cell_end=-1] [double
merit=-1] [int lineage_label=0] [double neutral_metric=0]
Injects identical organisms based on the supplied genome sequence into a range of
cells of the population.
Example:
InjectSequence ckdfhgklsahnfsaggdsgajfg 0 10 100
Will inject 10 organisms into cells 0 through 9 with a merit of 100.
InjectParasite <string filename> <string label> [int cell_start=0] [int cell_end=1]
Attempt to inject a parasite genome into the supplied population cell range with
the specified label.
InjectParasitePair <string filename_genome> <string filename_parasite> <string
label> [int cell_start=0] [int cell_end=-1] [double merit=-1] [int lineage_label=0]
Inject host parasite pairs into the population cell range specified.
KillProb [double probability=0.9]
Using the specified probability, test each organism to see if it is killed off.
KillRate [double probability=0.9]
Randomly removes a certain
does the same thing as the
one has to specify a rate.
fitness is 20000, than you
removal rate of 10000.
proportion of the population. In principle, this action
KillProb event. However, instead of a probability, here
The rate has the same unit as fitness. So if the average
remove 50% of the population on every update with a
KillRectangle [int x1=0] [int y1=0] [int x2=0] [int y2=0]
Kill off all organisms in a rectangle defined by the points (x1, y1) and (x2, y2).
Page 6 of 10
08/28/2007 04:35 PM
SerialTransfer [int transfer_size=1] [int ignore_deads=1]
Similar to KillProb, but we specify the exact number of organisms to keep alive
after the event. The ignore_deads argument determines whether only living organisms
are retainted.
SetMutProb [string mut_type='copy'] [double prob=0.0] [int start_cell=-1] [int
end_cell=-1]
ModMutProb [string mut_type='copy'] [double prob=0.0] [int start_cell=-1] [int
end_cell=-1]
ZeroMuts
This event will set all mutation rates to zero.
CompeteDemes [int type=1]
ResetDemes
CopyDeme <int src_id> <int dest_id>
SeverGridCol [int col_id=-1] [int min_row=0] [int max_row=-1]
Remove the connections between cells along a column in an Avida grid.
SeverGridRow [int row_id=-1] [int min_col=0] [int max_col=-1]
Remove the connections between cells along a row in an Avida grid.
JoinGridCol [int col_id=-1] [int min_row=0] [int max_row=-1]
Add connections between cells along a column in an Avida grid.
JoinGridRow [int row_id=-1] [int min_col=0] [int max_col=-1]
Add connections between cells along a row in an Avida grid.
ConnectCells <int cellA_x> <int cellA_y> <int cellB_x> <int cellB_y>
Connects a pair of specified cells.
DisconnectCells <int cellA_x> <int cellA_y> <int cellB_x> <int cellB_y>
Disconnects a pair of specified cells.
Environment Actions
Events that allow user to change environment properties, such as resources and reaction
parameters.
InjectResource <string res_name> <double res_count>
Inject (add) a specified amount of a specified resource. res_name must already
exist as a resource in environment file.
InjectScaledResource <string res_name> <double res_count>
OutflowScaledResource <string res_name> <double res_percent>
Page 7 of 10
08/28/2007 04:35 PM
SetResource <string res_name> <double res_count>
Set the resource amount to a specific level. res_name must already exist as a
resource in environment file.
SetReactionValue <string reaction_name> <double value>
Set the reaction value to a specific level. reaction_name must already exist in the
environment file. value can be negative.
SetReactionValueMult <string reaction_name> <double value>
Multiply the reaction value by the value. reaction_name must already exist in the
environment file. value can be negative.
SetReactionInst <string reaction_name> <string inst>
Set the instruction triggered by this reaction. reaction_name must already exist in
the environment file. inst must be in the instruction set.
Save Load Actions
SaveClone [string fname='']
Save a clone of this organism to the file specified; if no filename is given, use
the name clone.update. The update number allows regular clones with distinct
filenames to be saved with the same periodic event. Running avida -l filename will
start an Avida population with the saved clone. Note that a clone only consists of
the genomes in the population, and their current state is lost, so the run may not
proceed identically as to if it had continued as it was going.
LoadClone <string fname>
LoadPopulation <string fname> [int update=-1]
Sets up a population based on a save file such as written out by SavePopulation. It
is also possible to append a history file to the save file, in order to preserve
the history of a previous run.
DumpPopulation [string fname='']
SavePopulation [string fname='']
Save the genotypes and lots of statistics about the population to the file
specified; if not filename is given, use the name detail-update.pop. As with
clones, the update number allows a single event to produce many detail files. The
details are used to collect crossection data about the population.
SaveSexPopulation [string fname='']
SaveParasitePopulation [string fname='']
SaveHistoricPopulation [int back_dist=-1] [string fname='']
This action is used to output all of the ancestors of the currently living
population to the file specified, or historic-update.pop.
SaveHistoricSexPopulation [string fname='']
Page 8 of 10
08/28/2007 04:35 PM
Landscape Analysis Actions
Landscape analysis actions perform various types mutation studies to calculate
properties of the fitness landscape for a particular genome. When scheduled as an event
during a run, these actions will typically perform analysis on the dominant genotype.
In analyze mode, analysis is performed on the entire currently selected batch.
These actions are often very computationally intensive, thus will take a long time to
compute. In order to take advantage of increasingly available multi-processor/multicore systems, a number of these actions have been enhanced to make use of multiple
threads to parallize work. Set the configuration setting MT_CONCURRENCY to the number
of logical processors available to make use of all processor resources for these
compuations.
AnalyzeLandscape [filename='land-analyze.dat'] [int trials=1000] [int min_found=0]
[int max_trials=0] [int max_dist=10]
PrecalcLandscape
Precalculate the distance 1 full landscape for the current batch in parallel using
multiple threads. The resulting data is stored into the current batch and can be
used by many subsequent output commands within Analyze mode.
FullLandscape [string filename='land-full.dat'] [int distance=1] [string
entropy_file=''] [string sitecount_file='']
Do a landscape analysis of the dominant genotype or current batch of genotypes,
depending on the current mode. The resulting output is a collection of statistics
obtained from examining all possible mutations at the distance specified. The
default distance is one.
DeletionLandscape [string filename='land-del.dat'] [int distance=1] [string
sitecount_file='']
InsertionLandscape [string filename='land-ins.dat'] [int distance=1] [string
sitecount_file='']
PredictWLandscape [string filename='land-predict.dat']
PredictNuLandscape [string filename='land-predict.dat']
RandomLandscape [string filename='land-random.dat'] [int distance=1] [int trials=0]
SampleLandscape [string filename='land-sample.dat'] [int trials=0]
HillClimb [string filename='hillclimb.dat']
Does a hill climb with the dominant genotype.
PairTestLandscape [string filename=''] [int sample_size=0]
If sample_size = 0, pairtest the full landscape.
AnalyzePopulation [double sample_prob=1] [int landscape=0] [int save_genotype=0]
[string filename='']
Page 9 of 10
08/28/2007 04:35 PM
Driver Actions
These actions control the driver object responsible for executing the current run.
Exit
Unconditionally terminate the current run.
ExitAveLineageLabelGreater <double threshold>
Halts the run if the current average lineage label is larger than threshold.
ExitAveLineageLabelLess <double threshold>
Halts the run if the current average lineage label is smaller than threshold.
Return to the Index
|
The Events File
|
The Analyze File
Page 10 of 10

Avida V2.6.2 PDF (UNIX) Docs

Transcription

Similar documents

a guide to - Nuvali Evoliving

studio unit Avida Tower Makati west - Phil

avida tower

newbie pittstop

tabletops unlimited kashmir

avilon zoo montalban rizal

Flemish String Board Attachment

Satipatthana: The Direct Path to Awakening With Poep Sa Frank

Basic Information - Tecnológico de Monterrey

READY TO FLY!!!