Integration of QTL Information with Traditional Animal Breeding

Transcription

Integration of QTL Information with Traditional Animal Breeding
Integration of QTL Information with Traditional Animal Breeding Programs
Jack C.M. Dekkers
....
Department of Animal Science
225 Kildee Hall
Iowa State University
Ames, IA, 50011, USA
In_oduefion
To date, most genetic progress for quantitative traits in livestock has been made by selection on
phenotype or on estimates of breeding values (EBV) derived fi'om phenotype, without
knowledge of the number of genes that affect the trait or the effects of each gene. In this
quantitative genetic approach to genetic improvement, the genetic architecture of traiB of interest
has essentially been treated as a 'black box'. Despite this, the substantial i'ates of genetic
improvement that have been and continue to be achieved in the main livestock species, is clear
evidence of the power of quantitative genetic approaches to selection.
The success of quantitative genetic approaches does, however, not mean that genetic progress
could not be enhanced if we could gain insight into the black box of quantitative traits. By being
able to study the genetic make-up of individuals at the DNA level, molecular genetics has given
us the tools to make those opportunities a reality. Molecular data is of interest for use in genetic
selection because genotype information has heritability equal to 1 (assuming no genotyping
errors), it can be obtained in both sexes and on all animals, it can be obtained early in life, and it
may require the recording of less phenotypic information.
The eventual application of molecular genetics in livestock breeding programs depends on
developments in the following four key areas, which jointly culminate in the successful
implementation of strategies for marker-assisted selection (MAS):
i. Molecular genetics: identification and mapping of genes and genetic polymorphisms
ii. QTL detection: detection and estimation of associations of identified genes and gtmetic
markers with economic traits
iii.Genetic evaluation: integration of phenotypic and genotypic data in statistical methods to
estimate breeding values of individual animals in a breeding population
::
iv.Marker-assisted selection: development of breeding strategies and programs for the use of
molecular genetic information in selection and mating programs.
The objective of this paper is to review the potential role and integration of each of these four
key areas in genetic improvement programs for livestock.
Molecular Genetics
Through the use of molecular genetic technology, a large number of genes have been mapped
over the past 10 years in the main livestock species. Although some of these genes have a
functional role in the animal's physiology (i.e. they contain the genetic code for a protein), most
are non-functional or 'neutral' genes. The latter are referred to as 'genetic markers'. The fact that
genetic markers are non-functional does, however, not mean that they are not useful. In
particular, genetic markers can be used to identify genes that affect the quantitative trait we are
interested in (so-called quantitative trait loci or QTL). The important difference between genetic
markers and their 'linked QTL' is that we can determine what genotype an animal has for the
genetic marker but not for the QTL. Because the observable genetic marker is linked to the QTL,
we can, however, use a genetic marker to indirectly select for the QTL, which is the concept
behind MAS.
A marker that is linked to the QTL can be detected by contrasting the mean phenotype of
individuals that have alternate genotypes at the markers. If a difference in mean phenotype
exists, this indicates that the marker is linked to a QTL. However, not mean that every marker
that is linked to a QTL is expected to show a mean difference in phenotype; besides linkage, the
second condition that is needed to create a difference in mean phenotype between alternate
marker genotypes is the presence of linkage disequilibrium (LD) between the marker and the
QTL. The concept of LD is important for both QTL detection and MAS and will be explained
next.
Linkage disequilibrium
Consider a marker locus with alleles M and m and a QTL with alleles Q and q that is on the same
chromosome as the marker, i.e. the marker and the QTL are linked. An individual that is
heterozygous for both loci would have genotype MmQq. Alleles at the two loci are arranged in
haplotypes on the two chromosomes of a homologous pair that each individual carries. An
individual with genotype MmQq could have the following two haplotypes: MQ/mq, where the /
separates the two homologous chromosomes. Alternative, it could carry the following two
haplotypes: Mq/mQ. These alternative arrangements of linked alleles on homologous
chromosomes is referred to as the marker-QTL linkage phase. The arrangement of alleles in
haplotypes is important because progeny inherit one of the two haplotypes that a parent carries,
barring recombination.
Presence of linkage equilibrium or disequilibrium relates to the relative frequencies of alternative
haplotypes in a population. In a population that is in linkage equilibrium, alleles at two loci are
randomly assorted into haplotypes (Figure 2). In other words, chromomosomes or haplotypes
tha t carry marker allele M are not more likely to carry QTL allele Q than chromosomes that
carry q. In technical terms, the frequency of the MQ haplotypes is equal to the product of the
population allele frequency of M and the frequency of Q. If a marker and QTL are in linkage
equilibrium, there is no value in knowing an individual's marker genotype because it provides no
information on QTL genotype. If the marker and QTL are in linkage disequilibrium (Figure 3),
however, there will be a difference in the probability of carrying Q between chromosomes that
carry M and m marker alleles and, therefore, we would also expect a difference in mean
phenotype between marker genotypes.
Linkage disequilibrium (LD) between markers and QTL forms the basis for both QTL detection
and the use of markers in selection. Thus, an understanding of the factors that affect the presence
and extent of LD is important. The main factors that create LD in a population are mutation,
selection, drift (inbreeding), and migration or crossing (see below). The main factor that breaks
down LD is the process of recombination that rearranges haplotypes that exist within a parent in
every generation (Figure 4). Figure 5 shows the effect of recombination on the decay of LD over
generations. The rate of decay depends on the rate of recombination between the loci, i.e. on
their genetic distance on the chromosome; for tightly linked loci, any LD that has been created
will persist over many generations but for loosely linked loci (r > 0.1), LD will decline rapidly
over generations. Figure 6 shows the same concept but from a different angle.
Factors that create linkage disequilibrium:
Mutation and selection: Consider a population that is fixed for QTL allele q. Thus, the only two
marker-QTL haplotypes that are present in the population are Mq and mq. Now, assume a novel
mutation of QTL allele q to allele Q occurs in an Mq haplotype, which is thus converted to MQ.
Now, if this QTL allele has a favorable effect on phenotype, it will be selected for and increase
in frequency. If the marker is closely linked to the QTL, allele M will hitch-hike along with
allele Q and the frequency of the MQ haplotype will increase. The result is a population in which
the marker and QTL are in LD because of the preponderance of MQ haplotypes relative to mQ
haplotypes. This will, however, only occur if the marker is closely linked to the QTL. With loose
linkage, the LD will be broken up by recombination. Another way in which selection can create
LD is when the marker is located between two QTL that are jointly selected for. In this case the
marker allele will also hitch-hike along with the chromosomal segment that is selected for.
Random drift (inbreeding): Random drift results when only a limited number of parents
contribute to the next generation. As a result, by chance an excess of MQ haplotypes relative to
Mq haplotypes may be contributed to the next generation, creating a deviation from linkage
equilibrium in the progeny generation. The effects of drift accumulate over generations as a
function of effective population size (i.e. inbreeding) and recombination rate.
Migration or crossing: Two breeds can have different frequencies of marker and QTL alleles
and, therefore, different frequencies of marker-QTL haplotypes. If such breeds are crossed,
extensive LD is created among the progeny. The most extreme case is the crossing of two 'inbred
lines that are fixed for MQ and mq, respectively (Figure 7). Then, the F1 progeny will all be
MQ/mq and in this generation the LD will be complete because all M alleles are exclusively
associated with Q and the m alleles exclusively with q.
Population-wide vs. within-family LD:
Although a marker and a linked QTL may be in linkage equilibrium across the population, LD
will always exist within a family, even between loosely linked loci. Consider a double
heterozygous sire with haplotypes MQ/mq. The genotype of this sire is identical to that of an F 1
cross between inbred lines. This sire will produce four types of gametes: non-recombinants MQ
and mq and recombinants Mq and mQ. Because the non-recombinants will have higher
frequency, depending on recombination rate (Figure 4), this sire will produce gametes that will
be in LD, which will extend over larger distance (Figure 6). This specific type of LD, however,
only exists within this family; progeny from another sire, e.g. an Mq/mQ sire, will also show LD,
but the LD is in the opposite direction because of the different marker-QTL linkage phase in the
sire. On the other hand, MQ/mQ and Mq/mq sire families will not be in LD because the QTL
does not segregate in these families. When pooled across families these four types of LD will
cancel each other out, resulting in linkage eouilibrium across the population. Nevertheless, the
within-family LD can be used to detect QTL and for MAS, provided the differences in linkage
phase are taken into account.
Use of linkage disequilibrium to detect QTL
Methods to detect QTL using genetic markers rely on identifying markers that are
associated/correlated with phenotype. This will only occur for markers that are in LD with a
QTL and therefore depends on the type and extent of LD that exist in the population under
analysis.
Population-wide LD in outbred populations
The amount and extent of LD that exists in the populations that are used for genetic improvement
is the net result of all the forces that create and break-down LD and is, therefore, the result of the
breeding and selection history of each population, along with random sampling. On this basis,
populations that have been closed for many generations are expected to be in linkage
e_.q_librium, except for closely linked loci. Thus, in those populations, only markers that happen
to be tightly linked to QTL may show an association with phenotype, and even then there is no
guarantee because of the chance effects of random sampling.
There are two strategies to finding markers that are in population-wide LD with QTL: 1)
evaluating markers that are in or close to genes that are thought to be associated with the trait of
interest (candidate genes), or 2) use a high-density marker map, with a marker every 1 or 2 eM.
Such maps are not available at present for livestock species but are being developed for the
human. The success of these approaches obviously depends on the extent of LD in the
population. Studies in human populations have generally found that LD extends over less than 1
cM. Thus, many markers are needed to get sufficient marker coverage in human populations to
enable detection of QTL based on population-wide LD. Opportunities to utilize population-wide
LD to detect QTL in livestock populations may be considerably greater because of the effects of
selection and inbreeding. Indeed, Famir et al. (2000) identified substantial LD in the Dutch
Holstein population, which extended over 5 cM. The presence of extensive LD in livestock
populations is advantageous for QTL detection, but disadvantageous for identifying the causative
mutations of these QTL; with extensive LD, markers that are some distance from the causative
mutation earl show an association with phenotype.
Population-wide LD in crossbred populations.
Crossing two breeds that differ in gene and, therefore, haplotype frequencies, creates extensive
LD in the crossbred population that extends over larger distances (Figure 6). This enables
detection of QTL that differ between the two breeds based on a limited number of markers
spread over the genome (~ every 15 to 20 cM) and has formed the basis for the use of F2 or
baekcrosses between breeds or lines for QTL detection (e.g. Malek et al. 2001a,b). This
extensive LD enables detection of QTL that aresome distance from themarkers but also limits
the accuracy with which the position of the QTL can be determined.
More extensive population-wide LD is also expected to exist in synthetic lines, i.e. lines that
were created from a cross in recent history. Depending on the number of generations since the
cross, the extent of LD will have eroded some over generations and will, therefore, span shorter
distances than in F2 populations. This will require a more densely marker map to scan the
genome but will enable more precise positioning of the QTL.
Within-family LD in outbred populations.
Because linkage phases between the marker and QTL can differ from family to family, use of
within-family LD for QTL detection requires marker effects to be fitted on a within family basis,
rather than across the population. Similar to F2 or backcrosses, however, the extent of withinfamily LD is extensive and, thus, genome-wide coverage is provided by a limited number of
markers.
Incorporating QTL information in genetic improvement programs.
Strategies for selection on QTL information:
Once markers that are linked to QTL have been identified, their effects can be estimated based
on the association between phenotype and genotype and used to assign a 'molecular score' to
each selection candidate, which can be used to predict the genetic value of the individual and
used for selection. The constitution and method of quantification of the molecular score depends
on type of LD that is used and the method of marker use (see below). In addition to a molecular
score, individuals can also obtain a regular estimate of the breeding value for the collective effect
of all the other genes. The following three selection strategies can then be distinguished:
1) select on the molecular score alone
2) two-stage selection, with selection on molecular score, followed by selection on regular
phenotype-based EBV
3) selection on an index of the molecular score and the regular EBV.
Selection on molecular score alone ignores information that is available on all the other genes
that affect the trait and is expected to result in the lowest response to selection, unless all genes
that affect the trait are included in the molecular score. This strategy does, however, not require
additional phenotypes, other than those that are needed to estimate marker-effects, and can be
attractive when phenotype is difficult or expensive to record (e.g. disease traits, meat quality,
etc.). If both phenotypic and molecular information is available on selection candidates, index
selection is expected to result in the greater response to selection than two-stage selection. The
reason is similar to why two-trait selection using independent culling levels is expected to give
lower multiple-trait response than index selection; two-stage selection does not select individuals
for which a low molecular score may be compensated by a high phenotype-based EBV.
Use of molecular information to capitalize on QTL that segregate between breeds:
Breed or line crosses provide the most powerful populations to identify QTL, in particular if the
breeds are divergent for the main traits of interest. Such studies, however, identify QTL that
segregate between rather than within breeds. Nevertheless, this information can be used for
genetic improvement in a number of ways. If a large proportion of the breed difference in the
trait of interest is due to a small number of genes, introgression strategies can be used. If a larger
number of genes is involved, marker-assisted selection within a synthetic line is the preferred
method of improvement.
Marker-assisted introgression
The aim of an introgression program is to introduce one or more favorable genes (target genes)
from a breed that is inferior for other performance characteristics (the donor breed) into a high
performance line that lacks the target genes (the recipient breed). This is done through an initial
F 1 cross followed by multiple backcrosses to the recipient breed and one or more generations of
intercrossing (Figure 8). The aim of the backcross generations is to maintain the target gene(s)
while recovering the background genome of the recipient breed. The purpose of the intcrcrosses
is to fix the line for the target gene(s).
The effectiveness of introgression schemes is limited by the ability to identify backcross or
intercross individuals that carry the target gene(s) and by the ability to identify backcross
individuals that have a high proportion of the recipient genome, in particular in the region(s)
around the target gene(s). The latter affects the number of backcross generations required to
recover the recipient genome. Molecular genetics can enhance the effectiveness of both phases of
an introgression program. Effectiveness of the backcrossing phase can be increased in two ways:
i) by identifying carriers of the target gene(s) (foreground selection), and ii) by enhancing
recovery of the donor genetic background (background selection). Effectiveness of the
intercrossing phase can also be enhanced through foreground selection on the target gene(s).
Foreground selection relies on population-wide LD in the crossbred population between the
target gene(s) and linked markers. Ideally, the target gene can be identified directly through a
genetic test or even based on phenotype (e.g. the naked neck gene), in which case the LD will be
complete. If linked markers must be used, the effectiveness of foreground selection depends on
the number of target genes and on the confidence interval for the position of those genes. The
latter determines the size of the genomic region that must be introgressed. Both factors have a
large impact on the number of individuals that is required to find individuals that are carriers for
all target genes during the backcrossing phase and homozygous during the intercrossing phase.
For the introgression of multiple target genes, gene pyramiding strategies can be used during the
backcrossing phase to reduce the number of individuals required (Hospital and Charcosset 1997,
Koudandd et al. 2000). Alternatively, the requirement that selected backcross individuals must
be heterozygous for all target genes could be relaxed. Although this will result in a decline in the
frequency of the target genes in the backcross population, it may still be large enough to enable
subsequent selection for these genes during the intercross phase (Figure 9).
The useofmolecular
markersinbackgroundselection
involves
estimating
theproportion
ofthe
recipient
genome on thebasisofmarkersacross
thegenome and selecting
individuals
withthe
highest
proportion.
To reducelinkage
drag,greater
emphasiscanbe giventomarkersaroundthe
target
gcnc(s).
Thisrelies
on LD inthecrossbred
population
of markeralleles
thatoriginated
from therecipient
lineand linkedgenomicregionsthatoriginated
from therecipient
breed,
whichareexpected
tocontain
alleles
withfavorable
effects
comparedtoalleles
thatoriginated
from the donor breed. Note that in this case, estimates of QTL effects and position are not
needed, but the method relies on the assumption that, on average, genomic regions that
originated from the recipient will contain the better alleles for genes that affect performance.
Yakinovich et al. (1996) used marker-assisted background selection to introgress the naked neck
gene into a commercial broiler line.
Marker-assisted synthetic line development
Lande and Thompson (1990) proposed a strategy for MAS within a hybrid population created by
crossing two inbred lines (Figure 10). The strategy capitalizes on population-wide LD that
initially exists in crosses between lines or breeds. Thus, marker-QTL associations identified in
the F2 generation can be selected on for several generations, until the QTL are fixed or the
population-wide LD disappears. Zhang and Smith(1992) evaluated the use of markers in such a
situation with selection on BLUP EBV. They compared the following three selection strategies:
selection on molecular score alone (Mol.S), selection on BLUP EBV derived from phenotype,
selection on an index of molecular score and BLUP EBV. Data for a cross between inbred lines
were simulated on the basis on 100 QTL and 100 markers in a genome of 2000 cM. Marker
effects were estimated in the F2 generation using a two step procedure. In the first step, a
separate F2 population from the same cross was used to identify markers with the largest effects.
Then, to obtain unbiased estimates, the effects of those markers were re-estimated in the F2
population under selection. The latter estimates were used to obtain marker-based EBV
throughout the selection process. Results illustrated in Figure 11 show that index selection
resulted in greatest response, followed by selection on BLUP EBV and selection on markers
alone. Rates of response declined over generations for all strategies because data were simulated
using a finite number of loci, which were moved to fixation by selection. Rates of response
declined faster for selection on molecular score alone because recombination eroded the
disequilibrium between the markers and QTL. Nevertheless, substantial rates of response were
obtained using selection on markers alone.
Zhang and Smith (1992) considered the ideal situation of a cross with inbred lines. Although the
lines were not divergent for the trait of interest, they were homozygous at alternate alleles for all
loci. Breeds used in a livestock cross will typically have different means, which will increase the
extent of linkage disequilibrium in the cross. However, both breeds will likely segregate for most
QTL, which will reduce the disequilibrium. Nevertheless, even in crosses between commercial
breeds, substantial numbers of QTL have been found for which the breeds have sufficient
differences in frequency to allow their detection (e.g. Malek et al. 2001a,b). In addition,
favorable effects have been found to originate from the breed with the lower mean for a number
ofQTL (Malek et al. 2001b).
A greater problem with the use of crosses between outbred instead of inbred lines is the limited
ability to follow QTL past the F2 generation. In contrast to inbred lines, markers are not fully
informative in crosses between outbred lines. Therefore, the ability to track breed origin of
markers or marker haplotypes will decrease over generations, unless a substantial number of
markers are genotyped within the QTL regions.
An important advantage of selection in a breed cross population is that it can capitalize on QTL
identified in breed-cross studies. This could remove the first step in the estimation process used
by Zhang and Smith (1992), i.e. that of identification of markers with large effects. Although this
does entail the risk that different QTL may segregate in the population under selection, in
particular if QTL studies were based on different breeds, there would be a substantial cost
saving. It is crucial, however, that the second step of the estimation process be conducted in the
population under selection, in order to obtain unbiased estimates of QTL effects that are relevant
to the population under selection. An alternative approach to QTL detection and estimation was
suggested and evaluated by Whittaker et al. (1997). They used a cross-validation approach that
allowed the same F2 population to be used for both selection of markers and estimation of
marker effects, while maximizing power. This would remove the need for prior QTL
information, although such information could still be useful for reducing the genotyping load by
focusing only on the most promising genomic regions.
Instead of an F2 population, a backcross population couldbe used as the starting point for MAS.
This could be beneficial if the breed difference for performance is large and favorable effects for
QTL originate fi'om both breeds at alternate loci. Then, a backcross to the high performance
breed would reduce the genetic lag for performance traits. The fi'equency of favorable QTL
alleles fi'om the other breed would, however, only be ¼. Thus considerable emphasis would need
to be placed these QTL during the initial generations of selection. Use of a backcross for
selection does not negate the use of an F2 cross or prior data on such a cross for marker selection
or QTL identification.
Use of molecular information for within-breed selection:
When considering within-breed improvement using molecular data, it is important to distinguish
between the use of markers that are in population-wide linkage disequilibrium with a QTL and
markers that are in population-wide equilibrium. The latter require the use of the LD within
families. The use of population-wide versus within-family LD has important consequences for
the use markers in selection and for the phenotypic data that is required to support their use.
Smith and Smith (1993) advocated the use of markers that are in population-wide disequilibrium
with QTL because marker effects are easier to estimate and require smaller amounts of
phenotypic data. This is important in particular for traits that are difficult or expensive to
measure. Marker requirements are, however, greater for utilization of population-wide LD
because they must be tightly linked to the QTL, whereas sufficient within-family LD will exist
even for markers that are more distant fi'om the QTL (within 10 cM). The use of population-wide
versus within-family LD will be discussed further in what follows.
Selection on markers that are in population-wide LD
Markers that are in population-wide LD with a QTL include markers identified using candidate
gene and related approaches. The ideal case is a marker that is known to represent the functional
polymorphisms but this is not required for the effective use of population-wide LD. For markers
that are in population-wide LD with the QTL, selection can be directly on marker genotype or on
marker haplotype if multiple linked markers are used to track the QTL. It is, however, essential
to estimate the effects of the markers within the population under selection to capture the degree
of LD and linkage phases that are present in the population and to guard against potential
interactions of the QTL with the background genome. For the same reason, it will also be
prudent to re-estimate the effects on a regular basis. Estimation requires marker genotypes and
phenotypes on a random sample of individuals in the population (~500) and should be based on
an animal model with marker genotypes or haplotypes included as fixed effects (e.g. Short et al.
1997, Israel and Weller 1998):
Phenotype = fixed effects + marker genotype + breeding value + residual
In this model, the regular animal genetic effect (breeding value) models the collective effect of
all genes other than those associated with the markers. For animals that are not genotyped for the
marker/candidate gene, which will often be the case for ancestors, the effect fitted should be the
probability that the individual has each possible genotype (Israel and Weller, 1998). These
probabilities can be estimated from the available marker/candidate gene data. Selection should
be on the sum of the estimates of marker effects (= molecular score) and breeding value.
Population-wide LD can also be capitalized on using high-density marker maps with, e.g., a
marker every 1 or 2 cM. The power of this approach was recently demonstrated by Meuwisscn et
al. (2001) through simulation. They showed that for populations with an effective population
size of 100 and a 1 or 2 cM spacing between markers across the genome, sufficient
disequilibrium was present that genetic values could be predicted with substantial accuracy for
several generations on the basis associations of marker haplotypes with phenotype on as few as
500 individuals. Although genotyping costs would be to high when applied to the entire genome,
opportunities may exist to utilize this approach on a limited scale by saturating previously
identified QTL regions with markers.
Selection using within-family LD
Use of within-family LD between a QTL and a linked marker requires marker effects or, at a
minimum, marker-QTL linkage phases to be determined separately for each family. This requires
marker genotypes and phenotypes on family members. If linkage between the marker and QTL is
loose, phenotypic records must be from close relatives of the selection candidate because
associations will erode through recombination. With progeny data, marker-QTL effects or
linkage phases can be determined based on simple statistical tests that contrast the mean
phenotype of progeny that inherited alternate marker alleles from the common parent.
Alternatively, marker-assisted BLUP animal models have been developed to incorporate marker
data in genetic evaluation for complex pedigrees (Femando and Grossman 1989, Goddard 1992).
These methods expand the traditional BLUP model by including two additional random effects
for each QTL that is fitted: an effect for the QTL allele the individual obtained from the sire
(paternal allele, and an effect for the maternal QTL allele:
Phenotype = fixed effects + pat. QTL allele + mat. QTL allele + breeding value + residual
Effects of the paternal and maternal QTL are fitted as random and a gametic relationship matrix
is used to tie alleles of relatives together. This gametic relationship is computed based on the
probability that QTL alleles of two relatives are identical by descent. Marker data is used to
compute these probabilities. For example, if a progeny has inherited marker allele M from a sire
that is heterozygous for this marker, Mm, then the probability that this progeny inherited the
QTL allele that was associated with marker M in the sire is (I-r), where r is the recombination
rate between the marker and the QTL. Thus, the paternal QTL allele from a progeny that
inherited allele M will have a greater covariance with the QTL that is associated with marker M
in the sire than a progeny that inherited marker allele m from this sire. In this manner, marker
data is used to construct the variance-covariance matrix for the QTL effects, which is then used
to get BLUP estimates of QTL allele effects for each individual. These models result in BLUP
EBV of QTL effects along with polygenic EBV, which can be summed to obtain an estimate of
the total EBV, which can be used for selection.
The potential benefit of MAS using within-family LD was evaluated by Meuwissen and Goddard
(1996) and illustrated in Figure 12. Benefits were substantial, in particular for traits for which
regular selection is less effective, including traits for which phenotype is observed following
selection, sex-limited traits, and carcass traits.
Implementation of strategies for selection on within-family LD requires extensive phenotyping
and genotyping. In addition, data should be available for several generations prior to initiating
MAS to accurately estimate QTL effects. For example, results in Figure 11 assumed phenotypic
and genotypie data for five generations prior to initiation of MAS and responses dropped
substantially without the buildup of such data (Meuwissen and Goddard 1996).
Discussion
Although the process of MAS has been extensively evaluated by computer simulation, there is
little or no experimental evidence on the effectiveness of MAS in livestock. The limited reports
that are available in plants primarily focus on the introgression of known genes or QTL regions
and few results of a similar nature are available for livestock (Hanset et al. 1995, Yancovich et
al. 1996). Plant and mouse (Koudand6 et al. 2000) studies on the introgression of QTL regions
show that foreground selection based on markers was effective in moving the targeted region
into the recipient genome. However, the improvement in performance of the recipient breed was
generally less than expected based on the initial QTL effect estimates (Dekkers and Hospital
2002). Apart from false positives or overestimation of effects in the initial population, reasons
suggested for the lower response include presence of epistatic interactions among QTL and
between QTL and the genetic background, and genotype by environment interactions. Similar
factors could reduce the realized gain from MAS in synthetic or purebred populations.
Given the uncertainties about the sustainability of marker effects, it appears prudent to use
molecular genetic information in a manner that does not prevent progress toward the overall
breeding goal that can be achieved through conventional selection. A crucial concept in this
regard is to apply MAS in selection space that is not or under-utilized by conventional selection
(Soller and Medjugorac 1999). A prime example is pre-selection on the basis of markers among
members of a full-sib family for further testing, prior to availability of individual or progeny
records. In such situations conventional selection has no basis for selection because EBV are
derived from pedigree information, which is the same for all members of a full-sib family.
Family members can, however, differ for the markers they inherited, which then provides a basis
for selection, instead of having to make a random choice.
An important decision for the application of MAS is which QTL or markers should be used in
selection. QTL mapping studies typically apply very stringent thresholds based on genome-wide
testing to reduce the rate of false positives, as suggested by Lander and Kruglyak (1995). This,
however, increases the rate of false negatives and removes opportunities to select on those QTL.
Several studies have shown that greater gains from MAS can be obtained by allowing a higher
rate of false positives, in order to reduce the number of false negatives (Moreau et al. 1998,
Spelman and Garrick 1998). Thus, altemati#e strategies, such as use of false discovery rate
(WeUer et al. 1998), are needed to more adequately balance the cost of false positive against
false negative results for M.AS.
Acknowledgements
The author's research program on QTL detection and MAS is funded by grants from USDANRI, USDA-IFAFS, and PIC/SYGEN. Fruitful discussion and collaboration with colleagues,
post-does, and graduate students at Iowa State University is gratefully acknowledged.
References
Andersson, L., 2001. Genetic dissection ofphenotypie diversity in farm animals. Nature Reviews
Genetics 2, 130-138.
Dekkers, J.C.M., Hospital, F. 2002. The use of molecular genetics in improvement of
agricultural populations. Nature Reviews: Genetics 3: 22-32.
Famir, F., Coppieters, W., Arranz, J-J., Berzi, P., Cambisano, et al. 2000. Extensive genomewide linkage disequilibrium in cattle. Genome Res. 10, 220-227
Femando, R.L., Grossman, M., 1989. Marker-assisted selection using best linear unbiased
prediction. Genet. Sel. Evol. 21,467-477
Goddard, M.E., 1992. A mixed model for analyses of data on multiple genetic markers. Theor.
Appl. Genet. 83,878-886.
Hospital, F., Charcosset, A., 1997. Marker-assisted introgression of quantitative trait loci.
Genetics 147, 1469-1485.
Israel, C., WeUer, J.I., 1998. Estimation of Candidate Gene Effects in Dairy Cattle Populations, J.
Dairy Sci. 81, 1653-1662
Koudand6, O.D., Iraqi, F., Thomson, P.C., Teale, A.J., van Arendonk, J.A.M., 2000. Strategies to
optimize marker-assisted introgression of multiple unlinked QTL. Mammalian Genome 11,
145-150.
Koudand6, O.D., van Arendonk, J.A.M., Boverthuis, H., Gibson, J.P., Iraqi, F., 2000.
Trypanotoleranee QTL introgression in mice: experimental results. In: Koudand6, O.D.,
Introgression of trypanotoleranee genes, Doctoral dissertation, Wageningen Agricultural
University, the Netherlands.
Lande, R., Thompson, R., 1990. Efficiency of marker-assisted selection in the improvement of
quantitative traits. Genetics 124, 743-56
Lander, E., Kruglyak, L., 1995. Genetic dissection of complex traits: guidelines for interpreting
and reporting linkage results. Nature Genetics 11:241-247
Malek, M., Dekkers, J.C.M., Lee, H.K., Baas, T.J., Rothschild, M.F., 2001a. A molecular
genome scan analysis to identify chromosomal regions influencing economic traits in the pig.
I. Growth and body composition. Mamm. Genome 12, 630-636.
Malek, M., Dekkers, J.C.M., Lee, H.K., Baas, T.J., Rothschild, 200lb. A molecular genome
scan analysis to identify chromosomal regions influencing economic traits in the pig. II. Meat
and muscle composition. Mature. Genome 12, 637-645.
Meuwissen, T.H.E., Goddard, M.E., 1996. The use of marker haplotypes in animal breeding
schemes. Genet. Sel. Evol. 28, 161-176.
Meuwissen, T.H.E., Hayes, B., Goddard, M.E., 2001. Prediction of Total Genetic Value Using
Genome-Wide Dense Marker Maps. Genetics 157, 1819-1829.
Moreau, L., Charcosset, A., Hospital, F., Gallais, A., 1998. Marker-assisted selection efficiency
in populations of finite size. Genetics. 148, 1353-65
Short, T.H., Rothschild, M.F., Southwood, O.I., McLaren, D.G., de Vries, A., van der Steen,
H., Eckardt, G.R., Tuggle, C.K., Helm, J. Vaske, D.A., Mileham, A.J., Plastow, G.S.,
1997. Effect of the Estrogen Receptor Locus on Reproduction and Production Traits in Four
Commercial Pig Lines. J. Anim. Sei. 75, 3138-3142.
Smith, C., Smith, D.B., 1993. The need for close linkages in marker-assisted seleciton for
economic merit in livestock. Anim. Breed. Abstr. 61, 197-204
Soller, M., Medjugorac, I., 1999. A successful marriage: making the transition from quantitative
trait locus mapping to marker-assisted selection. In: Dekkers, J.C.M., Lamont, S.J.,
Rothschild, M.F. (Eds.), From Jay Lush to genomics: visions for animal breeding and
genetics. Dept. Animal Science, Iowa State University, Ames, USA.
Weller, J.I., Song, J.Z., Heyen, D.W., Lewin, H.A., Ron, M.,1998. A new approach to problem
of multiple comparisons in the genetic dissection of complex traits. Genetics 150, 1699-1706.
Whittaker, J.C, Haley, C.S, Thompson, R., 1997. Optimal weighting of information in markerassisted selection. Genet. Research 69, 137-144.
Yancovich, A., Levin, I., Cahaner, A., Hillel, J., 1996. Introgression of the avian naked neck
gene assisted by DNA fingerprints. Anim. Genet. 27, 149-155
Zhang, W., Smith, C., 1992. Computer simulation of markers-assisted selection utilizing linkage
disequilibrium. Theor. Appl. Genet. 83, 813-820.
relieson aseoclationof markerI _
eno Useof
with
heno pe
_"
Figure1.
linkedmarkers
QTLdetection
Marker
Mean
GenotvDe
PhenotvDe
MM
20
_
AlleleM Is
Mm
mm
favorable QTL
18
_ _socleted
14
allelew_h
MAS
SelectMMorindividuals
thatInherited
alleleM
RequiresLinkageDisequilibriumbetween
marker and QTL
I
Linkage
Equilibrium
Figure
2.
.o _
_
_
.._.a} i..__o.o
Linkage Figure
DimKluilibrium
3.
.o
....-.-
M is as often almoclatod with O
aim Isalmoclated with Q
D. P.,,- PiPa: 0
_
._..o.o
}
MO
O
Marker genotylpeis misted to phenotype
,,
eroded
by recombination
Figure
4.LD
Is¢ontlnuously
_
_
..
o.a
-q
.
.
by recombination
I_
Figure5.ErosionofLD_
1
o.9 _
M'"r"b
m
1
M is more oftsn mmocistKI with O
thlmm is ilmOCistid with Q
D: Pm- P,,Pa I_0
Madmr genotype unrelated to phenotype
_
_
r=.001
"
..o,----
-- o.e
melosl
o.
o.4
0.3
0.2
M
0
_
m
q
_
M
q
_
m
Q
o.I
_
0
0
112(1-r
) frequency '/2(1-r)
V2r
frequency '/=r
5
_
10
IS
Generation
20
25
I_
Figure8.
QQ QTL InUogresslonProgram
aa
Figure 9. Introgression of
i_.o.,_x IR_.,..,,.al
a_"
;
Jl"
30TCw_hsoo.cp_.y
mllutaJ.lngQTL I
.
_ O0,.-,.o,,ed.,..
00
*--
,=,,ure
11.
Oenet,c,ro,re.,n
•,
_2
h'-,,
•
o
3.5
[_x[_
x _
_
I ro,,.o
•
•
l'i
etection
._ Estimation of marker effects
•
'
. ."
°"
,4.
• o
.o -
. JD,.,e
•
....'".,-""
-'"
..•
•
.--'0''"
.•.o"
"
•
"
"
IB"
' " "....
=
_,_'
y=marlmrgenot_l)e
+ BV. residual
-"
,° D'°"
••
i:i
LMAS
.o
•--''
.o-
-m
"
B- "°
"i-:"'"
•
i _
lJ
cross between inbred lines
_
_]
r_-'
--"
F,,ur.
lO.
_
t
1
MAS
2
3
o.
4
6
6
Generation
Meuwissen & Goddard, 1996
QTL with 1/3 of genetic variancehaplotype-marked
h'-J._
"_
Figure 12. Gains from MAS
"07 ...../--
i
60 _
S040'
3O
20
o
1
Generation
3
5
=henotyplng after selectim
before selection
1'
8
9
10
Questions
Jim Arthur: Breed crosses have the disadvantage that they identify marker-QTL associations for
QTL which are likely fixed within each population contributing to the cross. However, assuming
that there are a limited number of loci with significant influence on the trait of interest, can breed
crosses identify the marker and chromosomal regions on which to focus within populations?
Answer: Breed crosses identify QTL for which the contributing breeds differ in frequencies.
Thus, QTL that are identified in a breed cross are not necessarily fixed for alternate alleles in the
parental breeds, although such QTL are detected with greatest power. So, yes, QTL regions
identified in a cross are good candidate regions for identifying QTL that segregate within the
breeds. In fact, initial results from studies in swine populations that have followed that strategy
look promising.
.•
..
Ed Buss: When you use the word family, do you put any limitation on the number that are in the
family?
Answer: Most designed studies for QTL detection using within-family LD are based on a
specific type of family, most often half-sib progeny. In that case, larger families give you more
power to detect QTL (a smaller number of large families gives greater power than a large
number of small families). When it comes to utilizing within-family LD in genetic evaluation
and MAS, the term 'family' can refer to any type of relationship; the methods of Fernando and
Grossman (1989), for example, do not require the use of a specific family relationship or family
structure. Instead, they can be applied to any pedigree that one would encounter in livestock
breeding populations.
Hein ran der Steen: Methods can be evaluated based on 'power' of the approach and the time it
takes from start of a project to a product that can be used and has commercial value. How do the
different approaches you discussed compare from the latter perspective?
.
Answer: This is indeed an important aspect. Divergent breed crosses have high power to detect
QTL but such QTL cannot be applied directly for within-breed selection, which requires
identification of QTL that segregate within breeds. Thus, additional work will be needed to
determine whether the QTL segregate within the breeds. Approaches that rely on populationwide LD result in marker-QTL associations that can be used immediately for within-breed
selection, provided that the associations have been estimated in the population in which it will be
used. Thus, approaches that utilize, population-wide LD give quicker results. Cost is another
component that must be considered, however, and that can also differ substantially between
approaches.