Bioinformatical Approaches to Ranking of anti-HIV

Transcription

Bioinformatical Approaches to Ranking of anti-HIV
Dissertation
Bioinformatical Approaches to
Ranking of anti-HIV Combination Therapies
and Planning of Treatment Schedules
Author
André Altmann
Dissertation
zur
des
der
der
Erlangung des Grades
Doktors der Naturwissenschaften (Dr. rer. nat.)
Naturwissenschaftlich–Technischen Fakultäten
Universität des Saarlandes
Saarbrücken
2010
Tag des Kolloquiums:
8.6.2010
Dekan:
Vorsitzender des Prüfungsausschusses:
Berichterstatter:
Prof. Dr. Holger Hermanns
Prof. Dr. Matthias Hein
Prof. Dr. Thomas Lengauer, Ph.D.
Prof. Dr. Jörg Rahnenführer
Prof. Dr. Hans-Peter Lenhof
Dr. Francisco Domingues
Beisitzer:
Abstract
The human immunodeficiency virus (HIV) pandemic is one of the most serious health challenges humanity is facing today. Combination therapy comprising multiple antiretroviral
drugs resulted in a dramatic decline in HIV-related mortality in the developed countries.
However, the emergence of drug resistant HIV variants during treatment remains a problem for permanent treatment success and seriously hampers the composition of new active
regimens.
In this thesis we use statistical learning for developing novel methods that rank combination therapies according to their chance of achieving treatment success. These depend
on information regarding the treatment composition, the viral genotype, features of viral
evolution, and the patient’s therapy history. Moreover, we investigate different definitions
of response to antiretroviral therapy and their impact on prediction performance of our
method. We address the problem of extending purely data-driven approaches to support
novel drugs with little available data. In addition, we explore the prospect of prediction
systems that are centered on the patient’s treatment history instead of the viral genotype.
We present a framework for rapidly simulating resistance development during combination
therapy that will eventually allow application of combination therapies in the best order.
Finally, we analyze surface proteins of HIV regarding their susceptibility to neutralizing
antibodies with the aim of supporting HIV vaccine development.
Kurzfassung
Die Humane Immundefizienz-Virus (HIV) Pandemie ist eine der schwerwiegendsten gesundheitlichen Herausforderungen weltweit. Kombinationstherapien bestehend aus mehreren Medikamenten führten in entwickelten Ländern zu einem drastischen Rückgang der
HIV-bedingten Sterblichkeit. Die Entstehung von Arzneimittel-resistenten Varianten während der Behandlung stellt allerdings ein Problem für den anhaltenden Behandlungserfolg
dar und erschwert die Zusammenstellung von neuen aktiven Kombinationen.
In dieser Arbeit verwenden wir statistisches Lernen zur Entwicklung neuer Methoden,
welche Kombinationstherapien bezüglich ihres erwarteten Behandlungserfolgs sortieren.
Dabei nutzen wir Informationen über die Medikamente, das virale Erbgut, die Virus Evolution und die Therapiegeschichte des Patienten. Außerdem untersuchen wir unterschiedliche Definitionen für Therapieerfolg und ihre Auswirkungen auf die Güte unserer Modelle.
Wir gehen das Problem der Erweiterung von daten-getriebenen Modellen bezüglich neuer
Wirkstoffen an, und untersuchen weiterhin die Therapiegeschichte des Patienten als Ersatz für das virale Genom bei der Vorhersage. Wir stellen das Rahmenwerk für die schnelle
Simulation von Resistenzentwicklung vor, welches schließlich erlaubt, die bestmögliche Reihenfolge von Kombinationstherapien zu suchen.
Schließlich analysieren wir das HIV Oberflächenprotein im Hinblick auf seine Anfälligkeit
für neutralisierende Antikörper mit dem Ziel die Impfstoff Entwicklung zu unterstützen.
iii
iv
Acknowledgements
This work was carried out in the Department for Computational Biology and Applied Algorithmics at the Max Planck Institute for Informatics in Saarbrücken. Foremost, I would
like to thank my advisor Prof. Thomas Lengauer for his encouragement, support, and
advice throughout the years, and for creating an inspiring environment where I could follow my own ideas. Thank you for giving me the opportunity to participate in this great
project. I am also grateful to Prof. Jörg Rahnenführer and Prof. Hans-Peter Lenhof, who
kindly agreed to referee this thesis.
This work would not have been possible without the great number of even greater collaborators. I am much obliged to Robert W. Shafer (Stanford University), W. Jeffrey
Fessel (Kaiser-Permanente Medical Care Program), and Klaus Überla (Ruhr-Universität
Bochum). I am thankful to Rolf Kaiser and all members of his group at the Institute
of Virology (University of Cologne), specifically Eugen Schülter, Nadine Sichtig, Melanie
Balduin, and Jens Verheyen. I am grateful to the members of the Arevir consortium, foremost Rolf Kaiser, Martin Däumer, and Hauke Walter for sharing their knowledge on HIV.
Thanks to all members of EuResist (IST-2004-027173) coordinated by Maurizio Zazzi
and Francesca Incardona for creating this unique experience: especially Mattia Prosperi
and Michal Rosen-Zvi for being partners in crime in developing the prediction engine.
Thanks to all former and present members of our group for making the department
a great and fun place to work (and have breaks). Especially my former office mates,
Christoph Bock, Alexander Thielen, and Kasia Bozek; Bastian Beggel, Jasmina Bogojeska, Alejandro Pironti, Hiroto Saigo, Oliver Sander, Laura Tolosi, Hendrik Weisser for
all the joint projects we worked on; Alejandro Pironti, Alexander Thielen, and Francisco
Domingues for proof-reading parts of this thesis; Christoph Bock, Francisco Domingues,
Oliver Sander, Tobias Sing, and Alexander Thielen for the numerous discussions. Special thanks go to Niko Beerenwinkel, Tobias Sing, and Martin Däumer for their invaluable advice during the beginning of this work, and to Achim Büch and Ruth SchneppenChristmann for their support and innumerable favors.
Yassen Assenov, Rayna Dimitrova, Konstantin Halachev, Christoph Hartmann, Andreas
Kämper, Lars Kunert, Jochen Maydt, Oliver Sander, Tobias Sing, and Laura Tolosi were
the main criminals for making life in Saarbrücken a very fun time. Thank you!
Furthermore, I would like to thank the students that worked with me on different
projects: Fabian Müller for his excellent work in both his Bachelor’s and Master’s Thesis
(congratulations for the Günter-Hotz medal!), Juliane Perner for her great work on her
Bachelor’s Thesis and her invaluable help for computing clinical cutoffs, and Peter Ebert
for his superb job in his Bachelor’s Thesis. It was great working with all of you. I wish
you all the best!
Thanks go also to Jorge Cham for his PhD comics that are a continuous source of fun
because they are so true and straight to the point of every-day PhD-life12 .
Finally, I would like to thank my family and friends for their love and support, especially
my parents, Bärbel, and Günter. Above all though, I am grateful to Evangelia; for sharing
ävery day with me: life is so much easier knowing you are by my side.
1
2
http://www.phdcomics.com/comics/archive.php?comicid=1159
http://www.phdcomics.com/comics/archive.php?comicid=1164
v
Contents
1
2
Introduction
AIDS, HIV, and Antiretroviral Therapy
2.1
2.2
2.3
2.4
2.5
2.6
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
GENO 2 PHENO
4.2
4.3
31
Features of Viral Evolution .
4.1.1 Activity Score . . . . .
4.1.2 Mutagenetic Trees . .
geno2pheno-THEO . . . . . .
4.2.1 Material and Methods
4.2.2 Results . . . . . . . .
4.2.3 Discussion . . . . . . .
Clinical Validation . . . . . .
4.3.1 Material and Methods
4.3.2 Results . . . . . . . .
4.3.3 Discussion . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
E U R ESIST: Uniting Data for Fighting HIV
5.1
5.2
5
7
9
10
13
15
15
20
22
23
25
26
geno2pheno[integrase] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Estimating Clinically Relevant Cutoffs for geno2pheno . . . . . . . . . . . 35
Improvements Using Semi-Supervised Learning . . . . . . . . . . . . . . . . 40
Predicting Response to Combination Therapy
4.1
5
5
The AIDS pandemic . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 HIV Diversity . . . . . . . . . . . . . . . . . . . . . . . . .
The Human Immunodeficiency Virus Type 1 . . . . . . . . . . . .
2.2.1 HIV Replication Cycle . . . . . . . . . . . . . . . . . . . .
Clinical progression of an HIV infection . . . . . . . . . . . . . .
HIV Treatment and Drug Resistance . . . . . . . . . . . . . . . .
2.4.1 Antiretroviral drugs . . . . . . . . . . . . . . . . . . . . .
2.4.2 The Era of Highly Active Antiretroviral Therapy . . . . .
HIV Therapy Management . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Rules-Based Interpretation Systems . . . . . . . . . . . .
2.5.2 Predicting the Drug Resistance Phenotype from Genotype
Further Advancements . . . . . . . . . . . . . . . . . . . . . . . .
Extensions to
3.1
3.2
3.3
4
1
Project Background . . . . . . . . . . .
EuResist Prediction Engines . . . . . .
5.2.1 Generative Discriminative Engine
5.2.2 Mixed Effects Engine . . . . . . .
45
45
47
51
52
56
60
63
64
67
69
75
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
78
79
80
vii
5.3
5.4
5.5
5.6
5.7
5.8
5.9
6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
80
81
84
84
84
89
91
94
94
94
95
95
99
99
100
101
102
103
104
105
106
107
107
108
110
111
113
6.1
114
116
119
121
123
124
127
131
139
140
145
150
152
6.4
6.5
6.6
Weighted Finite State Transducers . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1.2 Single Source Shortest Path . . . . . . . . . . . . . . . . . . . . . . .
6.1.3 N -Shortest-Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Transducers in Therapy Planning . . . . . . . . . . . . . . . . . . . . . . . .
Mutation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Mutation Probabilities Derived From geno2pheno . . . . . . . . .
6.3.2 Mutation Probabilities Derived From Treatment Data . . . . . . . .
Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 Prediction of Response to the Next But One Treatment . . . . . . .
6.4.2 Comparing Predicted Future Drug Options to Data From Real Cases
Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A Solution to the HIV Pandemic: A Vaccine
7.1
7.2
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Planning Sequences of HIV Therapies
6.2
6.3
7
5.2.3 Evolutionary Engine . . . . . . . . . . . . . . . . . .
Combining Classifiers . . . . . . . . . . . . . . . . . . . . . .
Predictive Performance of the EuResist Prediction Engine
5.4.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . .
5.4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . .
The EuResist Web Service . . . . . . . . . . . . . . . . . .
5.5.1 Implementation of the EV Engine . . . . . . . . . . .
5.5.2 EuResist web tool . . . . . . . . . . . . . . . . . .
Towards Prediction of Sustained Response . . . . . . . . . .
5.6.1 Material and Methods . . . . . . . . . . . . . . . . .
5.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . .
5.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . .
Effect of Modified TCE Definitions . . . . . . . . . . . . . .
5.7.1 Material and Methods . . . . . . . . . . . . . . . . .
5.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . .
Integrating Novel Drugs . . . . . . . . . . . . . . . . . . . .
5.8.1 Materials and Methods . . . . . . . . . . . . . . . . .
5.8.2 Results . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . .
Therapy History: Replacement for the Genotype? . . . . . .
5.9.1 Material and Methods . . . . . . . . . . . . . . . . .
5.9.2 Results . . . . . . . . . . . . . . . . . . . . . . . . .
5.9.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . .
155
The Immune System and Vaccines . . . . . . . . . . . . . . . . . . . . . . . 155
Vaccines and HIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.3
7.4
8
Predicting Neutralization from Genotype
7.3.1 Material and Methods . . . . . .
7.3.2 Results . . . . . . . . . . . . . .
7.3.3 Discussion . . . . . . . . . . . . .
Outlook . . . . . . . . . . . . . . . . . .
Conclusion and Outlook
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
161
161
166
170
172
175
Bibliography
179
Appendix: List of Publications
201
x
1 Introduction
Todays methods of modern molecular biology generate massive amounts of data. Proteomics focuses on the composition of proteins (our molecular machines), (epi-)genomics
studies the influence of changes in a person’s (epi-)genetic information on e.g. susceptibility to diseases, where transcriptomics analyzes which genes are active under certain
conditions. Research conducted with these methods in the context of diseases often aims
at providing personalized therapy.
In essence, personalized therapy refers to the delivery of the right drugs in the right
dosage to the patient. Decisions regarding the selection of drugs are in general based on
available biomarkers, e.g. the patient’s genome; personalization ensures a (cost-)effective
therapy. The ways leading to the “personalization” are manifold, for example:
1. Understanding the molecular basis of the disorder. It is crucial to understand how
genetic changes lead to the disease, because deeper understanding will allow a classification of defects with similar phenotype into subgroups depending on their molecular
origin. This, in turn, will lead to the development of better drugs targeted at the
causes of the disease leading to shorter treatment times and reduced costs.
2. Understanding of pathways involved in drug metabolization. Here, genetic changes
influencing the effective concentration of the drug are of interest. For instance, mutations resulting in overdosing of the drug increase the risk of side-effects, while
mutations that lower the drug concentration may result in treatment failure.
3. Pathogen tailored treatment. Obviously, knowing the disease-causing pathogen allows the selection of better matched drugs for its eradication (e.g. targeted vs.
broad-spectrum antibiotic). Moreover, resistance of pathogens against antimicrobial
agents often requires screening for active compounds prior to treatment start.
Typically, “omics” strategies are employed for identifying disease-related biomarkers. In
comparative studies, two groups of individuals (patients vs. healthy controls or responders
vs. non-responders) are studied with proteomics, (epi-)genomics, and transcriptomics.
In these settings, data are generated at a large-scale and their analysis is far beyond the
feasibility of manual inspection. Thus, bioinformatics methods are needed. These methods
often rely on methodologies from statistics and statistical/machine learning to separate
healthy from sick individuals based on the given data. Moreover, bioinformatics methods
are needed to design tools that analyze routinely generated omics data, and thereby bring
personalized therapy into clinical routine.
In this thesis we focus on personalized treatment of infections with the human immunodeficiency virus (HIV). HIV therapy is one of the most prominent examples of personalized therapy today and belongs to the third category of personalized medicine: finding
antiretroviral agents that are effective against the predominant viral variant in the host.
1
2
1 Introduction
Personalized HIV therapy
Briefly, HIV mainly infects cells of the human immune system and, when untreated, HIV
infections typically lead to the acquired immunodeficiency syndrome (AIDS) followed by
death due to opportunistic infections. By now, AIDS accounts for an estimated annual 2
million deaths and thus poses one of the major worldwide health challenges.
Patients in developed countries have access to about two dozen antiretroviral drugs.
These compounds interrupt the replication cycle of the virus and thereby allow for a
partial recovery of the patient’s immune system. Complete eradication of all viral particles
from the patient is not possible. Moreover, treatment success as defined by undetectable
amounts of virus in the blood can only be maintained for a limited time. The cause of the
limited treatment success is the dynamic viral population within the host: high turnover
rates and short replication cycles paired with an error-prone replication process lead to
the evolutionary selection of mutations that reduce the susceptibility of the virus to the
applied antiretroviral drug. These resistance mutations eventually render the antiretroviral
drug useless for treating the patient.
To date, testing for drug resistance prior to prescription of antiretroviral compounds is
highly recommended. Resistance can be assessed either by slow and expensive laboratory
based tests in cell cultures or via fast and cheap standardized methods that determine the
genetic sequence of the viral drug targets. For the output of the latter method, a multitude
of tools provides classification for single drugs in terms of drug resistance.
In order to delay the emergence of resistance mutations, modern anti-HIV regimens
combine at least three drugs with a minimum of two different mechanisms of action. The
introduction of this highly active antiretroviral therapy (HAART) resulted in a dramatic
decline of HIV-related mortality in developed countries. Resistance testing provides the
pivotal tool for selecting active compounds against the viral population in the patient.
Methods for resistance testing, however, still provide only recommendations for single
drugs and thereby ignore the interplay between multiple drugs for attacking HIV.
Outline
We begin in Chapter 2 with a brief overview of the discovery of HIV, followed by a summary of the current pandemic. Then, we continue with a description of the HIV particle,
its replication cycle, and clinical aspects of the HIV infection. This is followed by a review
of state-of-the-art HIV treatment including a detailed description of the viral drug targets.
We continue with a brief summary on HIV resistance testing and available genotype interpretation methods. We conclude the chapter with an outlook on potential advancements
towards richer decision support systems.
In Chapter 3 we present ways for improving and updating geno2pheno, a web service
that predicts drug resistance from the viral genome.
In Chapter 4 we start with a motivation for interpretation systems that consider combination therapies and summarize previously developed features that encode viral evolution.
Then we present geno2pheno-THEO, a freely available interpretation tool, that predicts
clinical response to combination therapy using information on the treatment, the virus, and
evolutionary features. We continue with a retrospective clinical validation of geno2pheno-
3
THEO, where we compare the prediction performance to three widely-used expert-based
interpretation systems. Moreover, we study the robustness of geno2pheno-THEO and explore the benefit of regimen-specific models.
Chapter 5 begins with an introduction to the EuResist project, which is followed by a
presentation of the three decision support systems for inferring short-term in vivo response
to combination therapy developed within the project. We continue with a description of
the EuResist prediction engine including combination approaches, performance assessment (including human experts as reference), and the web service. We proceed with an
investigation on alternative definitions for response to combination treatment and study
the performance of resulting prediction engines. Then we approach the update-problem,
which constitutes the Achilles’ heel of data-driven decision support systems for HIV combination therapy. We conclude the chapter with the presentation of a prediction system
that uses the patient’s treatment history instead of the viral genotype to infer response to
treatment.
In Chapter 6 we motivate the requirement for planning sequences of HIV therapies. We
present transducers, a family of finite state machines, that constitute the computational
framework for our novel method to rapidly simulate viral evolution during combination
therapy. Furthermore, we introduce five mutation models that are based on in vitro and
in vivo resistance data. The framework is studied in two validation settings. In the first
setting we challenge the framework to separate failing from succeeding treatments on the
basis of mutations caused by the preceding regimen. Within the second setting we assess
whether overall resistance development is correctly reflected by resistance development
within drug classes. The chapter is concluded by a case study comparing the impact of
typical first-line regimens on future drug options.
Chapter 7 makes the transition from personalized anti-HIV therapy to the probably only
true solution for the HIV pandemic: an HIV vaccine. In the beginning of the chapter we
summarize the basic concept of the human immune system comprising innate and adaptive immune response. This summary is followed by a short description of how vaccines
establish immunity. Further, we list the challenges posed by HIV to vaccine design and
review the most prominent vaccine candidates. The focus of the chapter is the prediction of
neutralization by antibodies based on the genetic sequence of the viral genome. Analysis
of the statistical learning models reveals positions that are essential for antibody binding. We conclude the chapter with potential improvements for our approach and further
applications to the development of HIV vaccines.
Chapter 8 concludes the thesis and provides an outlook on further advancements on
personalized HIV therapy.
4
1 Introduction
2 AIDS, HIV, and Antiretroviral Therapy
“In the field of observation, chance favours the prepared mind.” – Louis Pasteur
This chapter aims at providing the necessary background for this thesis on the human
immunodeficiency virus (HIV), the acquired immunodeficiency syndrome (AIDS), and
state-of-the-art antiretroviral treatment with the focus on available antiretroviral drugs
and routine treatment guidance. However, many important aspects that are not of direct
relevance for this thesis had to be omitted.
2.1 The AIDS pandemic
In 1981 a mysterious disease was discovered among the gay community in the United States
of America. The patients were hospitalized due to diseases resulting from an impaired
immune system. Owing to the media, the illness quickly became known as GRID (gayrelated immunodeficiency). The cause of the disease was unknown and the race for its
discovery commenced. The first effective tool to screen for a possible infection with the
unknown agent was a positive test for an infection with the Hepatitis B virus (HBV).
Roughly one year after the first cases of this immunodeficiency in the USA were reported, the virus was identified and isolated by researchers at the Institute Pasteur in
Paris (France). For their discovery of the infectious agent that is now known as the Human Immunodeficiency Virus (HIV), Luc Montagnier and Françoise Barré-Sinoussi were
awarded the Nobel-Price for Physiology or Medicine in 2008 (Barré-Sinoussi et al., 1983).
HIV infects cells of the human immune system and eventually leads to their death, thus
explaining the typical symptoms of immunodeficiency in HIV infected patients. Robert C.
Gallo, who also claimed to be the first discoverer of the infectious agent causing AIDS was
not awarded the Nobel Price (Gallo et al., 1983). Indeed, it turned out that the probes investigated by Gallo’s group were co-infected with a Human T cell Leukemia Virus (HTLV),
and thus “the paper by the Montaginer/Chermann group is unequivocally the first reported
true isolation of HIV from a patient with lymphadenopathy” (Gallo, 2002). Gallo, however, demonstrated that the identified agent is the cause of AIDS (Popovic et al., 1984)
and developed the first diagnostic blood test that prevented new infections and probably
saved thousands of lives. The omission of Gallo from the prestigious Nobel Price, despite
his numerous major contributions in the early years, resulted in a massive response by the
international scientific community (Abbadessa et al., 2009). The history of the discovery
of HIV has been briefly summarized by the main contributers Montagnier (2002) and Gallo
(2002) in featured articles.
The period following the identification of HIV (which in the early works was termed LAV
or HTLV-III due to its believed relation to the group of HTL viruses) as the cause of AIDS
was part of the scientific success story of modern medicine. Just to name a few milestones:
5
6
2 AIDS, HIV, and Antiretroviral Therapy
Figure 2.1: Global prevalence of HIV in December 2007. The figure was adapted from
the WHO 2008 Report on the global AIDS epidemic available at http://www.
who.int/hiv/data/en/. By the end of 2008 at total of 33.4 million [31.1-35.8]
people were estimated to live with HIV.
the genome was sequenced and inter- as well as intra-patient variation in viral populations
was detected (Shaw et al., 1984; Ratner et al., 1985; Wong-Staal et al., 1985; Hahn et al.,
1986), the virus was found in the brain of AIDS patients (Shaw et al., 1985), the modes
of HIV transmission were elucidated, all of HIV’s genes and most proteins were defined.
But most importantly, the blood supplies in most developed nations were rendered safe by
screening for HIV. The capability of screening for HIV infections, however, provided a first
hint on how severe the HIV pandemic should become, e.g. sera from hemophiliacs in Japan
were tested HIV negative early 1984, by end of 1984 already 20% of those patients were
tested HIV positive after they had been treated with HIV-contaminated blood products
from the United States (Gallo, 2002).
The spread of AIDS was rapid, soon after first reported in 1981 in the USA, fourteen
countries reported cases of AIDS in 1982, and 33 already in 1983. By the end of 2008,
33.4 million people worldwide were estimated to be living with HIV. Figure 2.1 displays
the prevalence of HIV in different regions of the world. According to the WHO 2009
report on the global AIDS epidemic, 2.7 [2.4-3.0] million people were newly infected with
HIV and about 2.0 million [1.7-2.4] million people died of AIDS in 2008. HIV is believed
to be responsible for approximately 25 million deaths, thus rendering it the most serious
biological killer to date.
The prevalence of HIV infected individuals varies heavily between geographic regions,
e.g. ranging from 0.1 to 0.5% in Central Europe to 15.0 to 28.0% in Southern Africa. These
regional differences are likely the result of many factors, including: different availability of
antiretroviral drugs, information campaigns (e.g. the government of South Africa pursued
an “AIDS denialism” policy that caused deaths of about 330,000 people (Ano, 2006, 2008)),
2.1 The AIDS pandemic
7
the stigmatization of the HIV infection, and frequency of rapes (e.g. South Africa has an
extremely high rate of rapes (Carries et al., 2007)). The high prevalence in sub-Saharan
Africa probably results also from the fact that this region is the origin of HIV (Zhu et al.,
1998). It is estimated that HIV entered the human population in 1931 [95% CI 1915-1941]
through multiple infections from simian immunodeficiency virus (SIV)-infected nonhuman
primates (Korber et al., 2000). Thus, approximately 50 years passed between the entry of
the virus into the human population and the enrichment of AIDS cases in 1981. During
these years, the virus could spread unrecognized, mainly because the late symptoms of
AIDS coincide with symptoms of e.g. malnutrition and tuberculosis, which happen to
be frequent problems in the infected population. As a consequence, the symptoms were
attributed to rather well-known causes instead of a new infectious agent.
2.1.1 HIV Diversity
As briefly mentioned, HIV was introduced into the human population by multiple crossspecies transmissions from SIV infected nonhuman primates. To date, we distinguish
two types of HIVs: HIV type-1 (HIV-1), which is responsible for the current pandemic,
evolved from an SIV variant present in chimpanzees, whereas HIV-2 is the result of a
zoonotic infection from SIV in sooty mangabeys (Heeney et al., 2006). HIV-1 comprises
four genetically distinct groups, namely M (main or major), N (non-M, non-O), O (outlier),
and P, which were the results from at least four different zoonotic infections. Group P was
only recently identified by Plantier et al. (2009) and originates from an SIV variant in
gorillas. This work focuses on HIV-1 group M as it accounts for most infections worldwide
and is therefore of premier interest.
HIV-1 group M is further divided into subtypes. This devision is based on clusters typically appearing in phylogenetic analyses of genetic sequences of HIV-1 group M (Robertson
et al., 2000). These subtypes are named A to D, F to H, J, and K and have a different
prevalence in different geographic regions of the world (Table 2.1), hence, suggesting that
their population structure is the result of founder effects. However, it is possible that some
phylogenetic clusters only appear because of incomplete sampling of the global viral population, e.g. newly sequenced HIV-1 strains from Central Africa fall in-between established
subtype clusters (Rambaut et al., 2004).
In addition to the pure subtypes recombinant forms exist, as well. These are either well
described established circulating recombinant forms (CRFs) or unique recombinant forms
resulting from super-infection of a patient with two or more different subtypes.
Questions related to HIV-1 subtypes and its influence on disease progression (Kanki
et al., 1999) and efficacy of antiretroviral treatment are still of major interest (Kantor et al.,
2005). Subtype B is most prevalent in the developed, industrialized regions of the world,
and therefore a representative of this clade was used for development of antiretroviral drugs.
Thus, the genetic changes that distinguish B variants from non-B variants are believed to
hamper the effectiveness of antiretroviral therapy (Descamps et al., 1998). Recently, it was
shown that HIV infection-related complications may vary between different subtypes, e.g.
the risk of developing HIV-induced dementia is increased in patients that are infected with
subtype D compared to patients infected with subtype A (Sacktor et al., 2009).
HIV-1 subtype B is the best-studied variant of HIV-1 owing to the available resources
8
2 AIDS, HIV, and Antiretroviral Therapy
Figure 2.2: Phylogenetic relation of lentiviruses in men and non-human primates.
Reprinted from The Lancet (Simon et al., 2006) with permission from Elsevier. The recently discovered HIV-1 group P (Plantier et al., 2009) is missing.
and infrastructure for research in its region of prevalence. The other HIV-1 subtypes clearly
deserve more attention than they have received so far. However, data collection efforts in
regions of their prevalence are less advanced than in Europe and North America. Also
the data studied in this work originates mainly for collection efforts in North America and
Central and Western Europe, hence, there exists a bias towards subtype B - simply due to
practical limitations.
The diversity of HIV poses also one of the major challenges for HIV vaccine design
(Walker and Burton, 2008). The sequence diversity within a single subtype, for example,
can reach up to 20%. Clearly, the sequence diversity is an issue especially for antibody
based vaccines that require conserved epitopes on surface proteins. For instance, in Africa
with a multitude of different circulating viruses the sequence of the envelope protein can
differ by up to 38% (Walker and Burton, 2008).
2.2 The Human Immunodeficiency Virus Type 1
geographic region
Sub-Saharan Africa
South & South-East Asia
Latin America
Eastern Europe & central Asia
North America
East Asia
Western & Central Europe
North Africa & Middle East
Caribbean
Oceania
number of
infections
22.400
3.800
2.000
1.500
1.400
0.850
0.850
0.310
0.240
0.059
major
risk groups
HSex
HSex, IDU
HSex, MSM, IDU
IDU
HSex, MSM, IDU
HSex, MSM, IDU
MSM, IDU
HSex, IDU
HSex, MSM
MSM
9
subtypes
A, D, F, G,
C, H, J, K, CRF
B, AE
B, BF
A, B, AB
B
B, C, BC
B
B, C
B
B
Table 2.1: Worldwide distribution of HIV-1 infections (as of December 2008), typical
modes of transmission, and prevalent HIV-1 subtypes. HSex=heterosexual,
MSM=Men who have sex with men, IDU=injection drug users. Numbers of
infections are given in millions and originate from the WHO 2009 Report on the
global AIDS epidemic. CRF=circulating recombinant form. Risk group and
subtype distribution is based on (Simon et al., 2006).
2.2 The Human Immunodeficiency Virus Type 1
The human immunodeficiency virus (HIV) received its name officially in 1986. Extensive
knowledge on HIV has been accumulated and a substantial amount of literature is available.
In the following we provide a brief summary, for further details please refer to Fields et al.
(2007) or similar reference literature.
HIV belongs of to the family of retroviruses and, more precisely, is a member of the genus
of lentiviruses, which indicates a long incubation period. Electron microscopy of particles
in infected cell cultures shows spherical entities with a diameter of 100 - 120 µm (Figure 2.3
a). A conceptual representation of the virus architecture is depicted in Figure 2.3 b). The
characteristic of retroviruses is that they store their genetic information in ribonucleic acid
(RNA) and thus require a mechanism to translate RNA to deoxyribonucleic acid (DNA),
which is the carrier of genetic information in their hosts. Each viral particle contains
two single stranded RNAs of approximately 10 kb length that are tightly bound to viral
nucleocapsid proteins and two viral enzymes (reverse transcriptase and integrase) that are
vital for a successful infection of the host cell. This complex is protected by a cone-shaped
capsid comprising approximately 2,000 copies of the capsid protein. The viral cone – also
known as the viral core – is clearly visible in the electron micrograph (Figure 2.3 a). The
viral cone is surrounded by a spherical matrix that is in turn covered by a lipid membrane.
Attached to the matrix, is the viral spike, which is responsible for target cell recognition
and cell entry.
Figure 2.4 shows the organization of the HIV genome. HIV encodes a total of 15 viral
proteins in overlapping reading frames. The majority of proteins is part of the three large
precursor proteins that have to be cleaved into functional subunits. The polymerase gene
10
2 AIDS, HIV, and Antiretroviral Therapy
(a) Electron microscopy of viral particles
(b) Schematic representation of an HIV virion
Figure 2.3: Electron micrograph of a budding HIV particle and mature HIV particles with
visible viral core from Fields et al. (2007) and reprinted with kind permission
from Lipincott Williams & Wilkins (a). Schematic representation of a single
HIV particle from Wikipedia (http://commons.wikimedia.org/wiki/File:
HIV_Virion-en.png) (b).
(Pol ) encodes for three proteins: the protease, the reverse transcriptase, and the integrase.
The Gag gene is the precursor of four viral structural proteins: p24 (the viral capsid),
p6 and p7 (the nucleocapsid proteins), and p17 (the viral matrix). The third precursor is
encoded by the Env gene and comprises the two subunits of the viral spike, the glycoprotein
gp120 and the transmembrane glycoprotein gp41. As indicated in Figure 2.3 b) the viral
spike consists of 6 proteins: one trimer of gp41 and one trimer of gp120, which is heavily
shielded by glycans. The smaller genes encode for transactivators (Tat, Rev, Vpr ), which
enhance gene expression, and other regulatory proteins (Vif, Nef, Vpu) helping the virus to
be more efficient in its reproduction and counter defense mechanisms of the host cell. For
instance, Vif disrupts the antiviral activity of the human enzyme APOBEC3G (Sheehy
et al., 2002). APOBEC3G causes G-to-A hypermutations in the genome of HIV and other
retroviruses, and thereby ultimately destroys the coding and replicative capacity of the
invading pathogen.
2.2.1 HIV Replication Cycle
Figure 2.5 depicts the replication cycle of HIV: starting from cell entry to maturation of
new infectious viral particles. The turnaround time is estimated with 1.5 days from entry
to the production of new infectious virions. The life cycle of HIV is complex and our
current understanding of each step is the result of extensive research. Nevertheless, many
details are not fully understood, yet. Thus, we focus the description of the replication
cycle on the targets for modern antiretroviral therapy. Further fascinating aspects of the
interplay between viral regulatory proteins and the host as well as the interplay of the
host’s immune system with the viral infection are omitted. For a full description of the
2.2 The Human Immunodeficiency Virus Type 1
11
Figure 2.4: Diagram of the genome organization of HIV. Reprinted from Freed (2004) with
permission from Elsevier.
molecular life cycle see e.g. Fields et al. (2007).
HIV enters the host cell by first binding one of its gp120 surface proteins to the CD4
receptor of the target cell. The importance of the CD4 receptor in HIV cell entry was
identified shortly after the isolation of HIV (Dalgleish et al., 1984). The CD4 receptor is
expressed on CD4+ T cells, macrophages, microglia, and dendritic cells that thereby are
the main targets of HIV. After the anchoring step, the gp120 subunit of the viral spike
undergoes a conformational change with the consequence of exposing an epitope that, in
turn, allows binding to a chemokine receptor (also called a coreceptor). The most important
coreceptors used by HIV in vivo are the chemokine receptors CCR5 and CXCR4 (Berger
et al., 1999). After coreceptor binding, another conformational change occurs in the gp41
subunit of the envelope protein. This change brings viral and host cell membranes in close
proximity with the result of membrane fusion (Esté and Telenti, 2007). Recent results
support the hypothesis that HIV primarily enters the target cell by endocytosis followed
by fusion in the endosome and not by fusion directly at the plasma membrane (see Uchil
and Mothes (2009) for a review). However, the successful fusion is followed by release of the
viral core into the cytoplasm of the target cell. Right after the viral core is uncoated, the
viral enzyme reverse transcriptase (RT) transcribes the viral RNA into a double stranded
DNA (dsDNA). Together with viral and host proteins the dsDNA forms the preintegration
complex (PIC), which is guided to the nuclear pore. Once the PIC has entered the nucleus
of the host cell, the viral enzyme integrase, which was part of the PIC, catalyzes the
integration of the viral DNA into the host chromosome. The integration of the viral DNA
into the host chromosome marks a central point in the HIV infection, as from now on the
cell is irreversibly infected.
The provirus, i.e. the integrated viral DNA, exploits the molecular machinery of the host
cell for the production of new infectious viral particles. To be precise, the viral DNA serves
as a template for the DNA-dependent RNA-polymerase II (pol II ) that produces messenger
RNA (mRNA). This mRNA is (partially) spliced and exported from the nucleus where it
is translated into the different viral (poly-)proteins. In addition, full RNA transcripts of
the complete integrated viral genome are generated and can be integrated into new virions.
The polyproteins Env and Gag/Gag-Pol are transported via different pathways to the
viral membrane where they participate in forming new viral particles. The Env precursor
protein, gp160, is glycolyzed in the endoplasmatic reticulum of the host cell, where the
12
2 AIDS, HIV, and Antiretroviral Therapy
Figure 2.5: HIV replication cycle. The basic steps of the HIV replication cycle: viral entry,
reverse transcription, integration, and formation of infectious particles. See
main text for further details. Figure reprinted with kind permission of Saleta
Sierra, Department of Virology, University Cologne.
monomers also undergo oligomerization. The predominant form is a trimer. The gp160
complexes are further transported to the cell’s Golgi complex where they are cleaved into
their subunits gp120 and gp41. There, cleavage is mediated by cellular enzymes and not
by the viral protease. The heavy glycosylation of gp120 provides the means for HIV to
shield its surface protein with a sugar coat against the immune response of the host.
The Pol polyprotein is only expressed as part of the larger Gag-Pol protein, which in
turn is a result of a -1 translational frameshift (Jacks et al., 1988). The ratio of Gag to
Gag-Pol, which is approximated to be 20:1 (20 copies of the Gag polyprotein for each
Gag-Pol polyprotein), is crucial for the successful replication of HIV (Shehu-Xhilaga et al.,
2001). The HIV protease (PR) is part of Gag-Pol and only active as a dimer. In fact,
enzyme activation is initiated when the PR domains of two Gag-Pol precursors dimerize
and the complex begins with intramolecular activity followed by intermolecular activity
(Pettit et al., 2004). The shorter Gag protein is rapidly targeted to the inner surface of the
cell membrane after its synthesis. At the membrane the precursor protein is then cleaved
by the viral PR during or after budding from the host cell into its four subunits, which
then form the viral matrix, capsid, and nucleocapsid, respectively. Only fully maturated
viruses are able to successfully infect new host cells.
2.3 Clinical progression of an HIV infection
13
2.3 Clinical progression of an HIV infection
The main infection route of HIV is via sexual transmission across a mucosal surface. The
actual risk of transmission varies and greatly depends on applied sexual practices. Maleto-male transmissions are one order of magnitude more likely than male-to-female and
female-to-male transmissions. Further prominent routes of infection are mother-to-child
transmission (via birth and/or breast feeding) and needle sharing among injection drug
users. In the beginning of the HIV pandemic, also tainted blood transfusions were responsible for new infections.
The two main markers that are used for monitoring HIV infections are the viral load
(copies of HIV RNA per milliliter (ml) blood plasma) and the CD4+ T cell count per
microliter (µl) blood. During the acute phase following primary infection, the viral load
increases rapidly and reaches a peak of 106 to 107 copies/milliliter blood. HIV directly
infects and causes the death of cells that are critical for effective immune response, and thus
the CD4+ T cell count decreases rapidly during this phase from over 1000 (as observed in
healthy individuals) to around 500. Symptoms of acute HIV infection (for example fever,
head ache, weight loss) appear about two weeks after exposure to the pathogen and are
rather nonspecific and therefore, in general, are not distinguishable from infections with
other viruses like influenza. Consequently, those symptoms are not used to diagnose an
HIV infection. Moreover, the nonspecific symptoms result in an unrecognized infection,
which in turn, leads to transmission of the virus to further individuals.
It is now known that, regardless of the route of transmission, HIV rapidly establishes a
persistent infection of the gut-associated lympohid tissue, which is also the principal site of
its replication. Arthos et al. (2008) showed that the HIV surface protein gp120 binds via
one of its loops (V2) to the receptor integrin α4 β7 , which in turn mediates the migration
of leukocytes to the gut (von Andrian and Mackay, 2000). Furthermore, the gp120-α4 β7
complex facilitates the activation of lymphocyte function-associated antigen 1 (LFA-1) on
CD4+ T cells. LFA-1 is a central element for establishing virological synapses (Bromley
et al., 2001) and consequently paves the way for HIV cell-to-cell transmission. Right after establishing the infection in the gut, the gut-associated lymphoid tissue undergoes a
substantial depletion of CD4+ T cells. It is estimated that half of the CD4+ memory cells
are killed during the initial attack (Mattapallil et al., 2005). Picker (2006) proposed that
this depletion represents an irreversible damage to the immune system that ultimately
results in AIDS. Another characteristic of the acute phase is the establishment of a latent
viral reservoir. More precisely, HIV infects resting memory CD4+ cells that, despite the
integrated provirus, remain replication-competent. The provirus remains dormant in this
reservoir, i.e. no viral proteins are actively transcribed, thus the infected cells are not especially targeted by the immune system. In this state, the provirus can survive for decades
until eventually HIV replication is initiated and new viral particles are produced. Of note,
the latency of the virus is regulated by epigenetic mechanisms (Kauder et al., 2009). Epigenetics refers to the change of phenotype or gene expression based on mechanisms other
than changes in the genetic code. For instance, attachment of methyl groups to cytosines
in the DNA stably alters gene expression profiles.
Towards the end of the acute phase, the viral load drops to a level of 103 to 104 and
the immune system partially recovers from the primary attack as indicated by an increase
14
2 AIDS, HIV, and Antiretroviral Therapy
Figure 2.6: Typical progression of an untreated HIV infection based on Pantaleo et al.
(1993).
in CD4+ T cells. The acute phase is followed by a phase of clinical latency where the
viral load slowly increases and the CD4 count continuously depletes. It is thought that
HIV kills CD4+ T cells directly mainly by inducing apoptosis or necrosis in infected cells.
Nevertheless, it is believed that the continuous decrease of CD4+ T cells is a result of the
combination of the persistent immune hyper-activation (Hazenberg et al., 2003) followed
by activation-induced cell death (Green et al., 2003) of T cells and the massive depletion
of mucosal CD4+ cells during the acute phase (Grossman et al., 2006).
At the point where the patient’s immune system is too weak, opportunistic diseases and
malignancies occur. Most HIV/AIDS-related complications are the result of bacterial or
viral challenges to the already depleted immune system. The diseases range from cancers,
which occur rarely among uninfected people, like Karposi’s sarcoma as a result of an
infection with human herpes virus type-8 to pneumonia induced by Pneumocystis jirovecy.
The progression of an untreated HIV infection is depicted in Figure 2.6.
AIDS follows the phase of clinical latency and is defined by either reaching a CD4 count
of less than 200 cells per µl or the manifestation of specified opportunistic infections. If untreated, AIDS is succeeded by death of the patient from these infections. Progression of the
primary infection to AIDS varies among patients, ranging from only six months (Markowitz
et al., 2005) to more than 25 years. One predictor of the rate of progression of the disease is the level of viral load established after one year without treatment (Lyles et al.,
2000). Two groups of patients are able to control the replication rate of the virus such that
the viral load does not exceed 5000 copies per ml (termed long-term non-progressors) or
even 50 copies per ml (termed elite or natural controllers). These patients are recruited
for genome-wide association studies that aim at uncovering the factors for their superior
control of the infection (Fellay et al., 2007).
2.4 HIV Treatment and Drug Resistance
15
2.4 HIV Treatment and Drug Resistance
Since the discovery of HIV a large array of antiretroviral drugs has been developed, targeting several stages of the viral life cycle. The first drug zidovudine (abbreviated ZDV or
AZT) was approved by the U.S. Food and Drug Administration (FDA) already in 1987 –
only four years after the isolation of the virus. ZDV is a nucleoside reverse transcriptase
inhibitor (NRTI) that targets the viral reverse transcriptase (RT), and thereby inhibits the
process of generating DNA from the viral RNA. The synthetization of ZDV dates back to
1964.
The introduction of the first antiretroviral drug was a milestone in HIV therapy – this triumph, however, was only of short nature. Larder et al. (1989) and Larder and Kemp (1989)
reported the emergence of resistant variants of HIV after prolonged treatment with ZDV.
A major cause for the emergence of resistance is the viral RT itself, since the mechanism of
reverse transcription is error prone and the enzyme lacks a proof-reading mechanism. Gao
et al. (2004) estimated a mutation rate of HIV with 5.4×10−5 mutations per nucleotide per
round of replication (initially Mansky and Temin (1995) reported 3.4 × 10−5 mutations per
nucleotide per round, their result, however, was based on shorter fragments of viral RNA).
Thus, the RT eventually generates a viral variant that is less susceptible to the drug and
therefore is selected evolutionarily under treatment. By now, resistance mutations in all
viral target proteins against all available compounds have been reported (Johnson et al.,
2008).
2.4.1 Antiretroviral drugs
Since the introduction of zidovudine a multitude of anti-HIV drugs has been approved by
the FDA and EMEA (European Medicines Agency) for treating HIV infections. Table 2.2
lists all currently FDA approved drugs. The targets of those drugs and their mode of
action will be explained in the following paragraphs. Here we proceed in order of the steps
of the viral replication cycle that they block.
Despite the large number of already existing anti-HIV drugs, multiple drugs in established and novel drug classes are under investigation. Table 2.3 lists an excerpt of currently
investigated drugs.
intercept the viral replication at the first possible point:
entry of the viral core into the cytosol of the host cell. Of all approved anti-HIV drugs, entry
inhibitors are the only drugs that target a host protein rather than a viral protein. The
development of entry inhibitors targeting human receptors originates from the observation
that 4%-16% of the European population (higher rates in the north and lower rates in
the south) have a homozygous ∆32 mutation in the CCR5 gene rendering the resulting
CCR5 receptor nonfunctional (Novembre et al., 2005). However, the population affected
by the genetic defect was also reported to be virtually immune against HIV infections,
while on the other hand, no severe side effects resulting from the nonfunctional receptor
are known (Novembre et al., 2005). In addition to maraviroc, which is a CCR5 inhibitor
and currently the only approved entry inhibitor, more CCR5 and also CXCR4 inhibitors
are under investigation (Esté and Telenti, 2007). Prior to the use of coreceptor blockers, it
Entry and Fusion Inhibitors
16
2 AIDS, HIV, and Antiretroviral Therapy
Generic name
Abbreviation
Trade name
Nucleoside and nucleotide reverse transcriptase inhibitors (NRTIs)
zidovudine
ZDV, AZT
Retrovir
didanosine
ddI
Videx
zalcitabine
ddC
Hivid
stavudine
d4T
Zerit
lamivudine
3TC
Epivir
abacavir
ABC
Ziagen
tenofovir
TDF
Viread
emtricitabine
FTC
Emtriva
Non-nucleos(t)ide reverse transcriptase inhibitors (NNRTIs)
nevirapine
NVP
Viramune
delavirdine
DLV
Rescriptor
efavirenz
EFV
Sustiva
etravirine
ETR, ETV
Intelence
Protease inhibitors (PIs)
saquinavir
SQV
Fortovase, Invirase (SQV + RTV)
ritonavir
RTV
Norvir
indinavir
IDV
Crixivan
nelfinavir
NFV
Viracept
fos-/amprenavir FPV/APV
Lexiva/Agenerase
lopinavir
LPV
Kaletra (LPV+RTV)
atazanavir
ATV
Reyataz
tipranavir
TPV
Aptivus
darunavir
DRV
Prezista
Fusion inhibitors (FIs)
enfuvirtide
ENF, T-20
Fuzeon
Entry inhibitors (EIs)
maraviroc
MVC
Selzentry
Integrase inhibitors (InIs)
raltegravir
RAL
Isentress
FDA approval
1987
1991
1992-2006
1994
1995
1998
2001
2003
1996
1997
1998
2008
1995
1996
1996
1997
2003/1999
2000
2003
2005
2006
2003
2007
2007
Table 2.2: Antiretroviral drugs approved by the FDA. Table is based on information
from http://aidsinfo.nih.gov.
2.4 HIV Treatment and Drug Resistance
Generic name
NRTIs
apricitabine
elvucitabine
racivir
NNRTIs
rilpivirine
lersivirine
MIs
bevirimat
vivecon
EIs
AMD11070
vicriviroc
ibalizumab
InIs
elvitegravir
Abbreviation
Phase
ATC
ACH-126443
RCV
III
II
II
TMC278
UK-453061
III
II
BVM
MPC-9055
II
II
AMD070
VCV
TNX-355
PRO 140
II
III
II
II
CXCR4 antagonist, trial was halted
CCR5 antagonist
monoclonal antibody targeting CD4
monoclonal antibody targeting CCR5
EVG
III
high cross resistance with RAL
17
Comment
Table 2.3: Some investigational antiretroviral drugs. Table is based on information
from http://aidsinfo.nih.gov and http://www.aidsmeds.com.
is necessary to determine which coreceptor is used by the virus for entering (see Lengauer
et al. (2007) for a review).
A further mechanism of preventing HIV from entering a target cell is to inhibit fusion
of virus and host cell membranes. The currently only licensed fusion inhibitor (FI), Enfuvirtide (abbreviated ENF or T-20), is a synthetic peptide that mimics the amino acids
127-162 of gp41. ENF binds to a subunit of gp41 and therefore prevents the required conformational change that facilitates the fusion of host and viral membrane. Drug resistance
mutations are usually located in the ENF-binding site on gp41 (direct resistance) or confer
resistance indirectly via mutations in other regions of gp41 and even in gp120 (Miller and
Hazuda, 2004). The latter mechanism is less well understood. The two major drawbacks
of ENF are its price (approximately 2,200 Euros for one month of treatment compared
to e.g. 400 Euros for one month of ZDV treatment in Germany; price estimates are
based on retail prices at http://www.docmorris.de and the recommended daily dose at
http://aidsinfo.nih.gov) and the way of administration: ENF has to be injected twice
daily, and thereby causing a number of skin irritations.
interfere with the process of generating a DNA copy of
the viral genome. The reverse transcriptase (RT) functions as a heterodimer comprising
the p66 and p51 subunits. The p66 subunit is the full product of the RT region of the Pol
gene, while the p51 subunit is obtained from the p66 unit by removal of a fragment at the
C-terminus. This cleavage process is catalyzed by the viral protease. The p51 subunit plays
only a structural role while the larger unit has two enzymatic centers: a DNA polymerase
Reverse Transcriptase Inhibitors
18
2 AIDS, HIV, and Antiretroviral Therapy
and a ribonuclease H (RNaseH). The DNA polymerase copies either viral RNA or DNA,
while the RNaseH is responsible for the degradation of tRNA primers and genomic RNA
present in DNA-RNA hybrid intermediates. There are two classes of reverse transcriptase
inhibitors that are distinguished by their mode of action.
The group of nucleoside/nucleotide reverse transcriptase inhibitors (NRTIs) are nucleoside and nucleotide analogs that are incorporated by the viral RT into the newly synthesized
DNA strand. These analogs lack a free 3’-hydroxyl group and therefore terminate the transcription process after their incorporation into the DNA. The cost of NRTIs for one month
of treatment varies from 300 Euros to 500 Euros (all prices apply to Germany only as they
are based on information from http://www.docmorris.de). Three different mechanisms,
associated with specific sets of mutations, used by HIV to lower the effect of NRTIs have
been observed (see Sluis-Cremer et al. (2000) for a review). These mechanisms involve
improved discrimination between NRTIs and their real dNTP counterpart, enhanced removal of the chain terminating NRTI at the 3’ end, and alteration in RT-template primer
interactions.
Non-nucleoside reverse transcriptase inhibitors (NNRTIs) form the second group of RT
inhibitors. NNRTIs are small molecules that inhibit the RT by binding to a hydrophobic
pocket in the proximity of the active site of the enzyme. Once the inhibitor is bound, it
impairs the flexibility of the RT resulting in its inability to synthesize DNA. Resistance
to NNRTIs occurs by mutations that reduce the affinity of the inhibitor to the protein.
Usually, a single mutation selected by one NNRTI is sufficient to confer complete resistance
to all compounds of the drug class (Clavel and Hance, 2004). This phenomenon is termed
cross-resistance. Thus, newer inhibitors like etravirine (ETR) are designed such that they
still exhibit high affinity in the presence of mutations induced by other NNRTIs. One
month of NNRTI treatment costs about 450 Euros for older drugs and 650 Euros for
recently approved products.
Figure 2.7 b) shows the structure of a functional heterodimer with typical NRTI and
NNRTI resistance mutations highlighted in the p66 subunit.
aim at preventing the enzyme from integrating the viral DNA into
the host chromosome. The integrase (IN) functions as a tetramer. Each monomer, which is
cleaved out by the PR from the C-terminal portion of the Gag-Pol polyprotein, has three
domains. The N-terminal domain contains a HH-CC zinc finger motif that is partially
responsible for multimerization, optimal activity, and protein stability. The DDE motif in
the core domain forms the catalytic triad. The C-terminal domain binds non-specifically
to DNA with high affinity. So far it was not possible to crystallize the entire 288-aminoacid long protein due to its low solubility and propensity to aggregate. As an intermediate
solution, the three domains have been crystallized individually. Moreover, structures of the
N-terminus plus the core domain (Wang et al., 2001) and the core domain plus C-terminal
domains (Chen et al., 2000) have been generated. The integration of the viral DNA requires
three subsequent steps. During the 3’ processing step the integrase removes a dinocleotide
from the long terminal repeat of each HIV-DNA strand. This step is followed by a process
termed strand transfer occurring in the nucleus where the integrase cuts the cellular DNA
and covalently links the viral DNA 3’ ends to the target DNA. The final step, the required
gap repair, is believed to be carried out by host DNA repair enzymes (Yoder and Bushman,
Integrase Inhibitors
2.4 HIV Treatment and Drug Resistance
19
2000).
Currently only one FDA approved integrase inhibitor (INI) is available. Raltegravir
(RAL) is a strand transfer inhibitor that interferes with the process by binding to the DDE
motif in the catalytic domain (Hazuda et al., 2000). Successful inhibition of the integration
process leaves the viral DNA in the nucleus, where it is recircularized by the host’s repair
enzymes. Hence, the HIV life cycle is interrupted. The mechanism of resistance of HIV
against INIs is still subject to investigation. Some clinically relevant mutations are found
to increase the rate of dissociation of the inhibitor from the IN-DNA complex (Grobler
et al., 2009). The price for one month of raltegravir treatment can be assessed with 1,100
Euros.
interfere with the process of forming new infectious
viral particles. The viral enzyme mainly engaged in virion maturation is the viral protease (PR). The PR cleaves the larger precursor proteins Gag and Gag-Pol into smaller
functional units. Unlike its cellular relatives the viral PR acts as a true homodimer. Each
monomer comprises 99 amino acids and the active site is formed by a cleft in between these
symmetrically arranged monomers (Figure 2.7 a). The cleft accommodates the substrate
to be cleaved and the two flexible flaps stabilize the substrate in the active site. The active
site has room for a peptide of approximately seven amino acid length.
Because of its important role in virion maturation, the viral PR was soon subject of investigation for potential inhibitors. Protease inhibitors (PI) are small molecules that bind
to the active site of the protease and therefore compete with its natural substrates. In
fact, the design of protease inhibitors constitutes a good example of structure-based drug
design, since the chemical structure of the inhibitors were chosen to mimic the structure
of the viral peptides that are naturally recognized and cleaved by the protease. There are
two mechanisms that result in resistance of HIV against PIs. The first one is the exchange
of amino acids in the PR such that the affinity to the inhibitor is decreased while the
natural substrates can be bound efficiently (Clavel and Hance, 2004). Modifications of the
affinity to the natural substrate alter also the efficiency of the protease. Thus, the second
mechanism introduces compensatory mutations aiming at reestablishing the efficiency of
the enzyme while maintaining resistance against the inhibitor. These compensatory mutations can occur either in the protease or in its substrate, i.e. at cleavage sites (Nijhuis
et al., 2007). As in the case of NNRTIs, the new generation of PIs shows antiviral activity
in the presence of mutations selected earlier by other PIs. The expenses for one month of
PI treatment range from 800 Euros to 1200 Euros.
Another class of drugs that interfere with the production of mature virions are maturation inhibitors (MIs). Unlike PIs they bind to the substrate of the protease instead of to
the protease itself (Salzwedel et al., 2007). Currently no MI is approved by the FDA. The
most advanced MI, bevirimat, prevents cleavage by the protease at the junction between
the capsid and a spacer in the Gag precursor protein. Unfortunately, it was shown that a
natural and frequently occurring polymorphism in Gag inactivates the antiviral activity of
bevirimat (McCallister et al., 2008; Baelen et al., 2009), and therefore gives the inhibitor
an unfavorable perspective (Verheyen et al., 2009).
Protease and Maturation Inhibitors
20
2 AIDS, HIV, and Antiretroviral Therapy
(a) HIV protease
(b) HIV reverse transcriptase
Figure 2.7: Structural models of viral targets for antiviral therapy rendered with the
software PyMOL. (a) Protease with bound inhibitor Amprenavir. The two
monomers of the functional homodimer are colored differently (PDB entry
3EKV). (b) Structural model of the functional RT dimer comprising the subunits p66 (mangenta) and p51 (cyan) from PDB entry 1RTJ. Typical NRTI
resistance mutations are highlighted in orange, and NNRTI resistance mutations in green.
Novel drugs are under investigation in nearly all established drug
classes described in the previous paragraphs. However, also novel targets or mechanisms are
under ongoing investigation. The studied approaches range from new ways of attacking old
targets, e.g. small peptides that prevent formation of an active RT dimer by blocking the
binding of the p51 and p66 subunits (Agopian et al., 2009), aptamers targeting the RNaseH
activity of the RT (Andreola et al., 2001), or integrase binding inhibitors with activity at
the 3’ processing step (Thibaut et al., 2009), to more exotic approaches. Sarkar et al.
(2007) reported a successful excision of the integrated provirus from the host chromosome.
A successful drug based on this mechanism would eventually allow to cure HIV infections,
rather than (merely) delaying disease progression. A further possibility to eradicate the
latent reservoirs of HIV is to trigger the transition from the latent phase to the active
phase by targeting the epigenetic regulation (Kauder et al., 2009). Finally, targeting the
frame shifting that maintains the ratio of Gag to Gag-Pol required for the generation of
infectious virions is an option (Gareiss and Miller, 2009).
Future Developments.
2.4.2 The Era of Highly Active Antiretroviral Therapy
The rapid resistance development of HIV against individual drugs required a new pharmaceutical strategy. Soon after the release of ZDV other nucleoside analogs were marketed
and dual therapy comprising two NRTIs was the first attempt to control viral replication (Hammer et al., 1996). The approach of combining several antiretroviral compounds
benefited the most from the release of drugs in other drug classes: NNRTIs and PIs.
This marks also the start of the era of highly active antiretroviral therapy (HAART) in
2.4 HIV Treatment and Drug Resistance
21
1995. HAART combines a minimum of three drugs from at least two different drug classes
(Clavel and Hance, 2004). A typical HAART combines two NRTIs plus either one PI or
one NNRTI (Dybul et al., 2002). The rationale of HAART is to maximally suppress viral
replication and thereby delay the progression of the infection to AIDS and death. The
use of different drug targets provides the means of erecting a higher barrier for HIV to
escape the regimen by developing resistance mutations. The success of HAART is based
on the fact that HIV has to acquire multiple resistance mutations against the different
drugs in the regimen. Here, the use of multiple drug classes ensures that different sets
of resistance mutations are required for a successful escape. Thus, the combination of a
successfully suppressed viral load, i.e. few viruses that can perform the “experiment”, and
a high genetic barrier to resistance, i.e. requirement of multiple mutations, substantially
slows down the progression of the infection.
The success of HAART was manifest in the immediate decline in HIV related mortality
following its introduction (Mocroft et al., 1998; Crum et al., 2006). Despite its success,
HAART presents a burden to the patients as many pills have to be taken on a daily basis.
Moreover, a strict adherence to the treatment schedule is required for optimal suppression
of viral replication. In addition, every drug comes with a extensive list of side effects ranging
from headache and diarrhoea to lipodystrophy (fat redistribution) and even peripheral
neuropathy (nerve damage) and therefore can seriously impair the patient’s quality of life.
Given that currently there is no cure for HIV and antiretroviral treatment is a life long
struggle, pharmaceutical companies are aiming at improving pharmacokinetics and the side
effect profile of the drugs with the aim to render treatment more bearable. Nowadays, some
antiretroviral combination therapies are available as a “once-daily” treatment delivered
by a single pill. For example, the drug Atripla comprise three antiretroviral compounds
(FTC+TDF+EFV) and is typically used for first-line treatment (one month of treatment
costs approximate 1,300 Euros).
A further milestone in HAART was the discovery that the protease inhibitor ritonavir
interferes with the liver enzyme cytochrome P450 (Kumar et al., 1996). This enzyme
is involved in the metabolic processing of most protease inhibitors. Thus, the use of a
small dose of ritonavir inhibits the liver enzyme, and helps to maintain optimal levels of
other protease inhibitors in the patient’s blood for a longer period of time. The boosting
of protease inhibitors with ritonavir is standard as of 2001 – following the introduction
of Kaletra (LPV+RTV) – and is usually denoted by PI/r. Currently, pharmaceutical
companies are searching for further compounds achieving the same effect as ritonavir but
with fewer side effects.
Despite of today’s potency of HAART, drug resistance is still an issue. By acquiring drug
resistance mutations the virus regains the ability to replicate in the presence of the drugs
leading to a high viral load, which marks virological failure. Furthermore, virological failure
usually precedes immunological failure marked by substantial decrease of CD4+ T cells.
A factor that contributes to resistance development in patients undergoing HAART is
incomplete adherence (Harrigan et al., 2005; Glass et al., 2008). Patients not taking their
drugs at all do not impose enough selective pressure on the virus. Conversely, if the
drugs are taken correctly and optimal drug levels are always reached, then the virus has
little chance to develop mutations. On the other hand, drugs taken at irregular intervals
leave the opportunity of developing mutations that are subsequently selected due to the
22
2 AIDS, HIV, and Antiretroviral Therapy
selective pressure. Unfortunately, drug resistance may also appear in patients with perfect
prescription refill rates. Harrigan et al. (2005) found at least one abnormally low drug
plasma concentration in 36% of the patients with 95% prescription refill rate (i.e. almost
perfect adherence to treatment). This points to either host genetic factors lowering the
amount of available drug in the patient’s body or to incomplete adherence during the
daily schedule, e.g. all drugs for a day are taken in the morning. A major factor towards
non-adherence are again the side effects of the compounds. Patients occasionally remove
drugs from an effective HAART without consulting with their physician in order to achieve
remedy for troubling side effects.
2.5 HIV Therapy Management
The circumstance that HIV evolves into drug resistance is further complicated by the fact
that resistance mutations selected by one drug also confer resistance against other drugs
from the same class. This phenomenon of cross-resistance is most pronounced among
NNRTIs where, prior to the release of etravirine, a single mutation could render the complete drug class useless. The second generation of NNRTIs and PIs has therefore been
designed in a way that they retain antiviral activity in the presence of commonly selected
resistance mutations by older drugs.
In order to ensure an effective treatment the extent of drug resistance has to be appraised
prior to onset of the regimen. Resistance of HIV against antiretroviral drugs can be assessed
by two different approaches. The first approach, phenotyping, measures the resistance of
the virus in an experimental assay (see e.g. Walter et al. (1999) and Petropoulos et al.
(2000)). In such assays the replication of the patient’s virus (clinical isolate) is compared to
the replication of a reference wild type strain in the presence of a varying concentration of
the drug. The concentration that cuts the replication rate in half is termed 50% inhibitory
concentration (IC50 ). The fold-change (FC) in resistance, also often referred to as resistance
factor (RF), is then simply the quotient between the IC50 of the clinical isolate and the
reference virus:
IC50 (clinical isolate)
FC =
.
IC50 (reference isolate)
Figure 2.8 depicts an example of dose-response curves generated by a phenotypic resistance
test. The experiment has to be carried out for every single drug, thus rendering the whole
process time- and cost-intensive – results are available within weeks and costs are in the
range of 1,000 Euros. Moreover, owing to the nature of the assay the test is restricted to
few specialized laboratories with sufficient security level and expertise.
The second method focuses on the genetic sequence of the viral drug targets. In genotyping the viral sequence is obtained using standardized fast and cheap methods (result
available within days at costs of 100 Euros) that can be carried out in virtually every
laboratory. The outcome of genotyping is simply a list of mutations compared to a wild
type virus and clearly much harder to interpret in terms of drug resistance as the single
number per drug generated by phenotyping. In order to assist the interpretation of the
viral genotype the international AIDS society (IAS) maintains an annually updated list of
resistance mutations (Johnson et al., 2008). This list, however, mainly informs about the
mere existence of mutations, the necessary knowledge about how many of the mutations
2.5 HIV Therapy Management
23
Figure 2.8: Dose-response curve of a phenotypic drug resistance test. Reprinted by permission from Macmillan Publishers Ltd: Nature Reviews Microbiology (Lengauer
and Sing, 2006), copyright (2006).
and in which combination are required to confer intermediate or complete resistance is
not provided. The missing semantics of the resistance mutations is introduced by rulesbased interpretation systems developed by virologists and clinicians (see Lengauer and Sing
(2006) for a review).
The benefit of resistance tests (genotypic or phenotypic) has been demonstrated in
prospective clinical trials. For example, Durant et al. (1999) showed that treatment decisions based on a genotype have a significantly better outcome than uninformed readjustments of medication. Moreover, Oette et al. (2006) showed that genotypic resistance
analysis is beneficial even in treatment-naïve patients, i.e. patients that have never received
any antiretroviral drug. The observed improvement is a direct consequence of transmitted
drug resistance mutations, where patients are infected with already resistant viral strains.
The advantage of resistance testing and the broad availability of genotyping led to routine
sequencing of relevant parts of the viral genome. The most relevant part comprises the 99
amino acids of the viral protease and up to the first 240 amino acids of the RT. These data
are now generated routinely in laboratories for genotype based resistance tests and are
stored together with frequently obtained markers of the treatment success (viral load and
CD4+ T cell count) and detailed information about the patient’s regimen (drugs, start,
stop, cause of stop) in databases (Rhee et al., 2003; Roomp et al., 2006). These data collections provide a rich basis for further developments in terms of resistance interpretations
as we will show in the remainder to this thesis.
2.5.1 Rules-Based Interpretation Systems
The need for adding semantics to the resistance mutations promoted the development of
many rules-based interpretation systems. The rules are generated by expert panels comprising virologists and clinicians on the basis of genotype-phenotype relationships, publica-
24
2 AIDS, HIV, and Antiretroviral Therapy
tions, clinical response (in the form of mutations observed in patients failing antiretroviral
therapy), and personal experience. The predictions by different rules-based interpretations
are often discordant owing to their history of origins, which is manifest in the differing sets
of rules and design principles. Frequently the discordant predictions lead to confusion by
users consulting multiple decision support systems (De Luca et al., 2003). Moreover, the
degree of disagreement can also depend on the HIV-1 subtype (Snoeck et al., 2006) thereby
reflecting the general lack of knowledge on non-B subtypes. The rule sets are maintained
regularly in meetings of the expert boards, and the rules have to be updated because of
the ongoing release of new antiretroviral drugs. Rule sets for novel compounds have to be
added as well as the impact of mutations selected by new compounds on the established
drugs has to be studied. Thus, over time the rules undergo refinement that reflects the
state-of-the-art knowledge on HIV drug resistance.
Nowadays, antiretroviral therapy comprises multiple drugs. The decision support systems, however, provide only classifications for individual drugs. In order to overcome this
limitation, the verbal classification provided by the tools is often mapped to a rating between 0 and 1 assessing the susceptibility of the virus against an individual drug. The
ratings of individual drugs are summed to form a genotypic susceptibility score (GSS) that
usually ranges between 0 and the number of drugs in the regimen (De Luca et al., 2003).
For treatment-naïve patients clinicians usually aim at achieving a GSS of 3.0 (i.e. three
fully active drugs) while for treatment experienced patients with heavily mutated viruses
a GSS of 2.0 is recommended.
The rules-based systems enjoy great popularity among clinicians and virologists involved
in HIV patient care, mainly owing to the fact that they are not providing “black box”
predictions. More precisely, the basis of the classification, i.e. the rule sets, are freely
available (for most systems) and thereby allow users to follow the rationale behind the
decision. In the following we provide a brief overview about the most popular rules-based
decision support systems.
The interpretation system provides a three-level classification of drug
resistance and is developed by the French ANRS (National Agency for AIDS Research)
AC 11 Resistance group. Their set of rules is now available in version 18 on the web site:
http://www.hivfrenchresistance.org/ (Meynard et al., 2002).
ANRS 07/2009 V18
Is a rule set maintained by the Rega Institute of the Catholic University in Leuven. It also provides a three-class rating. In contrast to the ANRS system it
applies a weighted score for the conversion from the verbal classification (i.e. susceptible,
intermediate, and resistant) into the scores that are then used to compute the GSS. Precisely, the score for boosted PIs is 0.75 and 1.5 instead of 0.5 and 1 for intermediate and
fully susceptible viruses, respectively, and the score corresponding to intermediate resistance of INIs, EIs and some NNRTIs is 0.25 instead of 0.5. The specifications are available
at http://www.rega.kuleuven.be/cev/ (Van Laethem et al., 2002).
Rega Version 8.01
The HIVdb system is maintained by the Stanford university and
available at http://hivdb.stanford.edu. Unlike in the other systems, each mutation
receives a score between -10 and 60 and therefore the systems resembles rather a linear
HIVdb Version 6.0.5
2.5 HIV Therapy Management
25
model with expert-derived weights than a rules-based system. Negative scores indicate
resensitization effects, i.e. a virus with such mutations is more susceptible against that
particular drug in the presence of the mutation. Based on the weights the resistance
against each drug is assigned to one of five possible classes (Rhee et al., 2003). Drugs scored
with 0-9 are estimated to be “Susceptible”, 10-14 indicates “Potential low-level resistance”,
while 15-29 and 30-59 correspond to “Low-level resistance” and “Intermediate resistance”,
respectively. Values of 60 or greater indicate “High-level resistance”. Moreover, the web
service provides implementations of other rule sets like ANRS and Rega and additionally
offers the option to use a personalized rule set.
HIV-grade is an interpretation system maintained by HIVgrade e.V. an association of German clinicians and virologists. Like HIVdb it provides the
possibility to compare different systems (all those mentioned above including geno2pheno
described in Section 2.5.2). HIV-grade differs from the other systems by providing special
rules for some drugs. These special rules consider which other drugs are part of the
combination. In particular, the additional rules are aiming to capture mutations with
resensitization effect and require that the drug selecting for the resensitization mutation
is maintained in the regimen for keeping up the selective pressure. The susceptibility
for each drug is provided in a three-class system. The web service is available at http:
//hiv-grade.de.
HIV-grade Version 12-2008
The AntiRetroScan algorithm is maintained by the Italian
Antiretroviral Resistance Cohort Analysis (ARCA) consortium (Zazzi et al., 2009). Like
HIVdb it provides five resistance classes and like the Rega rules it applies different weights
for different drug classes that should be used to compute a GSS. The rule set is available
at http://www.hivarca.net/includeGenpub/AntiRetroScan.htm.
AntiRetroScan version 2.0
In addition to the free decision support systems introduced in
the previous paragraphs, there also exists a number of proprietary rule sets that are
part of diagnostic kits. Those kits usually include a sequencing machine and primers
needed for the sequencing. Moreover, the interpretation software has to be approved
by the FDA. An example of a proprietary system is the ViroSeq (Version 2.8) rule set,
which is developed by Celera Diagnostics and provides three output categories (http:
//www.celeradiagnostics.com/cdx/ViroSeq). The tool accompanies the ViroSeq HIV-1
Genotyping System developed by Abbott. Likewise, the GuideLines Rules (Version 14) are
updated annually by an independent expert panel and accompany the TRUGENE HIV-1
Genotyping Assay offered by Siemens (http://www.medical.siemens.com/).
Proprietary Systems
2.5.2 Predicting the Drug Resistance Phenotype from Genotype
Currently, there exists a multitude of expert-based decision support systems. The systems
frequently disagree on the classification results, which is no surprise given the different design principles and experts involved in the numerous boards. However, there has been the
desire to base the classification of drug resistance on a more objective foundation. Obviously, an objective criterion of drug resistance is the phenotypic drug resistance as measured
26
2 AIDS, HIV, and Antiretroviral Therapy
in vitro by experimental assays. There are two frequently used tools that predict phenotypic drug resistance from the viral genotype. One system, VircoTYPE, is proprietary and
maintained by the company Virco BVBA (Mechelen, Belgium), the other, geno2pheno,
is a freely available web service offered via http://www.geno2pheno.org (Beerenwinkel
et al., 2002, 2003a). Both systems use statistical learning to infer drug resistance models
from matched genotype-phenotype pairs. Application of these models allows to infer the
more expensive and objective information on the basis of the cheap and easily available
genotype.
The first version of geno2pheno used decision trees (Breiman et al., 1984) to classify
the virus as resistant or susceptible with respect to several drugs. The training data for the
initial models comprised approximately 450 genotype-phenotype pairs (Beerenwinkel et al.,
2002). The decision trees achieved moderate prediction performance but demonstrated in
an appealing way, how rules of drug resistance can be automatically derived from data.
Precisely, decision trees are a white box classifier, i.e. in addition to the classification result
the user can learn what input led to the final decision, as opposed to black box models
like neural networks and support vector machines (SVMs) with non-linear kernels (Boser
et al., 1992). In a later version of geno2pheno a SVM-based prediction was added to
the web service and the models were trained on approximately 650 genotype-phenotype
pairs (Beerenwinkel et al., 2003a). The SVM exhibited better prediction performance
than the decision trees and allowed to solve a regression problem, i.e. predicting the FC in
resistance rather than only a class label. Because linear SVMs cannot operate on categorical
data, e.g. amino acid sequences, the viral genotype was represented as binary vector, with
one indicator for every possible amino acid at each sequence position. Hence, the encoding
required 20 times the length of the amino acid sequence. The model interpretation of
SVMs, however, is not as trivial as in the case of decision trees. Based on the work by Sing
et al. (2005b) the linear SVM models for the individual drugs could be interpreted and a list
of scored mutations (main contributers to observed resistance) is now provided with every
prediction (for details see Section 3.1). The current version of geno2pheno is trained on
approximately 1,000 genotype-phenotype pairs per drug.
VircoTYPE uses also a linear model but, unlike geno2pheno, uses linear regression with
pairwise interaction terms for mutations. The statistical learning task was carried out on
a median of 46,100 genotype-phenotype pairs per drug (Vermeiren et al., 2007). Again,
the use of a linear model facilitates model interpretation. In contrast to geno2pheno,
VircoTYPE is a commercial tool that charges the user for every prediction.
2.6 Further Advancements
Since the first methods that predict in vitro drug resistance have been published, this task
was subject of numerous publications. The wealth of publications was also boosted by a
freely available genotype-phenotype dataset published by Stanford’s HIVdb (Rhee et al.,
2006). The publications range from use of established statistical learning models like simple linear models (Wang et al., 2004), linear models including interaction terms (Vermeiren
et al., 2007), and rules-learning algorithms (Kierczak et al., 2009) to the development of
novel techniques like non-parametric methods (DiRienzo et al., 2003) and machine learning
approaches aiming at the discovery of patterns of resistance mutations (Saigo et al., 2007).
2.6 Further Advancements
27
Other approaches extended the genotype-phenotype relationship by incorporating descriptors of the antiretroviral drug in the prediction model (Drăghici and Potter, 2003; Lapins
et al., 2008). Additionally, methods from docking and molecular dynamics simulation were
used to compute the binding energy of PIs to variants of the protease. The binding energy turned out to be a good predictor of resistance without requiring a training step nor
training data (Jenwitheesuk and Samudrala, 2005). These systems, however, represent a
proof-of-concept and are not used as clinical decision support tools.
Despite all the approaches for predicting in vitro drug resistance, the overall aim is
still to infer clinical in vivo response of the patient to the drugs. Thus, the estimation of
relevant cutoffs is an essential part for establishing a useful prediction system. The expert
derived rules-based systems, on the other hand, mainly aim at interpreting resistance
mutations in terms of clinical response and consequently have no need for relevant cutoffs.
Nonetheless, all approaches have in common that they only classify single drugs as resistant
or susceptible, but what lies at the heart of the problem is the prediction of response to
a combination of drugs. In fact, HIV experts have discouraged inferring response from
genotype via the intermediate step of first predicting the resistance phenotype (Larder
et al., 2007; Brun-Vézinet et al., 2004).
In contrast to this criticism, we could show that predicted phenotypes – using the VircoTYPE system – combined with clinically relevant cutoffs are at least as predictive as
several expert derived rule sets. Additionally, predictive accuracy could be further enhanced for all systems by weighting individual drugs using statistical learning methods
instead of the simple summation usually applied to derive a GSS or PSS (Altmann et al.,
2009b). Moreover, statistical learning allowed to include predictions by multiple interpretation systems to infer virological response. In fact, combinations of predicted phenotypes
with expert predictions showed a slight improvement, which could also be confirmed on
a large independent test set. In contrast, combinations of only rules-based predictions
showed no added value over the use of a single system. In this study we also evaluated the
importance of predictions for single drugs in terms of response to antiretroviral treatment.
Indeed, the benefit of statistical learning originated from the different weighting of drugs
and confirms earlier results (Swanstrom et al., 2004). This weighting is presumably the
result of the abundance of the drug in the training data, its potency, the accuracy of the
prediction algorithm for that drug, pharmacokinetics (e.g. the forgiveness of a drug when
a dosage was missed or taken too late), and further factors adding to the success of the
drug in vivo.
Given a sufficient amount of data, it is possible to develop models that directly predict response to antiretroviral combination using genetic information and the drugs in the
treatment. Such models thereby circumvent the need to first assess resistance against single compounds and then subsequently combine the single predictions to a score for the
complete regimen. A further advantage is that such direct models can learn negative and
positive pharmacokinetic interactions between drugs from the data. Indeed, there exists
a large number of interactions between the different drugs (see Boffito et al. (2005) for a
review), ranging over different drug classes and even to non-anti-HIV drugs (e.g. all drugs
metabolized by cytochrome P450 are affected by ritonavir boosted PIs). The worst interactions between drugs in terms of drug resistance are effects that lower the bioavailability
of one or more compounds of the combination treatment: frequent suboptimal levels of
28
2 AIDS, HIV, and Antiretroviral Therapy
a drug in the blood plasma provide the right mixture of selective pressure and extent of
viral replication to eventually generate resistant viruses. An online repository providing
lectures and overviews on pharmacokinetic interactions is maintained by the Liverpool HIV
Pharmacology Group (http://www.hiv-druginteractions.org/).
Plasma levels of drugs can also be influenced by the genetic predisposition of the host.
For example, Mahungu et al. (2009) showed that a mutation in cytochrome P450 has a
positive influence on the plasma level of the NNRTI nevirapine. Hence, consideration of
host genetics during the selection of compounds for a combination therapy can yield further
improvements. On the other hand, severe adverse effects can be increased depending
on host genetics, e.g., the patient immunotype HLA-B*5701 is strongly associated with
hypersensitivity to the NRTI abacavir (Mallal et al., 2008). Still, despite its evident benefit,
pharmacogenetic testing is currently not widely applied.
A further host-specific aspect not considered by current tools is the presence of coinfections with other viruses (e.g. HBV) or other (chronic) diseases that require medication.
Again, compounds used to treat other diseases may limit the efficacy of anti-HIV drugs or
might add limitations in the selection of anti-HIV drugs.
Further important information is neglected by the restriction of the resistance analysis
to the baseline genotype alone: archived resistance mutations in the viral reservoirs (latent
viruses). Resistance mutations selected by earlier treatments can disappear from the viral
population as soon as the selective pressure is removed. For instance, protease resistance
mutations disappear in the absence of protease inhibitors because the protease is more
efficient without those resistance mutations. Increased efficiency translates into a higher
number of infectious viruses that eventually outnumber the protease resistant variants. The
mutations, however, persist in viral reservoirs and are rapidly reselected after reappearance
of the selective pressure by reusing the drug. Consequently, just studying the most recent
genotype might miss important mutations. And indeed, the patient’s treatment history has
long been recognized as valuable information (Bratt et al., 1998), and recently it has been
shown that inspection of previous genotypes helps to explain response to antiretroviral
combination therapy (Zaccarelli et al., 2009).
A quantitative analysis of the viral population with new sequencing techniques revealed
advantages in predicting the coreceptor usage phenotype from genotype (Däumer et al.,
2008). In terms of drug resistance, the interest is mainly on resistance mutations that are
only present in a minority of the viral population at the onset of the treatment. Minorities
might increase the risk of selecting that resistance mutation early after treatment start.
Hence, the question is, if minorities are harmful for the success of treatment at all, and
if so, what threshold of the minority leads to rapid selection of resistant variants. Using
standard Sanger sequencing (also called bulk-sequencing) a minority in the population
can only be detected, if it accounts at least for 20% in the overall sample. In contrast,
ultra-deep sequencing allows to determine the sequence of individual viruses. Hence, here
the resolution is mainly bound by the cost of the experiment. Initial studies using ultradeep sequencing demonstrated that resistance mutations found in minorities of the viral
population correlate well with previous antiretroviral therapies and can therefore provide
information on archived resistance mutations even in the absence of treatment records (Le
et al., 2009).
Eventually, every known drug combination fails due to acquired drug resistance muta-
2.6 Further Advancements
29
tions. The fact that anti-HIV therapy is a life-long effort paired with the limited number
of antiretroviral drugs and the cross-resistance between them, raises the need to consider
different strategies of effectively using the available array of compounds. For example, one
could follow a “hit-hard” strategy and apply one drug from each class, which will probably
lead to a very long lasting successful treatment (e.g. six years), but once it fails there
are resistance mutations against drugs from all classes. Consequently, only a very limited
number of drugs is available after treatment failure. On the other hand, one can restrict
the treatment to two drug classes, like in standard HAART, which remains effective for e.g.
three years. The fact that other drug classes have been spared, allows for multiple repetition of the medium term success. More specifically, one could optimize the order in which
drugs of one class should be administered for minimizing the impact of cross-resistance.
This planning of sequences of treatments, or therapy sequencing, aims at prolonging the
patient’s life.
Summing up, improvement over state-of-the art treatment guidance can be achieved by:
1. Construction of models that predict in vivo response to antiretroviral combination
therapy and thereby capture interactions between drugs and between drugs and mutations in vivo.
2. Inclusion of features of viral evolution for assessing the risk of the virus to escape a
putative treatment by developing further mutations.
3. Incorporation of data on the patient’s treatment history for assessing the risk of
resistance mutations archived in viral reservoirs.
4. Consideration of host specific characteristics ranging from genetic factors to coinfections for addressing the interactions between drugs and host (pharmacogenetics)
and virus and host (e.g. immune control)
5. Investigation of the impact of minorities in the viral population on the treatment
outcome and resistance development using novel sequencing techniques.
6. Generation of personalized treatment schedules that ensure the optimal use of the
available array of antiretroviral drugs.
This thesis addresses some of these issues. Dealing with points four and five, for example,
requires novel data: presence of co-infections or genotyping of the patient and sequencing
the viral quasi species, respectively. These data are not routinely collected in treatment
databases, and the corresponding issues are therefore not addressed. The objective of the
thesis is the inference of virological response in vivo directly from the viral genotype, the
applied drug combination, features derived from the viral genotype, and further available
information (Chapters 4 and 5). The models are restricted to RTIs and PIs (excluding
next generation drugs within these classes), simply due to lack of data on novel drugs in
databases collecting clinical response data. This poses no major limitation, as RTIs and
PIs remain the major building block for today’s antiretroviral therapy of HIV infections.
Novel drugs are typically spared for later treatment lines often termed salvage treatments.
Finally, we undertake first steps towards therapy sequencing by introducing a fast method
for estimating the viral evolution during combination therapy (Chapter 6).
30
2 AIDS, HIV, and Antiretroviral Therapy
3 Extensions to GENO 2 PHENO
The web service geno2pheno for predicting phenotypic drug resistance from the viral
genotype has been in operation since December 2000. Updates and continuous research is
required for maintaining its benefit for virologists and clinicians treating HIV patients. In
this chapter we describe the web tool geno2pheno[integrase] that predicts from the viral
genotype phenotypic resistance against two integrase inhibitors (Section 3.1). Moreover,
we present research aiming at improving the usefulness of predicted fold-change in IC50
by deriving clinically relevant cutoffs of the continuous measure (Section 3.2). Finally,
in Section 3.3 we explore approaches for improving prediction of resistance to individual
antiretroviral drugs by using abundantly available viral genotype data in addition to the
genotype-phenotype pairs.
3.1 GENO 2 PHENO[integrase]
The viral integrase comprises 288 amino acids and is a product of the C-terminal portion
of the large precursor protein encoded by the Pol gene. Only recently, in 2008, the first
integrase inhibitor raltegravir (RAL) was approved by the FDA. Knowledge of drug resistance against RAL and elvitegravir (EVG), an integrase inhibitor in phase III clinical trials,
accumulates only slowly. Moreover, due the recent release of the drug, routine resistance
testing is not established yet, mainly owing to the fact that the integrase of the patient’s
virus is assumed to be wild type and thus susceptible to the novel drug. As a consequence,
genotype-phenotype data is rare. However, some researchers investigate mutations, which
were reported to emerge during integrase inhibitor containing regimens, with phenotypic
resistance assays. These genotype-phenotype data are partially available via Stanford’s
HIVdb1 .
Material and Methods
Using this freely available resource we compiled a dataset comprising 113 and 126 genotypephenotype pairs for RAL and EVG, respectively. The raw data were preprocessed exactly
like in the case of geno2pheno: the decadic logarithm was applied to the fold-change
values, and each of the 288 amino acid positions was represented by 20 binary variables
indicating the presence (or absence) of a specific amino acid at that position. In accordance
to the geno2pheno web service we applied linear support vector regression (SVR) for
computing the continuous log10 (FC) from genotype. The cost parameter C of the SVR
was optimized in a five-fold cross-validation. Performance was measured using Pearson’s
correlation coefficient (r) between predicted and measured log10 (FC) in a leave-one-out
cross-validation (LOOCV) with the optimized cost parameter. In addition, we trained one
1
http://hivdb.stanford.edu/cgi-bin/IN_Phenotype.cgi
31
32
3 Extensions to geno2pheno
linear SVR model for each drug on the complete set for investigating the contribution of
each mutation to the overall observed resistance. More precisely, linear support vector
machines (SVMs) allow to rewrite the decision function for classification and regression
as a simple linear model with trivial access to the contribution of every mutation (Guyon
et al., 2002). The training of a SVM results in a set of support vectors x~k with label yk
and a set of non-zero weights ak . In case of a classification task, the label for a new sample
~x is then computed by:
!
X
f (~x) = sgn
yk αk K(x~k , ~x) ,
(3.1)
k
where K(x~k , ~x) is the kernel function. Essentially, a kernel function expresses the similarity
between two vectors. Hence, during the decision process the similarity of the input vector
to each support vector x~k is computed and multiplied with its label yk and its influence αk
derived during training. Popular kernel functions are for instance, the polynomial kernel
with degree d, K(x~k , ~x) = (1 + hx~k , ~xi)d and the radial basis function kernel K(x~k , ~x) =
exp(−βk~x − x~k k2 /(2σ 2 )). However, when using a plain linear kernel we simply set
K(x~k , ~x) = hx~k , ~xi,
which allows us to rewrite the decision function as
f (~x) = sgn (w
~ · ~x + b) , with
X
w
~=
αk yk x~k and b = hyk − w
~ · x~k i.
k
Thus, the weight vector w
~ is simply a linear combination of the support vectors and offers
a straightforward interpretation.
Results
Figure 3.1 depicts the scatter plots between predicted and observed log10 (FC) values. For
both drugs the correlation coefficients are high, whereas resistance to RAL (r = 0.87) was
predicted more accurately than resistance to EVG (r = 0.79).
The influence of each mutation on the measured phenotypic resistance against RAL and
EVG is displayed in Figures 3.2 a) and b), respectively. As expected, most mutations
lead to an increase in FC; a few mutations, however, slightly increase susceptibility to the
drugs. Figure 3.2 c) shows a scatter plot between the SVR weights obtained from the
RAL and EVG models. Here it becomes evident that most mutations confer resistance
to both drugs. A few exceptions are mutations at position 66, which influence the EVG
phenotype but not the RAL phenotype, and mutation 151I that increases RAL resistance,
but decreases EVG resistance. The offset (i.e. b in the linear model) is close to 0 for both
models, thus indicating no resistance for wild type viruses.
Discussion
A clear limitation of this study is the low number of training instances, which, in addition, originate from a total of eleven different research groups using different experimental
assays for measuring phenotypic resistance. These factors clearly increased the variation
3.1 geno2pheno[integrase]
33
4
4
●
●
● ●●
●
●
●
● ●
●
●
●
●
●
●
● ●
● ●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●● ●
●
●
●
●●
●●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●●
● ●●
●● ●●●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●● ●
●
●
●
●●
● ●
●
●
●●●●
●
●
●●
●
●
●●
●
●
●
●
●
● ●●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●●
r=0.79
●
●● ●
●
●
●● ●
●
●
●
● ●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
0
0
●
●
●
2
●
r=0.87
1
●
● ●
●
predicted log10(FC)
2
●●
●●
1
predicted log10(FC)
●
● ●
●
●●
3
3
●●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●
●●
−1
−1
●
−1
0
1
2
actual log10(FC)
(a) RAL
3
4
−1
0
1
2
3
4
actual log10(FC)
(b) EVG
Figure 3.1: Scatter plots between predicted and observed log10 (FC) values for (a) RAL
(n=113) and (b) EVG (n=126).
in the phenotypic measurements we used for training the SVR models. Despite this shortcoming, the achieved prediction accuracy was surprisingly good for both drugs. This,
however, might be a direct consequence of the origin of the data: researchers were interested in the effect of a few mutations, which frequently emerge during treatment with
integrase inhibitors. Thus, they conducted site-directed mutagenesis to introduce these
mutations, individually or in combination, into a wild type virus. Hence, our model summarizes the knowledge derived from these experiments, but it does not allow to identify,
yet un-described integrase resistance mutations as it was done with the training data for
geno2pheno (Sing et al., 2005b). Furthermore, as a consequence of the bias towards wellknown resistance mutations, some training instances had identical genotypes. Of course,
identical samples can artificially boost the performance assessed in cross-validation settings. In order to study the impact of duplicate samples on the prediction accuracy, we
repeated the LOOCV study with a smaller set of unique samples (n=76 for RAL and
n=90 for EVG). The observed correlation was slightly worse for RAL (r = 0.85) and even
slightly better for EVG (r = 0.81) than with the full dataset. Nonetheless, analysis of
the feature importance could confirm the high cross-resistance potential between the two
inhibitors (McColl et al., 2007).
The prediction of phenotypic resistance to the integrase inhibitors RAL and EVG based
on SVR models is implemented in the freely available web service geno2pheno[integrase] 2 .
Along with the predicted FC values, a list of mutations contributing to the drug resistance
is provided. Currently, prediction of resistance to INIs is decoupled from prediction to
resistance to RTIs and PIs, since, as of now, determining the genotype of the viral integrase
and the Pol fragment containing protease and reverse transcriptase are separate working
steps.
2
http://integrase.geno2pheno.org
140S
143C 143R
148H
148K
148R
151A
151I
155S155H
3 Extensions to geno2pheno
●
●
66A
68V
●
●
●
●
0
●
●
●
●
●
●
1
●
●
●
121Y
92Q
92V
●
0
25
50
75
234V
●
113V
●
−1
influence on log(FC)
97A
2
34
100
125
150
175
200
225
250
275
amino acid position
●
●
●
●
●
0
●
263K
●
●
●
●
234V
●
●
●
●
●
125K
95K
113V
114Y
●
68V
51Y
1
140S
121Y
66A
66I
●
●
●
●
146P
147G
148H
148K
148R
151A
153Y
155H
155S
157Q
2
92Q92V
●
●
●
●
25
50
75
100
138K
●
151I
20K
0
●
●
230R
●
128T
●
−1
influence on log(FC)
3
(a) RAL feature importance
125
150
175
200
225
250
275
amino acid position
(b) EVG feature importance
148H
148K
1.5
92Q
1.0
66I
155H
121Y
140S
66A
151A
147G
146P
263K
0.5
change in log(FC) for EVG
2.0
148R
92V
155S
0.0
61I
68I
74M
154I
230N
128T
20K
230R
−0.5
113V
234V
153Y
114Y
95K
offset 68V
125K
51Y
157Q
138K
−0.5
0.0
143R
97A
143C
151I
0.5
1.0
1.5
2.0
change in log(FC) for RAL
(c) Correlation of mutations
Figure 3.2: Contribution of each mutation to the predicted phenotypic resistance to RAL
(a) and EVG (b). All mutations that cause a change in log10 (FC) of more than
0.15 are labeled with the corresponding substitution in red (increase) or green
(decrease). Correlation of mutations (c): the x-axis (y-axis) displays the effect
on log10 (FC) by a specific mutation on RAL (EVG) resistance.
3.2 Estimating Clinically Relevant Cutoffs for geno2pheno
35
3.2 Estimating Clinically Relevant Cutoffs for GENO 2 PHENO
The appealing benefit of the phenotypic drug resistance test, and consequently also of the
predicted FC in drug resistance, is the output of a scalar value that objectively quantifies
drug resistance. These values, however, have to be interpreted in terms of clinical impact.
Intuitively, high values predict unfavorable outcome, while low values indicate good response to the drug. Still, where exactly the cutoffs are located, which classify a drug as
susceptible, intermediate, and resistant, is unclear. The computation of clinically relevant
cutoffs is therefore an essential task for converting an in vitro measure into a clinically useful tool. Moreover, the observed ranges of FC vary heavily between drugs. For example,
FC for ZDV in the training data for geno2pheno has a median of 29 [interquartile range
(IQR): 1.5; 225], while ABC has a median of only 3.9 [IQR: 1.7; 8.2]. In geno2pheno this
scaling issue is addressed by normalizing the raw predicted FC with the predicted FC as
observed in treatment-naïve patients (Beerenwinkel et al., 2003a). Briefly, the log10 (FC)
of one drug in a group of untreated patients, i.e. patients, who are assumed not to have resistance mutations, is predicted with geno2pheno and the mean µnaïve and the standard
deviation σnaïve are computed. The log10 (FC) for a new prediction is now normalized by
computing the z-score
log10 (FC) − µnaïve
z=
.
σnaïve
Hence, drug resistance is expressed as the distance (measured in standard deviations)
from a treatment inexperienced group, and therefore makes FCs for different drugs more
comparable. The problem, however, which values correspond to intermediate or complete
drug resistance remains unsolved by this transformation.
The first generation of cutoffs in geno2pheno was based on the observation that on a
logarithmic scale the distribution of the FC values displays a bimodal distribution (Figure 3.3) – more precisely, a mixture of two Gaussian distributions, with one Gaussian representing the susceptible subpopulation and the other the resistant subpopulation. Hence,
the density of the x = log10 (FC) can be modeled as
α · φ(x; µ1 , σ1 ) + (1 − α) · φ(x; µ2 , σ2 ),
with φ(x; µ, σ) denoting a normal distribution with mean µ and standard deviation σ
(Beerenwinkel et al., 2003a). The model parameters can be efficiently estimated using the
expectation-maximization (EM) algorithm (Dempster et al., 1977), and the intersection
(between µ1 and µ2 ) of the two (weighted) Gaussians represents a logical candidate for a
cutoff between susceptible and resistant. Beerenwinkel et al. (2003a) exploited the bimodal
nature of the distribution further for calculating the probability of an FC value of belonging
to the susceptible subpopulation, that is where prob(sus|FC) > prob(res|FC) (Beerenwinkel
et al., 2003b). To this end, the zero x0 of the log-likelihood function l(x) = prob(sus|x)
prob(res|x)
is computed and the log-likelihood function is approximated by its tangent L(x) at x0 .
Finally, the probability of an FC value to belong to the susceptible subpopulation can be
1
approximated with prob(sus|FC) ≈ 1+exp(−L(x))
, where x := log10 (FC). This quantity is
also referred to as the activity of a drug d against a virus.
Despite its appealing mathematical properties, the probability of belonging to the susceptible subgroup was not favored by the clinicians and virologists, who considered the
36
3 Extensions to geno2pheno
−1
0
1
log10(FC)
2
3
1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Probability
0
0
0.0
0.1
0.2
Density
Density
0.3
0.4
0.5
0.6
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability
Density
Density
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
EFV
0.7
NFV
−2
−1
0
1
2
3
log10(FC)
Figure 3.3: Distribution of log10 (FC) for the PI NFV and the NNRTI EFV. The Gaussian
representing the susceptible (resistant) subpopulation is depicted by a green
(red) dashed line. The intersection of the Gaussians is denoted by the vertical
dashed line, and the approximation of prob(sus|FC) is given by the blue line.
measure being too conservative. That is, viruses are being declared resistant against a
compound although the drug still shows some clinically useful activity in the patient. The
current set of cutoffs used in geno2pheno are so-called clinically relevant cutoffs, meaning, they reflect the clinical usefulness of predicted in vitro resistance. For estimating
those cutoff values special treatment changes were manually examined. In these treatment
changes only a single drug was added to the regimen and the achieved change in viral load
can therefore be fully attributed to this new drug in the combination. Hence, clinically
relevant cutoffs can be computed by correlating the predicted drug resistance against the
compound to its effect on viral load decline (Däumer et al., 2007). These special treatment
changes, however, occur only rarely in databases collecting routine treatment information, and consequently, the estimated cutoffs are unreliable. As a result, the approach
was restricted to the drug LPV for which most data was available. A further method for
deriving clinically relevant cutoffs (employed for the remaining drugs) was based on the
correlation of predicted drug resistance to a drug with the viral load measured during a
failing treatment containing that drug. In short, a linear model between log10 (FC) and
log10 (VL) was fitted and the FC corresponding to a VL of 103.41 copies per ml (i.e. two
standard deviations from the mean of VL in treatment-naïve patients) was selected to be
the upper cutoff (Däumer et al., 2007). The lower cutoff could not be estimated with this
method and was set to the biological cutoff (i.e. two standard deviations from the mean
IC50 observed in treatment-naïve patients).
The company Virco with its proprietary FC prediction system VircoTYPE was facing a
similar problem. For deriving clinically relevant cutoffs Winters et al. (2008) compiled one
dataset per drug comprising only regimens containing that drug. Briefly, the treatments
are grouped with respect to the estimated background activity of the other drugs (using
a preliminary cutoff), thus allowing for studying the relation between change in viral load
and predicted FC of the target drug. The lower and upper cutoffs are defined as the FC
3.2 Estimating Clinically Relevant Cutoffs for geno2pheno
37
values at which the drug exhibits a 20% and 80% loss in activity compared to a wild type
virus, respectively. The thresholds are optimized iteratively per drug, that is, after values
for all drugs have been computed sequentially, the next round of cutoff optimization uses
the improved values to achieve a better estimate of the background activity. The iterative
procedure was stopped when the computed clinical cutoffs remained stable. Moreover,
stability of the derived cutoffs was estimated using 1000 bootstrap replicates of the initial
data (Winters et al., 2008).
For allowing a more systematic derivation of clinical relevant cutoffs for the geno2pheno
system we investigated an approach akin to the one applied by Virco. In contrast to that
approach, all cutoffs are estimated simultaneously using global probabilistic optimization
algorithms like simulated annealing (Kirkpatrick et al., 1983) and genetic algorithms. In
the following, we briefly describe our approach.
Material and Methods
Data
The basis for the optimization is a set of treatment change episodes (TCEs) extracted
from the EuResist integrated database. Briefly, a TCE comprises the viral genotype
(at most three months prior to treatment start), the prescribed drug combination, and a
viral load before (at most three months) and after treatment start (within 4-12 weeks).
Treatment success was defined as a reduction of the VL below 500 copies per ml blood or
by at least a 100-fold reduction compared to the pre-treatment VL measurement. Details
on this standard datum definition and the EuResist database are provided in Chapter 5.
Application of the definition to data stored in the current version of the EuResist database
resulted in 5,012 TCEs. Of those, 4,514 instances were randomly selected and formed the
development set. The remaining 498 TCEs were used as an independent test set. Success
rate was 72.5% in both datasets. Furthermore, FC in drug resistance against the drugs
applied in a treatment was predicted with geno2pheno.
Global Probabilistic Optimization
Simulated annealing (SA) and a genetic algorithm (GA) were used to compute two cutoffs
for every compound. Below (above) the lower (upper) cutoff activity of the drug was rated
1.0 (0.0), representing full (no) activity. Between the cutoffs the activity was set to 0.5.
SA starts with a random set of cutoffs, and a subset of those are randomly modified in
each iteration of the algorithm. The cutoffs are used to compute the phenotypic susceptibility score (PSS) of the applied regimen, i.e. simply the sum of individual drug activities.
If a modified set of cutoffs improves the correlation between the PSS and the change in VL
(∆VL) on the development set, then it is retained. If, on the other hand, the modified set
lowers the correlation, then it is only retained with a small probability. This probability
depends on the magnitude of decrease of the correlation and on the runtime of the algorithm. Acceptance of slightly worse solutions is a tool for avoiding to get stuck in local
maxima. Precisely, the probability paccept of accepting a worse set of cutoffs is
δ
1
i
paccept = exp
, with T (i) =
1−
,
T (i)
50
I
38
3 Extensions to geno2pheno
where δ is the difference in performance between the old set and the new set (i.e. negative),
T (i) is the temperature of the system, I is the maximal number of iterations (here 5000)
and i is the current iteration. Based on the current best set of cutoffs, a neighboring set
of cutoffs was tested. In order to reduce computational complexity, the continuous IC50
values were searched in steps of 0.05. A neighboring set of cutoffs is defined as a set in
which no cutoff differs by more than 0.15 (i.e. three grid steps) from the corresponding
value in the current best set. Finally, it was ensured that lower cutoffs were smaller than
upper cutoffs. Among all neighboring sets of cutoffs, one set was randomly selected for
being tested in the next iteration.
In contrast to SA, GA maintains multiple sets of cutoffs (the population) with each
set containing cutoffs for all drugs (one individual). In every iteration of the algorithm
(generation), the chance to copy a set of cutoffs depends on its performance (fitness).
During the copying process the cutoffs are randomly modified (mutated) and even parts of
individuals might be exchanged (cross-over). The algorithm terminates when a stopping
criterion is fulfilled. Usually, the stopping criterion comprises a maximum number of
generations and a maximum number of generations without substantial improvement (stall
generation). For GA we used the implementation of the genetic algorithm and directed
search toolbox of the programming environment Matlab.
For robust estimates of the cutoffs, SA and GA were carried out on 100 bootstrap
replicates of the original development set and the mean of the cutoffs obtained from the
bootstrap replicates represents the final set of cutoffs. This set was used to compute PSS
values on the validation set where the PSS was correlated to ∆VL and to the binary outcome – performance was measured as Pearson correlation (r) and area under the receiver
operating characteristics (ROC) curve (AUC). Briefly, ROC curves depict classifier performance by giving a true-positive rate (TPR; percentage of correctly predicted successes)
for every false-positive rate (FPR; percentage of failing therapies that were predicted to
be successful). The AUC summarizes the performance and is a convenient measure for
comparing scoring systems without the need to provide a particular cutoff (Brun-Vézinet
et al., 2004). The AUC is a value between 0 and 1 corresponding to the probability that a
randomly selected success receives a higher score than a randomly selected failure (Fawcett,
2006). The achieved performance was compared to the old cutoffs, to the original activity
scores, and to Stanford’s HIVdb.
In addition to the original approach, we investigated slight modifications. Namely, the
use of AUC instead of correlation as the target function, a linear interpolation of the
intermediate region (instead of simply using 0.5), and the combination of both.
Results
Figure 3.4 a) depicts the cutoffs based on the activity score, the old manually derived
clinical cutoffs, and the set of new automatically derived clinical cutoffs. This new set of
cutoffs was derived with GA and correlation between PSS and ∆VL as a target function
without interpolation between the cutoffs. Of note, cutoffs based on the bimodal activity
model tend to be lower for most drugs than the (old or new) clinical cutoffs, and thereby
confirm the subjective impression of virologists and clinicians. When used with linear
interpolation between the lower and upper threshold, the new cutoffs achieve an AUC
3.2 Estimating Clinically Relevant Cutoffs for geno2pheno
(a) clinical cutoffs
39
(b) ROC curve
Figure 3.4: New and old clinical cutoffs (CCO) in comparison (a). Cutoffs based on the
bimodal activity model are shown in the upper bar. The old clinical cutoffs
are depicted in the middle bar, and the newly derived cutoffs are depicted
below. Green corresponds to susceptible, yellow and red indicate intermediate
and full resistance, respectively. Performance of different sets of cutoffs (b).
Performance achieved with the different sets of cutoffs on the validation data
in comparison to a HIVdb 5.0.0 based prediction.
(r) of 0.784 (0.469), compared to 0.759 (0.433) with the old cutoffs, 0.776 (0.461) with
the activity score, and 0.781 (0.457) with HIVdb using the five state interpretation. GA
provided an improved set of clinical cutoffs for geno2pheno that performed slightly better
than (as good as) HIVdb with respect to correlation (AUC). Figure 3.4 b) depicts the
ROC curves on the validation set computed with the old and new clinical cutoffs and
the reference methods HIVdb and activity scores. Based on a method by DeLong et al.
(1988) the significance of the difference between two AUC values could be assessed. Only
improvements over the old clinical cutoffs by any other method showed a statistical trend
(new clinical cutoffs: p = 0.0877; activity score: p = 0.1097) or even statistical significance
(HIVdb: p = 0.0204). The p-values for differences between the remaining pairs of methods
were all larger than 0.19.
Table 3.1 lists the achieved performance on the validation set using either 0.5 for the
intermediate resistance or the linear interpolation. Of note, for simulated annealing it
beneficial, in general, to use interpolation during the estimation of the cutoffs, in addition,
using the correlation between PSS and ∆VL as target function optimized both r and
AUC while AUC as target function results in worse r on the validation set. For GA such
conclusions cannot be drawn. Here, using AUC instead of correlation as optimization
function achieved slightly better performance in terms of AUC. The upper thresholds for
some drugs, however, were extremely large, e.g. ZDV and 3TC received approximately an
upper cutoff of 100-fold (data not shown). That is, for these drugs, viruses were regarded
as resistant only for FC values of 100 and more. Overall, cutoffs derived with GA yielded
40
3 Extensions to geno2pheno
D
I
D I
r
AUC
development
validation
simulated
r
D
I
0.455 0.464
0.465 0.467
0.421 0.396
0.446 0.436
annealing
AUC
D
I
0.763 0.774
0.759 0.766
0.759 0.756
0.752 0.751
genetic algorithm
r
AUC
D
I
D
I
0.472 0.469 0.780 0.784
0.468 0.471 0.774 0.784
0.467 0.463 0.786 0.789
0.479 0.466 0.784 0.790
Table 3.1: Performance of different sets of automatically derived clinical cutoffs. Performance was measured in correlation (r) and AUC on the independent validation
set. The rows denote the setting in the training stage, i.e. the different target functions and whether interpolations between thresholds was used (I for
interpolation, D for discrete). The columns correspond to different settings in
the validation setting. The first (last) 4 columns denote performance of cutoffs
derived with simulated annealing (genetic algorithm).
better performance on the independent validation set than the cutoffs derived with SA
when the same training setup was used. The method of DeLong et al. (1988) indicates
that the observed improvements in AUC are significant (p-values range from 0.0502 to
0.0041).
Clearly, SA and GA are not the only probabilistic global optimization methods. Other
methods like ant colony optimization (Dorigo and Gambardella, 1997) can also be used to
simultaneously optimize the cutoffs. However, the genetic algorithm performs well, and it
is likely that modifications of the target function and the computation of the PSS play a
more pivotal role in the final quality of the cutoffs than the optimization method.
3.3 Improvements Using Semi-Supervised Learning
The advantage of data-driven methods for developing decision support systems, namely
that information is only derived from data without interference of expert opinions, is also
their major weak point: a sufficient amount of data is required during the training step
of the models. This weakness becomes more evident when new drugs are released. Rulesbased systems can easily be extended on the basis of first reports of the drug in clinical trials
or extended access programs. For the phenotypic resistance test, which is an important
step during data generation, the drug has to be available to the laboratory conducting the
assay. This, however, is usually only the case when the drug is already FDA approved.
Consequently, there exists a serious time lag between approval of a drug and the availability
of sufficient data for training statistical models.
In a recent work we investigated the use of semi-supervised learning (SSL) methods
for improving the prediction of drug resistance (Perner et al., 2009). Briefly, SSL uses
labeled data (genotype-phenotype pairs) and unlabeled data (just genotypes from routine
diagnostic) to derive improved models. The idea behind SSL is that unlabeled data can
provide information on the distribution of the data in the input space. This information
3.3 Improvements Using Semi-Supervised Learning
41
can be exploited by the machine learning algorithm to avoid the separation of clusters, since
points in the same cluster are assumed to belong to the same class (cluster assumption).
Moreover, SSL methods generally assume that samples that are close in the input space are
also close in the output space (smoothness or manifold assumption). Zhu (2007) provides
an introduction into SSL and a survey on available SSL methods.
We examined two types of unlabeled data: the first category originates from patients that
were treated with the particular drug (for which we are building a model) at the time the
viral sequence was obtained (Sdrug ), while the second category required only the use of one
drug of the same drug class (Sclass ) at the time of genotyping. The second set of unlabeled
data can be expected to be larger but probably less informative for the learning as the
virus was (at the time of genotyping) not exposed to the drug in question. We found that
if all labeled data (up to 1,000 genotype-phenotype pairs) were used, then the SSL learning
methods showed little benefit over the standard supervised learning methods regardless of
the type of unlabeled data. If, however, only little labeled data were available (e.g. 10% of
all labeled data) then the SSL methods showed a significant improvement (Figure 3.6; upper
row). Unfortunately, the benefit could not be witnessed for all drugs/drug classes and all
tested SSL methods (data not shown). In fact, a violation of the SSL assumptions by the
underlying data, led to inferior models. For example, the performance of the 3TC model
was heavily corrupted by SSL methods. In the case of 3TC, one amino acid exchange
is sufficient to confer complete resistance, while for other NRTIs several mutations are
necessary. NRTIs are usually given to the patients in pairs. Thus viruses that were exposed
to 3TC were also exposed to other NRTIs with more complicated resistance patterns. As
a consequence, the data density does not reflect the labeling of 3TC resistance, which is a
clear violation of the smoothness assumption.
The transductive SVM (tSVM) is a SSL version of the SVM and, just like its supervised
counterpart, provides a decision boundary (see Eqn. 3.1) after the training phase. Briefly,
the standard soft margin SVM optimizes the following function:
N
X
1
min kwk
~ +C
ξi , subject to ξi ≥ 0, yi (w
~ · x~i + b) ≥ 1 − ξi , ∀i
w
~ 2
(3.2)
i=1
where N is the number of labeled training instances, x~i and yi are the features and the label
of the ith instance, respectively, w
~ and b define the hyperplane, ξi are the slack variables
that allow for misclassification and C is the cost parameter for misclassified examples.
The tSVM aims at determining a separating hyperplane under consideration of the M
unlabeled samples {~x∗1 , . . . , ~x∗M }, therefore Eqn. 3.2 is extended in the following way:
N
M
X
X
1
min kwk
~ +C
ξi + C ∗
ξj∗ , subject to
w
~ 2
i=1
j=1
ξi , ξj∗
≥ 0, yi (w
~ · x~i + b) ≥ 1 − ξi , yj∗ (w
~ · ~x∗j + b) ≥ 1 − ξj∗ , ∀i,j
(3.3)
where the additional parameters ξj∗ and C ∗ are the slack variables and the misclassification
cost parameter for the unlabeled instances, respectively. Thus, the optimization problem
∗ for the
in Eqn. 3.3 differs from Eqn. 3.2 in that the tSVM has to find a labeling y1∗ , . . . , ym
unlabeled data and a hyperplane < w,
~ b > simultaneously. An approximative optimization
procedure, which is required due to the complexity of the optimization problem, has been
42
3 Extensions to geno2pheno
Figure 3.5: Decision boundaries fitted by SVM-based models. Decision boundary derived
only from labeled instances (colored circles) using a supervised SVM method
(left). The transductive SVM takes also the unlabeled samples (black circles)
into account when fitting the decision boundary (right).
implemented in the software library SVMlight by Joachims (1999). The approach begins
with a labeling of ~x∗1 , . . . , ~x∗m based on the classification of an inductive SVM and a low
weight C ∗ for the penalty for misclassified unlabeled data points. Then the labels of two
randomly selected samples (one positive and one negative) are swapped. If the objective
function is improved by that exchange of labels, then the switch is made permanent.
This process is repeated until there are no more switches possible that yield an improved
objective function. At this point the penalty for misclassified unlabeled data points C ∗ is
increased and further labels are swapped to greedily improve the objective function. The
iterative procedure stops when C ∗ exceeds a user defined value. Figure 3.5 depicts the
concept of transductive SVMs.
The resulting hyperplane (i.e. decision boundary) has the same form as the one derived
with the standard supervised SVM approach. Consequently, the boundary derived with
an tSVM can be used to assess the class of new unseen samples. The ability of classifying
unseen samples is not granted for SSL methods. In fact, many methods only provide a
classification for all unlabeled samples that are available during the training phase. Hence,
classification of a new sample requires retraining of the entire model with all previous
training data and the new unlabeled sample. For instance, low-density separation (Chapelle
and Zien, 2005), one of the SSL methods we studied, cannot be used on unseen samples.
This inability is a direct result from the fact that low-density separation (LDS) constructs
a kernel which is based on all samples (labeled and unlabeled) in training data. LDS is
only capable of making predictions for samples that were used to build the kernel.
Nevertheless, the benefit of using tSVMs in the transductive setting (i.e. only prediction
of unlabeled samples used in the training phase) that we previously observed (Perner et al.,
2009), could be transferred to the inductive setting (i.e. prediction of new unseen samples
without retraining). Figure 3.6 depicts the learning curves (size of labeled training data
versus model performance) for the two drugs LPV and ZDV using the tSVM implementation SVMlight by Joachims (1999) and both types of unlabeled data. A standard supervised
SVM served as reference method. The results show that with only very little labeled data
3.3 Improvements Using Semi-Supervised Learning
43
(e.g. 2.5%) the SSL models almost reach the performance of the supervised method using
all available labeled data.
Concluding, SSL does not yet provide a useful tool for improving the prediction of
resistance for the first drug in a novel drug class (e.g. integrase inhibitors), simply due
to the fact that only little relevant unlabeled data exist at the time when the drug is
released. For instance, there are currently about 3800 integrase sequences stored in the
Los Alamos HIV Sequence Database (http://www.hiv.lanl.gov/), compared to about
75.000 sequences comprising the protease. Moreover, since prior to the release of raltegravir
there were no integrase inhibitors available, the majority of the available sequences will
not contain any resistance mutations. The question, whether SSL provides a benefit for a
new drug in an established drug class, depends on the resistance profile of the class and
how the drugs are used. For instance, the NNRTI and NRTI models suffered from the
circumstance that multiple drugs with different resistance patterns were simultaneously
targeting the same viral protein, while PIs generally benefited from SSL. Likewise, future
INIs might benefit from sequence data generated after exposure to raltegravir.
44
3 Extensions to geno2pheno
1.0
1.0
1.1
ZDV
1.1
LPV
●
●
●
●
●
●
●
●
●
●
0.9
●
●
AUC
●
●
●
●
●
●
●
●
0.7
●
0.5
●
200
300
400
500
●
●
600
0
200
400
SVM on L
SVMLight on L and S_drug
SVMLight on L and S_class
LDS on L and S_drug
LDS on L and S_class
600
Number of labeled samples
Number of labeled samples
LPV
ZDV
800
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
0.9
0.9
●
1.0
1.0
1.1
100
1.1
0
SVM on L
SVMLight on L and S_drug
SVMLight on L and S_class
LDS on L and S_drug
LDS on L and S_class
0.6
●
0.5
0.6
0.7
●
●
●
●
●
●
●
●
●
●
●
●
0.5
●
0
100
200
300
SVM on L
SVMLight on L and S_drug
SVMLight on L and S_class
400
Number of labeled samples
500
600
0.8
0.6
0.7
●
●
●
0.5
0.6
0.7
0.8
AUC
●
AUC
●
0.8
●
●●
0.8
AUC
0.9
●
0
200
400
SVM on L
SVMLight on L and S_drug
SVMLight on L and S_class
600
800
Number of labeled samples
Figure 3.6: Learning curve for the drugs LPV and ZDV. Model performance is given as
the area under the ROC curve (AUC) and based on a 10-fold cross-validation.
For LPV (ZDV) a total of 682 (1055) labeled samples (L) were available. For
the tSVM 1442 (2717) or 4435 (7887) sequences, which were exposed to LPV
(ZDV) or at least one PI (NRTI) at the time of sequencing, respectively, were
used. The performance was estimated using 2.5%, 5%, 10%, 20%, 40%, 60%,
80%, and 100% of the labeled data. The upper row depicts the performance
of tSVM and LDS in the transductive setting (i.e. test instances were used as
unlabeled training data). By contrast, the lower row depicts the performance
of tSVM in the inductive setting.
4 Predicting Response to Combination
Therapy
In Chapter 2 we introduced the basis of modern anti-HIV therapy. In order to ensure the
selection of potent regimens, several genotype interpretation systems are used to infer in
vitro drug susceptibility and/or in vivo response to antiretroviral treatment on the basis
of HIV-1 genotype. Most of these tools use a set of rules carefully crafted by experts
and classify the virus as susceptible, intermediate, or resistant to each of the individual
compounds. Few tools are fully data driven rather than based on expert knowledge. The
most prominent examples are geno2pheno (Beerenwinkel et al., 2003a) and VirtualPhenotype (Vermeiren et al., 2007), which both apply methods from statistical learning for
predicting in vitro resistance on the basis of genotype. Although all of these methods
are designed to infer susceptibility to individual compounds (Vercauteren and Vandamme,
2006), recently developed decision support tools are being explored to infer virological response directly to a typical 3-4-drug HAART regimen. In one study (Larder et al., 2007),
artificial neural networks were used to predict the change in viral load, given the sequence,
regimen, and additional host-specific features. In Section 4.2 we introduce the software
pipeline geno2pheno-THEO, which predicts the probability of reaching an undetectable
viral load during the course of the regimen given the applied drug combination and the
genetic makeup of the viral population. Geno2pheno-THEO uses covariates that encode
the estimated viral evolution during treatment in addition to drug combination and viral
genotype. These evolutionary features are based on works by Beerenwinkel et al. and are
briefly summarized in Section 4.1. The resulting statistical model is evaluated on a large
external database in a comparison to well-established HIV genotype interpretation systems
(Section 4.3).
4.1 Features of Viral Evolution
The features estimating viral evolution that are used to improve the prediction of response
to antiretroviral treatment are based on works by Beerenwinkel et al. and are briefly
summarized in the following sections.
4.1.1 Activity Score
The activity score for combination treatments is based on the activity score for individual
drugs as computed by geno2pheno. Precisely, let D be the set of all drugs and T =
{d1 , . . . , dn } ⊆ D be a combination of n drugs. Furthermore, Tc represents all drugs of T
belonging to drug class c ∈ {NRTI, NNRTI, PI}. The activity of a treatment T against
45
46
4 Predicting Response to Combination Therapy
virus seq is defined as:
activity(T, seq) =
X
c
max activity(d, seq),
d∈Tc
where rf d (seq) denotes the resistance factor (RF) of the virus seq against drug d, and
activity(d, seq) = prob(sus| log rf d (seq)) as defined in Section 3.2. Inhibitors sharing drug
target and mechanism of action are competing, e.g. for the binding site, thus the activity
score is restricted to the most active representative of a class (Jordan et al., 2002). No such
restrictions hold for drugs from different drug classes, and consequently the activity score
is additive for inhibitors from different drug classes. Obviously, the number of different
drug classes constitutes the upper bound of the activity for treatments. Thus, the activity
score is in [0,3] for our case with at most three drug classes.
Maximizing over drugs in the same class might at first glance seem limiting, since there
is no immediate benefit of administering multiple drugs from the same class. The benefit
becomes evident though, when considering the change of the activity score during the
course of treatment. The selective pressure posed by the treatment leads to mutations in
the viral genome, which in turn might affect drugs from the same class differently and thus
the most active drug from one class changes over time. For instance, at start of a regimen
comprising ZDV and 3TC the RT of a virus shows only the T215Y mutation, which solely
reduces the susceptibility of ZDV. Thus, the activity of 3TC is unharmed and consequently
the activity of the treatment is 1. During the time of therapy, the virus develops another
RT mutation. For instance, M184V that completely inhibits the activity of 3TC but does
not affect ZDV at all. The activity score for the combination treatment is now equal to the
activity of ZDV, in contrast to a 3TC monotherapy where the activity score would have
decreased immediately to 0.
This example illustrates that, in addition to the resistance to drugs at treatment start,
the long-term success of an antiviral therapy depends greatly on the ability of the virus to
escape from selective pressure presented by the treatment. An estimate of the virus’ ability
to escape from the treatment might therefore be an useful information for predicting the
success of combination treatments. Such an estimate can be derived by studying changes
of activity in the mutational neighborhood of the virus.
Due to the high dimension of the sequence space, i.e. there are 20319 possible amino acid
sequences of length 319, the neighborhood of the virus is explored using a beam search.
Precisely, starting from the original sequence, all variants with one additional mutation are
generated and the activity of the treatment against all variants is computed. The resulting
activity score for these in silico mutants is used to rank the variants. Only for the top b
mutants showing the least activity all variants with one additional mutation are generated
and their activity score is computed. The search stops when d mutations are introduced
into the original sequence. The breath b and depth d are the two parameters of the beam
search. The activity score at depth r is defined by:
activityr (T, seq) = min
activity(T, seq0 ),
0
seq ∈Br
where Br denotes the neighborhood of the original sequence with r additional mutations.
Br is generated from the b viruses with the lowest activity score in Br−1 by adding an
additional mutation.
For further details on an the activity score see the work by Beerenwinkel et al. (2003b).
4.1 Features of Viral Evolution
47
4.1.2 Mutagenetic Trees
The beam search approach assumes that every mutation is equally likely, and that only the
increase in resistance is important. This, however, is an oversimplification. For example,
there exist well-known resistance pathways for the first approved anti-HIV drug zidovudine.
The thymidine analog mutation (TAM) pathway 1 and 2 comprise the RT mutations 41L,
210W, and 215F/Y and 67N, 70R, and 219E/Q, respectively. The mutations belonging to
one pathway are mainly accumulated in a preferred order (hence pathway). The molecular
causes for the preferred order or for the existence of two distinct pathways are still unknown.
However, these pathways occur so frequently that they were identified by experts without
the use of computational analyses.
Beerenwinkel et al. (2005b) introduced mixtures of mutagenetic trees to estimate mutational pathways from cross-sectional data. Briefly, a mutagenetic tree is a weighted directed
tree with vertices representing the appearance a new mutation (event). Edge weights are
conditional probabilities between vertices with the constraint that the parent event has
to be present before the child event can occur. Thus, every path from the root of the
tree to a vertex represents the ordered accumulation of mutations. Basically, mutagenetic
trees are restricted Bayesian networks and constitute an extension of the oncogenetic tree
model introduced by Desper et al. (1999) that was restricted to undirected trees. A further
extension of the original approach is the ability to generate mixtures of mutagenetic trees.
Formally, a mutagenetic tree is a quartuple T = (V, E, r, ρ), with V = {1, . . . , l} being a
set of vertices representing the events, E being the set of edges, r ∈ V being the root vertex
that encodes the absence of any events, and ρ : E → [0, 1] being the mapping from edges
to conditional probabilities. A single tree is constructed by first building the complete
digraph G = (V, V × V, w) on all vertices. The weights w are defined as
∀u, v ∈ V : w(u, v) = log P r(u, v) − log(P r(u) + P r(v)) − log P r(v),
where P r(u) and P r(u, v) denote the marginal probability of event u and the joint probability of events u and v, respectively, as estimated from the data. The edge weights are
P r(u,v)
the logarithm of the independence of the mutations P r(u)P
r(v) multiplied by the direction
P r(u)
of the dependence P r(u)+P
r(v) . The mutagenetic tree is defined as the branching in G that
maximizes the sum of its edge weights. Using the algorithm by Edmonds (1967) the maximum weighted branching can be computed in O(|V ||E|) time, for a fully connected graph
this corresponds to O(|V |3 ). The weighting function w(u, v), however, displays an undesired behavior for edges leaving the root node, since the less likely events receive a higher
score w(r, v) = − log(1 + P r(v)). Thus, for edges leaving the root node the alternative
weighting function w(r, v) = log P r(v) is used.
The likelihood of a mutational pattern x is the probability that a given mutagenetic
tree T generates x: L(x|T ) = P r(x|T ). Let S ⊆ V be the set of events defined by the
mutational pattern x. If there is a subset of edges E 0 ⊆ E such that exactly all vertices of
S can be reached in the subtree (V, E 0 ), then x can be generated by T with likelihood:
Y
Y
P r(e)
(1 − P r(e)).
L(x|T ) =
e∈E 0
e∈{S×V \S}
The first product computes the probability of reaching all events in S and the second
product represents the probability of not going beyond the events in S. If there is no
48
4 Predicting Response to Combination Therapy
appropriate subtree of T , then the pattern x cannot be generated by T . Hence, L(x|T ) = 0.
For instance, the right tree in Figure 4.1 does support the mutational pattern 215F, 41L
but not the pattern 215F, 41L, and 70R, since the mutations 67N and either 219E or 219Q
have to be acquired before the mutation at position 70. Thus, the likelihood of the pattern
215F, 41L, and 70R to be generated by that tree is 0. On the other hand, the likelihood
of the 215F, 41L is 0.4 × 0.77 × (1 − 0.61) × (1 − 0.45).
The mixture of mutagenetic trees is estimated from cross-sectional data in an EM-like
learning algorithm (Dempster et al., 1977). Formally, a k-mutagenetic tree mixture model
is defined as
k
k
X
X
M=
αi Ti with αi ∈ [0, 1] ∧
αi = 1,
i=1
i=1
with Ti = (V, Ei , r, ρi ). Consequently, the likelihood of a pattern x given the mixture
model is defined as
k
X
L(x|M ) =
αi L(x|Ti ).
i=1
In order to handle noisy real-world data, one component of the mixture is forced to attain
a star-like topology modeling the independence of mutations. Briefly, during the learning
phase we want to find the mixture of k mutagenetic trees that maximize the log-likelihood
of the training data:
N
X
n=1
log
k
X
αi L(xn |Ti ).
(4.1)
i=1
The responsibility γni of the tree Ti for the training sample xn is defined as the probability
of xn being generated by Ti given the mixture model M . In the M-like step of the training
algorithm the parameters of the current mixture model, i.e. the edge weights of the star
component, the k − 1 trees, and the mixture parameters αi , are updated. In the E step the
responsibilities are computed using the updated model. E and M step are iterated until
the log-likelihood converges. At the initialization of the algorithm initial responsibilities
are assigned according to a (k − 1)-means clustering of the training data. Algorithm 1 is
adapted from (Beerenwinkel et al., 2005b) and summarizes the complete training process.
For each antiretroviral drug one mixture of mutagenetic trees is trained. The training
data for each drug comprises sequences from patients that were obtained during treatment
with that drug. Figure 4.1 depicts a 2-mutagenetic tree mixture model learned for the
drug ZDV. The left component is a star that supports every possible pattern and therefore
models the noise of the data. The right component is a tree estimated from data. The
mixture weights α are depicted above the components. The actual tree estimated from the
data is able to generate 72% of the instances found in the training data. The mixture of
mutagenetic trees can be used to derive two evolutionary features. The genetic progression
score provides an estimate on the expected waiting time for a mutational pattern to occur
and therefore summarizes how advanced the virus in terms of drug resistance is. The
genetic barrier to drug resistance is the probability that the virus will not escape from
drug pressure by developing further mutations. Both features are briefly described in the
next sections.
4.1 Features of Viral Evolution
49
Algorithm 1 k-mutagenetic tree learning (X = (xnj )1≤n≤N,1≤j≤l , k)
N is the number of training instances, and l is the number of events
1. Guess initial responsibilities:
(a) Run (k-1)-means clustering algorithm
(b) Set responsibilities
(
γni =
1
2
if xn is in cluster i − 1,
1
2(k−1)
else.
2. M-like step. Update model parameters:
P
Ni = N
n=1 γni ∀i ∈ {1, . . . , k}
P
P
T1 is a star with edge weights: β = lN1 1 lj=1 N
n=1 γn1 xnj
∀i ∈ {2, . . . , k} :
(a) Estimate the joint probability for all pairs of events (u, v):
N
1 X
pi (u, v) =
γni xnu xnv .
Ni
n=1
(b) Compute the maximum weight branching Ti from the complete digraph with weights
w derived from pi .
i
(c) Compute the mixture parameter αi = N
N .
3. E step. Compute responsibilities:
γni = Pk
αi L(xn |Ti )
m=1 αi L(xn |Tm )
.
4. Iterate steps 2 and 3 until convergence, i.e. no changes in Eqn. 4.1.
return T1 , . . . , Tk and α1 , . . . , αk
The Genetic Progression Score
The genetic progression score (GPS) was introduced by Rahnenführer et al. (2005). Briefly,
a timed mutagenetic tree can be obtained by assuming independent Poisson processes
for the occurrence of events on the edges and for the sampling time of the virus. The
difference of occurrence time of the event j and its parents is denoted by Zj . Since event
j was generated by a Poisson process, the waiting time Zj is exponentially distributed
with parameter λj . Furthermore, let the sampling time of the virus ZS be exponentially
distributed with parameter λS . The probability of event j can be computed as P r(j) =
λj /(λj + λS ). Thus, the expected waiting time for event j is defined as:
E[Zj ] =
1
1 − P r(j)
1 − P r(j)
=
=
E[ZS ].
λj
P r(j)λS
P r(j)
Estimation of the λS from real data is usually impossible, since the time of infection is
usually unknown and the virus might have had events before starting the particular drug.
Thus, we set E[ZS ] = 1, which defines a unit-less waiting time. We emphasize that the GPS
is not intended to estimate absolute waiting times. Rather it provides a dimension-free
50
4 Predicting Response to Combination Therapy
Figure 4.1: Example Mutagenetic tree mixture.
measure of genetic progression that allows for comparing mutational patterns of different
viruses.
Unfortunately – apart from the case of a single event – there exists no formula for
computing the waiting time for arbitrary pattern of events. However, within the timed
mutagenetic tree model the expected waiting time can be estimated by simulating the
waiting process. To this end, the estimated probabilities P r(j) are used to compute the
required λj = P r(j)/(1−P r(j)). For a sufficiently large number of simulations one receives
a stable estimate for the waiting time Zx for the mutational pattern x with respect to the
distribution induced by the mutagenetic mixture model M . This waiting time is the genetic
progression score: GPS(x) = EM [Zx ].
Of note, the GPS of a virus is computed for every drug (mutagenetic tree model) separately. Due to the definition of E[ZS ] = 1 the waiting times cannot be compared between
drugs.
The Genetic Barrier to Drug Resistance
Instead of inspecting the past of the virus, i.e. to which state(s) in the (mixture of) tree(s)
the virus already has progressed, we can use the evolutionary model for studying the future
of the virus, i.e. to which states will it most likely move. Briefly, the genetic barrier is
defined as the probability of not escaping from a drug by developing further mutations.
With escape from a drug being defined as exceeding a predefined threshold of phenotypic
drug resistance.
In the mutagenetic tree model it is sufficient to describe the given viral sequence as
one of 2l possible mutational patterns. Given a tree topology one can now compute the
transition probability from the current pattern to any of the 2l possible patterns. If the tree
topology does not allow the transition – e.g. when loosing mutations, then the probability
is set to 0. Furthermore, using the paired genotype-phenotype data, which was used to
4.2 geno2pheno-THEO
51
train geno2pheno, one can compute the mean phenotypic resistance associated with each
mutational pattern. Due to the limited number of genotype-phenotype pairs it is likely that
some mutational patterns do not occur, in this case the weighted mean of all mutational
patterns with non-zero transition probabilities is used, with the transition probabilities
serving as weights. The genetic barrier to drug resistance is then simply defined as the
sum of all transition probabilities to mutational patterns with phenotypic resistance below
a certain threshold.
The genetic barrier to drug resistance combines evolutionary information, the transition
probabilities, with phenotypic information.
4.2 geno2pheno-THEO
This section describes a joint work with Niko Beerenwinkel, Tobias Sing, Igor Savenkov,
Martin Däumer, Rolf Kaiser, Soo-Yon Rhee, W Jeffrey Fessel, Robert W Shafer, and
Thomas Lengauer. The work was published in Antiviral Therapy under the title “Improved
prediction of response to antiretroviral combination therapy using the genetic barrier to
drug resistance” (Altmann et al., 2007a).
With about 24 drugs available and novel drugs being approved almost every year, it
becomes increasingly difficult for the treating physician to select an optimal drug combination. There are a variety of different therapeutic goals, including VL reduction, increase
in CD4+ T cell counts, minimization of adverse effects and preservation of future drug
options. Importantly, different optimization criteria will tend to favor different therapies (Jiang et al., 2003). To date, most methods predict virological response to therapy
based on the baseline genotype and the compounds in the applied combination. Specifically, artificial neural networks (Wang et al., 2003) and fuzzy rules combined with a genetic
algorithm (Prosperi et al., 2004) were used in this manner to predict the change in VL.
Related approaches include the application of case-based reasoning (Prosperi et al., 2005)
and combinatorial optimization based on expert rules (Lathrop and Pazzani, 1999).
In this section, we report new ways of analyzing treatment change episodes (TCE) and
demonstrate how these new approaches might be more useful than current methods for
prediction of response to therapy. The improvement results from incorporating genetic
analysis, phenotypic prediction, and a prediction of the probability that further evolution
of resistance will occur. The applied methodologies involve various techniques of statistical
learning. We dichotomize virological response and compare the performance of several
classifiers that predict the therapeutic success or failure of each genotype-therapy pair.
The novel features derived from genotype and drug combination encode information about
the evolutionary potential of the virus and the predicted level of phenotypic drug resistance.
Specifically, we consider predicted phenotypes (see Section 2.5.2), the activity score based
on a heuristic search over in silico mutants (see Section 4.1.1), the genetic barrier and
the genetic progression score (see Section 4.1.2). Analyzing over 6,300 TCEs observed in a
clinical setting, we show that all new descriptors significantly improve prediction of therapy
outcome, especially if they combine evolutionary and phenotypic information.
52
4 Predicting Response to Combination Therapy
start
Sequencing
> 14 days
stop
> 0 days
Therapy
(a) failure
start
start
Sequencing
< 90 days
> 0 days
> 14 days
VL < 400 cp/ml
> 0 days
Therapy
stop
> 0 days
Therapy
stop
(b) success
start
start
Sequencing
< 90 days
> 0 days
> 14 days
Therapy
VL < 400 cp/ml
> 0 days
VL < 400 cp/ml
> 56/112 days
stop
> 0 days
Therapy
stop
(c) sustained success
Figure 4.2: A treatment change episode (TCE) resulting in sequencing is always considered
to be a failure (a). If the virus is undetectable during the treatment following a
failure, then the TCE is called a success (b). In addition, the alternative definition for successful TCEs requires two subsequent viral load (VL) measurements
below the threshold with at least eight (or 16) weeks in between (c).
4.2.1 Material and Methods
Treatment change episodes
A TCE (Larder et al., 2003) consists of a baseline genotype, a drug combination, and
a binary outcome indicating success or failure of the regimen. For our analysis, valid
successful or failing TCEs were defined as follows (Figure 4.2): any available genotype
is considered as evidence of a failing regimen, because, in general, sequencing can only
be performed if the VL exceeds ∼1,000 copies/ml. Successful regimens are defined by
inspecting therapies that follow a genotype measurement. When multiple genotypes are
available, the most recent sequence sample before the onset of therapy was used. If the
VL decreases below 400 copies/ml at least once during the course of the follow-up therapy
and if genotyping was performed no earlier than three months before starting the therapy,
then the respective treatment is considered a success. This definition of success focuses on
initial response. Sustained response is also investigated by using an alternative definition
of success that requires a second follow-up VL value below the threshold at least 8 weeks
(or 16 weeks) after the first (Figure 4.2 c).
4.2 geno2pheno-THEO
53
Figure 4.3: The 6,337 analysed TCEs comprise a total of 875 distinct combination therapies. The histogram summarizes all 354 drug combinations that occur at least
three times in the dataset. Another 116 combinations appeared only twice and
405 only once. Grey bars indicate successful therapies defined as initial virological responses below 400 copies/ml. The inlet histogram shows the 20 most
abundant drug combinations.
Datasets
We analysed data obtained from the Stanford HIV Drug Resistance Database (comprising
data from clinical studies ACTG 320, ACTG 364, GART and HAVANA) and from two
Northern California clinic populations undergoing genotypic resistance testing at Stanford
University. From a total of 25,717 therapies, 10,288 sequences, and 6,706 patients, we extracted 6,337 TCEs, including 4,776 failures and 1,561 successes, according to the definition
based on initial response. The median time period between treatment change and the first
follow-up VL measurement <400 copies/ml is 38 weeks (interquartile range: 16-78 weeks).
Figure 4.3 shows the distribution of drug combinations for this dataset (denoted A). The
alternative definitions resulted in 1,082 and 900 successes for eight and 16 weeks sustained
response, respectively. Because of lack of data on the drug enfuvirtide (at time of the study
the only approved EI), we considered only regimens consisting of NRTIs, NNRTIs and PIs.
From dataset A, we generated two special subsets that are balanced with respect to
sequences (BS) and with respect to therapies (BT), respectively. The dataset BS contains
one successful and one failing regimen for each of 1,364 sequences exactly. Each genotype in
this dataset gives rise to two TCEs, defined by the regimens before and after a successful
treatment change. Similarly, in the set BT each regimen is paired with two different
sequences, giving rise to exactly one successful TCE and one failure. This selection resulted
in a total of 2,436 TCEs comprising 321 different regimens. If, for a fixed drug combination,
54
4 Predicting Response to Combination Therapy
there were more sequences resulting in failure than in success, then failures have been
selected randomly to match the number of successes (or vice versa).
Dataset A was split into two groups denoted A1 and A2 according to the clinical centre
in which the TCE was observed. Group A1 comprises 816 successes and 2,098 failures from
one health care provider, and group A2 includes the remaining 745 successes and 2,678
failures from clinical studies and other hospitals. By using, in turn, one of these subsets for
training and the other for testing, we seek to identify possible biases that might be related
to the specifics of individual clinics, such as preferred treatment protocols. Datasets BS
and BT were split in the same manner.
In addition to the clinical data, we also used in vitro data on phenotypic drug resistance.
For each drug, 880 sequences with fold-change (FC) determined by a recombinant virus
assay (Walter et al., 1999) were available. These data were used to regress phenotype on
genotype and to identify resistant states in defining the genetic barrier.
Features
The baseline approach for predicting TCE outcome uses one indicator variable for each
resistance-associated mutation and one for each drug. For this approach and approaches
based on mutagenetic trees (Section 4.1.2), we considered the resistance mutations presented in Johnson et al. (2005) resulting in 66 binary variables (49 mutation indicators, 17
drug indicators). We refer to this encoding of genotypes and therapies as the indicator representation. All other encodings contain these straightforward covariates and additional,
more elaborate, features.
The phenotype representation includes, for each drug in the respective regimen, the
predicted FC in susceptibility. These predictions are based on a linear support vector
machine that has been trained on the 880 matched genotype-phenotype pairs as described
previously (Section 2.5.2).
In the activity representation, the indicator vector is extended by the estimated activity
of the drug cocktail against the virus population. For the beam search both parameters,
both the breath and depth are set to 10.
The genetic barrier representation also adds both phenotypic and evolutionary information to the indicator representation, and it can be regarded as an advancement over the
activity score. More precisely, we used mutagenetic trees, a family of probabilistic graphical models, to estimate the order and rate of occurrence of resistance mutations. Using the
Mtreemix software1 (Beerenwinkel et al., 2005c), for each drug, a mixture model of mutagenetic trees was learned from sequences derived under regimens comprising that drug. The
number of trees was selected using the Bayesian information criterion (BIC) (Yin et al.,
2006). A validation of mutagenetic tree models in terms of tree stability and goodness of
fit has been presented in Beerenwinkel et al. (2005b). Viral escape is approximated by exceeding a predefined level of phenotypic resistance. These levels are defined by the cutoffs
listed in Table 4.1 and are exactly the (old) clinical cutoffs for geno2pheno (Section 3.2).
Unlike the sequence space search for low-activity mutants, the genetic barrier accounts
for the fact that not all mutations are equally likely to occur. This is also an advantage
over simple counting of resistance mutations, a frequently employed approximation to the
1
http://mtreemix.bioinf.mpi-sb.mpg.de/
4.2 geno2pheno-THEO
drug
cutoff
trees
ZDV
30.0
5
ddC
2.2
2
ddI
2.4
4
d4T
2.0
4
3TC
15.4
5
ABC
3.4
7
TDF
2.1
3
drug
cutoff
trees
SQV
4.5
4
IDV
4.6
4
RTV
2.6
4
NFV
5.8
6
APV
12.0
3
LPV
10.0
5
ATV
4.2
2
NVP
9.0
5
DLV
9.7
2
55
EFV
7.0
4
Table 4.1: Resistance cutoff used for each drug and number of trees in the mutagenetic
tree mixture model.
genetic barrier.
Finally, the genetic progression score (GPS) involves only evolutionary information that
is extracted from the mutagenetic tree models. The GPS of a genotype is defined as the
expected waiting time for the mutational pattern to occur. Thus, the GPS also accounts
for different probabilities of different mutations, but it does not include any phenotypic
information.
Statistical Learning Methods
The five feature sets defined the input to several different machine-learning techniques,
which were used to predict treatment response. We selected several standard classification
methods (Hastie et al., 2001, pp.79-114), including linear discriminant analysis (LDA),
which was used in Beerenwinkel et al. (2003b), together with the activity representation,
least-squares regression, linear support vector machines (SVMs), decision trees (C4.5 software2 ), and logistic regression. We also included the more recent method of logistic model
trees (LMT), which combine decision trees with logistic regression in the leaves of the
tree (Landwehr et al., 2005).
Receiver Operating Characteristic Analysis
We used receiver operating characteristic (ROC) curves to compare the predictive power
of classifiers. ROC curves arise from varying a parameter of the classifier that controls
the trade-off between sensitivity and specificity. Each point on the ROC curve represents
one classifier, and the curve allows for reading off its false positive and true positive rate.
For example, the point (0.1, 0.8) represents a classifier that will falsely predict 10% of
failing regimens as “success”, but correctly detect 80% of the successful regimens. The
comparison of ROC curves is preferred to comparing error rates, because it corrects for
skewed class distributions (as in dataset A) and controls both sensitivity and specificity of
the classifier (Brun-Vézinet et al., 2004). The area under the ROC curve (AUC) is used
as a summary performance measure. The trivial classifier that makes random predictions
produces a linear ROC curve with an AUC of 0.5. The maximum AUC a classifier can
achieve is 1.0 and the larger this value, the better its prediction performance. AUC values
were compared using Wilcoxon rank-sum tests. The ROCR software3 was used for ROC
analysis (Sing et al., 2005a).
2
3
http://www.rulequest.com
http://bioinf.mpi-sb.mpg.de/projects/rocr/
56
4 Predicting Response to Combination Therapy
4.2.2 Results
Table 4.2 shows the performance of all learning techniques in combination with all feature
encodings for the complete dataset A using the initial response definition of therapeutic
success. Classifier performance was estimated by 10-fold cross-validation and is reported as
the AUC. All of the proposed extensions of the indicator representation improved predictive power (Table 4.2, last column). This improvement was observed across all statistical
learning methods. The additional features activity score and GPS yield the same predictive power on average. The largest improvement is achieved by the genetic barrier and by
phenotype predictions, which both use phenotypic information for prediction. Compared
with the feature encoding, the choice of the learning method has only a small effect. On
average, the AUC differs by as much as 0.064 (7.7%) between different feature encodings,
but only by 0.027 (3.2%) between different learning methods. The AUC of the phenotype
representation decreases if combined with decision trees. However, as illustrated in Figure 4.4, the ROC curves reveal that this difference is mainly due to lower true positive
rates at very high false positive rates, which might not be relevant for practical purposes.
We restrict the further analysis of ROC curves to the LMT learning method, which, on
average, outperformed all other techniques (Table 4.2).
In Figure 4.2, the ROC curves for the five different feature encodings used with LMT on
dataset A are shown. The indicator representation (black line) is improved significantly
by all four additional features (p<0.0002). This advancement is most prominent for the
genetic barrier that incorporates phenotypic information indirectly (p=0.0002), followed
by the predicted phenotype, the activity score and the GPS. If we accept a false positive
rate of 10%, then the indicator representation will detect 65% of the successful TCEs
correctly (represented by the point [0.1, 0.65] on the black line in Figure 4.2), whereas the
genetic barrier representation achieves an accuracy of 76.6%. The accuracy can be further
increased by accepting more false positives. For example, if 90% of the successes were to
be detected, we would have to accept 19.5% false positives for the genetic barrier or the
phenotype encoding, and 35.5% for the indicator representation. The differences between
the four encodings are even more strongly articulated in the analysis of the two balanced
subsets of the complete dataset.
In the dataset BS, each viral genotype is paired with a drug combination that gave rise
to a successful TCE and with another TCE that resulted in failure. Thus, the genotype
alone does not provide any information on the outcome of these TCEs. As might have
been expected, the GPS does not improve the predictive power on this dataset, because
this feature is derived only from the genotype (Figure 4.5 a). By contrast, the remaining
three encodings, namely activity, genetic barrier, and phenotype, enhance the performance
significantly (p<0.004). The genetic barrier encoding outperforms the activity encoding,
and the phenotype representation provides the best encoding on this dataset. However,
the difference between genetic barrier and phenotype is negligible (p>0.9).
In the dataset BT, the therapies (rather than the sequence, as in BS) are balanced. For
every therapy there exists the same number of genotypes that gave rise to successful and to
failing TCEs. Here, the drug combination alone does not provide any information on the
outcome of the TCEs. Usage of the GPS on this dataset increases the performance of the
indicator representation to the level reached with the activity representation (Figure 4.5 b).
LSR
0.825 (0.005)
0.911 (0.004)
0.870 (0.005)
0.892 (0.003)
0.856 (0.005)
0.871 (0.015)
SVM
0.819 (0.009)
0.910 (0.004)
0.864 (0.006)
0.891 (0.005)
0.864 (0.005)
0.870 (0.015)
C4.5
0.845 (0.005)
0.839 (0.005)
0.842 (0.007)
0.875 (0.005)
0.861 (0.005)
0.852 (0.007)
LOGR
0.820 (0.007)
0.912 (0.002)
0.868 (0.003)
0.891 (0.002)
0.868 (0.005)
0.872 (0.015)
LMT
0.868 (0.004)
0.905 (0.004)
0.898 (0.004)
0.916 (0.005)
0.899 (0.004)
0.897 (0.008)
Mean
0.834 (0.008)
0.898 (0.012)
0.868 (0.007)
0.893 (0.005)
0.867 (0.007)
Table 4.2: Classifier performance on the full dataset A. The table displays, for all combinations of feature encodings (rows) and learning
techniques (columns), the resulting area under the receiver operating characteristic (ROC) curve (AUC) and its standard error (in
parentheses) computed using 10-fold cross-validation. C4.5, C4.5 software; GPS, genetic progression score; LDA, linear discriminant
analysis; LMT, logistic model trees; LOGR, logistic regression; LSR, least-squares regression; SVM, linear support vector machines.
Indicator
+ Phenotype
+ Activity
+ Genetic barrier
+ GPS
Mean
LDA
0.825 (0.005)
0.912 (0.004)
0.868 (0.005)
0.891 (0.005)
0.856 (0.006)
0.870 (0.015)
4.2 geno2pheno-THEO
57
4 Predicting Response to Combination Therapy
0.6
0.4
indicator
+ phenotype
+ phenotype (C4.5)
+ activity
+ genetic barrier
+ GPS
0.0
0.2
True positive rate
0.8
1.0
58
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
Figure 4.4: ROC curves for the complete dataset A using LMT (unless stated otherwise
in the legend) and 10-fold cross-validation. Every feature encoding is represented by a receiver operating characteristic (ROC) curve, namely the baseline
indicator representation (indicator), and the following additional features: predicted phenotypes, activity score, genetic barrier, and genetic progression score
(GPS). Each point on the curve represents a classifier and allows determining
its true positive rate and false positive rate. For example, the point (0.355, 0.9)
on the black line represents a logistic model trees (LMT) classification model
trained on the plain indicator representation with an expected 35.5% of false
positives and 90% of true positives.
0.6
0.8
1.0
59
indicator
+ phenotype
+ activity
+ genetic barrier
+ GPS
0.0
0.0
0.2
0.4
True positive rate
0.6
0.4
indicator
+ phenotype
+ activity
+ genetic barrier
+ GPS
0.2
True positive rate
0.8
1.0
4.2 geno2pheno-THEO
0.0
0.2
0.4
0.6
False positive rate
(a) Dataset BS
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(b) Dataset BT
Figure 4.5: ROC curves for dataset BS (a) and dataset BT (b) using LMT and 10-fold crossvalidation. BS referes to “balanced with respect to sequences” and likewise BT
refers to “balanced with respect to therapies”. Fur further details see caption
of Figure 4.4.
Similarly to dataset BS, application of the phenotypic and the genetic barrier representation results in maximal performance. Here, the genetic barrier encoding outperforms the
phenotype encoding, but the difference does not reach statistical significance (p=0.1655).
In the interesting region of low false positive rates (<30%), however, the difference can be
substantial. For example, at a false positive rate of 10% the indicator approach recognizes
28% of the successful TCEs correctly, GPS yields a true positive rate of 54.7%, using the
activity score increases true positives to 54.5%, the phenotype representation achieves almost 58%, and the genetic barrier recognizes as many as 64.9% of successes correctly. This
rate more than doubles the precision of the indicator encoding alone.
In order to investigate possible biases in the estimated models resulting from the specific
clinical centres which collected the TCE data, we used the split of dataset A into A1
and A2. The TCEs in A1 originate from a single health care provider, but those in A2
stem from several different clinical studies and hospitals and hence are expected to be less
homogeneous. When A1 and A2 are used separately in the same cross-validation procedure
described above for the pooled dataset A, then the predictive performance is similar in both
cases (AUC of 0.898 for A1 and 0.906 for A2). If A1 (A2) is used for training and A2 (A1)
for testing, we estimate an AUC of 0.837 (0.875). The performance loss due to separation
of the data by clinics was slightly more pronounced for the balanced dataset BS than for
BT (data not shown).
All of the reported results remain qualitatively unchanged when therapeutic success is
defined by a more sustained response over at least eight or 16 weeks instead of by initial
response. Using the alternative definitions we re-calculated the AUC values for all feature
60
4 Predicting Response to Combination Therapy
encodings with LMT on all datasets. None of the derived performance measures showed
any significant difference compared with the initial definition.
4.2.3 Discussion
Given the increasing number of possible drug combinations and the genetic diversity of
HIV, it is unlikely that simple hand-crafted rules will capture the complex interplay between
drug cocktails and mutational patterns that determine response to antiretroviral therapy.
Thus, statistical and computational approaches are required for optimal use of the available
drugs in each individual patient. Here we have analysed the ability of various statistical
learning techniques in combination with different feature encodings to predict therapy
outcome from the baseline genotype and the applied drug combination.
The viral genotype is only one of many patient-specific characteristics, such as immune
status or genetic predisposition, and it is straightforward to extend our approach to situations where additional parameters are available (see Chapter 5). Nevertheless, the viral
genotype has a prominent role among those covariates as it encodes the structure of the
target proteins. The challenge in using these genetic data is to predict the evolution of
the virus population under drug pressure and to understand the complex relationship between mutational patterns and in vivo drug resistance. We have addressed these issues
by considering, in addition to drug and mutation indicators, features that make use of
phenotypic and evolutionary information. Our computational experiments have identified
the predicted phenotype and the genetic barrier to drug resistance as the most beneficial
features, significantly boosting the predictive power of all classifiers. The choice of the
learning technique had less impact, but LMT consistently showed an advantage over the
other methods. In general, the representations that incorporate in vitro phenotype predictions yielded better performance than the GPS, a purely evolutionary feature, but both
sources of information led to improvements among all tested classifiers and datasets.
The two subanalyses of the complete TCE dataset A showed opposing results. Whereas
the performance was increased compared with dataset A for the balanced sequences (BS),
it was decreased for the balanced therapies (BT). In the case of the BS data, classifiers need
to learn the differences between drug combinations conditioned on the genotype. However,
in effect, with this dataset the dependence on genotype is largely masked by the differing
application profiles of regimen use accumulated in the dataset. For example, lopinavir
appeared in only 38 failing regimens and in 284 successful follow-up therapies. Likewise,
the combination of zidovudine and didanosine defined 60 failures, but not a single success.
This is because the underlying clinical cohort data reflects the historical approval and use
of drugs. For example, the combination of zidovudine and didanosine has been available
for some time and many patients have received it until failure. This is easy to learn
from the data, but the generalization that this combination always fails, although strongly
supported by the data, is certainly wrong and thus misleading when evaluating treatment
options containing zidovudine and didanosine. This problem is eliminated with the other
subset, BT, where classifiers learn differences between mutational patterns conditioned on
the regimen. Because, in learning therapy outcome, we aim to generalize the mutation-drug
interactions and not to reconstruct historical drug-use patterns, we regard the performance
using dataset BT as a more genuine estimate of our ability to predict treatment response.
4.2 geno2pheno-THEO
61
We have addressed the concern that the learned models might be biased by the specific
sampling of patients by analysing TCEs from different clinics separately. We found a small
decrease in predictive performance that was more pronounced when the more homogeneous
subset A1 was used for training. This finding highlights the importance of including data
from many different clinical centers. The performance loss was greater for the BS dataset
than for the BT dataset indicating that the patterns of drug use can vary among clinics.
Larger studies that include sufficient data from several clinics need to investigate this source
of bias in more detail in the future.
In the present study, we have dichotomized therapy response into “success” and “failure”. However, with response defined as the change in VL after a certain time, regression
methods can be used for learning in a similar way, and this is unlikely to affect the results
presented here. Furthermore, most classifiers predict a score (in fact, often a probability)
rather than only the class label. Thus, for a given genotype, they can be used for ranking
drug combinations with respect to the expected therapeutic success. In the future, ranking systems could evolve into valuable tools for supporting the complex decision-making
process of clinicians. However, such systems will have to provide an interface that allows
physicians to incorporate their prior knowledge on a given case, and to constrain the range
of possible regimens by explicitly excluding or including specific drugs. In this way, it is
possible to spare drugs or drug classes, limit total pill burden, or react to adverse effects,
while being able to identify the most promising options from the remaining regimens. The
development of improved therapy rankers will crucially depend on their public availability,
which allows experts to identify strengths and weaknesses of different approaches. For this
reason, we have implemented the prototypical therapy ranker THEO (THErapy Optimizer,
Figure 4.6). It is freely accessible for research purposes as part of the geno2pheno web site4 .
In order to produce a ranking of therapies, geno2pheno-THEO (g2p-THEO) applies the
classifier trained for predicting therapy response to the sequences of PR and RT provided
by the user and a predefined set of therapies (consisting of any combination of either two
PIs or two NRTIs plus one NNRTI or one PI). The score produced by the classifier for every
combination of drugs is the expected therapeutic success, which is used to rank the considered therapies. In particular, g2p-THEO applies the LMT classifier, which was trained
on dataset BT using the genetic barrier encoding as input, to derive the scores needed for
the ranking. This selection of feature encoding and statistical learning method was based
on the cross-validation results obtained on dataset BT. Figure 4.6 depicts the results of a
g2p-THEO analysis of a genotype from a patient failing treatment after first-line therapy
with abacavir, lamivudine and efavirenz (for a list of mutations, see caption of Figure 4.6).
The first two regimens proposed by g2p-THEO are double PI therapies (saquinavir and
lopinavir/ritonavir, and amprenavir and lopinavir/ritonavir) with a predicted probability
of virological success of over 70%.
As discussed above, the statistical model used by g2p-THEO has some limitations. Thus,
suggestions made by g2p-THEO must be used with care. Moreover, the ranking displayed
by g2p-THEO might appear counterintuitive, because it is based on confidence scores
provided by the statistical learning method and the single criterion for therapy success is
reduction of VL below a threshold. Hence, regimens that are highly active tend to occur
among the top-ranked therapies. Although these regimens are most likely to reduce the
4
http://www.geno2pheno.org
62
4 Predicting Response to Combination Therapy
Figure 4.6: The g2p-THEO applet for selecting and evaluating drug combinations. The
applet allows for limiting the number of drugs that can be part of a therapy
(No. of drugs). Furthermore, the daily burden (No. of pills per day) can be
limited. The number of compounds from each drug class can be set (for example, 1≤NRTIS≤3), and the use of single drugs can be enforced (for example,
3TC=include) or excluded. Pushing the “Compute” button ranks all remaining
therapies according to the underlying prediction model. In the resulting table,
the components of the regimens are listed (Regimen), the number of pills, how
many pills of every compound are included in the regimen (Comment), and
the score calculated by the model (Success). Additionally, the predictions are
compared in a histogram between all selected (according to the constraints specified by the user; red bars) and all possible therapies (black bars). The figure
shows the result of a g2p-THEO analysis of a genotype from a patient failing treatment after first line therapy with abacavir, lamivudine and efavirenz.
The sequence (subtype B) contained the following mutations (compared with
the reference strain HXB2): PR V3I, E35D, S37N, L63P and A71T and RT
L74V, K102N, K103N, Y115F, E122K, D123E, D177E, I178V, V179D, M184V,
G190A, T200A, R211K, L214F, H221Y, T286A, E297R and S322T.
4.3 Clinical Validation
63
VL below the threshold necessary for classifying a TCE as a success, less active regimens
might be a better choice for preserving future drug options and might therefore be ranked
higher by clinicians (Jiang et al., 2003). Thus, we emphasize that the purpose of therapy
rankers will always be to support, and not to replace, the complex decision-making process
of clinicians.
In future work, further improvement of the genetic barrier representation might be
achieved by estimating mutagenetic trees for combinations of drugs instead of single drugs.
Such trees would model evolutionary pathways to multi-drug resistance and handle the interplay of drugs within the applied regimen. However, this approach is currently infeasible
due to the lack of sufficient data. We also emphasize that the drug-wise computation of the
genetic barrier, as employed here, facilitates the incorporation of a new drug, because only
one additional mutagenetic tree needs to be learned. Moreover, to a first approximation,
this tree can be learned from in vitro and clinical study data even prior to approval. The
most promising avenues to increased prediction accuracy and improved therapy ranking are
via larger datasets, the use of additional parameters (such as CD4+ T cell counts, plasma
levels of drugs, viral replication capacity, and host genetic factors), analysis of previous
sequences of the viral population that represent important background information, and a
profound understanding of viral evolutionary escape from drug pressure.
In the following section, we provide a retrospective validation of g2p-THEO using data
from European patient cohorts. These type of validations are used to ensure that interpretation approaches are not overfitted to a specific patient cohort. Furthermore, validations
facilitate the comparison between different approaches on independent data, and thereby
allow for objectively rating their performance.
4.3 Clinical Validation
This section describes a joint work with Martin Däumer, Niko Beerenwinkel, Yardena
Peres, Eugen Schülter, Joachim Büch, Soo-Yon Rhee, Anders Sönnerborg, W. Jeffrey Fessel, Robert W. Shafer, Maurizio Zazzi, Rolf Kaiser, and Thomas Lengauer. The work was
published in the Journal of Infectious Diseases under the title “Predicting the Response to
Combination Antiretroviral Therapy: Retrospective Validation of geno2pheno-THEO on
a Large Clinical Database” (Altmann et al., 2009a).
In this section, we present the external validation of g2p-THEO in a dataset containing
7600 treatment-sequence pairs collected in a Europe-wide effort (Aharoni et al., 2007).
This validation is of similar nature as the split of dataset A into A1 and A2 (based on the
center where the data was collected) presented in the previous section. Virological response
was dichotomized, and performance was compared with three state-of-the-art expert-based
interpretation tools. In subsequent analyses, various techniques of statistical learning were
applied to (1) assess the putative improvement in prediction accuracy incurred by applying
models for specific drug combinations and (2) investigate the reliability of g2p-THEO when
applied to unseen combinations of compounds - that is, those combinations that are not
contained in the training dataset.
64
4 Predicting Response to Combination Therapy
4.3.1 Material and Methods
Treatment Change Episodes (TCEs)
The present study used the previously introduced definition of a TCE (Altmann et al.,
2007a), on which g2p-THEO is based (Figure 4.2). As opposed to the previous analysis
we do not make use of the sustained response. Although current assays have a threshold
of sensitivity of 40 or 50 copies, the 400-copy threshold was used to include data obtained
by earlier assays.
Datasets
The statistical model applied by g2p-THEO was trained on data obtained from the Stanford HIV Drug Resistance Database (Rhee et al., 2003) (comprising data from clinical
studies ACTG 320, ACTG 364, GART, and HAVANA) and from two northern California
clinic populations undergoing genotypic resistance testing at Stanford University. From a
total of 25,717 therapies, 10,288 sequences, and 6706 patients, 6359 TCEs were extracted
(4776 failing and 1583 successful therapies). This dataset is hereafter called “StanfordCalifornia” (A in the previous section). Overrepresentation of certain compounds in failing
or successful therapies within the Stanford-California dataset led the statistical learning
models to often base their decisions only on the drug combination, irrespective of the genotype. To eliminate this artifact, g2p-THEO was trained not on the full dataset but on a
subset that contained the same number of failure- and success-associated genotypes for
every drug combination (see Section 4.2 for details). The number of genotypes per drug
combination ranges from two to 446 (2478 TCEs in total). Hereafter, this dataset is called
“Stanford-CaliforniaBT” (for “balanced therapies”). In some analyses, both the StanfordCalifornia and the Stanford-CaliforniaBT datasets were evaluated again after removal of
ZDV+3TC+IDV combination therapy, which was overrepresented because of the inclusion of the large ACTG 320 dataset. For further analysis, a subset of Stanford-California
containing only drug combinations with ≥ 20 successes and ≥ 20 failures was selected.
Only six treatments met this requirement (Table 4.3); hereafter, this dataset is called
“Stanford-California6”.
Using the same definition, an independent TCE dataset (EuResistDB) was extracted
from the EuResist integrated database (version 2007/05/29), comprising data from Germany (Arevir) (Roomp et al., 2006), Italy (ARCA) (De Luca et al., 2006), and Sweden
(Karolinska Institute). For more details on the EuResist database see Chapter 5. From
a total of 58,195 therapies, 19,258 sequences, and 16,999 patients, we obtained 7603 TCEs
(6217 failing and 1386 successful therapies). For further analysis, we generated the subset EuResistDB6, comprising the same six drug combinations as in Stanford-California6
(Table 4.3). Because the EuResist integrated database includes antiretroviral treatments
started from 1990 to 2007, the main analysis was repeated on the TCE subset derived
from treatments started after 31 December 2000, to minimize the contribution of obsolete
therapy records featuring e.g. non-boosted protease inhibitors.
4.3 Clinical Validation
Regimen
ZDV+3TC+IDV
d4T+3TC+SQV/r
ddI+d4T+EFV
d4T+3TC+EFV
ddI+d4T+NFV
d4T+3TC+NFV
All regimens
Stanford-California6
Failure Success Total
223
229
452
71
28
99
96
54
150
60
27
87
110
28
138
275
28
303
835
394 1229
65
EuResistDB6
Failure Success Total
108
5
113
27
2
29
132
25
157
114
16
130
131
22
153
209
16
225
721
86
807
Table 4.3: Distribution of failing and successful treatment change episodes for the 6
most common regimens in the Stanford-California and EuResistDB datasets
(Stanford-California6 and EuResistDB6 subsets).
ANRS
Rega
Stanford HIVdb
1.0
susceptible
susceptible
susceptible, potential
low-level resistance
Score
0.5
possible resistance
intermediate
low-level resistance,
intermediate resistance
0.0
resistant
resistant
high-level
resistance
Table 4.4: Mapping of the classification given by different expert-based interpretation systems to continuous values.
Interpretation systems
ANRS (Meynard et al., 2002, version 2006/07), Rega (Van Laethem et al., 2002, version
7.1.1), and Stanford HIVdb (Rhee et al., 2003, version 4.3.0) are expert-based interpretation methods. These algorithms apply carefully handcrafted interpretation rules or tables
derived by expert panels from the analysis of available in vitro and in vivo resistance data.
In addition, some rules of the ANRS algorithm were derived from the statistical association
between baseline genotypic data and virological response. The classification generated by
the interpretation systems was normalized into a score by mapping the verbal classification
to a score according to Table 4.4. Thus, the rating numerically represents the activity of a
drug against the virus on a scale ranging from 0.0 (inactive) to 1.0 (fully active). Individual
scores for NNRTIs and for boosted PIs computed using the Rega algorithm were converted
to 1.0, 0.25, and 0.0 and to 1.5, 0.75, and 0.0, respectively, as indicated by the algorithm
developers. The treatment score or genotypic susceptibility score (GSS) (De Luca et al.,
2004) was then defined as the sum of single-drug scores for the compounds included in the
regimen.
g2p-THEO is a data-driven interpretation system that directly computes a rating for a
combination therapy. This value can be interpreted as the probability of the viral load
being reduced to below the limit of detection during the course of therapy. g2p-THEO
represents the HIV-1 genotype by 49 indicator variables, each of them indicating the presence (1) or absence (0) of a resistance mutation (Johnson et al., 2005) (the complete
66
4 Predicting Response to Combination Therapy
list: for protease, 10F/ I/R/V, 16E, 20M/R/I, 24I, 30N, 32I, 33F/I/V, 36I/L/V, 46I/L,
47V/A, 48V, 50L/V, 53L, 54L/V/M/T/A/S, 60E, 62V, 63P, 71V/T/I/L, 73S/ A/C/T,
77I, 82A/F/T/S, 84V, 88D/S, 90M, and 93L; for reverse transcriptase, 41L, 44D, 62V, 65R,
67N, 70R, 74V, 75T/M/S/A/I, 77L, 100I, 103N, 106A/M, 108I, 115F, 116Y, 118I, 151M,
181C/I, 184V/I, 188C/L/H, 190S/A, 210W, 215F/Y, and 219Q/E). Similarly, treatment
is encoded using 17 indicator variables, each representing the presence (1) or absence (0)
of a compound in the regimen (for the list of considered compounds, see Table 4.1). In
addition, viral evolution during the course of therapy is represented by the genetic barrier
to drug resistance (Beerenwinkel et al., 2005a) for all drugs in the regimen. The genetic
barrier is the probability that the virus will remain susceptible under drug pressure, given
as a numerical value between 0.0 (no genetic barrier [i.e., the virus is expected to become
resistant]) and 1.0 (insurmountable genetic barrier [i.e., the virus is expected to remain
susceptible]). Together with the indicator variables, the genetic barrier (one value per
drug) is used as input to the logistic model tree (LMT) (Landwehr et al., 2005) applied by
g2p-THEO to compute a score for a combination therapy.
Receiver Operating Characteristic (ROC) Curves
ROC curves depict classifier performance by giving a true-positive rate (TPR; percentage
of correctly predicted successes) for every false-positive rate (FPR; percentage of failing
therapies [i.e., with a genotype obtained during treatment] that were predicted to be successful [i.e., to decrease viral load to <400 copies/ml]). The area under the ROC curve
(AUC) summarizes the performance and is a convenient measure for comparing scoring systems without the need to provide a particular cutoff (Brun-Vézinet et al., 2004). Briefly,
the AUC is a value between 0.0 and 1.0 corresponding to the probability that a randomly
selected success receives a higher score than a randomly selected failure (Fawcett, 2006).
The ROCR software (version 1.0-2) (Sing et al., 2005a) was used for the ROC analysis.
Comparative Analysis
The statistical model applied by g2p-THEO was used to predict the outcome of the
genotype-therapy pairs in the external EuResistDB dataset. The performance was compared with that of the three expert-based interpretation tools: ANRS, Rega, and Stanford
HIVdb. The EuResistDB dataset was used as an independent test set. One hundred bootstrap replicates of the EuResistDB dataset were used for computing standard deviations.
Stability
To further analyze the robustness of the approach, the prediction of response to drug
combinations that were not present in the training data was simulated by a variant of
cross-validation on the training data. In standard cross-validation, the available data are
split randomly into n equally sized non-overlapping subsets. Then, n − 1 pooled subsets
are used as a training set, and the remaining subset is used as a test set to compute
the performance of the model. This procedure is repeated n times. Hence, every subset
is used as a test set once. To simulate the prediction of unseen drug combinations, the
splits were not carried out randomly, and every subset contained only TCEs with the same
4.3 Clinical Validation
Analysis, setting
Stability analysis
All
Regimen-specific models
Regimen specific
All
Kernel
C
Linear
0.1
1
Linear
Linear
0.1
0.1
C optimizing AUC in 5-fold CV
4 (optimized AUC in 10-fold CV)
67
Table 4.5: Support vector machine settings. C is the cost factor, CV refers to crossvalidation.
drug combination. Results derived by this “therapy-fold” cross-validation were compared
with results of standard cross-validation, with the number of folds equal to the number of
different drug combinations. A substantial loss in performance (e.g., a loss of 0.1 in AUC)
for the therapy-fold cross-validation compared with the normal protocol indicates unstable
behavior with respect to unseen drug combinations, because it indicates the requirement to
include examples of the drug combination that should be predicted in the training data for
maintaining the performance. The experiment was repeated on the special subset StanfordCaliforniaBT and on the complete EuResistDB dataset. Because of limited computational
resources, the LMT applied by g2p-THEO was replaced by the faster linear support vector
machines (SVMs) (Chang and Lin, 2001), with similar predictive performance (see e.g.
Table 4.2). SVM settings are listed in Table 4.5.
Regimen-specific Models
For some drug combinations, many samples were available. Models trained exclusively on
TCEs for one drug combination are expected to predict response to that regimen more
accurately than models trained on all available TCEs. The Stanford-California6 subset
contained sufficient data for training 6 regimen-specific models and for measuring their
performance. The performance of the individual models was assessed by 10 repetitions of
five-fold cross-validation on data comprising only one drug combination. The performance
of the full model was assessed by 10 repetitions of a variant of five-fold cross-validation in
which all TCEs with other drug combinations were added to the four subsets forming the
training data. SVMs with a linear kernel were used as a statistical learning method. SVM
settings are listed in Table 4.5. Performance in an independent test set was assessed by
predicting the outcome of TCEs in EuResistDB6 with the full model and the six regimenspecific models (10 repetitions). Results were compared with those obtained using g2pTHEO and the three expert-based interpretation tools.
4.3.2 Results
Comparative Analysis
Figure 4.7 depicts the ROC curves for the three expert-based interpretation tools and
g2p-THEO. The curves for Stanford HIVdb, ANRS, and Rega did not differ substantially,
resulting in comparable TPRs and FPRs for every GSS cutoff. In contrast, in the FPR
range from 0% to 40%, the curve for g2p-THEO was distinctly located above all the other
68
4 Predicting Response to Combination Therapy
curves. For higher FPRs, all curves proceeded with similar slopes. The AUC value for
g2p-THEO was significantly larger than those for the expert-based approaches (p < 0.001;
paired Wilcoxon test). ROC curves allow for a detailed analysis of specific points on the
curve. For example, at a FPR of 20% (close to a GSS cutoff of 2.5 for Rega and ANRS
and of 2.0 for Stanford HIVdb), Stanford HIVdb, ANRS, and Rega yielded TPRs of 44.2%
(±3.2%), 47.8% (±2.7%), and 46.5% (±2.2%), respectively. In contrast, at the same FPR
g2p-THEO achieved a TPR of 64.0% (±1.6%). On the other hand, at a false-negative rate
(FNR) of 10% (close to a GSS cutoff of 1.0 for all systems), Stanford HIVdb, ANRS, and
Rega yielded true-negative rates of 54.1% (±2.0%), 51.0% (±2.1%), and 55.9% (±0.8%),
respectively, compared with 54.7% (±1.8%) for g2p-THEO. Restriction of the EuResistDB
dataset to therapies started after 31 December 2000 led to an even more pronounced
difference in AUC between g2p-THEO (0.824±0.006) and the expert-based approaches
(for Stanford HIVdb, 0.728±0.008; for ANRS, 0.733±0.008; for Rega, 0.754±0.007).
Stability
Table 4.6 summarizes the results of the stability analysis. The standard cross-validation
performance on the Stanford-California dataset was in line with previously computed results (Table 4.2). For both Stanford-California datasets containing the highly prevalent
ZDV+3TC+IDV regimen, the therapy-fold cross-validation yielded slightly worse results
than the standard protocol (difference in AUC of ∼ 0.03). In contrast, in the EuResistDB
dataset and both Stanford-California datasets without ZDV+3TC+IDV, no loss in performance was observed.
Regimen-specific Models
Table 4.7 shows the mean AUC and standard deviation for the regimen-specific models and the full model for the six most common drug combinations in the StanfordCalifornia dataset. The results for the combination of d4T+3TC+SQV/r were the worst
for both models, and there was no benefit in using the regimen-specific model. However, for the remaining five drug combinations, the benefits of regimen-specific models
ranged from 0.017 to 0.048 and reached statistical significance. Within the EuResistDB6
dataset, an insufficient number of successful TCEs was available for ZDV+3TC+IDV
and d4T+3TC+SQV/r (Table 4.3). The full model outperformed the regimen-specific
model only for d4T+3TC+EFV. For the remaining three drug combinations, the benefit
of the regimen-specific model was more pronounced than in the cross-validation setting and
ranged from 0.046 to 0.301 for d4T+3TC+NFV, a combination for which the full model
actually failed to make useful predictions. The LMTs in g2p-THEO also outperformed
the regimen-specific model for d4T+3TC+EFV. In the remaining three cases, the benefit
of the regimen-specific models ranged from 0.011 to 0.082. All regimen-specific models
outperformed the expert-based methods.
Figure 4.8 depicts the ROC curves for the regimen-specific models (g2p-THEO-SVMRS), the full model (g2p-THEO-SVM), g2p-THEO, and the expert-based methods applied
to the EuResistDB6 dataset. The full model performed better than the expert-based
methods in the area below a FPR of 32% but worse in the remaining region. As in
the previous ROC plot (Figure 4.7), g2p-THEO performed better than the expert-based
4.3 Clinical Validation
69
Figure 4.7: Receiver operating characteristic (ROC) curves for the EuResistDB dataset.
Every method is represented by a single ROC curve, namely Stanford HIVdb,
ANRS, Rega, and geno2pheno-THEO (g2p-THEO). Each point on the curve
represents a classifier with a different cutoff and allows the true-positive rate
(TPR) and false-positive rate (FPR) for that cutoff to be determined. Whiskers
indicate the standard deviations of the TPRs at a specific FPR. The genotypic
susceptibility score (GSS) and predicted success probability (p) cutoffs leading
to specific TPR and FPR values are indicated within the plot for expert-based
approaches and g2p-THEO, respectively. For each method, the area under the
ROC curve and its standard deviation are given parenthetically in the box.
interpretation tools in the area below a 50% FPR and performed as well in the remaining
region. However, the regimen-specific models outperformed the other methods over the
whole range of FPRs. More specifically, they yielded a TPR of 58.4% at a FPR of 20%,
compared with 39.8% for the expert-based methods and 53.6% for g2p-THEO.
4.3.3 Discussion
Validation of HIV genotype interpretation systems is a crucial step in translating computerbased methods into clinically effective treatment decision support tools. In the present
study, using an external dataset of ≈ 7600 TCEs extracted from the EuResist integrated
database, the recently developed g2p-THEO system was shown to outperform the three
most widely used expert-based interpretation systems. Although the EuResist dataset
included many obsolete therapies because of its long observation period, the same results
70
4 Predicting Response to Combination Therapy
AUC
Dataset
Stanford-California
With ZDV+3TC+IDV
Without ZDV+3TC+IDV
Stanford-CaliforniaBT
With ZDV+3TC+IDV
Without ZDV+3TC+IDV
EuResistDB
Drug combinations,
no.
Standard
cross-validation
Therapy-fold
cross-validation
876
875
0.901
0.898
0.870
0.895
323
322
712
0.838
0.812
0.847
0.812
0.813
0.844
Table 4.6: Area under the receiver operating characteristic curve (AUC) values obtained
by standard cross-validation and therapy-fold cross-validation. The rows of the
table correspond to the five datasets, with both Stanford-California datasets
studied with and without the ZDV+3TC+IDV drug combination, in light of the
overrepresentation of this regimen caused by the ACTG 320 data. (StanfordCaliforniaBT is the “balanced therapies” subset.) Note that computation of
the AUC requires positive and negative samples, but because of the nature of
therapy-fold cross-validation, it could not be ensured that positive and negative samples were present in every subset of the cross-validation. Thus, it was
not possible to compute fold-wise AUC values or their standard deviations and
statistical significance.
were confirmed when therapies started before 1 January 2001 were removed. The g2pTHEO system was more accurate than ANRS, Stanford HIVdb, and Rega by 16.2% 19.8% in the detection of therapeutic success (20% FPR). However, all of the systems
were comparable in detecting treatment failure at a 10% FNR. This finding suggests that
expert-based systems are better suited to detect the failure of therapy than to detect
success, probably because their original purpose was to detect resistance to individual
drugs. However, whether a treatment will most likely be successful is exactly the response
a user expects from a decision support tool. Current expert-based approaches are indeed
evolving into clinically oriented tools aimed at building effective combination regimens.
Computing a regimen GSS by simple summation of the individual drug scores derived by
expert-based systems fails by definition to weight both different drug potencies and drug
interaction effects. However, such an unweighted GSS is still commonly used (Maggiolo
et al., 2007; Cozzi-Lepri et al., 2007) in the absence of any agreed-upon standard for a
weighted GSS. Notably, the latest Rega algorithm has introduced arbitrary drug weights
in an attempt to account for the expected increased potency of ritonavir-boosted PIs and
a lack of intermediate NNRTI activity.
The superior performance of g2p-THEO may have derived from two factors. First, the
calculated genetic barrier provides useful information by estimating the probability of viral
evolution under drug pressure (Altmann et al., 2007a). Second, the training process assigns
weights to all drugs. Hence, during the decision making process, drugs are not treated
equally. As shown by Altmann et al. (2007b, 2009b), this can significantly improve the
performance of genotype interpretation tools. On the other hand, g2p-THEO is currently
p
<0.001
0.256
<0.001
0.003
<0.001
<0.001
Regimen-specific
model
0.596 (0.038)
0.420 (0.012)
0.925 (0.003)
0.727 (0.016)
0.821 (0.008)
0.829 (0.016)
0.720
0.810 (0.005)
Full model
0.570
0.815
0.879
0.771
0.709
0.528
0.712
0.733
g2p-THEO
0.610
0.444
0.914
0.769
0.739
0.782
0.710
0.781
EuResistDB6, AUC
Stanford
HIVdb
0.656
0.407
0.803
0.693
0.665
0.782
0.668
0.707
ANRS
0.692
0.463
0.798
0.724
0.671
0.781
0.688
0.718
Rega
0.755
0.426
0.777
0.708
0.684
0.784
0.689
0.707
Table 4.7: Area under the receiver operating characteristic curve (AUC) values for cross-validation and training test set-up. Rows correspond to
the six regimens in the two datasets. The columns for the Stanford-California6 data show the AUC values derived by 10 repetitions of
five-fold cross-validation for the regimen-specific models and the full model using all available training data (SDs are in parentheses);
p-values for the comparison of the performances of the two models were obtained by the Wilcoxon rank-sum test. The columns for
the EuResistDB6 dataset show the AUC values for the regimen-specific models (standard deviations are in parentheses), with the
full model using all Stanford-California data, the original geno2pheno-THEO (g2p-THEO) predictions, and the three expert-based
approaches (Stanford HIVdb, ANRS, and Rega).
Regimen
ZDV+3TC+IDV
D4T+3TC+SQV/r
ddI+d4T+EFV
d4T+3TC+EFV
ddI+d4T+NFV
d4T+3TC+NFV
Mean
Full dataset
Stanford-California6
Regimen-specific Full model,
model, AUC
AUC
0.965 (0.005)
0.928 (0.002)
0.747 (0.035)
0.740 (0.022)
0.988 (0.006)
0.950 (0.002)
0.924 (0.028)
0.876 (0.015)
0.974 (0.005)
0.957 (0.005)
0.946 (0.010)
0.915 (0.010)
0.924
0.895
4.3 Clinical Validation
71
72
4 Predicting Response to Combination Therapy
Figure 4.8: Receiver operating characteristic (ROC) curves for regimen-specific models applied to the EuResistDB6 dataset. Every method is represented by a ROC
curve for a subset of the EuResistDB data comprising six different treatments.
In addition to the methods depicted in Figure 4.7, ROC curves are shown for
g2p-THEO-SVM (g2p-THEO in which logistic model trees were replaced by
SVMs) and g2p-THEO-SVM-RS (g2p-THEO using regimen-specific SVMs).
For each method, the area under the ROC curve is given parenthetically in the
box.
limited to a set of well-established resistance mutations, which might explain its inability
to improve the detection of failing regimens.
In the computation of a score for a drug combination, the robustness of the tool with
respect to unseen drug combinations is an important issue. In both Stanford-California
datasets, a slight decrease in the AUC was observed in predictions for unseen drug combinations. Overrepresentation of ZDV+3TC+IDV therapy due to inclusion of the ACTG
320 clinical trial in the datasets was identified as a possible confounder of this analysis,
because no decrease in performance with unseen drug combinations was observed after
removal of TCEs containing the ZDV+3TC+IDV combination. Thus, our stability analysis indicated that g2p-THEO returns reliable scores for unobserved drug combinations.
The major reason for the preservation of performance is that the prediction is based on
a linear model. Indeed, during the learning process of the linear model, contributions of
every single covariate to the outcome are computed. This also holds for drugs in a regimen,
because the impact has to be distributed among these drugs. In the end, observed and
4.3 Clinical Validation
73
unobserved drug combinations are both composed of observed compounds. However, this
property is also a potential disadvantage of linear models. Specifically, if the prediction
is based on a linear model, the synergistic effects between drugs or mutations cannot be
represented unless a large number of covariates are introduced to explicitly model these
effects. In contrast, in regimen-specific models all mutations are evaluated in the context
of the same drug combination, thus rendering the explicit modeling of interactions unnecessary. In the Stanford-California6 dataset, the regimen-specific models for all drug
combinations exhibited increased performance compared with the full model, even though
the full model had access to many more training samples. This finding was confirmed in
independent data. However, the benefit of regimen-specific models decreased when they
were compared with the full statistical model for g2p-THEO. This can be explained by
the fact that g2p-THEO applies LMTs that directly train multiple linear models on distinct subsets of the training data. Unfortunately, only a few outdated regimens gave rise
to enough training data for regimen-specific models. However, a promising approach has
been recently proposed (Bickel et al., 2008), one that pools data on “similar” regimens to
overcome this limitation in generating regimen-specific models.
A major issue with any fully data-driven system is the inability to generate predictions
for newly licensed compounds because of the delayed availability of sufficient training data.
This drawback can be addressed only by multicenter efforts and cooperation between drug
companies and regulatory bodies for immediate release of clinical trial data. However,
because an optimized background regimen is recommended for the effective use of any
new compound, interpretation systems are still relevant for choosing the backbone drugs,
particularly in heavily experienced patients. Expert-based systems can complement datadriven systems for predicting the activity of novel drugs until sufficiently large genotyperesponse datasets are available. An attempt to combine expert scores for novel drugs with
a data-driven approach will be presented in Section 5.8.
Data-driven systems need a large amount of data for training, so observational cohort
data are often used. These provide a valuable source for assessing the impact that drug
resistance has on the response to treatment but typically lack other relevant information,
including adherence levels and pharmacokinetics data. Weighting for these factors is expected to help us develop better systems aimed at building effective regimens. It must also
be noted that modern and future antiretroviral treatment strategies are expected to limit
the development of drug resistance by providing increased potency and convenience, perhaps making treatment toxicity issues relatively more relevant than resistance over time.
However, drug resistance and cross-resistance remain issues for a substantial proportion of
patients harboring viral populations that display a complex mutational pattern because of
multiple treatment failures. In addition, toxicity is also a major contributor to the selection
of drug resistance through decreased adherence. Developed as a clinically oriented tool,
g2p-THEO allows the user to exclude specific drugs for toxicity issues and provides a total
pill count for each regimen. Although no data-driven system is meant to replace a comprehensive patient evaluation by an expert HIV specialist, validated tools such as g2p-THEO
can provide an appropriate support to most care-givers of HIV-infected patients.
74
4 Predicting Response to Combination Therapy
5 E U R ESIST: Uniting Data for Fighting HIV
In the previous chapter we demonstrated that features derived from the viral genotype
are helpful in predicting response to combination therapy. The predictive performance of
statistical learning models, however, depends to a great deal on the amount of available
training data. In the case of geno2pheno approximately 1000 genotype-phenotype pairs
were sufficient for building well performing models for inferring phenotypic drug resistance
against a singe drug from genotype. The predicted resistance information is based on
information derived in vitro. Clinicians, however, are mainly interested in knowing whether
a drug-cocktail will work in vivo or not. The fact that anti-HIV drugs can be used in
hundreds of combinations calls for massive amounts of training data for systems such as
g2p-THEO that directly predict response to combination therapy. The HIV Resistance
Database Initiative (RDI) was the first international initiative aiming at the integration
of multiple databases from different centers for collecting sufficient data to train such
statistical models (Larder et al., 2002). Since its foundation in 2002, the RDI database
grew from 3,500 to approximately 30,000 patients in 20071 . However, till now RDI failed
to provide a freely accessible web tool that predicts response to combination treatments.
This chapter describes our work within the EU project: EuResist. EuResist, like
RDI, aims at the collection of data and a the development of a free web accessible tool
that predicts response to combination treatments. The current version of the database
holds approximately records from 33,000 different patients. Unlike RDI, EuResist made
the prediction system freely available already in 2007. This chapter begins with a background on the EuResist project (Section 5.1). In Sections 5.2 and 5.3 we introduce the
individual EuResist prediction engines and the combination strategies, respectively. The
final web service is described in Section 5.5. Sections 5.6 and 5.7 investigate how changes
in the definition to treatment response affect the prediction performance. Furthermore, in
Section 5.8 an approach for overcoming a serious limitation of all prediction engines aiming at inferring response to combination therapies, namely the update ability with respect
to novel compounds, is studied. Finally, section 5.9 explores the possibility of predicting
response to antiretroviral therapy based on the treatment history alone.
5.1 Project Background
The EuResist project aimed at developing a European integrated system for clinical
management of antiretroviral drug resistance. The system should provide the clinicians
with a prediction of response to antiretroviral treatment in HIV patients, thus helping
the clinicians to choose the best drugs and drug combinations for any given HIV genetic
variant. To this end, a massive European integrated dataset was created, linking some of
the largest locally existing resistance databases.
1
source: http://www.hivrdi.org
75
76
5 EuResist: Uniting Data for Fighting HIV
The project involved participants from eight institutions. Three institutions acted as
data providers and consultants concerning virological knowledge and clinical aspects of
HIV treatment, namely the University of Siena, the Karolinska Institute in Stockholm,
and the University of Cologne. Four participating institutions were involved in modeling
the prediction engines: IBM Israel, KFKI Research Institute for Particle and Nuclear
Physics Hungarian Academy of Sciences, Informa S.r.l., and the Max Planck Institute for
Informatics. A division of IBM Israel was also responsible for the integration of the multiple
data sources into a unified database, and Informa S.r.l. was concerned with the overall
project management and the development of the web front end for the prediction engine.
The Kingston University dealt with matters of data quality of the integrated database.
Like other HIV resistance tools, the response models should be able to provide a prediction even when the genotype is the only available information. The data collected in
the resistance databases, however, enable exploiting patient related data: for instance the
patient’s treatment history, the current immune status, and demographic data. To allow
for both: the use of the genotype only and the exploitation of additional data, the modeling was restricted to genotype and treatment information only (minimal feature set) and
opened for all available information (maximal feature set). In operation mode, the model
chosen for the prediction depends on the information provided by the user.
A considerable effort within the project was the definition of the standard datum. Briefly,
the standard datum is the equivalent to the TCEs in the previous chapter and defines
which data from the database are considered for learning and what has to be learned.
For instance, if one is interested in virological response to an anti-HIV treatment around
three months after onset of the regimen (short-term response), then one has to extract
treatments from the database where a viral load measurement is available around three
months of treatment. The standard datum definition can also be a source of error or
overfitting. For instance, in the work by Larder et al. (2007) the time between onset of the
treatment and the viral load measure is used as a feature. In their study, one treatment
with different viral load measurements gave rise to at most three training points for the
same statistical model. The setup ensured that the patients in training and test set formed
a distinct set. However, the high correlation of the cases, due to multiple samples from the
same treatment, in the small test set of 50 instances leads very likely to an overestimation
of model performance. Another important issue, which the standard datum definition has
to address, concerns the baseline measurements, i.e. what is the maximal allowed time
span between measurement of a biomarker and start of the therapy so that the value can
be considered a baseline measurement. For example, a genotype obtained about one year
before the treatment start is clearly too outdated to serve as a baseline genotype. On the
other hand, a genotype that was generated a few weeks before treatment start is clearly
suitable. In general, the stricter a standard datum definition, the fewer data are available
for the statistical learning step, thus the definition has to find a good balance between
strictness and resulting training data.
The standard datum definition applied within the EuResist project considers values
that were obtained at most three months before start of the regimen as baseline values
– only if there was no intermediate treatment of at least two weeks length between the
measurement and start of the treatment to be considered. With this definition, EuResist
follows the recommendations of The Forum for Collaborative HIV Research. Furthermore,
5.1 Project Background
Sequencing & VL
start
VL < 500 cp/ml or 100−fold reduction?
77
stop
< 90 days
28 days
> 0 days
56 days
Therapy
Figure 5.1: Standard datum. A treatment change is considered a valid training/testing
instance if the baseline measures (viral genotype and VL) were obtained at
most 90 days before start of the new treatment and given that there was no
other treatment lasting longer than 14 days between time point of measurement
and start of treatment. A treatment is considered successful if the VL at 8
weeks (±4) is below the limit of detection (here: 500 copies of viral RNA per
ml blood serum) or constitutes a 100-fold reduction compared to the baseline
value.
the EuResist standard datum focuses on initial response around eight (four to 12) weeks
of treatment; one treatment can only give rise to only one training instance since only
the closest measurement to the eight weeks time point is considered. Thus, the definition
facilitated a regression approach, i.e. predicting the change in viral load between the
baseline measurement and the follow-up measurement. In addition, like in the development
of g2p-THEO, a classification approach was investigated within the project. To this end,
treatment response was dichotomized into success and failure based on the reduction of
viral load. Precisely, if the viral load could be reduced below the limit of detection, i.e. 500
copies of viral RNA per ml blood serum, then the treatment was considered a success. Some
patients, though, start with extremely high viral load values and a reduction below the limit
of detection is hard to achieve within the short time frame. Thus, alternatively, a success
can be achieved by a 100-fold reduction of the viral load compared to the baseline value.
Figure 5.1 depicts a schematic overview of the standard datum definition. Modifications
of this standard datum definition that focus on sustained response (VL at 24±8 weeks)
are studied later in this chapter.
Figure 5.2 depicts the growth of the EuResist Integrated database (EIDB) over the
period of the project. The amount of patients and HIV sequences almost doubled from
initially about 17,000 to now 34,000 entries. Likewise, the number of stored treatments
nearly tripled from 35,000 to 98,000. The observed increase of data is not only a result of
the increased size of the initial three databases, but also a direct consequence of the addition of new databases. In November 2007 the Luxembourg database joined the EuResist
integrated database, and in October 2008 three additional databases started their collaboration with EuResist: IRSICAIXA Foundation (Spain), Catholic University of Leuven
(Belgium), and Univeristy of Brescia (Italy). More databases are expected to join the
EuResist effort. It strikes that the number of usable standard datum instances is by far
smaller than the number of sequences in the database. This discrepancy is a result of the
strict requirements posed by the standard datum definition. For instance a sequence has to
be available at most 90 days prior to treatment start, thus not all available sequences in the
database gave rise to a usable learning instance. However, the collection effort increased
78
5 EuResist: Uniting Data for Fighting HIV
6000
100000
●
●
4500
SD instances
Patients
Sequences
Therapies
●
75000
●
3000
●
●
●
1500
●
●
●
●
50000
●
●
●
●
25000
●
10.10.08
27.11.07
16.08.07
01.05.07
29.05.07
23.01.07
0
08.11.06
07.12.06
0
Figure 5.2: Growth of the EuResist integrated database (EIDB). The graph depicts the
increase in patients, viral sequences, and recorded treatments in the database
of the time of the project (scale on the right). The blue line (scale on the
left) corresponds to the increase of standard datum (SD) instances available
for training and testing the prediction engine.
the available training instances from about 1500 to over 5000. Since not all experiments
have been carried out with the most recent release of the EuResist integrated database,
we provide for every computer experiment the release date of the database that was used,
and thus allowing to put the results in the correct context.
5.2 E U R ESIST Prediction Engines
Within the EuResist project four institutions developed statistical models for predicting
the response to combination antiretroviral therapy. Three of the prediction engines were
selected for the final system to be implemented as a web service. Our contribution, the
Evolutionary (EV) engine, was based on the g2p-THEO software described in Chapter 4.
The two other contributions the Generative Discriminative (GD) engine and the Mixed
Effects (ME) engine originate from the Machine Learning Group of the IBM Research
Laboratory in Haifa, Israel and from the Department of Computer Science and Automation
of the University of Roma TRE located in Rome, Italy, respectively. In the following
sections we provide a brief summary for the GD and ME engines. Furthermore, we specify
the differences between the original g2p-THEO software and our EV engine.
5.2 EuResist Prediction Engines
79
outcome
PI
IDV
SQV
NRTI
3TC
NNRTI
ZDV
EFV
NVP
Figure 5.3: The Bayesian network used in the GD engine. Variables indicating use of
individual drugs are connected to the outcome (black arrows) and to indicators
for their drugs class or in the case of the maximal feature set: historic use of
a drug from the same class (green arrows). The variables in the middle layer
representing the (historic) use of a drug class are also connected to the outcome
variable (blue arrows).
5.2.1 Generative Discriminative Engine
The Generative Discriminative (GD) engine applies generative models to derive additional
features for the classification using logistic regression. When inspecting Figure 5.2 it strikes
that only a small percentage of therapies in the database (Table 5.1) have an associated
genotype and are therefore suitable for training a classifier, which is supposed to receive
sequence information. However, a much larger fraction of the therapies can be labeled
as success or failure on the basis of the baseline and follow-up VL measures alone, since
for the labeling no viral genotype is required. The GD engine thus trains a Bayesian
network on about 20,000 therapies (with and without associated genotype). The network
is organized in three layers and uses an indicator for the outcome of the therapy, indicators
for individual drugs, and indicators for drug classes (Figure 5.3). This generative model
is used to compute a probability of therapy success on the basis of the drug combination
alone. This probability is used as an additional feature for the classification by a logistic
regression model: the discriminative step of the approach. Furthermore, indicators for
individual drugs and single mutations are input for the logistic regression.
Indicators representing a drug class are replaced with a count of the number of previously used drugs from that class when working with the maximal feature set. In this
way information about past treatments is incorporated. In addition to features from the
minimal set, the maximal feature set comprises indicators for mutations in previously observed genotypes, the number of past treatment lines, and the VL measure at baseline.
Correlation between single mutations and the outcome of the therapy was used to select
relevant mutations for the model. A detailed description on the network’s setup and the
selected mutations can be found in Rosen-Zvi et al. (2008).
80
5 EuResist: Uniting Data for Fighting HIV
5.2.2 Mixed Effects Engine
The Mixed Effects (ME) engine explores the benefit of including second- and third-order
variable interactions. Basically, since modern regimens combine multiple drugs, binary
indicators representing usage of two or three specific drugs in the same regimen are introduced. Further indicators represent the occurrence of two specific mutations in the viral
genome for modeling interaction effects between them. Moreover, interactions between
specific single drugs and single mutations or pairs of drugs and single mutations are represented by additional covariates. In addition to the terms modeling the mixed effects, other
clinical measures (VL and CD4+ T cell counts), demographic information (risk factor,
country of infection, risk, sex, . . . ), covariates based on previous treatments (indicators
for previous use of drugs and drug classes), and the predicted viral subtype are used as
additional covariates.
The large number of features (due to the mixed effects) requires a strong effort in feature
selection. Thus, multiple feature selection methods were used for generating candidate
feature sets. Filters and embedded methods, i.e. methods that are intrinsically tied to a
statistical learning method, were applied sequentially: (i) univariable filters, such as χ2
with rank-sum test and correlation-based feature selection (Hall, 1999), were applied to
reduce the set of candidate features; (ii) embedded multivariate methods, such as ridge
shrinkage (Le Cessie and Van Houwelingen, 1992) and the Akaike information criterion
(AIC) selection (Akaike, 1974) were used to eliminate correlated features and to assess the
significance of features with respect to the outcome in multivariate analysis. In multiple 10fold cross-validation runs on the training data the performance of the resulting feature sets
were compared with a t-statistic (adjusted for sample overlap and multiple testing). The
approach leading to the best feature set was applied on all training samples to generate the
final model. Unlike the GD engine, the ME engine employs one logistic regression model
that only uses the maximal feature set. Missing variables, e.g. when working with the
minimal feature set, are replaced by the mean (or mode) of that variable in the training
data.
5.2.3 Evolutionary Engine
As stated before, one major obstacle in HIV-1 treatment is the development of resistance
mutations. The Evolutionary (EV) engine uses derived evolutionary features to model
the virus’s expected escape path from drug therapy. The representation of viral evolution
is based on mutagenetic trees. Unlike in g2p-THEO the mixture of mutagenetic trees
comprises only two components, the noise component and one tree that was estimated
from the data. A further difference to the original approach concerns the way of how
mutational patterns that define complete drug resistance against an individual compound
are identified. This information is crucial for computing the genetic barrier to drug resistance. The original approach, which is briefly described in Section 4.1.2, uses the available
genotype-phenotype pairs. Since there are only about 1,000 such pairs, it is very likely that
mutational patterns that can occur are not observed within the small sample. Here, we
used geno2pheno for predicting the drug resistance phenotypes for approximately 16,000
viruses stored in the EIDB. Based on this much larger sample the procedure of identifying
resistant patterns was carried out as initially explained. The modified approach resulted in
5.3 Combining Classifiers
81
genetic barrier values that were more predictive for response to treatment than the values
obtained with the original protocol (increase of 0.02 and 0.027 in AUC and correlation,
respectively, in a 10-fold cross-validation on the training set). The genetic barrier to drug
resistance was provided together with other features, like indicators for individual drugs
in the treatment and indicators for IAS mutations in the genotype as in the prototype
g2p-THEO. In the maximal feature set, indicators for previous use of a drug and the baseline VL measure extend this list. As a general improvement over the original g2p-THEO
prototype, interactions up to second order between indicator variables were considered as
well. This was achieved by introducing further indicator variables that explicitly model
these interactions.
The resulting large number of covariates required the implementation of a feature selection step as part of the training process of the EV engine. The chosen feature selection
approach was based on SVMs with a linear kernel. The approach works in three steps: (i)
optimization of the misclassification cost parameter C in a 10-fold cross-validation setting
to maximize the area under the ROC curve (AUC); (ii) generation of 25 different SVMs
by five repetitions of five-fold cross-validation using the optimized cost parameter; (iii)
computation of the z-score for every feature. All features with a mean z-score larger than
2 were selected for the final model based on logistic regression.
5.3 Combining Classifiers
In order to provide the end-user of the prediction system with a single outcome for a request,
instead of multiple results generated by the different engines, the three classifications have
to be combined to one final decision. In principle, there are two approaches for combining
classifiers, namely classifier fusion and classifier selection. In classifier fusion, complete
information on the feature space is provided to every individual system and all outputs
from the systems have to be combined; in classifier selection, every system is an expert in
a specific domain of the feature space and the local expert alone decides for the output
of the ensemble. However, the individual classifiers described above were designed to be
global experts, thus only classifier fusion methods were explored.
Methods for classifier fusion can operate on class labels or continuous values (e.g. support, posterior probability) provided by every classifier. The methods range from simple
non-trainable combiners like the majority vote, to very sophisticated methods that require
an additional training step. In order to find the best combination method we compared several approaches ranging from simple methods to more sophisticated ones. All results were
compared to a combination that has access to an oracle telling which classifier is correct.
Intuitively, the predictive performance of this oracle represents the upper bound on the
performance that can be achieved by combining the classifiers. The following subsections
briefly introduce the combination methods considered.
Non-trainable Combiners
As mentioned above, there are a number of simple methods to combine outputs from
multiple classifiers. The most intuitive one is a simple majority vote, whereby every individual classifier computes a class label (in this case success or failure) and the label
82
5 EuResist: Uniting Data for Fighting HIV
that receives the most votes is the output of the ensemble. One can also combine the
posterior probability of observing a successful treatment as computed by the logistic regression. This continuous measure can be combined using further simple functions: mean
returns the mean probability of success by the three classifiers (Kittler et al., 1998); min
yields the minimal probability of success (a pessimistic measure); max results in the maximal predicted probability of success (an optimistic measure); median returns the median
probability.
Meta-classifiers
The use of meta-classifiers is a more sophisticated method of classifier combination, which
uses the individual classifiers’ outputs as input for a second classification step. This allows
for weighting the output of the individual classifiers. In this work we applied quadratic
discriminant analysis (QDA), logistic regression, decision trees, and naïve Bayes (operating
on class labels) as meta-classifiers.
Decision Templates and Dempster-Shafer
The decision template combiner was introduced by Kuncheva et al. (2001). The main idea
is to remember the most typical output of the individual classifiers for each class, termed
decision template. Given the predictions for a new instance by all classifiers, the class with
the closest (according to some distance measure) decision template is the output of the
ensemble.
Let ~x be an instance, then DP~x is the associated decision profile. The decision profile for
an instance contains the support (e.g. the posterior probability) by every classifier for every
class. Thus, DP~x is an I × J-matrix, where I and J correspond to the number of classifiers
and classes, respectively. The decision template combiner is trained by computing the
decision templates DT for every class. The DT for the class j is simply the mean of all
decision profiles for instances ~x belonging that class. Hence,
DT j =
1 X
DP ~x , ∀j ∈ {1, . . . , J},
Nj
~
x∈ωj
where Nj is the number of elements in class ωj . For a new sample, the corresponding
decision profile is computed and compared with the decision templates for all classes using
a suitable distance measure. The class with the closest decision template is the output
of the ensemble. Thus, the decision template combiner is a nearest-mean classifier that
operates on decision space rather than on feature space. We used the squared Euclidean
distance to compute the support for every class:
µj = 1 −
J
I
1 XX
[DT j (j 0 , j) − DP ~x (j 0 , i)]2 ,
JI 0
j =1 i=1
where DTj (j 0 , i) is the (j 0 , i)-th entry in DTj . Decision templates were reported to outperform other combiners, e.g. Kuncheva et al. (2001) and Kuncheva (2002).
Decision templates can also be used to compute a combination that is motivated by the
evidence combination of the Dempster-Shafer theory. Instead of computing the similarity
5.3 Combining Classifiers
83
between a decision template and the decision profile, a more complex computation is carried
out as described in detail in Rogova (1994). We refer to these two methods as Decision
Templates and Dempster-Shafer, respectively.
Clusters in Decision Space
Regions in decision space where the classifiers disagree on the outcome are of particular
interest in classifier combination. Therefore, we propose the following method that finds
clusters in decision space and learns separate logistic regression models for every cluster
for fusing the individual predictions. Let si be the posterior probability of observing a
successful treatment predicted by classifier i. Then we express the (dis)agreement between
two classifiers by computing:
αij =
(
0
si − sj
classifiers i and j agree on the label,
else,
for all i and j where i < j. Thus, in case of disagreement between two classifiers, the
computed value expresses the magnitude of disagreement. These agreements are computed
for all instances of the training set and used as input to a k-medoid clustering (Rousseeuw
and Kaufman, 1990). For all resulting k clusters, an individual logistic regression is trained
on all instances associated with the cluster using the si as input. The idea is that in
clusters where e.g. classifier 1 and 2 agree, and classifier 3 tends to predict lower success
probabilities the logistic regression can either increase or decrease the influence of classifier
3, depending on how often predictions by that classifier are correct or incorrect, respectively.
When a new instance has to be classified then first the agreement between the classifiers is
computed for locating the closest cluster. In a second step the logistic regression associated
with that cluster is used to calculate the output of the ensemble. The number of clusters
k, the only parameter of this method, is optimized in a 10-fold cross-validation. The
approach is motivated by the behavior knowledge space (BKS) method (Huang and Suen,
1995), which uses a look-up table to generate the output of the ensemble. However, the
BKS method is known to easily overtrain, and does not work with continuous predictions.
Local Accuracy-based Weighting
Woods et al. (1997) propose a method that uses one k-nearest-neighbor (knn) classifier
for every individual classifier to assess the local accuracy of that classifier given the input
features. The final output is then solely given by the most reliable classifier of the ensemble.
Since the three classifiers in this setting are trained to be global experts, we applied the
proposed method to compute the reliability estimate for each classifier given the features
of an instance. In contrast to the method proposed by Woods et al. (1997), the output is
a weighted mean based on these reliability estimates.
In order to use a knn classifier as a reliability estimator the labels from the original
instances are replaced by indicators of whether the classifier in question was correct on
that instance or not. With the replaced labels and the originally used features the knn
classifier reports the fraction of correctly classified samples in the neighborhood of the
query instance. This fraction can be used as a local reliability estimator. The output by
84
5 EuResist: Uniting Data for Fighting HIV
the ensemble is then defined by a weighted mean:
P
ri si
s̄ = Pi
i ri
where si and ri are the posterior probability of observing a successful treatment and the
local accuracy for classifier i, respectively. For simplicity, only Euclidean distance was used
in the knn classifier, the number of neighbors k was optimized in a 10-fold cross-validation
setting.
Combining Classifiers on the Feature Level
As described in Section 5.2, every individual classifier uses a different feature set, specifically, different derived features, but the same statistical learning method. Thus, a further
combination strategy is the use of all features selected for the individual classifier as input to a single logistic regression rather than computing a consensus of the individual
classifiers’ predictions.
5.4 Predictive Performance of the E U R ESIST Prediction Engine
5.4.1 Data
About 3,000 instances in the EIDB (version from 2007/11/27) met the requirements of
the standard datum definition and can therefore be used as learning instances. From this
complete set, 10% of the data were randomly set aside and used as an independent test
set (Table 5.1). The split of the training data in 10 equally sized folds was fixed, allowing
for 10-fold cross-validation of the individual classifiers. The same 10 folds were used for
a 10-fold cross-validation of the combination approaches. Classification performance was
measured in terms of accuracy (i.e. the fraction of correctly classified examples) and the
area under the receiver operating characteristics curve (AUC). Briefly, the AUC is a value
between 0.0 and 1.0 and corresponds to the probability that a randomly selected positive
example receives a higher score than a randomly selected negative example (Fawcett, 2006).
Thus, a higher AUC corresponds to a better performance.
5.4.2 Results
Results for the individual classifiers using the minimal and maximal feature set are summarized in Table 5.2. The use of the extended feature set significantly improved the
performance of the GD and EV engine with respect to the AUC (p ≈ 0.002 for both using
a paired Wilcoxon test). With respect to accuracy only the improvement observed by the
EV engine reached statistical significance (p = 0.007). Remarkably, replacement of all
missing additional features in the case of the ME engine when working with the minimal
feature set did not result in a significant loss in performance (p = 0.313 and p = 0.312
with respect to AUC and accuracy, respectively).
Correlation among classifiers
The performances of the individual classifiers were very similar. Pearson’s correlation
coefficient (r) indicated that the predicted probability of success for the training instances
5.4 Predictive Performance of the EuResist Prediction Engine
EIDB
Labeled Therapies
Training Set
Test Set
Patients
18,467
8,233
2,389
297
Sequences
22,006
3,492
2,722
301
VL measurements
240,795
40,498
5,444
602
Therapies
64,864
20,249
2,722
301
Successes
13,935
1,822
202
85
Failures
6,314
900
99
Table 5.1: Summary of the EuResist Integrated Database (version from 2007/11/27) and
training and test set. The table displays the number of patients, sequences, VL
measurements, and therapies for the complete EuResist Integrated Database
(EIDB) and the set of therapies that could be labeled with the standard datum
definition. 469 of the sequences associated with all labeled therapies belong
to historic genotypes and are not directly associated with a therapy change.
Moreover, detailed information on training set and test set (comprising labeled
therapies with an associated sequence each) is given.
Engine
GD
ME
EV
minimal feature set
AUC
Accuracy
Train
Test
Train
Test
0.747 (0.027) 0.744 0.745 (0.024) 0.724
0.758 (0.019) 0.745 0.748 (0.031) 0.757
0.766 (0.030) 0.768 0.754 (0.031) 0.748
maximal feature set
AUC
Accuracy
Train
Test
Train
Test
0.768 (0.025) 0.760 0.752 (0.028) 0.757
0.762 (0.021) 0.742 0.754 (0.030) 0.757
0.789 (0.023) 0.804 0.780 (0.032) 0.751
Table 5.2: Results for the individual classifiers on training set and test set. The table
displays the performance, measured in AUC and Accuracy, achieved by the
individual classifiers on the training set (using 10-fold cross-validation; standard
deviation in brackets) and the test set using different feature sets.
using the minimal (maximal) feature set were highly correlated (i.e. close to 1.0): GD-ME
0.812 (0.868); GD-EV 0.797 (0.786); ME-EV 0.774 (0.768). In fact, the three classifiers
agreed on the same label in 80.4% (81.7%) of the cases using the minimal (maximal) feature
set. Notably, agreement of the three classifiers on the wrong label occurred more frequently
in instances labeled as failure than in instances labeled as success (39% vs. 4% and 37%
vs. 4% using the minimal and maximal feature set, respectively; both p < 2.2 × 10−16 with
Fisher’s exact test).
This behavior led to further investigation of the instances labeled as failures in the
EIDB. Indeed, 145 of 350 failing instances, which were predicted to be a success by all
three engines, achieve a VL below 500 copies per ml once during the course of the therapy.
However, this reduction was not achieved during the time interval that was used in the
applied definition of therapy success. Among the remaining 550 failing cases this occurred
only 100 times. Using a Fisher’s exact test this difference was highly significant (p =
4.8 × 10−14 ). These results were qualitatively the same when using the maximal feature
set.
Results of combination methods
Table 5.3 summarizes the results achieved by combining the individual classifiers, and Figure 5.4 depicts the improvement in AUC on training and test set of the combination meth-
86
Method
SB
Oracle
Min
Max
Median
Mean
Majority
QDA
LR
DTrees
NB
DT
D-S
Cluster
LA
Feature
5 EuResist: Uniting Data for Fighting HIV
minimal feature set
AUC
Accuracy
Train
Test
Train
Test
0.766 (0.030) 0.768 0.754 (0.031) 0.748
0.914 (0.015) 0.911 0.842 (0.025) 0.844
0.771 (0.020) 0.765 0.746 (0.027) 0.761
0.760 (0.023) 0.765 0.742 (0.030) 0.731
0.773 (0.020) 0.766 0.759 (0.027) 0.766
0.777 (0.020) 0.772 0.760 (0.024) 0.744
0.683 (0.023) 0.660 0.759 (0.027) 0.738
0.771 (0.020) 0.763 0.755 (0.031) 0.738
0.778 (0.021) 0.774 0.762 (0.028) 0.744
0.718 (0.044) 0.741 0.748 (0.032) 0.757
0.732 (0.027) 0.740 0.759 (0.027) 0.738
0.777 (0.021) 0.774 0.755 (0.027) 0.754
0.777 (0.021) 0.772 0.755 (0.024) 0.754
0.775 (0.019) 0.773 0.758 (0.029) 0.741
0.777 (0.020) 0.771 0.761 (0.025) 0.741
0.750 (0.026) 0.747 0.745 (0.029) 0.751
maximal feature set
AUC
Accuracy
Train
Test
Train
Test
0.789 (0.023) 0.804 0.780 (0.032) 0.751
0.917 (0.013) 0.920 0.850 (0.022) 0.860
0.792 (0.021) 0.793 0.760 (0.030) 0.764
0.779 (0.021) 0.779 0.757 (0.030) 0.741
0.789 (0.029) 0.786 0.768 (0.029) 0.761
0.794 (0.019) 0.793 0.780 (0.028) 0.781
0.697 (0.027) 0.683 0.768 (0.029) 0.761
0.790 (0.022) 0.794 0.769 (0.027) 0.764
0.798 (0.020) 0.805 0.781 (0.030) 0.771
0.722 (0.033) 0.678 0.777 (0.032) 0.757
0.752 (0.028) 0.753 0.768 (0.029) 0.761
0.796 (0.019) 0.797 0.766 (0.026) 0.767
0.796 (0.019) 0.796 0.767 (0.026) 0.764
0.797 (0.018) 0.800 0.783 (0.028) 0.784
0.795 (0.019) 0.791 0.781 (0.029) 0.777
0.786 (0.021) 0.779 0.780 (0.029) 0.767
Table 5.3: Results for the combined classifiers on training and test set. The table summarizes the results achieved by the different combination approaches on the training set (10-fold cross-validation; standard deviation in brackets) and the test
set. The reference methods are Single Best (SB) and Oracle, the non-trainable
combiners are named according to their function, the meta-classifiers according
to the statistical learning methods, logistic regression, decision trees, and Naïve
Bayes are abbreviated by LR, DTrees, and NB, respectively. Decision Templates
(DT), Dampster-Shafer (D-S), Clustering (Cluster) and Local Accuracy (LA)
are the methods described in detail in Section 5.3. Feature is the combination
on the feature level.
ods compared to the single best and single worst classifier, respectively. Most combination
methods improved performance significantly over the single worst classifier. However, only
the oracle could establish a significant improvement over the single best classifier. Overall,
performances of the combination approaches were quite similar. Of note, the pessimistic
min combiner yielded better performance than the optimistic max combiner. Among the
non-trainable approaches tested, the mean combiner yielded the best performance. The
logistic regression was the best performing meta-classifier. In fact, the logistic regression
can be regarded as a weighted mean, with the weights depending on the individual classifier’s accuracy, and the correlation between classifiers. Moreover, using all features of the
individual classifiers as input to a single logistic regression did not improve over the single
best approach.
Figure 5.5 shows the learning curves for the three individual classifiers, the mean combiner, and the combiner on the feature level. The curves depict the mean AUC (after 10
repetitions) on the test set achieved with varying sizes of the training set (25, 50, 100, 200,
400, 800, 1600, 2722). In every repetition the training samples were randomly selected
from the complete set of training instances. The mean combiner appeared to learn faster
and significantly outperformed the single best engine with a training set size of 200 samples
5.4 Predictive Performance of the EuResist Prediction Engine
87
Figure 5.4: Improvement in AUC of combination methods compared to the single best and
single worst classifiers. The figure displays the improvement in AUC of all combination methods over the single best (blue bars) and single worst (red bars)
classifiers on the training set (upper panel) and the test set (lower panel). Significance of the improvement on the training set was computed with a one-sided
paired Wilcoxon test. Solidly colored bars indicate significant improvements
(at a 0.05 p-value threshold), as opposed to lightly shaded bars for insignificant
improvements. On the test set no p-values could be computed.
(p ≈ 0.010 with a paired one-sided Wilcoxon test). The improvement remained significant
up to a training set size of 1,600 samples (p ≈ 0.002). The combination on feature level
was significantly worse (p ≈ 0.002) than the worst single approach for all training set sizes
(except for complete set).
Finally, Figure 5.6 depicts the ROC curves of the three individual engines and the
combined engine (mean) using both feature sets on the test set. The results are compared
to a Stanford HIVdb based GSS prediction (for details see e.g. Section 4.3). The Stanford
HIVdb service did not support the rarely used drug ddC. Consequently, the four instances
in the test set containing ddC were excluded. Moreover, Stanford HIVdb did not provide
prediction results for two sequences. Thus, the reported AUC values (in parentheses in the
figure legend) differ slightly from the values reported in Tables 5.2 and 5.3 as only 295 of
the 301 test instances were used for generating the figure.
Impact of Ambiguous Failures
In order to further study the impact of ambiguous failures (i.e. instances labeled as failure
but achieving a VL below 500 cp per ml once during the course of treatment) on the
5 EuResist: Uniting Data for Fighting HIV
0.75
88
●
●
●
●
0.70
●
●
●
0.65
●
●
●
●
0.60
AUC on test set
●
●
●
●
0.55
●
●
●
0.50
●
●
●●
EV
GD
ME
combined feature level
combined mean
●
● ●
0
500
1000
1500
2000
2500
size of training set
Figure 5.5: Learning curves for the individual classifiers, the mean combiner, and the combination on feature level. The figure shows the development of the mean AUC
on the test set depending on the amount of available training data for the individual classifiers, the mean combiner, and the combination on the feature level
using the minimal feature set. Error bars indicate the standard deviation on
10 repetitions.
performance of the individual classifiers and the combination by mean or on the feature
level, they were removed from the training set, the test set, or both sets. After removal the
classifiers were retrained and tested on the resulting new training and test set, respectively.
The results in Table 5.4 suggest that training with the ambiguous failures does not impact
the classification performance (columns “none” vs. “only train”, and columns “only test”
vs. “both”). However, the ambiguous cases have great impact on the assessed performance.
Removal of these cases increases the resulting AUC by 0.05.
However, there might still be an influence of these ambiguous failures on the performance
of the trainable combination methods. For verification we removed these cases whenever
performance measures were computed (also in 10-fold cross-validation) and trained a selection of the combination methods on the complete training data and on the cleaned
(i.e. without ambiguous failures) training data. The results in Table 5.5 suggest that the
trainable combination methods were not biased by the ambiguous failures.
A possibility for circumventing the need for dichotomization of virological response is
the prediction of change in VL between the baseline value and the measurement taken
5.4 Predictive Performance of the EuResist Prediction Engine
0.8
0.6
True positive rate
EV (0.797)
GD (0.752)
ME (0.735)
combined (0.786)
Stanford (0.726)
0.0
0.0
0.2
0.4
0.6
0.4
EV (0.762)
GD (0.737)
ME (0.736)
combined (0.766)
Stanford (0.726)
0.2
True positive rate
0.8
1.0
ROC curve with maximal feature set
1.0
ROC curve with minimal feature set
89
0.0
0.2
0.4
0.6
False positive rate
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
Figure 5.6: ROC curves for the prediction engines on the test data. The left (right) plot
depicts ROC curves for the individual prediction engines and the combination
using the mean combiner when working with the minimal (maximal) feature
set. Results are compared to the performance of the Stanford HIVdb based
genotypic susceptibility score.
at the follow-up time point. Logistic regression was replaced by linear regression in the
individual classifiers for predicting the change in VL. Using the maximal feature set the
GD (ME, EV) engine achieved a correlation (r) of 0.658±0.023 (0.664±0.023, 0.679±0.020)
on the training set (Rosen-Zvi et al., 2008). The mean combiner yielded a correlation of
0.691±0.019. However, the oracle achieved r = 0.834±0.012. Although small, the difference
between EV and the mean prediction reached statistical significance (p ≈ 0.005) using a
one-sided paired Wilcoxon test. Results on the test set were qualitatively the same: GD
(ME, EV) reached a correlation of 0.657 (0.642, 0.678) and the mean combiner (oracle)
reached 0.681 (0.814).
5.4.3 Discussion
The performance of the methods considered for combining the individual classifiers improved only little over the single best method on both sets of available features. It turns
out that the simple non-trainable methods perform quite well, especially the mean combiner. This phenomenon has been previously discussed in literature (Kuncheva et al.,
2001; Liu and Yuan, 2001). Here we focused on finding the best combination strategy
for a particular task. The advantage of the mean combiner is that it does not require an
additional training step (and therefore no additional data), although it ranges among the
best methods studied. Moreover, this combination strategy is easy to explain to end-users
of the prediction system.
The learning curves in Figure 5.5 show that the mean combiner learns faster (gives
more reliable predictions with fewer training data) than the individual prediction systems.
Moreover, the curves show that the combined performance is not dominated by the single
90
5 EuResist: Uniting Data for Fighting HIV
Engine
GD
ME
EV
Mean
Feature
minimal feature set
ambiguous instances removed from
none only Train only Test both
0.744
0.738
0.784
0.786
0.745
0.739
0.770
0.771
0.768
0.776
0.811
0.824
0.772
0.767
0.812
0.814
0.747
0.754
0.797
0.808
maximal feature set
ambiguous instances removed from
none only Train only Test both
0.760
0.747
0.808
0.806
0.742
0.757
0.808
0.810
0.804
0.812
0.846
0.855
0.793
0.791
0.849
0.849
0.779
0.787
0.832
0.842
Table 5.4: Results on the (un)cleaned test set when individual classifiers are trained on
the (un)cleaned training set. The table summarizes the results, measured in
AUC for the individual classifiers, the mean combiner, and the combination
of feature level when retrained on the (un)cleaned training set and tested on
the (un)cleaned test set. Cleaned refers to the removal of ambiguous failing
instances.
best approach as the results on the full training set might suggest. Furthermore, the
learning curve for the combination on the feature level indicates that more training data is
needed to achieve full performance. In general, combining the three individual approaches
leads to a reduction of the standard deviation for almost all combination methods. This
suggests a more robust behavior of the combined system.
In the cases of failing regimens, all three classifiers very frequently agree upon the wrong
label, precisely in 350 of 900 (39%) failing regimens in the training data using the minimal
feature set. There are two possible scenarios why the VL drop below 500 copies per ml
did not take place during the observed time interval despite the concordant prediction of
success by all the three engines:
1. Resistance against one or more antiretroviral agents is not visible in the available
baseline genotype but stored in the viral population and rapidly selected, which
would lead to an initial decrease in VL shortly after therapy switch, and a subsequent
rapid increase before the target time frame (Figure 5.7 (b) lower right).
2. The patient/virus is heavily pretreated and therefore takes longer to respond to the
changed regimen, or the patient is not completely adherent to the regimen, both
cases lead to a delayed reduction in VL after the observed time frame (Figure 5.7 (b)
lower left).
Figure 5.7 (a) shows the distribution of predicted success provided by the mean combiner
using the minimal feature set. There is a clear peak around 0.8 for instances labeled
as success whereas the predictions for the failing cases seem to be uniformly distributed.
Interestingly, the distribution of the failing cases with a VL below 500 copies per ml
resembles more the distribution for success than for failure.
The approach to predicting the change in VL exhibited moderate performance. In general, the task of predicting change in VL is harder, since many host factors, which are not
available to the prediction engines, contribute to the effective change in individual patients.
However, guidelines for treating HIV patients recommend a complete suppression of the
virus below the limit of detection (Hammer et al., 2008). Thus, dichotomizing the outcome
5.4 Predictive Performance of the EuResist Prediction Engine
Method
minimal feature set
Train
Test
removed from test only
Single Best
0.809 (0.021)
Oracle
0.935 (0.012)
Min
0.817 (0.019)
Max
0.807 (0.024)
Median
0.820 (0.020)
Mean
0.823 (0.019)
Logistic Regression 0.824 (0.019)
Decision Templates 0.823 (0.018)
Clustering
0.822 (0.020)
Local Accuracy
0.823 (0.019)
removed from train and test
Logistic Regression 0.825 (0.019)
Decision Templates 0.823 (0.018)
Clustering
0.822 (0.021)
Local Accuracy
0.823 (0.019)
91
maximal feature set
Train
Test
0.811
0.936
0.807
0.810
0.810
0.816
0.816
0.818
0.808
0.813
0.839
0.945
0.847
0.832
0.844
0.850
0.852
0.851
0.852
0.850
(0.017)
(0.014)
(0.022)
(0.018)
(0.021)
(0.019)
(0.017)
(0.019)
(0.017)
(0.019)
0.847
0.950
0.848
0.824
0.835
0.847
0.856
0.850
0.850
0.844
0.816
0.818
0.796
0.813
0.852
0.851
0.852
0.850
(0.017)
(0.019)
(0.017)
(0.019)
0.856
0.850
0.844
0.843
Table 5.5: AUC for the combined engines on training set and test set with the ambiguous
cases removed from test set and training set or test set only. The table displays
the results, measured in AUC, on training set (10-fold cross-validation; standard
deviation in brackets) and test set for a selection of combination approaches
when trained on the (un)cleaned training set. For computation of the AUC the
ambiguous cases were always removed.
and instead solving the classification task is an adequate solution, since classifiers can be
used for computing the probability of achieving complete suppression.
5.4.4 Conclusion
The use of the maximal feature set consistently outperformed that of the minimal feature
set in the combined system. Among the studied combination approaches the logistic regression performed best, although not significantly better than the mean of the individual
classifications. The mean is a simple and effective combination method for this scenario.
Variations in the size of the training set showed that a system combining the individual
classifiers by the mean achieves better performance with fewer training samples than the
individual classifiers themselves or a logistic regression using all the features of the individual classifiers. This and the consistent reduction of the standard deviation of the
performance measures lead to the conclusion that the mean combiner is more robust than
the individual classifiers, although the performance is not always significantly improved.
Moreover, the mean is a combination strategy that is easily explainable to the end-users
of the system.
In this study we discovered ambiguous failures. These therapies are classified as failure
but have a VL measurement below 500 copies per ml. Although these instances did significantly influence neither the learning of the individual classifiers nor the learning of the
combination method, they lead to an underestimation of the performance. This suggests
92
5 EuResist: Uniting Data for Fighting HIV
(a) Distribution of predicted success probabilities
(b) Different viral load trajectories
Figure 5.7: (a) Distribution of the predicted success for all successful therapies (blue solid),
all failing therapies (red solid), failing therapies with at least one VL measure
below 500 during the regimen (red dashed), and failing therapies with all VL
measures above 500 (red dotted) of the mean combiner using the minimal
feature set. (b) Different viral load trajectories. The top row depicts a clear
success (left) and a clear failure (right). The bottom row depicts trajectories
that are labeled as failures because the VL closest to eight weeks of treatment
exceeds the limit of detection (horizontal line). Those cases are nevertheless
often predicted as success.
that clinically relevant adjustments of the definition of success and failure can result in
increased accuracy of the combined engine.
The training and test data comprise patients with very differing level of pre-treatment,
i.e. from therapy-naïve patients to patients receiving their 15th antiretroviral regimen.
Hence, it is of interest how the engine performs for those patients. Figure 5.8 a) depicts
the performance of the EV engine for different groups of patients measured by 10-fold
cross-validation on the training set (EuResist release from 2008/10/10). The accuracy
is the highest for naïve patients, and interestingly, the AUC is the lowest for this group.
The low AUC can be explained by the fact that all patients in that group receive high
predicted success probabilities, and a substantial fraction of the few failing regimens might
be indeed ambiguous failures. Nevertheless, the AUC rapidly increases and reaches 0.84
for the group with the highest level of pre-treatment. The accuracy, on the other hand,
decreases from the initial 0.82 and stabilizes at 0.75. Again, usage of the maximal feature
set shows a better performance in all groups (except for naïve patients).
An older version of the EuResist engine (trained on release from 2007/08/29) was
compared to the opinion of 12 international HIV treatment experts (who were allowed to
use any available interpretation algorithm) on 25 cases from the test set. Only ten experts
handed in all their predictions and were therefore considered. The result of this Engine vs.
Experts (EVE) study is shown in Figure 5.8 b). It turned out that the prediction engine
performed as accurately as the best expert (expert 8). Moreover, the EuResist prediction
93
140
5.4 Predictive Performance of the EuResist Prediction Engine
expert 9
consensus
expert 7
expert 3
expert 2
expert 5
0.5
40
(a) Group-wise Performance
expert 1
10+
20
1−2
expert 6
6−9
0
0
expert 10
3−5
# pretreatments
●
●
●
expert 4
●
●
expert 8
●
● ●
●
case
●
● ●
●●
0.6 0.7 0.8 0.9
AUC / accuracy
1
100
80
●
●
●
●
60
# instances
●●
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
outcome
AUC
accuracy
120
●
EuResist
success
failure
(b) Result of the EVE Study
Figure 5.8: Performance of the EV engine with respect to pre-treatment level (a). Bars
denote the mean number of samples in the group. Green (red) denotes the
mean number of successful (failing) therapies in the group. Performance is
measured in AUC (solid line) and accuracy (dashed line) using the minimal
(black) and maximal (blue) feature set. Result of the EVE study (b). Every
row corresponds to one case, and the columns denote the true outcome of the
treatment, the EuResist prediction, predictions by the 10 experts, and the
consensus of the expert predictions. Green relates to (predicted) success and
red to (predicted) failure.
94
5 EuResist: Uniting Data for Fighting HIV
engine provides only six wrong predictions, and interestingly, the decision profile of the
prediction engine is very similar to the consensus decision of all ten experts, which makes
also six mistakes.
5.5 The E U R ESIST Web Service
The prediction engines developed by the three institutes are hosted on servers within these
institutes. Every institute is running a SOAP (once defined as: simple object access protocol) server. A SOAP protocol defines which data and in which format may be exchanged
between a client and a server. In essence, SOAP is the successor of XML-RPC. The specifications are usually written down in a WSDL (web service description language) file.
Modern programming languages support the construction of SOAP clients and servers on
the basis of the WSDL files.
Each engine has its own endpoint that conveniently provides the required WSDL file
upon request (i.e. by adding “?wsdl” to the endpoint of the engine): EV engine2 , GD
engine3 , and ME engine4 . Thus, once the endpoint of an engine is known, virtually everyone
can build a SOAP client to send valid queries to the server. The structure of the SOAP
services selected in the EuResist project allows every institution to implement the service
in a preferred programming language. Moreover, the use of SOAP services allows for a
variety of tools: web sites, stand-alone programs, or integration into existing programs
for management of clinical data. If preferred, then only a subset of the engines can be
queried as well. All the different clients accessing the array of prediction engines can be
implemented without any adaptation of the underlying SOAP servers.
Indeed, the EuResist prediction engine was integrated into InfCare HIV (http://www.
infcare.se), a Swedish management tool for HIV data.
5.5.1 Implementation of the EV Engine
The SOAP server of the EV engine is written in PHP. Functions that are specific to the EV
engine – especially the computation of the genetic barrier to drug resistance – are implement
in C++ and made available to PHP via a PHP extension. The original computation of
the genetic barrier to drug resistance was rather slow, mainly due to the requirement of
computing all transition probabilities in the mixture of mutagenetic trees. The speed up
was achieved by precomputing all possible values of the genetic barrier offline. This was
possible since the mixture of mutagenetic trees perceive the virus only as a binary pattern
of length l, with l being the number of predefined events (Section 4.1.2). Thus, there is
only a finite set of possible values the genetic barrier can take. This workaround facilitates
fast response times of the web service.
5.5.2 E U R ESIST web tool
The most visible portal for accessing the EuResist prediction engine is the dedicated web
site (http://engine.euresist.org), which was created by one of the EuResist partners.
2
http://euresist.bioinf.mpi-inf.mpg.de/prediction/?wsdl
http://srv-peres.haifa.il.ibm.com:9080/EuResistIbmWeb/services/EuResistIBMEngine?wsdl
4
http://www.informadoc.net/euresistibmserviceinterfaces.asmx?wsdl
3
5.6 Towards Prediction of Sustained Response
95
The web site is responsible for collecting the data that is sent to the array of prediction
engines. Figure 5.9 displays the data input page of the web site. Mandatory information
for a successful request is the HIV sequence comprising protease and reverse transcriptase.
If no further information is provided, then the sequence is aligned to a consensus sequence;
the list of mutations along with a predefined list of putative treatments is sent to the three
individual engines. The user can also provide optional information that is available on
the patient: the number of previous treatment lines, previously taken antiretroviral drugs,
VL, CD4+ T cell counts, and demographic data (gender, age, mode of HIV transmission).
Like in g2p-THEO, the user can influence the list of putative treatments by excluding
drugs from being part of any treatment. The three engines predict response to each of the
treatments on the provided list and send their results back to the web site.
After receiving predictions for all putative treatments, the web site generates and displays
a top 10 list of combination treatments (Figure 5.10). Unlike in g2p-THEO, the ranking
is not only based on the mean predicted success but also on the range of the predictions.
Precisely, the ranking criterion is:
(mean_of_predictions) × (1 −
range_of_predictions
),
α
with α = 1.3. The user can then interactively reorder the top 10 list with respect to success
rate or range of the predictions only. Moreover, for drugs that are currently not supported
by the EuResist prediction engine, the web site queries Stanford’s HIVdb and provides
the result. The prediction results are followed by a summary of the data that was provided
to the engines and an analysis of the mutations, e.g. whether they are known resistance
mutations or not. At the top right of the results page the user can click on “Request
detail”. The resulting page (Figure 5.11) states which engines were contacted and whether
the request was successful.
5.6 Towards Prediction of Sustained Response
Currently, as many other tools, the EuResist prediction engine focuses on predicting initial virological response, i.e. reduction of VL within 4-12 weeks of treatment. However,
clinicians are also interested in the response to the selected treatment beyond this short
period. Thus, performance of the EuResist prediction engine of inferring sustained response has to be properly assessed. For this analysis, sustained response is measured at
16-32 weeks of treatment. Sustained response data were extracted from the EuResist
integrated database comprising data from Italy, Sweden, Germany, and Luxembourg (i.e.
version from 2007/11/27).
5.6.1 Material and Methods
Data
The data for this study were extracted according to the standard datum definition presented in Section 5.1. For the comparison we used the originally employed definition that
focuses on initial response without changes, i.e. response measured at 8 (±4) weeks of
treatment, and a modified version with sustained response measured at 24 (±8) weeks of
96
5 EuResist: Uniting Data for Fighting HIV
Figure 5.9: Data input page of the EuResist prediction engine. The form contains fields
for pasting a HIV sequence comprising protease and reverse transcriptase. Alternatively, the sequence can be uploaded as fasta or mutation can be selected
from a drop-down menu. In the field below one can select the drugs that should
be considered when ranking different regimens. The remaining fields ask for
additional data, e.g. number past treatment lines and previously used drugs.
5.6 Towards Prediction of Sustained Response
Figure 5.10: Results page of the EuResist prediction engine.
97
98
5 EuResist: Uniting Data for Fighting HIV
Figure 5.11: Details of the request. The page summarizes how long did the generation of
the result took and which SOAP servers were queried.
treatment. In both cases a success was defined as a drop of viral load below 500 copies
per ml or, alternatively, a 100-fold reduction compared to the baseline value. 3,023 treatments met these requirements for initial response and 2,722 (301) were used for training
(testing) the EuResist prediction engine as described in the previous section (short-term
dataset). A comparable number of 2,778 treatments met the requirements for sustained
response (long-term dataset). For avoiding over optimistic estimation of the prediction
performance, the instances in the long-term dataset were grouped depending on whether
they were also present in the short-term dataset. Precisely, dataset A comprised 1,837
treatments of the short-term training set for which a classification with respect to longterm response was available (1,540 of those share the same label). Dataset B comprised all
742 treatments for which only a long-term label was available. And dataset C comprised
199 treatments from the short-term test set for which also a long-term label was available
(167 of them have the same label). Treatments in dataset A and B contributed to the
training set for a long-term prediction model, while dataset C was used as an independent
test set. Changes of the label from success at initial response to failure at sustained response were with 142 (17) as frequent as changes from failure to success with 155 (15) in
dataset A (C).
Comparative Analysis
The EuResist prediction engine was used to predict the response to antiretroviral therapy
for all treatments in datasets B and C. In order to rate the achieved prediction performance
correctly, the results were compared to a long-term prediction engine. This engine was
trained from scratch (i.e. feature selection and statistical learning) for this task solely
on data from datasets A and B. The resulting model was used to predict the sustained
response for all treatments in dataset C. Predictions by this model for treatments in dataset
B were obtained in a cross-validation like procedure that will be explained in the following
paragraph.
The training procedure for the sustained prediction model was as follows. Training data
for the long-term model comprised datasets A and B. Precisely, dataset B was split into 10
equally sized subsets, and dataset A plus nine of the subsets from dataset B were used as
training set. The remaining subsets from dataset B operated as a test set. This procedure
5.6 Towards Prediction of Sustained Response
99
was repeated 10 times. Hence, every subset of dataset B was used as a test set once.
Therefore, results can be compared to the predictions given by the short-term model for
instances in dataset B. The final long-term model was trained using the complete data
from datasets A and B. And response to treatments in dataset C was predicted and results
compared to the short-term predictions of dataset C. Due to unavailability of predictions
of the GD engine for dataset B, only the ME and EV engines were used for the comparison
on dataset B. Results on dataset C include the predictions made by all three engines.
As described in Section 5.2, logistic regression was the only statistical learning method
for all engines. Again, the three predictions provided by the individual engines were combined using the mean. Performance was measured in AUC (area under the ROC curve).
Statistical significance was computed using a Wilcoxon test for testing whether the two
areas under the ROC curve are different.
All experiments were carried out using both, the minimal feature set and the maximal
feature set. Results on dataset C were also compared to a GSS prediction based on the
Stanford HIVdb (for details see e.g. Section 4.3). In addition to the sustained outcome of
the treatment, the initial response for the instances in dataset C was available. Thus, on
dataset C performance of the all models is assessed with respect to both time points.
5.6.2 Results
Figures 5.12 (a) and (b) show the ROC curves on dataset B. Briefly, the short-term model
achieved an AUC on dataset B of 0.799±0.029 (0.823±0.05) compared to 0.799±0.05
(0.846±0.055) achieved by the long-term model using the minimal (maximal) feature set.
In both cases the difference was not significant (p=0.99). When using the maximal feature
set, however, the performance difference between the two models was more pronounced
in favor of the long-term prediction engine. Results were qualitatively the same on the
independent test set C. The short-term model achieved an AUC of 0.757 (0.799) compared to 0.77 (0.81) achieved by the long-term model using the minimal (maximal) feature
set. Figures 5.13 (a) and (b) show the corresponding ROC curves. Here the long-term
model is better when predicting sustained response and the short-term model is better
when studying initial response. The performance of HIVdb was better for predicting sustained response than initial response. In general HIVdb performed worse than either of
the prediction models.
5.6.3 Discussion
In this analysis we showed that the EuResist prediction engine predicted long-term response as well as short-term response. This fact has been demonstrated previously for
rules-based prediction systems (Zazzi et al., 2009). Moreover, the EuResist prediction
engine performs as well as a system trained for predicting long-term response. A likely
explanation is that in many cases the short-term response and long-term response share
the same label (≈ 82% in dataset A and C). Interestingly, the performance for all tested
systems increased when studying the sustained response.
0.8
0.6
short 0.823 (0.050)
long 0.846 (0.055)
0.0
0.0
0.2
0.4
Average true positive rate
0.6
0.4
short 0.799 (0.029)
long 0.799 (0.050)
0.2
Average true positive rate
0.8
1.0
5 EuResist: Uniting Data for Fighting HIV
1.0
100
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(a) CV performance on B (min)
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(b) CV Performance on B (max)
Figure 5.12: ROC curves for dataset B. Curves were generated using the minimal (a) and
maximal (b) feature set, respectively, with whiskers indicating the standard
deviation, for the initial response model (short) and the sustained response
model (long).
5.7 Effect of Modified TCE Definitions
The results in Section 5.3 indicated that the standard datum definition is suitable but
has serious drawbacks, the main one being that treatments are labeled as failures albeit
the fact that the VL is sufficiently reduced during course of the treatment. As discussed,
the factors leading to a delayed response, e.g. suboptimal adherence, usually cannot be
inferred from the viral genotype. Luckily, the suboptimal labeling does not have a great
impact on the learned statistical models but only on assessed performance leading to an
underestimation of the usefulness of the tool in clinical practice (Section 5.3). Likewise,
factors that lead to an initial response but a virological failure only a few weeks later may
not be found in the viral genotype as well, for example, decreased adherence of the patient
(after improved health conditions due to initial response or due to the appearance of sideeffects), archived drug resistance mutations that are not visible in the baseline genotype,
rapid development of drug resistance.
An improved standard datum definition could consult multiple VL measurements before
labeling a treatment as success or failure. The problem with relying on multiple measurements at specific time points is the expected decrease of training samples. For instance,
only about 2000 of the 3000 standard datum instances used in the previous section have
also a VL available in the four months window around 24 weeks of treatment.
Here we introduce two modified standard datum definitions that use multiple time points
of VL measurement for dichotomizing virological response to success and failure.
0.6
0.8
1.0
101
short 0.799
long 0.81
hivdb 0.752
short (short) 0.786
long (short) 0.77
hivdb (short) 0.735
0.0
0.0
0.2
0.4
True positive rate
0.6
0.4
short 0.757
long 0.77
hivdb 0.752
short (short) 0.747
long (short) 0.737
hivdb (short) 0.735
0.2
True positive rate
0.8
1.0
5.7 Effect of Modified TCE Definitions
0.0
0.2
0.4
0.6
0.8
False positive rate
(a) CV performance on C (min)
1.0
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(b) CV Performance on C (max)
Figure 5.13: ROC curves for the independent test set C. Here both outcomes, initial response (dashed lines) and sustained response (solid lines) are studied for the
short-term model (short) and the sustained response model (long). The performance is compared to the performance of Stanford HIVdb (hivdb).
5.7.1 Material and Methods
Alternative Standard Datum Definition
The basic idea behind the first modified standard datum definition is to extract from the
database instances with initial and instances with sustained response separately. If there
are instances that receive a label for both time points, then those with discordant labels
are removed from the final dataset. For this study we use the most recent version of the
EuResist integrated database (2008/10/10). The database gives rise to 5001 and 4456
instances for initial and sustained response, respectively. 3504 instances received a label
for both time points; 2959 of those instances received the same label. This initial data is
augmented by instances that only received a label for initial response (n=1497) or only for
sustained response (n=952). The resulting dataset comprises 5408 treatment switches of
which 10% are randomly assigned to a test set (n=541). The success rate was 0.747 and
0.756 in the training and test set, respectively. This definition is termed TEP (for: two
end points).
The second modified standard datum definition examines the area under the log10 (VL)
curve after treatment start as basis for the labeling. Here, all available VL measurements
between treatment start and at 24 (±8) weeks are used to compute the area under the
log10 (VL) curve. By definition, at least two VL measurements are required, the time
point of the first measurement is set to treatment start, and the time point of the last
measurement is set to 24 weeks. The minimum expected area under the VL curve is
168 × log10 (50) ≈ 285, corresponding to the fact that all measurements are at the level of
detection, i.e. 50 copies of viral RNA per ml blood. However, we usually apply a threshold
0.8
0.6
0.2
0.4
True positive rate
0.6
0.4
0.2
Average true positive rate
0.8
1.0
5 EuResist: Uniting Data for Fighting HIV
1.0
102
minimal 0.849
maximal 0.856
0.0
0.0
minimal 0.818 (0.028)
maximal 0.856 (0.030)
0.0
0.2
0.4
0.6
0.8
False positive rate
(a) Performance on training data
1.0
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(b) Performance on test data
Figure 5.14: Performance of the EV engine when trained and tested using the TEP definition. The engine was trained with the minimal and maximal feature set.
Performance in 10-fold cross-validation (a) and on an independent test set (b).
of 500 copies per ml as the database features also values using an older, less sensitive VL
assay. Applying the 500 copies threshold leads to a cutoff for the area under the VL curve
of approximately 450. As the choice of the cutoff is not straight forward, we apply an array
of cutoffs: 400, 450, 500, 600, and 700, corresponding to an average VL of approximately
240, 477, 946, 3727, and 14677, respectively. The dataset comprises a subset of instance
from the sustained response dataset. Precisely, those instances with more than one VL
measurement during the observation period (n=4059). An independent test set is created
by randomly selecting 10% of the instances (n=406). This definition is termed AVL (for:
area under VL).
Prediction Engine
Due to practical reasons only the EV engine is trained and tested on the datasets. For
training the engine we use exactly the same features and feature selection model as described earlier in Section 5.2. For the AVL labeling, feature selection was carried out only
once with a cutoff of 500. Performance is again assessed using the area under the receiver
operating characteristics curve.
5.7.2 Results
Using the TEP definition, the EV engine achieves an AUC of 0.818±0.028 (0.856±0.03) and
0.849 (0.856) in the 10-fold cross-validation and on the independent test set, respectively,
using the minimal (maximal) feature set. Figure 5.14 depicts the corresponding ROC
curves.
Table 5.6 lists the AUCs achieved on training and test data using the AVL labeling. The
5.7 Effect of Modified TCE Definitions
success rate
minimal
maximal
train
test
train
test
train
test
103
400
450
500
600
700
0.627
0.611
0.712 (0.019)
0.724
0.791 (0.023)
0.794
0.728
0.695
0.753 (0.028)
0.742
0.822 (0.024)
0.797
0.785
0.749
0.788 (0.035)
0.772
0.860 (0.021)
0.823
0.858
0.825
0.810 (0.032)
0.840
0.867 (0.020)
0.878
0.915
0.911
0.782 (0.040)
0.829
0.867 (0.036)
0.894
Table 5.6: Performance using the area under the log10 (VL) curve for labeling. The columns
correspond to different thresholds used for dichotomizing the area into success
and failure. The first two rows state the fraction of successful treatments in
the dataset (train and test). The following rows show the performance of the
minimal and maximal feature set measured in AUC on the training and test
data.
values range from 0.712±0.019 to 0.782±0.04 in the 10-fold cross-validation when restricted
to the minimal feature set. Application of the maximal feature set leads to an increase of
around 0.07 in AUC. Moreover, the higher the selected threshold the higher the achieved
AUC.
5.7.3 Discussion
The observed performance with the TEP definition is close to the one achieved after removing the ambiguous failures from the dataset in Section 5.3. This suggests that the
labeling using the modified definition is clearer. A confounding factor is obviously the
increase in data compared to the previous analysis. However, when the modified definition
was applied to an older version of the EuResist database (2007/11/27) a similar boost in
performance was observed (data not shown). Moreover, using a training set of comparable
size originating from the most recent release, with labels according to initial response only,
shows AUC values around 0.75 (see e.g. results in Section 5.8). Thus, the observed boost
in performance mainly originates from the change in the standard datum definition.
One could argue that removing the cases with discordant labels at the two time points
would only leave the “easy to predict” cases in the training and test data. In general, cases
are considered to be simple if the viral genotype shows few mutations, i.e. the patient is
not very treatment experienced. However, the number of recorded treatments prior to the
treatment switch, which is a measure of treatment experience, did not differ significantly
between the time points for initial and sustained response (p = 0.6529 computed with a
Kolmogorov-Smirnov test). Neither was the difference between initial response and the
combined response significant (p = 0.99). Thus, a differing level of treatment experience
as a source for improved prediction performance can be excluded.
In general, results obtained with the AVL definition are inferior to the ones obtained
using the TEP definition, probably owing to the fact that the AVL definition does not
exclude instances that represent unclear cases (e.g. slow decrease in VL). And indeed, if
the instances with discordant labels in initial and sustained response are excluded from the
dataset the AUC reaches 0.824±0.032 and 0.837 on training and test set, respectively, with
the minimal feature set and a cutoff of 500. However, the increase in AUC with increasing
104
5 EuResist: Uniting Data for Fighting HIV
threshold used for the labeling indicates that the task of sorting out clear failures is not so
hard to solve. The cutoffs 600 and 700 represent treatments with an average VL of 3727
and 14677, respectively.
As long as there are known factors that significantly influence the response to treatment
and these factors cannot be included in the statistical model (for any reason) one should
select a labeling that removes (or at least reduces) the impact of these factors. One such
factor is the adherence of the patient to the prescribed treatment. Also the RDI tried to
exclude cases that are likely to be related to poor adherence. Precisely, if at baseline the
patient had a VL of less than 1000 copies per ml and the VL increased after the treatment
change by more than 300-fold (i.e. 2.5 log10 VL) then the TCE was excluded (Larder
et al., 2007). The TEP definition provided a labeling involving instances that are less
likely corrupted by incomplete adherence. However, inspection of the VL trajectories of a
patient’s past treatments could give a clue about his or her general adherence and provide
a useful covariate for predicting response to combination therapy.
There are many ways to modify the original standard datum definition, e.g. by changing
the outcome time points or considering more than two outcome time points. When modifying the definition of response one has to consider not to pose too many restrictions as one
would drastically reduce the amount of available data. Moreover, and most importantly,
the definition of treatment response must be in concordance with clinical practice: only in
this case one can provide a tool that can be used in clinical routine.
5.8 Integrating Novel Drugs
The work described in this section has been presented at the 7th European HIV Drug
Resistance Workshop 2009, Stockholm, Sweden (Altmann et al., 2009c).
The beauty of purely data-driven approaches to predict either drug resistance of the virus
to single drugs or the virological response to combination therapy is that they rely solely
on data for constructing a model that solves the task. These approaches, however, require
a sufficient amount of data to solve the task with a certain reliability. Hence, the beauty of
the approach is also its major weakness, as data for newly licensed drugs are rarely available
in sufficient amounts. In the case of geno2pheno, phenotypic resistance data are required.
The data, however, demands the availability of the drug to the laboratory, and this usually
only the case after official release of the drug on the market. Moreover, once the drug
is available, the experimental data are still cost- and labor-intensive to obtain. In the
case of prediction engines that infer response to combinations of antiretroviral agents, the
databases collecting the training data have to be updated with treatments comprising the
novel compounds. Unfortunately, this process can take years, e.g. in the latest release of the
EuResist database (2008/10/10) there exist only 119 standard datum instances featuring
three novel drugs (the oldest was approved in June 2005). Owing to the problems of
update ability, the data-driven approaches are doomed to lag behind the new developments
by the pharmaceutical companies (unless they will agree to share their data on clinical
trials to facilitate updating such tools). For geno2pheno we investigated the use of
semi-supervised learning techniques for improving the prediction of HIV drug resistance
in situations with only few training samples (see Perner et al. (2009) and Section 3.3).
Unfortunately, a benefit of semi-supervised learning could not be confirmed for every drug
5.8 Integrating Novel Drugs
105
class. More precisely, some model assumptions of the employed semi-supervised learning
methods could even be harmful for the classification performance of some drugs.
A solution to the update problem of prediction engines like g2p-THEO and EuResist
that is based on semi-supervised learning strategies is unlikely to be successful, simply owing to the use of drug combinations instead of single drugs as in the case of geno2pheno.
Here we investigate a strategy that aims at incorporating expert-based prediction systems,
which generally are rapidly updated following the release of a new drug, into the data
driven approach. The approach borrows ideas from language processing where discounting
is the process of replacing the original counts of events with modified counts in order to redistribute the probability mass from the frequently observed events to rare or even unseen
events. For example, absolute discounting removes a constant number from the counts for
all events (Ney et al., 1994).
5.8.1 Materials and Methods
Standard Encodings
From the latest release of the EuResist integrated database (2008/10/10) we extracted
4,538 standard datum instances (original definition as in Figure 5.1). These therapy
switches were used as training data for the EV engine. Response to antiretroviral treatment was dichotomized to success and failure based on the 8 (±4) week follow-up viral
load (VL).
The baseline model used the minimal feature set, i.e. indicators for drugs in the regimen, mutations in the baseline genotype, and the genetic barrier to drug resistance. The
extended model used the maximal feature set, i.e. additionally baseline VL, indicators for
previous use of a drug, and indicators modeling previous exposure to NRTIs, NNRTIs, and
PIs. Both encodings are termed standard encodings.
ndGSS Encodings
The alternative EV engine incorporates rules-based ratings for novel drugs as additional
features. We overcome the fact that only very few real novel drugs (considered by today’s
standards) are available in the training data by treating all drugs in a regimen that were
approved by the FDA less than four years prior to treatment start as novel drugs; simply
because these drugs were real novel drugs at the time the treatment started. The genotypic
susceptibility score (GSS) for drugs that are considered novel was computed using Stanford’s HIVdb (version 5.0.0; HIVdb5) and used as an additional covariate termed ndGSS
(for: novel drug GSS). This step constitutes the discounting-like part of the modeling strategy, since indicators and genetic barrier for drugs contributing to ndGSS were set to 0, i.e.
with respect to the standard encoding the novel drugs were not existing in the treatment.
Table 5.7 lists the year of FDA approval for the considered drugs and the year until when
those drugs were considered novel. Moreover, the table lists in how many instances the
drugs were used and how many instances were affected by the discounting. In order to be
able to distinguish between the absence of novel drugs and all the virus being resistant to
all novel drugs (in both cases the ndGSS is 0), the ndGSS was set to -1 if all novel drugs
were classified as resistant by HIVdb5. Moreover, as novel compounds can be expected
106
5 EuResist: Uniting Data for Fighting HIV
Year FDA approval
Year until novel
Drug used in instances
# instances changed
ZDV
1987
1991
1355
0
ddI
1991
1995
1085
0
d4T
1994
1998
900
60
3TC
1995
1999
3123
133
ABC
1998
2001
752
153
TDF
2001
2003
1852
413
Year FDA approval
Year until novel
Drug used in instances
# instances changed
IDV
1996
1999
237
66
SQV
1995
1999
245
54
NFV
1997
2000
359
128
LPV
2000
2002
1726
394
APV
1999
2003
314
69
ATV
2003
2004
423
110
NVP
1996
2000
395
75
EFV
1998
2002
823
243
Table 5.7: Year of approval of the drugs by the FDA and year until the drugs were considered novel. The last two rows list the number of standard datum instances in
which the drugs were used, and the number of instances that were affected by
the discounting-like step.
to be more powerful drugs, we introduce a weighting factor (wf) for the ndGSS feature.
Precisely, the ndGSS feature is multiplied by the weighting factor during the prediction
process.
Performance Assessment
Performances of standard and the ndGSS encodings were compared in a 10-fold crossvalidation on the training data. In this setting we seek to investigate whether the changes
applied to the standard encoding have a serious impact on the performance with respect
to the established compounds. To this end, the folds making up the training data for the
ndGSS encoding were used as is to train the classifier, the test fold, however, comprises
the standard encoding. Thus both approaches are tested on the same encoding (i.e. the
standard encoding).
An additional 119 standard datum instances containing real new drugs (DRV n=59,
TPV n=52, ETR n=4, DRV+ETR n=4) were extracted from the EuResist database
for independently assessing performance of the ndGSS encoding. The ndGSS for these
new drugs was computed using HIVdb5 as well. Performances of the ndGSS encodings
were compared with the performance achieved by a HIVdb5-based GSS for the complete
regimen. The weighting factor was optimized by five-fold cross-validation on these 119
instances. Classification performance was assessed using the area under the ROC curve
(AUC).
5.8.2 Results
In the cross-validation setting the baseline (extended) model using the standard and
the ndGSS encoding achieved an AUC of 0.742±0.020 (0.775±0.022) and 0.735±0.027
(0.768±0.027), respectively. For comparison, HIVdb5 reached an AUC of 0.708±0.026. On
the 119 instances including real novel drugs HIVdb5 achieved an AUC of 0.545 (almost
random) while the baseline (extended) model using the ndGSS encoding yielded an AUC
5.9 Therapy History: Replacement for the Genotype?
(a) Performance on training data
107
(b) Performance on instances with real new drugs
Figure 5.15: ROC curves on training data (a) and test data containing novel drugs (b).
Whiskers indicate the standard deviation.
of 0.587 (0.636). Using the optimized weighting factor for ndGSS elevated the AUC to
0.608 (wf=8), 0.642 (wf=7), and 0.677 (wf=4) for HIVdb5, baseline, and extended model,
respectively.
5.8.3 Conclusions
The cross-validation analysis demonstrated that there was no statistical significance between the standard and the ndGSS encoding (p > 0.13 using a paired Wilcoxon test).
This indicates that the precision of the system regarding established drugs is not compromised by the discounting-like approach. The baseline encoding significantly outperformed
HIVdb5 and the extended encoding significantly outperformed the baseline encoding (both:
p = 0.02).
Performance on the 119 test instances featuring real novel drugs was disastrous. However,
even the reference approach (HIVdb5) yielded a performance close to random. Both, the
baseline and the extended ndGSS models outperformed HIVdb5, i.e. they could predict
response better than the GSS alone. Thus, the modeling approach can be regarded as
a step in the right direction. Boosting the influence of the ndGSS feature elevated the
AUC of all models by approximately 0.05. This effect is probably related to the increased
potency of the novel drugs.
5.9 Therapy History: Replacement for the Genotype?
The vast majority of tools that predict response to antiretroviral therapy focus on the viral
genotype as the major source of information. Indeed, obtaining the sequence of the viral
drug targets is a standard approach for finding the best treatment for HIV patients in
developed countries. However, despite dropping prices for genome sequencing, genotyping
is not a standard practice in resource-limited settings, where even the standard VL assay
108
5 EuResist: Uniting Data for Fighting HIV
exceeds the budget for standard care. Unfortunately, the majority of people infected with
HIV live in such resource-limited countries.
The viral drug targets are steadily responding to the ongoing treatment and each treatment leaves traces in the viral genotype. In fact, the genotype obtained by bulk sequencing
may not provide all mutations that ever occurred in the viral population. Precisely, if mutations do not present a replicative advantage, resistance mutations may disappear from
the currently predominant viral variant. For example, the predominant viral population
in patients having a treatment break has usually the characteristic of the wild type virus.
Unfortunately, the patient harbors previous viral variants in the form of proviral DNA
in several infected tissues. This constitutes a memory of resistance mutations provoked
by previous treatments. As a consequence, recycling of drugs leads to a rapid reselection
of previously existing resistance mutations, which are not prevalent in the predominant
viral variant. For avoiding such a short-term viral rebound, the treating clinician considers
the patient’s treatment history, i.e. the previously taken drugs, in addition to the viral
genotype when selecting a new regimen. Treatment history has long been recognized as
clinically relevant (Bratt et al., 1998). More recently it was shown that taking all available
genotypes into account improves the prediction of treatment response in heavily pretreated
patients (Zaccarelli et al., 2009). Thus, a prediction model focusing on treatment history
could be an effective alternative to a genotype-based model in resource-limited settings.
This study aimed at investigating which representation of the treatment history is the
most useful one for inferring response to combination antiretroviral therapy. And how the
predictions of an history-only engine compare to state-of-the-art predictions based on the
viral genotype. The research was part of Fabian Müller’s Bachelor Thesis (Müller, 2008)
and only the main results are summarized in the following sections.
5.9.1 Material and Methods
Data and Response Definition
For the experiments we used the 2007/08/16 release of the EuResist integrated database.
The definition of successful virological response was modified compared to the standard
datum definition for allowing to exploit the wealth of data in the database. A success was
defined if at least one VL measurement during the treatment was below 500 copies per
ml. If the lowest measurement in the database exceeded the value, then the treatment was
labeled as a treatment failure. Moreover, since we are interested in the impact of treatment
history, a baseline genotype was not required for inclusion of a treatment in the study. Of
note, the response definition used here is yet another alternative to the originally defined
standard datum definition.
The definition resulted in 35,149 treatments being labeled as success or failure, we refer
to this dataset as HO (for: history only). Of those treatment changes, a set of 3,910 could
be associated with a genotype that was obtained at most 90 days prior to treatment start.
These instances were collected in a second dataset termed WG (for: with genotype). The
HO dataset comprised 1689 distinct combinations of antiretroviral drugs.
5.9 Therapy History: Replacement for the Genotype?
109
Features of Treatment History
The current therapy, i.e. the therapy of which the virological response has to be inferred,
was encoded using 17 binary variables each representing a single drug. This encoding was
used in previous sections and is termed no history.
The indicators for the current therapy were augmented with features derived from the
patient’s treatment history. The simplest encoding used one indicator for previous exposure
to a drug for each of the 17 compounds, hence it is denoted as binary history. A more
elaborate encoding takes into account the time since last exposure to the drug. Here the
history is not encoded as a binary variable, but as a continuous variable ranging between
0 (never exposed) to 1 (very recently exposed). Precisely, the variable was defined as
f (t, k) =
(
1−
0
1
tk
4300k
0 ≤ t ≤ 4300
,
else
where t is the time since last exposure in days (with a maximum of 4300) and k is a
parameter controlling the decay of the influence of previously used drugs. A value of 1 for
k equals linear decay, values larger (smaller) than 1 lead to a slower (faster) decay of the
influence. Due to the nature of the encoding it is called continuous history. The motivation
for this feature is that compounds that were taken a long time ago may have a reduced
impact on the current regimen.
In addition to a simple concatenation of features for the current drugs and historic use
of drugs, special interaction features were introduced aiming at modeling the interaction
between previously used drugs and drugs in the current regimen. For example, if the
current treatment does not include any protease inhibitor, the previous use of protease
inhibitors is expected to have little impact on the treatment outcome. Like in the ME
engine, second order features represent the previous use of a compound and the current
use of another drug from the same class, i.e. 49 (7×7) and 100 (10×10) additional features
model interaction between PIs and RTIs, respectively. These 149 variables extend the 17
binary variables for the current regimen. This approach was applied to the binary and
continuous history features, thus, leading to the 2nd order binary and 2nd order continuous
encodings.
Genotype Features
For correctly rating the performance of the history-only prediction within a state-of-theart context, we compared it to a prediction based on genotype. Precisely, we applied the
encoding used for g2p-THEO (49 binary variables representing mutations in protease and
reverse transcriptase, 17 binary indicators for the drugs in the current regimen, and 17
variables for the genetic barrier to drug resistance) and trained a classifier on the WG
dataset.
Statistical Learning
The features were tested with two statistical learning techniques: logistic regression and
random forest. The number of trees in the forest was set to 100. The parameters of
the history encoding, the cutoff in days for a compounds for being considered as part
110
5 EuResist: Uniting Data for Fighting HIV
of the history, and k, which controls the decay of the influence of previously used drugs,
were optimized in a 10-fold cross-validation setting (separately for both learning methods).
Moreover, all history encodings were compared in a 10-fold cross-validation setting on the
HO dataset. Performance of the genotype-based model was assessed in a 10-fold crossvalidation setting on the WG dataset. And for the comparison to the genotype-based
model the history-only models was trained on only those instances of HO for which no
associated genotype was available (i.e. HO without WG: 31,239), the remaining instances
were used as a test set and split into 10 equally sized folds matching the folds of the
cross-validation for the genotype-based model.
In addition to a performance comparison we also studied the best way for combining
genotype and history predictions. Here we focused on three strategies. In strategy one
we added the history features as additional covariates to the genotype encoding (concatenation). This approach, however, limits the data used for training to the WG dataset.
Strategy two used the prediction computed by the history-only model as an additional
covariate in the genotype encoding (prediction feature). This approach is akin to the one
used in GD engine (Section 5.2), which adds the prediction of a Bayesian network to the
list of features. The third strategy simply computed the mean of the history-only and the
genotype-based prediction (combined by mean). In the latter two approaches the history
model could be trained on all available data (i.e. HO without WG), and the genotype-based
model only on WG.
Performance is assessed as the area under the ROC curve. Significance is assessed using
a paired one-sided Wilcoxon Rank Sum test.
5.9.2 Results
The results suggested that previous use of a drug should be considered from the first day
on. Linear increase of the threshold at which drugs are considered for the history encoding
resulted in a linear decrease in AUC for both classifiers (data not shown; see Müller (2008)
instead). Moreover, the best parameter k for modeling the influence of past treatments
on the current regimen was 1 (i.e. linear decay) for both classifiers. It turned out that
values of k that are smaller than 1 impair the performance of logistic regression. The
random forest classifier was more robust with respect to the choice of k. For all further
experiments, k was set to 1 and drugs were considered from the first day on.
Table 5.8 summarizes the performance of the different history encodings obtained on the
HO dataset. Using the binary history significantly, outperformed the predictions based on
the current treatment only for both classifiers (p < 0.001). For logistic regression the best
encoding was the 2nd order binary encoding that significantly outperformed the standard
binary encoding (p < 0.001). The random forest classifier achieved the best results with
the continuous encoding, which also significantly outperformed the binary one (p < 0.02).
The use of second order features seriously impaired the performance of random forest,
which performed significantly worse than their standard counterparts (p < 0.001). The
random forest classifier performs in general better than the logistic regression. However,
the initial difference of 0.03 in AUC (no history) could be reduced to 0.008 using the best
performing history features for both approaches.
Table 5.9 shows the performance of the best history encoding and the genotype-based
5.9 Therapy History: Replacement for the Genotype?
no history
binary
continuous
2nd order binary
2nd order continuous
logistic regression
0.670 (0.011)
0.713 (0.008)
0.714 (0.009)
0.726 (0.008)
0.707 (0.011)
111
random forest
0.700 (0.008)
0.729 (0.007)
0.734 (0.008)
0.720 (0.007)
0.727 (0.007)
Table 5.8: Mean 10-fold cross-validation performance in AUC on the HO dataset. Standard
deviation is given in parenthesis.
history-only
genotype-based
concatenation
prediction feature
combined by mean
logistic regression
0.722 (0.027)
0.766 (0.025)
0.778 (0.025)
0.781 (0.024)
0.780 (0.025)
random forest
0.746 (0.023)
0.771 (0.022)
0.793 (0.018)
0.787 (0.020)
0.794 (0.022)
Table 5.9: Mean 10-fold cross-validation performance on the dataset WG achieved by the
best performing history-only models and the genotype based prediction and the
three combination approaches. Standard deviation is given in parenthesis.
predictor on the WG dataset. The genotype based prediction clearly outperformed the
history-only predictions for both classifiers (p < 0.002). Again, the random forest classifier
performs better than the logistic regression. Using the genotype features, however, the
difference is only marginal – an effect that we observed before with g2p-THEO (Chapter 4). All three ways for combining the history information and the genotype information
significantly outperformed the genotype alone (p = 0.032 for the smallest improvement:
logistic regression and concatenation). None of the improvements among the different ways
of combination achieved statistical significance.
5.9.3 Discussion
The study showed that prediction of treatment response based on the current regimen
and the genotype of the virus only can significantly be improved by including information
on the patient’s treatment history. Moreover, recoding the exposure to a previously used
drugs from the first day on provided the best results. We explored further features derived
from the patient’s treatment history, among those were similarities of the current therapy
to the preceding regimen, but none of the other encodings could improve the performance
over the results presented here (Müller, 2008).
A possible confounder of the analysis was the fact that the treatment history was likely
to be incomplete for a large number of patients. However, repeating the analysis with
a dataset comprising only patients with their treatment history completely recorded in
the database, which resulted in a dataset with 17,720 treatment changes, led to the same
results. The AUC for all encodings of the history (including no history) were about 0.03
higher then the results in Table 5.8 (data not shown).
112
5 EuResist: Uniting Data for Fighting HIV
In our study, the history-based prediction is clearly inferior to the genotype-based prediction. Recently, other studies presented results where the difference between a genotypebased and a treatment history-based prediction were much less pronounced (Prosperi et al.,
2009a; Revell et al., 2009). A potential cause for the discrepancy could be that the latter
two studies aimed at predicting virological response at a fixed time point and additionally
allowed the use of baseline VL as an additional covariate. In both studies, which applied random forests as a statistical learning method, baseline VL was ranked as the most
important feature. However, due to a known bias of the random forest to rate continuous or categorical variables higher than variables with few categories – especially binary
ones (Strobl et al., 2007), the impact of VL should be reassessed. The baseline VL might
nevertheless be a good predictor for response to combination therapy, its application in
resource-limited settings, however, remains doubtful; owing to the high cost of the viral
load assay. Fiscus et al. (2006) investigate alternatives of the established VL assay that
can be used in regions with limited resources where response to antiretroviral treatment is
still typically monitored using the CD4+ T cell counts.
In an ongoing work Saigo et al. (in preparation) use a tailored machine learning approach
based on subsequence mining (Nowozin et al., 2007) that tries to exploit the order in which
drugs were applied in the patient’s history. Here the value of genotypic information over
treatment history alone could be confirmed using the original standard datum as response
definition. Moreover, interpretation of the model revealed that the information about
initial response (success or failure) of treatments in the past plays an important role.
To conclude, the study demonstrated that response to treatment can be predicted with
good performance based on treatment history alone. The best representation of the treatment history was different for the two evaluated statistical learning methods. Such historybased models might be an option for resource-limited settings and are currently subject
of further investigations (Prosperi et al., 2009a; Revell et al., 2009), and (Saigo et al., in
preparation). Such models will certainly be useful as soon as the multitude of antiretroviral
drugs is also available in resource-limited settings. So far, the WHO addressed the inability
to provide personalized HIV treatment to those patients by issuing simplified treatment
protocols that cover first- and second-line treatment (Gilks et al., 2006).
It still remains a question whether the performances of the history-based models assessed
on databases reflecting the pandemic in the developed countries can be transferred to the
situation in resource-limited settings; with the majority of viruses originating from other
subtypes than B. Moreover, the performance of the history-based models is assessed on
treatment change decisions that are usually genotype-based, either on a baseline genotype
or a historic genotype. Thus, the genotypic information is implicitly contained in the drug
combination that the model is asked to assess and therefore may lead to an overestimated
performance in the comparison to genotype based methods. Therefore, either prospective
studies or retrospective analyses on treatment changes that were not based on genotypes
are required to assess the usefulness of history-only-based prediction models.
6 Planning Sequences of HIV Therapies
In the previous chapters we demonstrated that predicting virological response to combination antiretroviral therapy is a feasible task. The resulting models, however, focus solely
on the efficacy of a regimen, which is only one aspect of a putative treatment the clinician has to assess. The other side requiring consideration is the effect of the regimen on
the virus (or the whole viral population); specifically, which mutations will the regimen
provoke and how will these mutations affect the efficacy of future treatments. Patients
infected with HIV cannot be cured currently since the virus integrates its DNA into the
genome of the host cell, thus making a life-long therapy a necessity. For ensuring the best
use of the currently available panel of antiretroviral drugs it is essential to apply them in
an order that avoids or at least minimizes cross-resistance effects and exploits potential
resensitization effects.
Jiang et al. (2003) introduced the notion of future drug options (FDO) based on resistance of the virus against single drugs and complete drug classes as assessed by a rulesbased interpretation algorithm. The FDO measure was applied to data from the clinical
trial ACTG 364 (Katzenstein et al., 2003) conducted by the AIDS clinical trial group
(ACTG) and focused on assessing the impact of baseline phenotypic resistance on treatment length. The trail comprised mainly treatments with two NRTIs and the PI NFV
and/or the NNRTI EFV. Difference in the FDO between the viral genotype before treatment start and at virological failure was studied for determining the most effective regimen
in terms of resistance cost. When corrected with respect to the time to virological failure,
EFV based treatments exhibit a higher resistance cost than NFV or NFV+EFV based
regimens. The study clearly demonstrates the use of the FDO concept for devising general
treatment guidelines. For using this approach to develop a personalized treatment plan,
however, one has to compare the current predominant viral population in the patient to
the makeup of that population at failure of the possible regimens.
Various approaches have been undertaken to stochastically simulate the evolution of
the viral population during treatment, for examples see Deforche et al. (2008); Prosperi
et al. (2009b); Perelson (2002); Ribeiro and Bonhoeffer (2000) and references therein.
These approaches focus on modeling of the infection dynamics of HIV, beginning with a
simple hunter-pray dynamics up to models that consider changes in the viral environment
affecting its replicative success, for example the presence of antiretroviral drugs. The latter
models qualify for approaching the problem of estimating the viral population at failure of
a regimen. The computational time, however, typically required for this in silico evolution
significantly exceeds the time available for interactive web services (e.g. about 3 min for
a single drug combination (Prosperi et al., 2009b)). Moreover, these models are based on
certain modeling assumptions and rely on parameters (e.g. rate at which new target cells
are generated, death rate of infected and uninfected cells) that are hard to estimate from
available biological data.
113
114
6 Planning Sequences of HIV Therapies
This chapter introduces an approach for computing the viral population at the time
of treatment failure. The approach is based on finite state machines and is scalable, thus
eventually affording web services with moderate waiting times. The strategy for calculating
personalized future drug options comprises three steps: (i) selection of putative effective
regimens with methods described earlier, (ii) in silico evolution of the virus according to
a particular treatment, and (iii) assessment of the FDO at failure of that putative drug
combination.
Sections 6.1 and 6.2 introduce the framework employed for simulating the viral evolution
during treatment. Section 6.3 describes the applied mutation models along with their
assumptions, and Section 6.4 presents a validation for the mutation models. Section 6.5
concludes the chapter with final remarks and a case study focusing on the four most
prominent first-line regimens.
6.1 Weighted Finite State Transducers
Briefly, weighted finite state transducers are a family of finite state automata. They extend
the concept of simple acceptors that comprise an input alphabet (A), a set of states (Q)
with directed edges between them (E), as well as initial (I) and final (F ) states by an
output alphabet (B), costs for every edge and cost functions for initial (λ) and final (ρ)
states. Formally, a finite state acceptor is quintuple:
F = (A, Q, I, F, E),
where I, F ⊆ Q and E ⊆ Q × A ∪ {} × Q, with beeing the “empty word”. Basically,
finite state acceptors verify whether words defined over the input alphabet are valid or
not. Here, a word is a concatenation of elements of the alphabet. The verification process
starts always at the beginning of the word and in one of the initial states of the automaton.
From this initial state, the elements of the word are read one-by-one: if from the current
state there exists an outgoing edge that is labeled with the corresponding element, then
the acceptor proceeds to the target state of the arc and is ready to read the next element
of the word. A word is considered valid, only if there exists at least one initial state such
that reading the word from that initial state ends in one of the final states.
Analogously, a weighted finite state transducer is defined as:
T = (A, B, Q, I, F, E, λ, ρ),
where I, F ⊆ Q, E ⊆ Q × A ∪ {} × B ∪ {} × K × Q, λ : I → K and ρ : F → K, with
K being part of a semiring (K, ⊕, ⊗, 0̄, 1̄). Thus, (K, ⊕, 0̄) is a commutative monoid with
identity element 0̄, (K, ⊗, 1̄) is a monoid with identity element 1̄, ⊗ distributes of ⊕, and
0̄ is an annihilator for ⊗, i.e. ∀a ∈ K : a ⊗ 0̄ = 0̄ ⊗ a = 0̄. In addition to acceptors,
transducers emit elements of the output alphabet when traversing along an edge from one
state to the another. Consequently, for every valid input word, an word over the output
alphabet is generated. Moreover, in weighted finite state transducers, edges, initial states,
and final states are equipped with scores (over K). Hence, a valid input word and the
emitted output word are associated with a total score.
Finite state transducers have been used in the field of speech recognition and language
translation and provide a uniform way of representing knowledge for these tasks. In natural
6.1 Weighted Finite State Transducers
115
language processing knowledge is typically modeled in the form of conditional probabilities.
For example. the probability of observing a certain word w depends on the preceding words
in the sentence w0 , and can therefore be represented as P (w|w0 ). In a finite state automata
representation each possible sequence of words, the history, gives rise to a state; every
possible succeeding word is represented by an edge leaving that state, and the probability
of this word following the history corresponds to the edge weight. For practical reasons
the history is limited to a fixed length, i.e. n-gram models model the probability of the
n-th word given the n − 1 preceding ones. Weighted finite state acceptors are sufficient
for representing such language models. Transducers, however, are required to transform
instances from one alphabet to instances from another alphabet, e.g. translating a French
input sentence into English. The naïve approach, a word-to-word translation, maps one
French input word to an English word. Unfortunately, the translation of a single word is
often ambiguous, i.e. the word can have different translations which are highly dependent
of the context. As a consequence, one has to model P (e, f |e0 , f 0 ), where e is the translation for f and f 0 and e0 are the history of the French and English words, respectively.
Thus, every history of English and French words corresponds to a state in the transducer,
every French word together with a possible translations gives rise to an edge leaving that
state, and the probability of the French word following the history and the translation to
a particular English word given that history constitutes the edge weight. Since the probabilities have to be estimated from training data, which are always scarce, the distribution
P (e, f |e0 , f 0 ) is decomposed into P (e|f ), P (f |f 0 ), and P (e|e0 ) representing the translation
model, the French language model, and the English language model, respectively. These
models can then be learned independently from different datasets and represented as individual transducers and acceptors. The algorithm compose described below constructs an
approximation to the P (e, f |e0 , f 0 ) representing transducer from the three individual finite
state machines.
For applications in natural language processing the obvious choice for a semiring for
transducers is the Probability semiring: (R+ , +, ×, 0, 1). Due to computational reasons
the Log semiring is preferred: (R ∪ {−∞, +∞}, ⊕log , +, +∞, 0), here all probabilities
are represented by their negative natural logarithm and ⊕log is defined as: x ⊕log y =
− log(e−x + e−y ). According to these two semirings the probability of events along a
path from an initial state to a final state are multiplied, and probabilities of alternative paths are summed up. However, in language processing tasks the total probability
of a sequence of events if often dominated by the probability of the most likely path.
Thus, instead of accumulating the probabilities of all alternative paths it is sufficient to
compute the probability of the most likely path. This approximation is termed Viterbi
approximation or maximum approximation and is implemented by the tropical semiring:
(R ∪ {−∞, +∞}, min, +, +∞, 0). Hence, many tasks can be solved efficiently by simply
searching the shortest path from an initial to a final state in an automaton.
Due to the computational demands required in natural language processing domains,
libraries and toolkits offering efficient implementations of algorithms based on transducers
are available, e.g. Mohri et al. (2000); Hetherington (2004); Lombardy et al. (2004); Allauzen et al. (2007). For the experiments presented here the RWTH FSA toolkit (Kanthak
and Ney, 2004) is used. The toolkit is open source and offers a range of algorithms, C++
and Python interfaces as well as an command line tool. Many algorithms based on trans-
116
6 Planning Sequences of HIV Therapies
ducers afford an on-demand implementation. Briefly, an on-demand algorithm computes
the outgoing edges (and their weights) of a given state only as soon as the state is visited.
For instance, if one wants to multiply all edge weights of a transducer with a scalar and
afterwards process a word of the input alphabet, then the multiplication can be done for all
edges explicitly before reading the word. Alternatively, in an on-demand implementation
the multiplication can be carried out while the word is processed. This has the advantage
that only edges of states that are actually required during a computation are updated.
Kanthak and Ney (2004) give a definition of local algorithms, which is a prerequisite for
the on-demand implementation. Consider an algorithm that generates a new transducer
based on one or more transducers constituting its input (e.g. composition of transducers; see following section). If this algorithm is able to generate for the new transducer
any arbitrary state and all its outgoing edges depending only on the information about
the corresponding state(s) of the algorithm’s input transducer(s), then it has the local
property. For instance, multiplication of edge weights with a scalar value has the local
property: for computing an arbitrary state in the resulting transducer, only information
on that state in the input transducer is needed (weights of the outgoing edges), all other
states are irrelevant.
The command line toolkit reads transducers in a binary and an XML format. The latter
is mainly used for construction of new transducers by the user, while the former allows
efficient storage and input-output operations. The following sections introduce some basic
and important algorithms like the composition of two weighted finite state transducers and
the extraction of the shortest path or the n-shortest paths of acceptors.
6.1.1 Composition
Weighted finite state transducer composition is the generalization of nondeterministic finite
state automata intersection. The algorithm is used to construct large complex automata
from small and simple ones. The intersection of nondeterministic finite state machines is
defined as follows. Let F1 = (A, Q1 , I1 , F1 , E1 ) and F2 = (A, Q2 , I2 , F2 , E2 ) be two finite
state acceptors, then the intersection automaton is defined as:
F1 ∩ F2 = F = (A, Q1 × Q2 , I1 × I2 , F1 × F2 , E),
where ((q1 , q2 ), a, (q10 , q20 )) ∈ E ⇔ (q1 , a, q10 ) ∈ E1 ∧ (q2 , a, q20 ) ∈ E2 . Basically, the new
automaton comprises one state for every combination of states of the two input automata
(Q1 × Q2 ). Thus, each state in the resulting acceptor can be mapped to exactly one state
in each of the input acceptors. Likewise, the new initial (final) states are defined by all
combinations of initial (final) states of the input acceptors. Edges in the resulting acceptor
are added based on pairs of edges of the input automata that have the same edge label:
the source (target) of the new edge is the state representing the two source (target) states
of the edges in the two input automata.
When moving from acceptors to weighted transducers one has to respect the edge weights
as well as the input and output symbols. Thus, let T1 = (A, B, Q1 , I1 , F1 , E1 , λ1 , ρ1 ) and
T2 = (B, C, Q2 , I2 , F2 , E2 , λ2 , ρ2 ) be two weighted transducers. The composition of the two
transducers is defined as:
T1 ◦ T2 = T = (A, C, Q1 × Q2 , I1 × I2 , F1 × F2 , E, λ, ρ),
6.1 Weighted Finite State Transducers













117

























Figure 6.1: Example of acceptor intersection and non-commutative transducer composition. States are represented as circles, initial states and final states are marked
using bold and double circle boundaries, respectively. Numbers inside the circles stand for the labels. Edge labels in acceptors comprise a single element of
the input alphabet, while edge labels in transducers comprise an element of the
input alphabet (left) and one element of the output alphabet (right) separated
by a colon. The first row shows the two input acceptors and intersection result
in the last column. The second row shows two input transducers and the result
of the composition left ◦ right and right ◦ left in the last column of the second
and third row, respectively.
where ((q1 , q2 ), a, c, w1 ⊗ w2 , (q10 , q20 )) ∈ E ⇔ (q1 , a, b, w1 , q10 ) ∈ E1 ∧ (q2 , b, c, w2 , q20 ) ∈ E2 ,
λ(q1 , q2 ) → w1 ⊗w2 ⇔ λ1 (q1 ) = w1 ∧λ2 (q2 ) = w2 , and ρ(q1 , q2 ) → w1 ⊗w2 ⇔ ρ1 (q1 ) = w1 ∧
ρ2 (q2 ) = w2 . The major difference (apart from updating the weights for edges, initial states,
and final states) to the construction of the intersection acceptor is that edges in transducer
composition are added based on pairs of edges of the two input transducers where the
output element of one edge matches the input element of the other edge. Consequently,
the composition of transducers is not commutative. An example of acceptor intersection
(top row) and transducer composition (lower rows) is shown in Figure 6.1.
Algorithm 2 provides the composition algorithm based on Pereira and Riley (1997)
in pseudocode. It constructs a transducer T = (A, C, Q, I, F, E, λ, ρ) from two input
transducers T1 and T2 . This algorithm, however, assumes -free input transducers and uses
a queue S. The function E[q] retrieves all edges leaving the state q and functions i[e], o[e],
and n[e] return the input label, output label, and target state of edge e, respectively.
A problem with -containing transducers can occur when an -output edge of T1 is
matched to (a sequence of) -input edges of T2 . In this case, the application of Algorithm 2
could generate redundant -paths in the resulting transducers that (depending on the
semiring) could lead to incorrect path costs. Thus, all but one -path have to be removed
from the composite transducers. Remarkably, this filtering can be carried out by another
118
6 Planning Sequences of HIV Therapies
Algorithm 2 Weighted-Composition(T1 , T2 )
Q ← I1 × I2
S ← I1 × I2
while S 6= ∅ do
(q1 , q2 ) ← Head(S); Dequeue(S)
5:
if (q1 , q2 ) ∈ I1 × I2 then
I ← I ∪ {(q1 , q2 )}
λ(q1 , q2 ) ← λ1 (q1 ) ⊗ λ2 (q2 )
end if
if (q1 , q2 ) ∈ F1 × F2 then
10:
F ← F ∪ {(q1 , q2 )}
ρ(q1 , q2 ) ← ρ1 (q1 ) ⊗ ρ2 (q2 )
end if
for each (e1 , e2 ) ∈ E[q1 ] × E[q2 ] such that o[e1 ] = i[e2 ] do
if (n[e1 ], n[e2 ]) ∈
/ Q then
15:
Q ← Q ∪ {(n[e1 ], n[e2 ])}
Enqueue(S, (n[e1 ], n[e2 ]))
end if
E ← E ∪ {((q1 , q2 ), i[e1 ], o[e2 ], w[e1 ] ⊗ w[e2 ], (n[e1 ], n[e2 ]))}
end for
20: end while
return T
transducer. To this end, the input transducers have to be preprocessed: output (input)
-labels in transducer T1 (T2 ) are mapped to 2 (1 ) resulting in T˜1 (T˜2 ). Using the filter
transducer FT , T1 ◦ T2 can correctly be computed by T˜1 ◦ FT ◦ T˜2 using Algorithm 2 (for
details see e.g. Pereira and Riley (1997) and Mohri (2005)).
Returning to the language translation example from above, one would create two acceptors, one representing the English language model Elm and one the French language model
Flm . A simple translation transducer from French to English FE t can be constructed by
a transducer comprising one state and edges for each French input word and a possible
English translation. The operation Flm ◦ F E t ◦ Elm builds a transducer that translates
French sentences to English TF →E . For translation, a single French sentence is represented
as a linear acceptor f with edge labels corresponding to French words, and f ◦ TF →E generates all possible English translations of f . However, for translating f most parts of the
translation transducer TF →E are not visited. Here the use of an on-demand implementation together with pruning (see following sections) results in improved computational
performance.
A transducer example from the domain of Bioinformatics is given by the combination of
translating a sequence of nucleotides into amino acids and aligning the result to a reference
amino acid sequence. For this task two transducers are required. Transducer TT r translates
a sequence of nucleotides into a sequence of amino acids, and a second transducer encodes
the alignment TAl . TT r can be constructed so that all three reading frames are translated,
i.e. there is the possibility to read at most two nucleotides without starting the translation
6.1 Weighted Finite State Transducers
119
process (see Figure 6.2). The lower part of Figure 6.2 depicts a sketch of the alignment
transducer, it supports affine gap costs (g for gap open and ge for gap extend) and one
edge for each possible mismatch of amino acids, the cost of a mismatch m can, for instance,
be related to the BLOSUM matrix. Let Sq,nt and Sr,A be linear acceptors representing the
query nucleotide sequence and the reference amino acid sequence using the edge labels,
respectively, then Sq,nt ◦ TT r ◦ TAl ◦ Sr,A computes three possible translations of nucleotides
to amino acids and the cost of all possible alignments to a reference.
6.1.2 Single Source Shortest Path
The previous section demonstrated how large transducers can be constructed from small
ones. The information one wants to compute, however, is still concealed in these large
directed graphs. In particular, the set of all possible translations of a French sentence
into English is of less interest than the most likely translation. The same holds true
for possible alignments. If the transducer is defined over a tropical semiring, however,
then the desired information can be retrieved rather efficiently from the large automaton,
since standard algorithms can be applied to solve the single source shortest path (SSSP)
problem for computing the shortest path from an initial state to a final state. For example,
Dijkstra’s algorithm (Dijkstra, 1959) can be applied if all edge weights are non-negative,
the single source limitation can be circumvented by a slight modification of the automaton:
introduction of a new single initial state that links via -edges to all previous initial states.
The same modification can be applied to the final states, thus only the cost of the shortest
path between the single initial and the single final state is of interest. Theoretically, as
soon as the execution of Dijkstra’s algorithm extracts the final state from the queue, it
can be stopped. For allowing also the more general case with negative edge weights (but
non-negative circles) the Bellman-Ford algorithm (Bellman, 1958) can be used. Of note, if
the semiring used by the transducer is defined over R+ , then one can safely apply Dijkstra’s
algorithm.
The RWTH FSA toolkit implements Algorithm 3 that was introduced by Mohri (2002).
The algorithm is termed Generic Single Source Shortest Distance because some well known
and widely used algorithms are special cases of this algorithm (hence generic) and depending on the semiring the term path is not pertinent for what the algorithm computes (hence
distance). Mohri showed that the algorithm is correct for any semiring and queuing discipline. Precisely, given a transducer that is defined over a tropical semiring the queuing
discipline determines which algorithm is executed: if the queue used in Algorithm 3 follows
a first-in first-out or shortest-first principle then the algorithm is equivalent to executing
Bellman-Ford or Dijkstra, respectively.
Briefly, the algorithm maintains two properties for each state, d[q] and r[q] store the
current distance estimate from the initial state i to state q and the total weight added to
d[q] since the last time q was extracted from the queue S, respectively. Like in Algorithm 2,
E[q] represents all edges leaving q, and n[e] and w[e] return the target state and the edge
weight, respectively. In the tropical semiring, ⊕ is equal to min. Thus, intuitively, in this
case the statement in Line 11 checks whether the current distance of n[e] to i is shorter
than the current distance R from i to q plus the edge weight from q to n[e]. If this is not
the case, then d[n[e]] and r[n[e]] are updated. Furthermore, if n[e] is not in the queue it is
120
6 Planning Sequences of HIV Therapies























Figure 6.2: Example transducers for translating nucleotide sequences into amino acid sequences and alignment of two sequences. The notation for the transducers is
the same as in Figure 6.1. Furthermore, *EPS* corresponds to (the “empty
word”) and the slash in the edge label separates the edge weight (on the right)
from input and output elements (on the left). The top transducer TT r translates a sequence of nucleotides into amino acids, the first two states (0 and 1)
allow to read any nucleotide (n) without starting the translation process, the
states (3 and 4) represent the fact that the first (n1) and second (n2) nucleotide
of a codon have been read, respectively, and the arc between state 4 and 2 emits
the amino acid (A) encoded by the codon n1n2n3. The bottom transducer TAl
encodes an alignment with affine gap costs. State 0 has arcs for matches (X:X)
and mismatches (X:Y) at cost m. Furthermore, states 1 and 2 encode deletions
and insertions, respectively, with respect to the reference sequence.
6.1 Weighted Finite State Transducers
121
Algorithm 3 Generic-Single-Source-Shortest-Distance(T , i)
for j ← 1 to |Q| do
d[j] ← r[j] ← 0̄
end for
d[i] ← r[i] ← 1̄
5: S ← {i}
while S 6= ∅ do
q ← Head(S); Dequeue(S)
R ← r[q]
r[q] ← 0̄
10:
for each e ∈ E[q] do
if d[n[e]] 6= d[n[e]] ⊕ (R ⊗ w[e]) then
d[n[e]] ← d[n[e]] ⊕ (R ⊗ w[e])
r[n[e]] ← r[n[e]] ⊕ (R ⊗ w[e])
if n[e] ∈
/ S then
15:
Enqueue(S, n[e])
end if
end if
end for
end while
added to S so that its outbound edges can be updated later. For a detailed analysis of the
running time, possible modifications of the algorithm and correctness see Mohri (2002).
Given the result of any single source shortest distance algorithm, the shortest path from
the source to all other states can easily be extracted. In particular, the paths leading
to final states. Therefore, the best translation of a French sentence into English can
be found by computing best(f ◦ TF →E ). Analog to this, the best reading frame and
alignment of a nucleotide sequence to an amino acid sequence is retrieved by executing
best(Sq,nt ◦ TT r ◦ TAl ◦ Sr,A ).
6.1.3 N -Shortest-Strings
Due to imperfect models or violated model assumptions, the best hypothesis of an automaton, as represented by the shortest path, need not be the best solution with respect
to a scoring function. If the model is sufficiently accurate, however, then the best (or a
better) solution might be among the top n (i.e. most likely) hypotheses represented by the
automaton. Moreover, these n-best lists can be post-processed to form a final solution.
In the case of a deterministic acceptor, the n-best paths in the automaton are equivalent
to the n-best hypotheses or n-best strings. In nondeterministic acceptors, however, the
n-best paths may contain duplicate hypotheses (with different costs). Thus, in order to
obtain n distinct hypotheses, nondeterministic automata have to be made deterministic
before computing the n-best paths.
The process of computing a deterministic weighted finite state automata from a nondeterministic one works analogously to the unweighted case via the classic subset construction
algorithm (Aho et al., 1986). Unlike in the unweighted case, not all weighted automata
122
6 Planning Sequences of HIV Therapies
can be made deterministic. Fortunately, any acyclic weighted automaton is determinizable (Allauzen and Mohri, 2002). Furthermore, the determinization algorithm affords a
local implementation (Mohri and Riley, 2002). This makes the algorithm extremely useful
for extraction of n-shortest-strings.
Given an (non)deterministic acceptor F = (A, Q, I, F, E), for which we can assume
(without loss of generality) that |I| = |F | = 1, one has to carry out the following steps
to compute the n-best strings. First, a potential function φ is computed, which provides
for every state q ∈ Q the shortest distance to the single final state. Obviously, φ can
be computed by inverting all edges in the automaton and executing Algorithm 3 starting
from the single final state. Then, the automaton F is determinized, resulting in the
deterministic automaton F 0 = (A, Q0 , I 0 , F 0 , E 0 ). Based on the potential function φ defined
for F, the potential function Φ for F 0 can be derived on demand (Mohri and Riley, 2002).
This potential function for the states of the determinized automaton is the basis for the
ordering of the elements in the priority queue S that is used in Algorithm 4 (based on Mohri
and Riley (2002)). Precisely, the ordering of the queue is defined as: (p, c) < (p0 , c0 ) ⇔
(c + Φ[p] < c0 + Φ[p0 ]), where p, p0 ∈ Q0 , and c and c0 are the costs from the initial state to
the states q and q 0 , respectively. Since Φ never underestimates the cost from the current
state to a final state, one can view Φ also as the admissible heuristic that plays a central
role in the A∗ algorithm (Hart et al., 1968).
Algorithm 4 n-shortest-paths(F 0 , n)
for p ← 1 to |Q0 | do
r[p] ← 0̄
end for
π[(i0 , 0)] ← NIL
5: S ← {(i0 , 0)}
while S 6= ∅ do
(p, c) ← Head(S); Dequeue(S)
r[p] ← r[p] + 1
if r[p] = n ∧ p ∈ F then
10:
Exit
end if
if r[p] ≤ n then
for each e ∈ E[p] do
c0 ← c ⊗ w[e]
15:
π[(n[e], c0 )] ← (p, c)
Enqueue(S, (n[e], c0 ))
end for
end if
end while
The algorithm maintains for each state of the automaton, which was made deterministic,
the attribute r[p] that stores the number of times the state (p) was extracted from the
priority queue S. This attribute is essential as it provides the stopping criterion for the
algorithm: r[p] is initialized with 0 for all states and as soon as the single finial state is
6.2 Transducers in Therapy Planning
123
extracted from the queue n times (lines 9-11) the algorithm terminates. Paths are defined
by storing a predecessor π for each pair (p, c) (Mohri and Riley, 2002).
The potential function φ can be exploited in further ways together with pruning. Briefly,
when applying forward pruning the search algorithm does not follow edges that lead to a
total path cost exceeding a certain threshold. Mohri and Riley (2001) introduced a weight
pushing algorithm that requires the potential function in order to move path weights further
to the beginning (or end) of a path. Having the costs of the complete path accumulated
at the beginning of the path facilitates more efficient forward pruning, i.e. one can decide
not to search along paths that promise a high total cost early in the search process.
6.2 Transducers in Therapy Planning
The strategy for assessing future drug options includes the computation of an estimate of
the viral sequence at failure of the therapy. The baseline sequence can be represented as a
linear acceptor with the edge labels corresponding to the sequence positions. For allowing
an unambiguous identification, edges are labeled Tar_Pos_AA with Tar, Pos, and AA
corresponding to the protein the sequence encodes (here Tar ∈ {PRO, RT}), the amino
acid position, and the amino acid at that position, respectively.
Given a function Mt (p, a, a0 ) that provides the probability of mutating the amino acids a
to a0 at position p under the therapy t, one can easily construct a transducer representing
this function. This is illustrated by the left transducer in Figure 6.3. The self loops at the
initial state and the final state read single sequence positions and leave them unchanged
(at no cost). All transitions from the initial state to the final state introduce exactly
one mutation, e.g. at position p from a0 to a0 at the cost of Mt (p, a, a0 ). A composition
of the linear sequence automaton with this mutation transducers introduces all possible
mutations at all positions. Using the algorithms described above, one can retrieve the
sequence with the most likely mutation or the n most likely sequences. Figure 6.4 displays a
toy example. The base sequence comprises (the first) three amino acids of the protease (a).
The mutation transducer (b) allows one possible mutation for each position (colored edges).
The composition of the two automata results in a transducer that has one mutation for each
path from the initial state to the final state (c). The most likely mutation can be extracted
by finding the shortest path in that transducer (d), and the list of n-best mutations can
be generated by extracting the n-best paths (e). The latter algorithm, however, is only
defined for acceptors, thus the mutations (colored edges) cannot be inferred from the
graph anymore. For introducing two mutations in the baseline sequence one simply has to
compute the composition of the transducer in Figure 6.4 c) and the mutation transducer.
Transducers constructed in a different way can be used to assess the activity of a regimen with respect to a given genotype. The right transducer in Figure 6.3 illustrates the
concept of these rating transducers. In this automaton the output labels are identical to
the corresponding input label for all edges. Thus, a composition of an other transducers
with the rating transducers leaves the edge labels unchanged. In the rating transducer,
the edges are equipped with a weight derived from the relevant rating function Rt (p, a).
The rating function Rt (p, a) provides a score for amino acid a at position p in the context
of treatment t. For instance, Rt (p, a) can express the resistance amino acid a at position
p causes to treatment t. The initial state has a cost Rt (0) which corresponds to the offset
124
6 Planning Sequences of HIV Therapies







Figure 6.3: Concept of mutation and rating transducers. The notation of the transducers is
the same as in previous figures. The mutation transducer (left) for each treatment has two states, connections between these states introduce mutations.
The edge weight corresponds to the score of the mutation as represented by
the function Mt (p, a, a0 ). The rating transducer (right) has one loop edge for
every position and amino acid, the edge weight Rt (p, a) can for example correspond to the weights for an amino acid at a certain position as derived from the
linear SVM models, which predict in vitro drug resistance. The weight of the
state Rt (0) then corresponds to the offset of the linear model z0 (see Eqn. 6.2).
of that rating function. After computing the composition of a linear acceptor representing
a sequence (short: sequence acceptor) with such a rating transducer, each edge in the sequence acceptor receives a score. The sum of all scores along a path constitutes the activity
of a treatment against the sequence represented by the path. Rating transducers can for
instance be used to remove ambiguities from an input sequence. In our case, ambiguities
are by parallel edges between subsequent states in the sequence acceptor. In the composition of the sequence acceptor and the rating transducer these edges are extended by a score
corresponding to the amino acid at the alignment position. Thus, one can arrange that
the shortest path in the transducer corresponds to the most resistant variant represented
by the sequence.
So far the transducers only represent mutation or rating instructions for a single treatment. However, it is trivial to extend them to allow mutation and rating instructions for
different treatments within one automaton. This allows to store all models within a single
large transducer instead of multiple small ones. Figure 6.5 extends the toy example with a
second treatment in the mutation transducer. For selecting the correct mutation function,
the first edge in the sequence acceptor is labeled with the corresponding treatment.
6.3 Mutation Models
The major problem with building the mutation transducer is the estimation of correct mutation probabilities (or scores). The following two subsections present approaches that use
either in vitro or in vivo data for estimating mutation probabilities. The proposed models
all aim at assessing mutation probabilities with respect to individual drugs (exceptions
are the RTI mutation probabilities estimated from in vivo data). This approach has a
drawback since mutation models for individual drugs have to be combined in some way
6.3 Mutation Models
125




















(a) sequence acceptor
(b) mutation transducer













(c) composition







(d) best result





















(e) 3-best result
Figure 6.4: Toy transducer example. HIV sequences are represented by linear acceptors
(a), the mutation transducer for each treatment comprises two states, with
the connecting edges encoding the mutations (b). The composition introduces
every possible mutation into the sequence (c). The most likely mutation can be
retrieved via finding the shortest path (d). Likewise the n most likely mutations
can be extracted by listing the n-best paths.
126
6 Planning Sequences of HIV Therapies









(a) sequence acceptor
































(b) mutation transducer
Figure 6.5: Toy transducer example for multiple treatments. The sequence information
is preceded by an edge with a treatment label (a), the complete mutation
transducer comprises two states for each treatment, the initial state connects
to the pre-mutation state via an edge labeled with the treatment (b).
for building the Mt functions that model mutation probabilities for a specific treatment.
On the other hand, having mutation models for individual drugs available, enables the
computation of therapy mutation model for every possible combination of drugs. If not
stated otherwise the therapy mutation models are constructed as geometric mean of the
mutation models for individual drugs
Mt (p, a, a0 ) =
Y
1
Md (p, a, a0 ) |t| ,
d∈t
where d are the drugs used in therapy t and |t| corresponds to the number of drugs in the
therapy. Since the transducers are defined over the tropical semiring this is equivalent to
log Mt (p, a, a0 ) =
1 X
log Md (p, a, a0 ).
|t|
d∈t
Of note, the framework also supports the use of a weighted geometric mean:
Y
X
Mt (p, a, a0 ) =
Md (p, a, a0 )wd,t ⇔ log Mt (p, a, a0 ) =
wd,t log Md (p, a, a0 )
d∈t
with
P
d∈t wd,t
= 1.
d∈t
(6.1)
6.3 Mutation Models
127
6.3.1 Mutation Probabilities Derived From GENO 2 PHENO
The first model is motivated by the work of Beerenwinkel et al. (2003b) and assumes
that mutations, which confer the most dramatic change in resistance with respect to the
current regimen, are selected first during therapy. Unlike the work by Beerenwinkel et al.,
the model used here does not make use of the “maximum assumption” between drugs of
the same class (see Section 4.1.1), but it simply takes into account the change of resistance
against all drugs when a mutation is introduced. The SVM models in geno2pheno use
a linear kernel to predict the decadic logarithm of the resistance factor; this allows for
rewriting the regression function as a linear model (see Section 3.1). In order to simplify
further steps we rewrite the linear model w
~ · ~x + b for predicting phenotypic drug resistance
as:
N X
X
log (rf(seq)) = z0 +
zp,a ∗ δ(a, seq(p)),
(6.2)
p=1 a
where z0 is the offset (before b), zp,a is the weight for amino acid a at position p ∈ {1, . . . , N }
(extracted from the corresponding position in w),
~ and δ(a, seq(p)) is 1 only if the query
sequence presents amino acid a at position p (representing the feature vector ~x). Because
of this structure the weights zp,a can be used directly to construct the mutation function
log M (p, a, a0 ). Precisely, the mutation score is defined by the difference in predicted resistance between a sequence seq and a mutant at position p exchanging amino acid a with
a0 (seqp,a→a0 ):
log M (p, a, a0 )
=
(6.2)
=
log (rf(seq)) − log rf(seqp,a→a0 )
z0 +
N X
X
zp,a ∗ δ(a, seq(p))
p=1 a

− z0 +
N X
X

zp,a ∗ δ(a, seqp,a→a0 (p))
p=1 a
=
zp,a − zp,a0 .
The use of the raw resistance scores, however, introduces a problem since different drugs
show different ranges of resistance levels. Thus, for minimizing range-related problems, the
models were scaled to exhibit similar resistance levels for each drug. The scaling works in
two steps and focuses on the bimodal distribution of resistance factors (Figure 3.3): first
mean µ and variance σ of the susceptible subpopulation were determined for each drug and
. Second, the values were scaled
the resistance factor was transformed to log(rf0 ) = log(rf)−µ
σ
(i.e. divided by a scalar s) such that the mean of the resistant subpopulation µresist has
the same value for all drugs. Note that the first transformation is similar to the z-scores
used in geno2pheno (see Section 3.2). Using this scaling the score of a mutation is
log M (p, a, a0 ) =
zp,a − zp,a0
σ·s
and a combination of Eqn. 6.1 and Eqn. 6.3 yields for arbitrary therapies
log Mt (p, a, a0 ) =
X
d∈t
wd,t
zd,p,a − zd,p,a0
σd sd
(6.3)
128
6 Planning Sequences of HIV Therapies
with d being a drug in treatment t, wd,t standing for the weight of the drug in the treatment
P
( d∈t wd,t = 1). All other variables receive and additional index d indicating that they
are specific for an individual drug. This mutation model is termed g2praw .
An alternative model, also derived from geno2pheno, deduces M (p, a, a0 ) from the
probability that a susceptible sequence becomes resistant after acquiring an additional
mutation. Precisely, given a set of sequences, let S be the number of all susceptible
sequences, Sp,a ≤ S be the number of all susceptible sequences with amino acid a at
position p, and Rp,a→a0 ≤ Sp,a be the number of sequences that move to the resistant
subpopulation after mutating a to a0 . Hence
M (p, a, a0 ) =
Rp,a→a0
Sp,a
(6.4)
expresses the probability that a sequence becomes resistant after mutating a to a0 . Obviously, 0 ≤ M (p, a, a0 ) ≤ 1 holds and no scaling for the scores is required for achieving
compatibility between different drugs. Here, we define the cutoff between susceptible and
resistant viruses as the intersection of the Gaussians modeling the distribution of RF values for susceptible and resistant viruses (Figure 3.3). The probability of transcending the
boundary from susceptible to resistant by accumulating one mutation clearly depends on
the level of resistance the virus currently exhibits and the magnitude of resistance conferred
by that additional mutation. Hence, the potential of a certain mutation is estimated from
a dataset by averaging over all sequences. The resulting M (p, a, a0 ) scores are dominated
by the magnitude of change in resistance conferred by the mutation from a to a0 and are
therefore highly correlated with the corresponding scores of g2praw . For instance, a mutation that increases phenotypic resistance by 10 fold can turn a larger fraction of susceptible
viruses into resistance ones than a mutation that increases resistance only by 2 fold. Consequently the former mutation will have a larger M (p, a, a0 ) score. Approximately 30, 000
HIV-1 pol sequences containing protease and reverse transcriptase from the EuResist
database were used to estimate the probabilities. This model is termed g2ptransition . Figure 6.6 depicts scatter plots between log M (p, a, a0 ) from g2praw (x-axis) and g2ptransition
(y-axis) for all drugs.
The scatter plots show clearly that there is a substantial correlation between the two
mutation scores. However, the plots reveal also that for some drugs the assumption, that
the most resistance conferring mutation will be selected first during treatment, may lead
to the preferred accumulation of rather rare mutations (e.g. RT mutation Q151M for ddI
and d4T). For smoothing the probabilities with respect to the chance of a mutation to
actually emerge Eqn. 6.4 is modified to
M (p, a, a0 ) =
Rp,a→a0 Rp,a0
∗
,
Sp,a
R
with R and Rp,a representing the number of all resistant sequences and the number of resistant sequences having amino acid a0 at position p, respectively. Thus, Rp,a0 /R expresses
the probability of observing a certain mutation in the resistant subpopulation. This mutation model is termed g2pmixed . Figure 6.7 depicts scatter plots between log M (p, a, a0 )
from g2praw (x-axis) and g2pmixed (y-axis) for all drugs.
For practical reasons the models were restricted to allowing only mutations from and to
amino acids that actually occur in the training data of geno2pheno at a given position.
6.3 Mutation Models
0.4
0.8
1.0
0.00
0.10
change in log(rf)
0.15
0.15
Q151M
0.25
0.30
0.30
0.0
0.4
0.6
G48V
0.8
1.0
0.1
0.2
0.3
change in log(rf)
0.5
0.6
I50V
0.7
M184V
0.20
0.25
0.2
0.4
0.4
G190Q
G190V
G190A
G190C
G190T
G190E
K103SC188L G190SY188L
R103N
K103N
0.6
0.8
1.0
1.2
K14S
Q18K
I72E
I15M
E34IS63C
S63Q
S63F
S63R
T4QS63G
S37H
I66M
C95L
A12E
E61N
I20T
R43K
S37E
S37I
T80S
P12S
P12K
E17RH63Q
H63R
I19T
Q18I
Y67L
P39S
T4N
N69Y
N12I
N12A
N12E
R8A
L63C
E67W
E67S
E35K
T19Q
L63G
T19V
I10F
R14T
C63Q
C63R
N69T
Y69Q
Q63R
L19R
K69R
E16W
E16D
K37H
K37E
K37I
V15I
Q18T
L63F
Q7D
I12A
V15M
G94C
V32G
H69Y
R14E
N37Q
L63Q
S37Q
Q18R
L63R
Q7E
A37D
Q18A
K70T
P79L
M72L
R41T
W6G
H69T
D60N
W6L
G16A
L38F
I15K
A63V
A63E
D35N
E67Y
K69E
K70Q
Q39T
E12Q
Q39P
Q39S
F53V
F53I
E12P
E12S
E12K
S37T
L19V
R70E
R41Q
V33I
V33M
T12Q
C95I
K55I
K14T
V19Q
L89S
T12P
N69Q
A71T
T4I
R14Q
P79H
L64M
S37Y D67C
E37Q
I19R
D29E
K55Q
Q61E
R69Y
R69T
I12E
Q92H
V63H
V63D
V63N
V63S
D67Y
D67L
A63H
K14E
T96P
T12S
T12K
E17A
Q7R
Q2I
I66F
N98I
V3I
Q61N
C37D
T37D
N37T
E68R
T63C
H61D
H61R
R70N
V15K
R8D
A63Q
A63D
A63N
A63L
A63F
L19Q
T72R
A63S
A63C
A63G
N37Y
Y67C
K45R
I72F
V56L
I19Q
I19V
C95S
D65K
H37Q
H37T
H37Y
R8L
I82C
A63R
T39P
T39S
M93L
I13R
E21Q
I72S
T63G
Q2L
L33F
V33L
E16G
L10S
K14Q
Q37D
E35N
K41T
H69Q
G49R
G51A
K41Q
A74R
I13V
G94D
K43I
E17G
L38M
Y37D
V77I
I85M
W67L
T63F
W67C
E37T
E37Y
Q58K
Q18E
V63L
I11R
I11F
L10C
V82I
I62V
E16A
T63Q
T63R
N98K
N98D
M89L
M89V
L15M
L15I
A12Q
Q18L
E70Q
E70N
E70K
E70T
A12P
K69Y
I36A
L10V
D65E
I63P
K37Q
K69T
R69Q
N12Q
T72I
N12P
N12S
N12K
M64V
S37D
P19R
P19V
W6R
Q63P
G6R
Q61D
I64V
T72E
A12S
A12K
L6R
I93L
Q61R
H61K
R70K
D29A
S4Q
S4T
V63C
V63G
S63P
N61D
N61R
L89I
P9L
H37D
I12Q
I72L
I12P
I12S
I12K
R43I
E61D
V33F
N61K
Q61K
E18H
P9T
V75I
K20L
L63P
V63Q
V63F
I33M
S67Y
S67L
V85M
D29V
K69Q
Q92E
M89S
A13R
A13V
G48Q
R63P
E21K
Q58E
M72V
L76V
V63R
V12Q
V12P
V12S
V12K
T36V
T36L
R72F
R72S
M46V
L64V
N37D
K20R
T74A
K37T
K37Y
R70T
P19Q
K43T
R55Q
R55I
L18H
L72V
I11V
E61R
E37D
R70Q
C63PH63P A63P
N14Q
I36V
I72V
N14T
N14E
V82C
I33L
I33F
G73A
K92R
I36L
L24I
Q18H
I84L
V32I
T72F
K18I
K18T
K18R
K18A
K18E
K18L
M62V
E67L
D60E
E72F
E72S
T74R
E61K
A74K
R20S
L15K
M20I
T72S
S4N
S4I
E72L
N55Q
E63Q
N55R
N55K
E63D
E63N
E63L
N55I
E63F
E63R
R16A
E63S
E63C
E63G
Q92R
T63P
R43T
M89I
A71V
G48I
S67C
T74K
T72L
R20M
M36A
K37D
M20T
P10I
P10F
I37D
G48R
L24F
K20S
T74E
T72V
L10I
L90V
G73C
A82F N7D
L10F
E68G
F99L
A74E
N60E
K20M
E72V
N7E
N7R
T74P
G48L
V10I
A91I
A91T
K18H
I82M
V10F
M36V
T71V
A74P
V71I
M36L
V82M
L54F
L54M
L54S
E67C
V82D
V82G
G63P
I82D
I82L
V63P
I82G
G48A
V82L
R72L
R72V
L13R
L13V
L82A
L82FI97F
M63PE63P
L82S
A71I
L90W
H92R
G67C
T74S
M46L
I54L
V93L
V89S
R20I
G73S
N63P D30V
R20T
A74S
L11V
I24F V89I
K20I
V82A
K20T
V82F
G73T
I54F
T71I G48V
I82A
D83N
I82F
D30Y
L54A
I97L
V82S
I84V
V20M
V20S
V50I
I54S
K74S
I54M
V20I
V20T
G48M
L54V
A82S
N88K
V47I
I82S
F82S
L46I
M46I
I54A
I54VN88S
I54TL90M
L54T
S88D
N88D
D30N
3.0
2.5
2.0
1.5
−log(prob)
1.0
0.5
NFV
0.5
0.6
3.5
3.0
2.5
2.0
−log(prob)
1.0
0.5
I82S
I54A
I54T
0.7
L89V
I13K
0.0
0.1
0.2
0.3
0.4
ATV
S37E
S37H
K70Q
N37Y
L38M
K43I
T72K
G48M
L63S
T12Q
K43E
D35N
V15I
T37N
T37Y
G94D
I19V
A63Q
L64I
S39Q
Q19M
S74P
R14V
N88T
N12P
K14V
I13V
I15M
S37N
S37Y
G48Q
K41N
P19Q
P19T
P19MC63T
E21Q
E21K
P1L
H69T
E37N
G73C
D67W
K37I
K37CR69H
T12P
E68G
I66N
I72R
I72S
A37N
A37Y
E35D
I93V
L19I
D65Q
M72T
M72E
M72K
E37Y
E21G
R43E
R43I
A12Q
E35N
A12T
V10C
P39T
A74T
V15M
T19M
C37D
L63T
Q18R
L5F
Y69H
Y69T
S12T
T74K
I36A
L15V
A12P
I20R
A71T
R69T
S12Q
M36I
D67F
D67L
L19V
K37Q
Q61D
H61E
K69H
Q35D
Q35N
Y67E
L93F
I72V
A37Q
A37I
A37C
Q92K
Q18A
S37I
A63C
T31P
V62M
H18L
W6C
N83D
G16A
T91I
T96P
S12P
E72I
Q61N
M72I
M72R
M72S
M72V
R41T
R41I
S63T
P19L
R70Q
R70K
H37Y
E18L
T37I
S37C
L5R
L5I
N37I
M93L
M93F
M93I
M93V
E37I
E67D
E67W
V32A
T72I
K69T
S37Q
V33L
Y37D
E34D
Q63C
I66L
Q12P
I66M
E37C
E34Q
N69H
E17R
W6R
T70Q
D25Y
A63L
T70K
L63P
E37Q
I62V
L10T
I20T
M36A
Q18H
L15I
V63P
N37C
N69T
T37C
T19L
N41R
E17G
K20V
K41R
L6W
I63T
K12Q
K12T
K12PP19I E7Q
E70Q
E70K
I36K
L76I
L24S
L10V
D67C
A74K
E67F
V64M
K57S
K37D
T31S
T91A
Q63L
A63S
N63P
L38I
D29V
H63T
N37Q
S39P
T72R
T72S
L93I
G48W
G73S
R20T
Q39P
Q18L
K55N
K55I
S63P
L10C
S67D
F53L
R8L
I85M
I62M
I85V
I77V
K57R
E67L
C63P
L64M
V10P
I20M
D29E
D65E
T80I
N41T
N41I
T80S
P19V
I12K
Y69Q
H37I
I10R
H37C
M36K
I64M
T19I
T19V
L15M
T31A
T72V
A63T
Q19L
I3F
E37D
R12Q
N83K
G16R
R12T
R12P
S67W
S67F
S67L
Y67D
Y67W
T63P
S37D
H69Q
A16E
T37Q
K41T
I12Q
L72I
S39TK41I
I12S
I12T
G48E
R72V
R69Q
P79A
Y67F
Y67L
M89V
L93V
H37Q
H37D
G51A
G73A
Q19I
R92H
G73T
E58R
Q63S
L10P
I12P
K69Q
D61E
E72R
E72S
Q19V
N61E
A71V
K20R
I3V
T74S
L24F
N83S
E67C
N88D
L6R
I63P
L6C
Y67C
A37D
T74P
Q63T
E58Q
V32L
L2Q
E63P
T37D
L2S
N37D
E60N
D25E
V71I
E72V
I36V
Q39T
Q63P
M36V
L72R
L72S
L72V
L89V
D60N
N69Q
H63P
Q37D
R20M
A63P
G48A
A74S
T20M
D25N
I36L
T71V
K98N
K20T
L33F
Q61E
V33F
V10R
A71I
I47L
A74P
L90M
M36L
I37Q
G16E
I37D
I98N
I37C
S67C
L10R
P10R
I82T
N7Q
A82M
V36L
T36L
K20M
I33F T10F
L24I
V82I
L76V
T71I
Q92H
H2P
H2L
H2S
N30D
I10Y
R63P
I82C
R7Q
I10F
T8LT69Q
V10Y
M46V
L10Y
S73A
S73T
I89V
L10F
P10Y
P10FV10F
I82L A71L
E92H
A7Q
V71L
F66V
N85V
N85M
R61E
F66N
F66M
F66L
F24I
K74S
K74P
M46L
I82D
I82G
T71L
H2Q I47V
V20R
VK72V
20T
V20M
I82A
S88T
I84V
I82M G48V
V82T
K72R
K72S
K92H V22A
V82C
I54L
S88D
I71L
I50V
V82L
C73S
V82D
V82G
L54F
L54S
T88D
I32L
V82M
I82F
I82S V82A
I54F
L46IL54M
C73A
C73T
V46I
M46I
L54A
L54T
I54S
A82F
A82S
L54V
Q92E
L19Q
L19V
I85L
P63H
H69T
H69R
L63C
I47V
K70Q
L89I
Q69K
I93L
N88T
L19P
P12I
G94S
I20T
Q19P
V11S
N69H
V63C
N37E
C37D
S37Y
P12E
V64M
N69R
N69T
S37I
Q63T
I62V
D67W
T37N
T37E
T37C
I72L
E34G
D65Q
V15M
P63T
D30N
I72V
N88K
S63Q
S63P
K92Q
K92E
L63S
V64IT37Y
N69Y
H69Y
L97F
Q61N
S63H
A22S
K57S
M20R
R8D
H61R
I85M
I15M
Q18T
R57S
T37I
V13M
V15L
A22V
I13M
N83K
L63Q
V19A
V19S
Q69E
G16R
V33F
Q39P
S63T
L63P
S37D
A16G
E12N
L63H
H63T
N37Y
V36I
R8T
S39P
R43T
S74A
L18T
K69E
G16E
N37I
N83D
A12N
V63S
T12N
I13R
Q19A
K41N
A63L
A63V
A71T
E35G
F10V
R70K
M64I
I15L
C67Y
L19A
R14Q
R14E
L38F
E37Y
I63Q
I63P
I63H
I63T
I63S
K12N
L19S
L63T
G51A
I50V
I85V
M89I
E34Q
V19I
E37I
N61H
R70Q
R43K
T72L
R14K
P12N
A63C
E34A
C63Q
C63S
C63P
C63H
V11C
K45N
I10F
T74A
V63Q
Q19S
V63P
V63H
V11F
S12N
I10M
V13R
Y67W
D67F
D67L
L10R
M93I
M93L
E21G
C63T
E34D
E34I
T72V
N88D
N61E
V63T
L24I
L10T D65E
R69Y
Y67F
Y67L
L64M
L19I
T37D
R63Q
R63P
R63H
R63T
N61R
L24F
Q61H
I10V
Q69H
T43K
L46I
Q19I
S67W
T4I
N88S
I12N
L64I
D88S
R8P
R8L
E72L
E72V
Q61E
D60E
Q69R
Q69T
C67W
A63S
Q61R
I11C
I11F
L33F
A63Q
Q58E
L24S
A63P
C67F
N41T
M36A
A63H
N37D
P19A
P19S
L36A
A16R
A16E
I41R
C67L
K69H
S67F
E7Q
E7H
D35G
V32I
K20L
G48M
A63T
L10M
K69T
L10F
K69R
S67L
M46V
T39P
K41T
K98N
V89I
I37D
R6W
R6S
G73C
V71I
I75V
E37D
N35G
L10VM36V
T69Y
L36V
L36I
A11C
A11F
M36I
K35G
Q69Y
N7Q
R19A
R
19I
I24F
R19S
I24S
N7H
G48V
F53Y
K65E
V82I
N14Q
N14E
N14K
K69Y
K34Q
I33F
P19I N41R
D8P
D8L
K20R
K41R
K45R
N60E
T71D
T71L
A71G
A71D
A71L
T14Q
T14E
T14K
D12N
F53L
R92L
V20I
V20TT71G
N70Q
N43K
R92Q
R92H
R92E
V82D
V82G
V82F
R20I
M20I
M20T
V82M
N85M
K34A
N85V
I43T
I43K M62I
R20T
H2Q
V93I
V93L
G73T
M46L
F95L I82G
A71V
K20I
I82D
I82F
V82S
K20T
V46L
V46I T71VT71I
I82M
A71I
K34D
L54M
K34I
C6W
C6S M62V
M46I
L90M
K74A
I82S
F82A
I54A
I54L
V82A
S11C
S11FF95C
G73S
I54M
Q35G
L54V
I82A
L82A
0.2
0.3
change in log(rf)
0.4
−log(prob)
3
N37C
L63V
0.1
0.5
change in log(rf)
Q18K
0.0
0.30
change in log(rf)
1
3
2
−log(prob)
1
I54T
I54A
I84V I54LI54M
T82F
L76V
V48M
0
4
3
0
0.0
LPV
0.4
0.15
E44D
T39D
T35K
I179F
T48F
D121A
C162T
T142L
L120I
H121Y
K39R
V35A
M184L
K73N
K211E
M178F
T39L
I47L
A207S
R104N
G211R
I5T
A98S
T200R
K174T
I35E
A207E
L193I
A200K
V178I
L210E
L210G
K39N
P4S
K211S
I142T
K32T
Q197N
D40A
R83E
A173P
C162F
V142L
K122D
K102Q
K11A
G196E
D169H
D169Y
C162K
V202T
G123E
I60V
P97S
K70Q
T215N
L74M
K102H
I5M G123Q
E219Q
I142V
D207N
R78S
R135M
E200M
E200I
G211Q
E169R
H174N
D177E
V50L
K174Q
Q197L
E207L
K101R
K83S
Y171I
Y171F
A114V
E44K
T39Q
E89K
T35L
N207H
E207G
Q197P
T35M
T211D
T211W
T211I
T215H
A138D
K11N
I5N
E207R
I47M
A173M
L195T
N43Q
N43T
L100F
T7S
V202K
W210L
E204A
S211D
K46Q
M135L
R104Q
R166T
R166I
K173E
I50V
Q122N
Q122I
A138Q
A173K
D207H
K135R
K135M
V135I
R43M
E32K
E32T
Q219R
T39I
D192T
G177Q
A162S
G177N
G177K
I142L
M35E
K139Q
R174K
Q91L
K139R
P9Q
K139E
K139I
R206K
R49Q
A207L
A207R
A207G
K32I
G15V
S69I
V200Q
T165P
R166Q
S173P
V200T
V200R
G123T
E29Q
R20I
N175Y
D121H
Q101T
Q101I
I184V
E39Q
D69I
E39D
E39L
E39I
D121E
K122P
C162V
T200L
S162N
I50L
N39Q
Q207K
N39D
N39L
Y127F
T11R
D162Y
R104T
S200Q
S200M
S200I
S200T
K35E
K35V
K35A
K35S
L195M
L195R
L195K
L195I
I47N
G207P
D185N
T142F
T142R
T142K
E200Q
D121Y
E200T
E79K
E211D
N67G
T165L
K49Q
Y162R
N174Q
N174G
N174K
N174T
S68R
Q211A
S68N
R174T
S39K
S39R
S39N
R139I
T173R
L35S
S211W
S211I
S68T
D44K
L41M
T215I
R201M
R172K
M164L
K39D
K39L
T27S
E197Q
N207K
N207AG123D
E35S
R211Q
K211D
E204N
M35S
N211T
N211K
N211E
I167F
N68T
N68Y
T211G
T211R
T215S
F210W
F210L
P176Q
E200R
Y162W
S69E
S68Y
N123S
T35E
K32Q
W210E
Q207A
W210G
G18A
V184W
A162N
V60R
E174K
V202I
K43M
K220E
R11A
A138V
K207A
K211I
N104Q
E36D
H207K
H207A
N104T
I35S
L197R
L197T
S215E
S215C
K173N
S173M
S173K
M135T
Q197T
E39P
E39A
Q197R
E203K
T166K
K211W
E218Y
V142F
T48Q
T165I
T173A
T173S
E169D
E6A
G93R
I200Q
D204A
E200L
N32R
K173Q
K197E
V184T
R43N
E32I
H174G
E40K
V184M
K203V
K203A
K207S
K39Q
K139V
K20I
R32Q
V142R
V142K
Q207S
R32K
R32T
R32I
E79D
A36G
A173E
R135L
Q207E
K83E
V184L
S69P
V200L
K49E
E122Q
K73T
T39P
E122N
D169G
V189I
E173Q
E173N
K197Q
R199E
N207E
D69E
K197N
N207S
C162A
G211A
K39I
E174Q
S48T
E174T
I200T
R70K
N39I
N39P
N39A
K43N
K207E
T216S
K211G
R49E
I181V
I181F
T11A
N162G
L135T
H207E
H207S
I37F
E219R
D162R
D162W
R174Q
T215E
A138T
E122I
S200R
A138G
A138S
S211G
E123T
E43K
T7N
K28D
I200R
V60Y
S173E
K138Q
K82N
K138D
K138V
M139P
T142S
T39A
D204N
D204K
K211R
M35V
M35A
S211R
D69P
R39Q
K197P
K197L
I60R
R173P
R173M
A139V
R173K
R39I
R39P
R39A
E169H
E169Y
I2V
S48F
E197N
I35V
R11N
T35S
K207L
K207R
N219G
I135A
I135P
H174K
H174T
K207G
E6Q
A28D
Q151K
E194K
T11N
Y162T
Y162F
L187F
Y162C
K6N
K6T
P52H
T215C
E204K
I35A
I142F
K43Q
K13I
E203V
S107P
S107T
L35V
L35A
C38F
V178M
V178F
H197N
H197P
H197L
H197R
H197T
N177Q
R122P
Q207L
I142R
I142K
E53K
N177K
D169Q
R201K
K135L
R43Q
E32Q
F215L
F215A
M16V
R199K
R43T
V159L
T139V
Y162K
A207P
E203A
D6K
T7I
G15R
E197P
E197L
V142S
K49T
N207L
N207R
N207P
L210S
N207G
Q207G
I178M
E40S
F215Y
S3R
W24C
S162G
G18V
T162F
T162K
T162V
T69N
N211S
N211D
N211W
N211I
N211G
N211R
R173E
N68G
R70Q
E6S
A36K
E102Q
Q207R
A204N
E102K
E102H
F210E
F210G
T173P
R49T
Y215N
E44V
K197R
K197T
V200A
V200K
Y208H
G123N
H175Y
H174Q
G70K
L74I
A173Q
V135A
V135P
A173N
R196Q
K39P
I31M
V180I
I178F
T70Q
L215N
L215H
D40F
T70R
L215I
L215E
I166K
T70K
S88G
W88C
L215S
L215C
I200L
I31V
S211Q
I60Y
T48E
T173M
T173K
E29G
T35V
Q43T
A162G
K39A
C162S
I135N
N162H
H207L
H207R
N104K
T35A
H207G
E207P
Y215H
D211W
D211I
I135R
K73M
K211Q
E194Q
M39P
H64R
E123D
L80T
K203R
E218T
F162A
W88S
R211A
E197R
E197T
D123N
N69A
K28G
K43T
G123S
D207K
E6D
D169K
A69I
A69E
A69P
D6N
E173I
E53G
T173E
N67E
K173I
A139P
V215L
V215A
V215Y
L173R
L173A
I167M
I142S
G67E
L210Y
S173Q
E211W
E211I
E211G
A36E
S173N
M16R
M16I
K82R
Y162V
I135M
N147K
A138E
Q145K
Q145E
M200A
M200K
K102N
R173Q
R173N
K219E
K139P
V135N
D162T
D162F
D162K
R166K
D162C
D215N
T195M
T195R
T195K
T195I
V203R
V203N
E169G
Q219D
E43M
V203G
Y146F
E40A
Y146C
T173N
V135R
G141R
A173I
K102M
I180F
K219Q
E200A
E218H
R143S
T173Q
Y215I
E35V
E35A
E44G
T200A
V111I
L173P
L173S
M178L
G177D
Q204A
N211Q
Y215S
T211Q
Q122K
Q122D
E43N
D6T
C162N
Q219N
A36D
C38G
D207A
D207E
D207L
H207P
D207S
P217Q
F210S
E28D
A158S
P55Q
P55L
R104K
S69G
E43Q
N69D
V21G
S139V
T200K
N69S
I132L
T139P
V135M
L210M
I135L
D44V
K203N
L214F
H208F
R35M
R35L
R35E
E169Q
I109M
I109V
R35V
R35A
I109L
S162H
Y162A
R35S
I47F
E203R
G177E
R143G
L173M
E211R
T142M
A203Q
I118V
N69I
D123S
A203D
A162H
S11A
S11N
K203G
I2L
K219R
Q102N
R207P
M39A
Q145C
K
211A
W210S
H211Q
E70Q
N215H
V167F
I69E
N215I
I69P
V167M
K207P
E70R
E70K
N215S
L142F
L142R
L142K
E70S
S39D
S39L
Y215E
V75T
G134N
G134T
G134I
E89G
S48Q
F115N
R139V
C88G
K102R
E122K
I179M
L80I
S191Y
D215H
G18C
D207R
E44A
E219D
E219N
E219H
D207G
R22K
E203N
V135L
E219G
R135T
E200K
Q219G
S191A
S191P
S191T
H208L
D110N
S176Q
N147S
K123T
K123D
K123N
L173K
V180F
S191C
K123S
A7N
D44G
S211A
Y215C
F87S
Q207P
Q102M
N64Q
V179I
E122D
S200L
T173I
F215N
E6K
N219H
L149I
Q101H
T162A
V132L
T162N
T162S
V179F
N211A
V106G
V21I
I200A
E43T
N177D
F215H
Q23H
Y146S
T211A
V142M
E203G
A204K
E102N
T207K
E102M
T207A
T207E
T207L
T207R
T207S
T207G
S68G
I31K
T69A
K13V
D215I
Y162S
D215S
D69G
E169K
L173E
W210Y
Q49E
Q49T
G70Q
I179S
Q219H
C162G
R99W
K36E
R28G
D162V
A28G
E6N
N177E
N32K
N32T
R173I
G67D
V8I
E40F
E28G
N67D
I200K
L74V
K13N
V178L
T69S
I142M
Q204N
S39Q
T69D
P4H
W162H
S39I
S39P
S39A
L149F
W162G
N69E
E122P
L173Q
D44A
L173N
F215I
D211G
V108A
N69P
S200A
S200K
W88G
F215S
E123N
R169Q
R169D
R169H
R169Y
F162N
T75A
T75L
F144Y
L142S
R169G
F162S
E45G
L142M
T69I
Q122P
Q91E
K138T
K138G
K138S
W210M
I159L
N32I
E6T
V135T
A69G
G155R
A98G
S98G
V215N
V215H
D162A
E211Q
I184W
S48E
E211A
I135T
P4T
D196Q
S173I
D207P
D196E
D196G
K64R
Y162N
Y146D
I178L
D86E
S3G
E53D
T69E
F215E
R219D
Q151M
V203Q
T69P
V203D
I214L
L4H
L4T
V148M
N32Q
R65K
K219D
K203D
V148L
F215C
I74V
Q102R
F208L
E218D
D162S
L173I
C162H
E123S
R219N
A190V
D215E
R162V
R162A
A196E
G138E
D215C
A196G
R162S
S4H
S4T
V62T
K219N
V179M
M164K
E102R
F210Y
F210M
T207P
S139P
V215I
V215S
R139P
E203D
G203Q
Q200A
Q200K
Q177E
I184M
I184T
I184L
K138E
H101E
Y57K
Y57N
K40S
H4T
G203D
N47F
Q204K
K219G
P119T
V179S
F87L
S207P
I179A
I179G
Y162G
D211Q
E203Q
D211R
D211A
K203Q
P48Q
V16R
V16I
N215E
A157P
P58T
Y208F
P58S
N215C
H211A
I122P
A117T
P27T
P27I
C163S
G134S
P27S
A117S
F115Y
K219H
I31L
D162N
N54I
R219H
R219G
P119S
R196G
V215E
V215C
A7I
E179L
I181C
R125G
Y113D
I139V
A123N
I139P
T162H
T68G
A123S
T162G
A14P
K135T
N69G
T69G
N147H
V62P
V62A
R103H
K36D
S58T
R196E
V179A
V179G
Y208L
N64R
M75I
A190Q
F34IF34L
Y116S
E152G
L77F
V108I
V106L
V148G
P59A
Y188C
R169K
T75M
I179E
Y162H
N54K
Y181V
Q142I
Q142T
Q142V
Q142L
Q142F
Q142R
Q142K
Q142S
Q142M
A189I
I164K
N73T
N73M
G172R
G212W
G172K
I106A
V75A
P119L
I69G
D162G
P59Q
Y181F
K101G
A107P
A107T
P35V
P35A
N220R
E30R
I10L
I10V
E104K
N220E
T101E
R162N
TR101G
101P
P35S
I10G
F162G
R45G
R162G
V179E
Y116L
Y116F
V181C
D162HF100I
K40A
K40F
N22K
T158S
P48E
A179L
F162H
K101N
E69G
K68G
K125G
V75L
V106M
K101Q
R103T
I179L
R101N
R101Q
A75I
K101I
K101T
V179L
I214F
R101T
R101I
R99G
V75M
V106I
R162H
A190C
E101P
Y181C
R101H
K101H
L75M
L75I
H101P
Y188N
N65K
R3G
Q101E
R103E
Y188F
T75I
N166K
E179D
A190T
M106A
E30K
R152G
Y188H
Y188S
R103I
V106A
K103R
C188N
C188F
C188H
C188S I179D
V75I
V179D
L100I
K103H
Q101P
K101E
A190E A179D
K103T
R101E
K103E
K103I
K101P
G190R
R190A A190S R103S R101P
R190Q
R190V
R190T
R190E
R190C
R190S
change in log(rf)
N37T
0.3
1.2
S37Y
G94C
I72M
E16W
S37C
S12P
V77I
V32G
T72E
Q61K
R8V
L63R
D35G
R43K
L10V
D35E
A63L
A63F
G94R
H63V
H63K
E34D
S63V
K70E
R57K
E16D
I93M
I72T
T37S
T19Q
I19Q
T19L
Q92K
C63A
V33M
K14T
I72E
S37N
I15M
R70T
C95L
I19L
I13V
K69E
V13L
K37Y
K37S
K37C
I12A
R69Y
D67F
D67V
S63K
S63C
H69P
H69L
T4Q
I62V
V15I
L90V
T37Y
T37C
G68R
A63R
K69T
M72T
M72E
I36A
V33I
H37V
H37S
R14E
L64R
Q63A
A12E
G68D
P39L
G48Q
F53L
N12Q
V63A
V63L
E37T
L19R
E12R
E12T
H63A
H63L
H63F
H63R
H63C
T91S
A12R
A12T
C63L
N69K
L10S
S63A
S63L
Q63L
Q63F
P19A
L63T
E37H
R14Q
L15S
S63F
T37N
Q7A
Q19R
D65Q
D65K
V15M
I64M
T80K
E67F
E67V
T12P
E37V
L89F
R8T
G40V
G40E
S12N
K41R
Q63R
V63F
E17G
K20R
I82V
K14R
H61N
H61D
L10H
C37Q
C37D
T19R
I13L
A37V
A37S
M20I
V10S
D65H
V3L
R20Q
G73A
C63F
C63R
L10P
K43E
V56L
K55I
L63P
H37Y
H37NH37Q
H37C
E18H
P19Q
P19L
S63R
I19R
W6S
V77L
L24S
S37Q
V10H
M89L
C67W
I93F
I3F
E61N
V63R
V72S
V10P
T20I
P39T
R43E
I84L
N69E
N69T
E37Y
R70K
E37S
E12P
Q12K
T96N
T96I
T19V
Q39L
K37N
Y37Q
I12E
I12R
I12T
E61D
Y37D
A74S
R70E
A12P
Q61N
V12Q
E37C
D17G
V12K
Q61D
V33LH63T
T63P
M64V
N37Q
I3N
Q63T
I77L
A37Y
A37N
A37C
P9L
S12Q
P12N
I64V
W67Y
K20Q
I33L
L15I
I36R
I36T
K69P
K69L
S67D
S67E
K14E
N41Q
I89S
N41I
L90F
A63T
I63V
I63K
I63A
I63L
I63R
I63T
I63CI63F
H69R
T74E
Q35N
Q35D
Q35G
Q35E
I3L
L6S
E67R
E67G
E67C
T26S
A71M
M89F
C67L
L19V
H63P
E16G
K14Q
I15E
S63T
G6W
P39S
L64M
L10M
T37Q
D25N
E21G
S37D
I66M
C67Y
Q39T
Q39S
N35E
N35G
V72K
Q69Y
L76I
E37N
T12N
L89I
L64V
L76S
D60E
N12K
H69Q
V15E
Q18I
N69P
N69L
I89V
Q58R
N41T
Q58H
Q63P
P19R
T96S
I12P
K43I
T96Y
G48I
T4P
L15M
I66F
V71D
L10R
H18L
M93L
A63P
I19V
S67F
K69R
K55R
A12N
I36K
V10M
N37D
W6A
T96P
Q18K
I93L
G63R
G63T
Q19V
N69R
A74P
H37D
S12K
D67R
D67G
D67C
S63P
Q92H
K69Q
C63T
C63P
I72R
Q18E
S67V
T36L
K65E
Q2H
H69Y
A71D
E67W
M89I
P19V
N69Q
G48R
M36I
L89S
F33L
T80S
R72V
K37Q
K37D
R43I
T72R
I12N
T12Q
D65E
E21V
D67W
E37Q
E12N
P12Q
I36V
N98I
Q2I
M36A
I63P
N63C
A71G
T37D
G16A
N98R
N98S
S74P
N69Y
K69Y
V72L
A37Q
E63V
E63KE63L
E63A
E63F
E63R
E63C
V10R
I66N
L10T
V71G
I12Q
T71A
K43T
I72V
Q2P
Q2T
M89S
A12Q
S67G
S67R
S67C
Q18H
L89V
K20S
K70Q
G6S
K20A
I82C
R70Q
L24M
E67L
E67Y
E12Q
A37D
R20S
S4Q
N60D
H67W
H67L
H67Y
T1L
S4TT4I
S4L
S79P
M89V
I12K
D25E
K20T
R20A
A12K
M46V
R20T
W6P
I36L
E37D
V36L
T12K
E18L
T72V
T70Q
A22V
G67L
G67Y
I72S
R41Q
V82C
R43T
E16A
Q58E
D63L
D63F
D63R
D63T
L15E
R41I
E72R
E72V
P12K
M36R
M36T
G49R
G51A
V63T
T69P
T69L
N63A
N63L
V93F
T69R
V93L
E21D
D67L
D67Y
R20M
Q18L
I72K
E12K
K20M
K92H
E21Q
E72S
T74R
A11R
A11C
A11I
L1P
G63P
A11V
V10T
W6C
A71L
A71I
G6A
F10M
S67W
W6R
Q92L
N7Q
R41T
N7K
N7D
N7L
N7A
K41Q
D29G
T74K
K41I
L10Y
G48W
S4P
I11V
Q7E
M36K
T72S
I72L
R72S
R72K
R72L
K20I
E72K
V63P
T74A
D25Y
L10I
T10Y
T10I
N63F
N63R
N63T
T74S
M72R
I54L
G48M
V32I
M36V
R20I
L6A
L6P
L6R
K41T
L6C
I50V
T14Q
D8R
D8V
D8T
G34A
G34T
T14R
G34K
G34V
G34E
T14E
G34D
A91S
T72K
G6P
G6C
N98K
G6R
K45R
G48L
G73C
D88S
E92R
G48A
E72L
F38L
L54F
N60E
S4I
F10R
M46K
N63P
M72V
G48E
N98H
S67L
S67Y
T71M
M36L
N88K
T72L
N70Q I82L
L82M
M72S
H7E
R7E
M62I
M62V
N98D
N55I
N55R
D63P
T74P
V10Y
V10I
D21Q
T69Q
H12Q
M72K
T69Y
I37D
H12P
H12N
S31T
H12K
Q92E
V71L
D29E
E63T
E63P
M72L
V71I
I85L
T71D
D29A
N43K
I23V
V46K
V46L
I45R
N7E
F24I
V82L
N30V
T1P
K92L
T71G
L24I
F10T
I85M
Q8R
Q8V
Q8T
H59Y
S10H
S10P
S11R
S11C
S10M
S11I
S10R
S10T
S10Y
S11V
S10I
F99L
D29V
N88H
N88Y
D29N
K92E
L76V
N88T
N88I
E21K
N88D
Q92R
T71L
I82M
I54F
T71I
L82D
L82A
L82G
M46L
V47IL11V
I85V
V20R
I82D
I82G
D21K
F10Y
F10I
V82M
N88S
V20Q
I84V
V20A
V20T
G73T
F66N
N43E
NV20S
43I
I23L
V46I
L90M
L54V
V82D
V82G
I82A
N30D
L62I
L62V K92R
G73S
V20M
V20I
V82A
H92L A7E
G48V
N43T
M46I
L46I H92R
H92E
I54V
L54M
L54S
I54M
I54SV82F I82F
A82F
L82F
A82S
L54A
L54T V82S
L82S
0.0
K70N
K20T
K20Q
E34K
E34G
N88D
L10R
L63S
A12N
K20M
Q63T
M64I
S37H
R14I
I64V
P12K
R8D
L10P
Q19V
L93V
R14T
V63Q
V63H
V63R
V63N
V63D
V63K
V63M
V63S
T19R
K57R
C37S
C37H
C37N
C37T
T72R
L64M
P79T
P79S
I62V
E34D
S12A
S12N
P19I
I36M
G17E
E37D
M64V
S67W
K12Q
K37N
K37T
KM72F
37Y
R69Q
Q35G
H63S
P39T
A37H
A37S
Q19T
Q19P
Q19R
D25E
T20M
I15V
C67L
R14E
P12Q
L63Q
N12E
R70Q
R70K
R70N
I72L
I20R
M89L
H69L
E21G
N69E
C63H
C63R
I93V
K69H
T12P
L10S
A71I
R43K
S63Q
K20V
L24S
W6G
L10H
S12E
T37Y
Q92H
E34I
D35K
I13M
L63T
S37N
M20V
K37E
T71M
L64I
L90V
K43E
S37T
H63QV63T
E34Q
W6R
W6S
C67S
A12E
I72V
L64V
C63N
C63D
C63K
C63M
T12K
S12T
F99L
N41T
E72T
N41I
E72R
E72I
N37Y
V10C
G68D
V33I
S63T
K69L
K55I
H37Y
Q2P
A12T
Q63P
D65E
A13L
A13V
A13R
T37E
E67L
N12T
Q39S
E16A
N69R
E12P
E12K
Y37E
Y37D
C37Y
C37E
T72I
N69Y
I12E
K41N
C63S
V10L
T12Q
V12Q
Q58K
I75V
T63P
V12E
V12T
V12P
V12K
C67F
H63T
L63P
A63Q
L33M
C95I
K92H
E67S
T37D
V19T
V19P
M72T
M72R
R8P
L19V
K41R
S37Y
V10R
V11I
Q19I
V63P
Y67L
A63T
V82I
T70N
T91A
V11F
P9H
D25Y
F53L
C67W
C95L
C37D
T19I
E70Q
E70K
E70N
H63P
A12P
V82S
S12P
I63P
H69E
V19R
V72K
G48E
N37E
V56L
R14K
L38M
K37D
L19T
I12T
I12P
I12K
T4I
V33L
G48R
R43E
K45N
I20Q
N60E
I20K
I20T
S39P
K69E
V15K
L19P
S37E
E61H
E61K
V10P
T96P
G86E
L38W
A12K
Q58R
P79A
K45I
E7Q
I72K
E67F
V13R
I33L
I82D
I82G
N12P
K45T
A12Q
E72L
G16A
S37D
L93F
Q61D
I20M
H69R
V33M
S63P
Q2T
N37D
S74T
L19R
V19I
D67E
R72L
Y69Q
R41T
R41I
K69R
D60E
R20Q
G48W
Y67S
R20K
R20T
E35D
S39T
I93F
V10S
N12Q
S12K
R12E
R12T
N12K
K69Y
Q61R
T72L
T72V
A37N
A37T
H69Y
A63P
R20M
I13L
Q37E
Q2D
I82A
N41S
E72V
E67W
K35Q
I12Q
L10M
D35Q
T20V
I20V
M36K
V10H
I15K
R8T
K43N
I13V
K65E
E61Q
A37Y
N98H
T91S
C63Q
S12Q
L39P
L39T
E12Q
E35K
L6P
L6W
E72K
M36V
I3F
N98S
H61Q
D61N
R41S
G73C
V15E
T4P
L10T
T72K
T71I
A37E
A37D
E63Q
N69Q
L19I
V71T
I36K
K41T
K41I
R20V
I85M
K18Q
R43N
K69Q
Q2L
I3V
N98D
Y67F
N83S
D25N
Q37D
H37E
I36V
K45R
C63T
D35N
N83D
T1P
M46K
Q39P
Q39T
M36A
D29N
K41SM89S
C95S
Y67W
V46L
V77L
L10I
Q61N
N35G
R72V
Q14N
Q14I
Q14T
Q14E
Q14K
V36A
D63T
D63P
L6G
R12Q
M10F
R12P
R12K
H69Q
Q92L
I82L
G73T
Q65E
D67L
I15E
H61D
R87K
M36L
H37D
M89I
Q58H
I36A
L24I
K34Q
K34I
I13R
V10M
G51A
I84L
L38F
A74P
K35N
E35Q
K35G
H18Q
I33M
P10H
V36L
P10M
P10T
P10I
P10S
L1P
H61R
I36L
V71M
N83K
N7Q
I37N
I37T
I91A
K61D
K61R
I41S
I91S
M72I
A71L
V15L
K43I
L6R
L6S
V10T
D83K
L38I
M72L
R55I
E35N
I89V
C63P
R72K
D12Q
L66V
T9H
T10F
M15L
E63T
D12K
I14K
E61D
I77L
E61R
V32A
R43I
D67F
D67S
V82D
L72K
V82G
I47K
N30D
V10I
I10F
F97L
I15L
D29A
D29G
M89V
L89S
M93I
M93V
H61N
G34Q
R10H
G34D
G34I
R10S
Q92R
D67W
A74S
A74T
V46I
K21G
M72V
V82A
R7Q
R19I
R16A
N3F
N3V
L10F
Q92E
F66L
F66V
S73G
I47L
V71I
D29E
L90M
H12Q
K7Q
D35G
E63P
H12T
K7T
K7E
H12P
K7D
H12K
L11F
L11I
M72K
E61N
D29V
L89I
M93F
P10F
V82L
A82L
I85V
L20R
I37Y
I37E
I37D
K61N
L46I
Q58E
K43T
E35G
K92L
V10F
R43T
L62I
L62V
T71L
I43T
T82V
K92R
I23L
K92E
L89V
L13R
V32I
V54S
M46L
S88H
S88Y
V22A
I82F
L66I
V14K
S88T
S88N
V48I
S88D
L76I
V48Q
L20Q
L33F
L20K
L20T
L20M
L20V
V71L
V33F
V54FR10M
I54V
M46I
V66I
E40G
I33F
T82I
T82S
I54S
V48G
R10T
R10IF66I
R10F
I45R V48A
I71L
G48M
I54F
S73C
V82F
I47V
L82F
S73T
V48E
V48R
T82D
I50M
H92LH92R
V48W T82G
0.2
Y188LK103N
1.0
T91I
Q18R
T4L
APV
H92E
V54T
A82F T82A
T82L
V54A
V54LV54M
0.8
0.0
3.0
2.5
2.0
1.5
−log(prob)
1.0
0.5
0.0
3.5
3.0
2.5
2.0
1.5
−log(prob)
1.0
0.5
0.0
3.5
0.6
Q2S
0.1
0.10
IDV
I19R
I13L
N69T
K14E
L89M
I93M
K45T
R14E
P63N
T4I
E35K
A12M
A22S
A22V
E16W
N69P
N69L
N61D
Q18H
Q63R
Q63C
S37I
M64I
H63T
N83K
L10C
L23I
K41N
V10Y
L76S
N37K
F53Y
D37V
I77V
R8T
L19I
C37Y
T72E
T72R
I10T
L19R
Y69R
D60A
L63R R69P
P39T
E35D
S63R
V15L
A82I
A12K
I36L
V72K
R69L
V82L
R45N
R45I
I12K
E68D
D37T
I93L
L63C
R20I
T37Q
E72F
L63D
M72K
Q7E
E37Y
L46I
I89S
Q63D
R8A
S37N
R70N
S63C
N69H
I72L
A63L
A63I
C37D
C37V
Q18L
S63D
H63P
H63N
N98I
P19M
L5F
W6L
V33F
V82T
V72I
L5I
L63H
E16D
T12A
Y67R
Y67G
D37Q
R70Q
Q58K
G17R
S67G
S67R
A63R
S37K
I15S
L38W
K57I
D67R
D67G
E67W
E16A
T12I
H61K
L15M
A63C
S39P
I12N
I12S
I10V
Q63H
M20N
I82D
I82G
G17A
C37T
I3N
I36K
I66L
G16D
P19A
L19V
T12M
Y69K
Y69E
Y69T
K43N
I20N
Q19L
I13V
A63D
V15M
S63H
T37H
S39T
K69E
T63P
T63N
W6A
V63F
M93L
I10Y
G16A
R41S
I19V
L36K
N12R
I64V
K43E
V72L
L63T
L64I
K12N
K12S
R69H
E37D
T70R
T70E
T70N
T19F
Q61H
Q92K
Y37T
P19Q
I11F
C63T
C63P
C63N
V12S
Y69P
Y69L
E67D
S37Y
W67F
L10S
D65Q
E37V
E37T
M72I
V56L
D60H
R70K
S63T
T12K
E37Q
S67L
R57S
L63P
C37Q
M64V
L63N
L38F
M89V
D37H
L19T
P12N
P12S
D67L
K37Y
I20L
E67S
T72F
V10F
P39L
H61N
H61D
S63P
Y69H
V82N
L15E
R41T
R41I
Y67L
K20R
K41R
A63H
N37Y
V15E
E68G
E21K
Q19R
Q19I
P79T
P79S
Q39S
D60E
K69T
S
S37V
S37T
37D
A12N
S12R
C37H
V19T
S63N
K20I
I19T
C67F
I20A
I10F
S67C
E67R
E67G
K69P
K69L
E72M
S4T
S4I
S37Q
V62I
T72M
I15V
H18L
L89V
N41R
Q35G
E18Q
E18H
D25Y
Q61R
Q63T
A71T
E61K
R72F
M46L
R45K
R45T
V82A
I66M
A13V
L33V
L19F
V82I
I36A
A12S
L64V
I15L
V63E
V63A
S37H
N69Q
P19L
P19R
P19IY69Q
N37D
L36A
N12E
M46I
A63T
N37V
Q39P
D67C
K37D
K37V
P9L
R63H
R63T
R63P
R63N
L10I
T70Q
T70K
G17D
N37T
Y59H
Q2H
I19F
L10T
Q58R
I36V
Q63P
I85M
W6S
V12R
V12E
P12R
I85N
Q2P
H69Q
P79R
E12Q
V13R
I15M
A63P
N37Q
Q63N
V63L
Q2S
W6P
K12R
K41S
R20N
I85V
L10V
Y67C
T12N
A63N
L33F
D65E
A12R
S39L
I12R
I12E
L10Y
R43I
R8V
R43S
E18L
P9T
K92H
T12S
K20N
E37H
V82D
V82G
M36L
I63R
I63D
I63H
I63T
I63C
G17E
K37Q
K37T
E67L
L24F
V19F
K12E
E61N
D29G
R69Q
T96P
T96S
I15E
V63I
R43Q
S67F
M72L
I13R
G6L
G6A
C95W
I33L
N37H
M36K
F53S
R8D
K41T
K41I
S12E
P19V
P19T
P19F
Q39T
M89I
Y59D
V10M
S79R
T1L
T1P
K69H
W6C
A82D
I50V
A82G
L10F
Q19V
P12E
R8G
M89S
Y37Q
D12N
D12S
D12R
D12E
S74E
Q61K
N35K
N35D
D67F
T12R
I10M
Q34K
K43R
P79L
R72M
E72V
T72V
Q92H
K20L
L36V
R16E
R16W
M19L
M19R
M19I
R16D
R16A
M19V
K58E
M19T
M19F
F10M
P79D
A12E
L89I
M20L
E67C
L89S
Q19T
Q61N
Q65E
R12Q
I37Q
D63T
D63P
D63N
V93F
I37Y
I37D
V93M
I37V
I37T
V
V93L
I37H
93I
S19V
S19T
S19F
T72K
I63P
I63N
E72K
T71D
K57S
M36A
Y67F
D34T
L6P
G63T
G63P
G63N
L6S
L10M
N98D
V63R
V63C
R20L
Q19F
K35G
T12E
G6S
G6P
G6C
Y37H
E72I
I11V
K37H
E61D
Q61D
T72I
T39L
K65E
S88K
V63D
A71D
A74K
A74R
Q39L
M36V
K43S
E72L
N7Q
L82N
L82A
L82I
L82D
S74A
L82G
N7A
N7K
N7R
N7E
K43I
D60N
G49R
G51A
K20A
K43Q
T72L
N12Q
R20A
I33V
I33F
E35G
N88S
D29N
V12Q
F53I
L38I
I54F
E67F
T82N
N55K
E63L
E63I
K34T
I32A
E63R
E63D
D29E
E63C
M47L
D29H
K92L
Y59F
N41S
F99L
R43T
I54L
L24M
V54T
D35G
S79L
S79D
I32L
E60N
V63H
K69Q
A13R
R55Q
D29A
R55T
K12Q
I12Q
D12Q
G67F
Q92L
T80S
N35G
L24I
H7Q
D29V
I32G
L6C
H7E
A71G
T71G
R72V
R72K
R72I
M62L
M62V
V20T
V20K
V20R
V20I
V20N
V20L
V20A
V89I
M20A
S74K
N88K
Q2I
H12Q
I43T
F38I
K7E
H12E
T80I
T71V
Q58E
N41T
N41I
A71V
Q92R
A12Q
S12Q
T20R
T20I
F53V
V63T
C95I
C95L
K43T
T12Q
H2I
R72L
K92R
S74R
V63P
V63N
I98D
C95R
V89S
N43Q
Q8V
Q8D
Q8G
N43E
N43R
N43I
P12Q
D94G
N43S
V71I
A74P
M12Q
E63H
E63T
E63P
E63N
M47I
R55N
I84L
C95S
M62I
V71L
R55I
Q34D
Q34T
S74P
K34V
R55K C95F
T20N
L90V
K98D
T20L
Q2L
I54M
L90W
T82A
F66M
T82I
T82D
L54M
F97L
T82G
V46L
T74E
M54A
M54V
H2L
T20A
I82S
T74A
K34I
V46I
D34V
N43T
V47M
A71I
T74K
N30D
T74R
Q92E
M54T
L54A
L54V
L54T
K34E
T71I
F53L
A71L
S11V
Q34V
Q34I
Q34E
R92E
T71L
K92E
D34I
V76L
T74P
V76S
I54A
V47K
I54V
S73T
F82C
V47L
V47I
V82S
F82M
F82V
I54T
F82L
F82T
L90F
G48Q
F82N
G73C
L82S
F82A
F82I
H92E
F82D
F82G
D34EA82S
N88D
S88D G73S
T82S
L90M
G48R
G48T
G48E
G73T
G48I
G48M
G48L
G48A
F82S
I84V
0.4
0.05
change in log(rf)
R14K
0.2
1.0
3.5
0.2
0.0
3
2
−log(prob)
1
0
3.0
2.5
2.0
1.5
−log(prob)
1.0
0.5
0.0
K65R
0.25
change in log(rf)
2
0.00
EFV
K11R
T215Y
0.25
change in log(rf)
T165P
R70Q
S123R
S68G
D86H
Y215D
T165L
K11N
V148G
K39Q
K39N
A173I
N67H
K211S
M35A
Y121D
K174T
G211M
G211E
G211T
F171V
F171I
V60I
T142R
T142N
T142K
T142S
G196K
A138Q
K174G
K39L
L195V
E40F
E207K
F61L
K64Q
T39A
A162D
V148L
T173E
T173S
K83G
N175K
K207D
K207G
T11Q
R83E
K20I
T11A
Y215S
V142T
V142R
V142N
V142K
V142S
Y146F
Y146C
G67E
G207A
S162T
T7N
L193F
L173I
P52L
R20I
D6A
A162F
R49T
D6S
R174T
R174G
K70T
I132V
C162W
R206K
K32N
K64Y
F214V
R199K
Q197T
N207L
N175Y
E29K
I167L
D76E
S48E
S48P
D162F
P217Q
D215E
D215N
K73N
N57K
I37N
S48F
S48Q
G45E
R172I
E122A
K174N
K173M
K13I
T11K
T11R
T11N
R78K
F171L
H121N
Q102T
D69I
R72T
V142L
A162V
A162N
Q122P
C162K
N32I
D76R
S211R
K166N
K39P
K39T
E207G
F214S
C162A
M35E
S48L
A162S
A200Q
F215L
E174T
H121C
V184I
V200Q
E39Q
E207D
V200L
V200E
E39N
E39L
R70S
T215V
T200S
E169R
E79R
E32T
S105A
S105T
N104T
D86Y
H207Q
Q197L
H207N
D186E
S200Q
G211A
S200V
S200L
S200E
K39A
V178F
M16T
Q211K
I200V
H174L
L39T
L39A
I167F
A200L
N54D
L100V
N174Q
R172K
I135K
T7I
P119T
M35R
M35I
F115C
I2V
N67S
I179S
K43T
L173N
K6Q
D169Q
K6N
D113E
D36E
K173L
T142L
E79Q
I135A
I135P
R22N
R174N
Y215E
G196E
T216H
E169A
R135I
D169G
P140Q
N211Q
E218H
N211K
N211S
N207E
R173E
Y171F
Y171V
Y171I
Y171L
R173S
Q207L
G15V
E197P
G177N
A200E
A122T
K204E
K173A
L74M
E86N
T35S
A44V
A44G
A44N
E39P
E39T
K220E
T173M
N69E
F210W
V202L
L197R
L197H
R72K
A162T
K122V
E174G
E174N
Q151R
Q173M
Q43K
Q173L
Q173A
Q173I
I159Y
E35L
I142Q
I35L
R70K
Q102M
E28D
C162D
V180I
W88S
I179F
A173N
E203A
E28K
T173L
Q101E
E39A
S173M
K83E
K135M
H207L
H207E
H207K
C162F
N69K
T173A
D6Q
D6N
T35K
N204E
R101G
L142M
H174T
E6D
F87S
I165S
K102R
W88G
A211Q
I179M
K102N
V50L
S215E
Q207E
K32I
N32A
E89G
R135K
R135A
R135P
R207H
I47N
T215F
M41V
Q211S
E203Y
M35L
T35M
K65N
S3G
G211Q
E203G
T195M
Y215N
M184L
N207K
T35A
E203V
K32A
V200R
P55Q
I2L
N39T
N39A
I195M
L35V
Q161E
D192T
P122K
D207A
T165A
E169H
E169Y
N67A
T173I
I47F
F215A
A36Q
A36K
A36G
Q204G
Q204K
Q204N
Q204A
I50V
I37F
Q11N
H121E
K35E
K35R
K35I
K35L
K35V
R166K
N104R
I35V
G67D
A35E
A35R
A35I
A35L
A35V
E29Q
V135R
G177Q
Q173N
K197P
S173L
I200Q
H174G
D39T
D39A
A211K
A211S
K173I
D123G
V142M
Q207K
E211Q
E6A
E211A
E211K
E211S
E6S
D123S
T200M
A138D
E40D
Y208H
R70T
G211K
G211S
A162G
Y162C
V202K
Y146D
V184M
C162V
E122T
R43Q
M178F
D44V
T215L
S173A
K13N
R125G
W88C
R199E
E219N
V178L
S215N
S215C
S215I
R122Q
D123R
S163C
R122M
R122D
R122P
R122K
R122V
I90V
K126E
Q219S
I200L
T35E
K73T
E197Q
E197N
T207Q
A122Q
A138G
T207N
I31M
T207L
T207E
T207K
T207D
T207AA36D
A122I
A122M
A122D
T207G
A122P
T200I
Q197R
D169K
H207D
T35R
T35I
E86D
E86H
E86Y
D44G
H207G
I165T
I165P
I165L
L210M
E123Q
C162N
D162V
M35V
R32Q
T211Q
I200E
T211A
Q166I
H174N
T128P
Q102K
M139A
M139V
M139P
F210V
F210L
M139S
Y121N
A162H
D69S
Q219R
C162S
K211R
A204E
V159L
N64H
T27P
G123N
T142M
R166N
E35V
K64R
V200K
K166Q
E197T
P119S
K166E
S163N
K65R
L215A
L215H
L215Y
L215D
L215E
L215N
T75S
L215S
V111L
K174Q
Q207G
V60S
T173N
G15R
Q207D
Y121C
R135M
T215A
A169Q
I50L
E102T
E102M
A169G
G177E
V202T
V202I
K73M
C162T
I106L
A36E
R104K
E44V
C38S
C38F
E177D
K43E
N207D
I60S
N207G
T35L
S200R
K32T
S162G
F162V
F162N
E197L
D211G
D211V
F162S
R172T
P122V
M178L
E122I
E44G
N177Q
S68T
E79K
V8G
D44N
L195M
I37M
K166I
D162N
D162T
Y162W
Y162K
N175I
D162S
Q122K
K219Q
F116I
I135M
T211K
T211S
D110N
A98E
W162V
W162N
W162T
N32T
M200R
M200K
W162S
R207Q
V135I
S39E
D179L
R207N
R22K
G68T
D121N
Q197H
H64K
Q211R
C38G
G162H
A69S
T35V
E44N
E6Q
E6N
D121C
G211R
S173I
E169Q
E218Y
V10L
P4N
I165A
K197N
E169G
H174Q
V215L
V215A
V8I
R173M
Y162A
L210S
S162H
S123N
S210Y
A122K
A122V
Y121E
V75T
A138E
T69N
M16V
T200V
I159V
A33G
S48T
P176Q
E203Q
P176T
P176L
K197Q
I109V
Y162D
Y162F
R43K
R43T
T211R
Q161H
R196D
F210M
N177E
R174Q
K138E
E207A
T27L
T69H
K207A
G141R
N54I
Q102R
Q102N
N67G
E28A
E173M
I39Q
N64Q
M16R
F208L
N68V
I39R
I39E
I39D
R173L
I39N
N64K
N64Y
N64R
L193I
N68C
I39S
N68S
N68G
I39G
L210Y
E203K
N43T
P119L
T4S
S207A
R101I
V10I
F87L
C38W
A139V
A139P
A139K
A139I
E122Q
E211R
E79D
P4H
P4T
E122M
K197T
D204K
Q151K
N162G
D204G
V135K
V135A
V135P
I132L
R173A
N104K
E174Q
V60Y
A158T
S39D
S39G
E218V
E123K
T215H
E123D
Y162V
Y162N
W210V
A200R
Y162S
R166Q
R166E
E122D
S98G
K197L
A203Q
A203K
L200R
L200K
A203D
R43E
A211R
V21G
E173L
N207A
R166I
E32Q
D204N
D204A
A158S
K173N
I179A
E179L
K36D
N69P
S139R
G134R
V75S
I184L
P197Q
T69E
P197N
E122P
I60Y
F214L
K219S
N175H
R207L
R207E
R207K
R207D
R207G
E173A
E173I
H64Q
H64Y
T215Y
K32Q
K219R
N69A
N69D
D215C
T27I
E70Q
A14Q
C162G
T215D
R68Y
E70R
R68N
R68V
M139K
M139I
E70K
R45E
K123N
R68C
E70S
A14P
R173I
T69K
V60L
Y215C
D204E
V184L
G138E
R101T
T215S
K28A
K197R
W210L
Y215I
V189I
F215H
Y162T
N162H
H207A
G155R
H208F
Y175I
E194G
A123N
A179G
I69S
N69I
Q122V
S39N
Q219G
E218T
E102K
A169K
E102R
E102N
L74V
T215E
D215I
Q219P
T200Q
Q204E
P197T
E169K
P4S
T200L
E218R
N211R
I60L
T215N
S69G
F40D
S200K
I142T
F215Y
Q151M
A33P
F162T
Q43T
D211M
D211E
D211T
D211A
C162H
D121E
T27S
D194K
S173N
V135M
A200K
T200E
I142N
I142R
I142K
I142S
K82Q
A98S
H208L
M211Q
R143S
H219E
A43T
A43E
E53Q
N67E
M211A
M211K
F115Y
I179G
R173N
F215D
R196Q
S163G
F210S
G177D
E200R
W162H
I200R
K142M
C121E
W162G
E123G
F116Y
I142L
K197H
S39Q
N68T
S39L
S39P
E123S
A33V
A69G
S163R
I178F
V21I
S163T
E123R
I159L
N177D
A55Q
V179I
E203D
F215S
A139R
R172G
Y64R
K219G
K203D
T69P
V108A
K219P
Q138G
R169Q
Q138E
F160L
R169H
R169Y
I34L
E48T
L94I
R
169K
T69A
R169G
F215E
K49E
R172S
L77F
Q207A
E194Q
T69D
K172G
K172S
E53G
R219P
R219G
K49Q
I109M
I109L
M135L
M135T
I167M
F210Y
N67D
I142M
E200K
V179S
T69I
E122K
T215C
H67A
Y57H
Y57N
V196E
Y57K
H67E
H67D
I39L
P62A
P59A
H67S
H67G
F215N
T215I
N32Q
I31L
V62P
I178L
Y175H
T75A
L215I
T75M
L215C
P197L
N43E
Q219E
R101H
V179M
V179F
R135L
R135T
E197R
N192T
Q91E
D69G
M145Q
Q200K
R213V
P131T
V31L
S163I
I214F
M145C
I214V
R213G
I214S
I122K
T122K
G28A
I122V
T122V
D162G
K49R
V215H
N69S
I200K
E173N
R196A
R196K
D211Q
Y162G
R196G
Q43E
F162G
I135L
E218D
I118V
I135T
Y162H
E197H
H219NK219E
V75A
K49T
T166Q
T166E
T166I
T69S
E122V
Q190A
Q190E
N215I
D28A
K36E
P59Q
N215C
A117S
G134C
G134N
G134T
A98G
F215C
W210M
F215I
P197R
P197H
K101N
R207A
D162H
R101Q
I106M
M139R
E70T
R196E
R68S
R68G
I179E
K82T
S191Y
T200R
C134S
V135L
S191C
V135T
E49R
E49T
N134I
P48T
R66K
R99G
N134S
S39T
S39A
Q219N
V179A
I179D
E194K
R101E
I74V
V75M
V111I
Q49R
Q49T
K101R
H64R
D123N
R103H
R103E
I39P
I39T
I39A
R219E
F162H
D211K
D211S
W210S
V139K
V139I
L139P
L139K
L139I
H215I
L139R
D219N
V215Y
V215D
V215E
V215N
G148L
H215C
P69G
M211S
G143S
V215S
C163I
M211R
K219N
N69G
K101G
T69G
V179G
S58T
K82N
T200K
W210Y
L4S
L100I
T75L
T75I
Q194K
Y208F
Q46K
Y181S
E101P
V62T
T139Q
T101Q
V203Q
H182Q
T101H
T101E
V203K
I181F
N186E
A107T
R28A
V109L
I179L
I211S
C188F
G134I
G134S
K135L
E53D
Y208L
T139E
K135T
K101I
N166Q
N166E
V179E
K101T
V108I
V106G
T139S
T139A
V179D
V75L
G194K
I214L
E123N
V215C
D211R
K101H
Q101P
K101Q
K68T
R219N
V215I K82R
T139V
T139P
R68T
K101E
V179L
M75L
A190E
T139I
T139K
V75I
V203D
K53D
V62A
N166I
Y181I
T139R
V139R
F193I
V106I
A179E
A179D
A179L
V106L
H101Q
Q190S
R190Q
H101E
M50V
M50L
R190T
T101P
F100I
Y181F
I211R
R190C
M75I
Y188N
R101P
V106M
I181V
M106A
I69G
R103I
Y188C
G190R
Y188S
Y181V
K101P
E190S
Y188H
R190V
R190A
R190E
A190S
K103R
K143S
K103H
K103E
Y188F
I106A
R103T
H101P
K103I
G190T
G190Q
G190C
G190V
V106A
G190A
K103T
G190E
K103S
R190S
Y181C
R103S
I181C
G190S
V181C
R103N
C188L
SQV
−log(prob)
1.5
I35G
K197H
K197T
V10I
I94M
R101N
K73M
I31K
A36D
A173E
E174Q
F124L
R103T
T135V
K39N
K39I
T27P
K207A
A173K
I35R
A207G
P122Q
E173K
I135M
E138A
H121D
K64E
I179E
A28N
K11S
K219D
N123E
N39LK135T
S200V
E53Q
D204E
N103T
G177Y
Y121N
D169N
E207T
P122E
E32R
P1T
K35R
K35G
Q91H
V106M
E200K
F87L
K101G
R174K
E39K
E39N
E39I
A162N
A162C
A162S
W210Y
W210F
R49Q
I180F
D207L
G190S
R11T
F116S
G190V
K197E
Y121H
H174K
N162I
I202T
K207G
E197Q
E101G
W212L
R211M
R211G
N173E
N173K
M16I
E177G
M35G
K32R
K22N
E203D
Q102K
K6E
A35G
G123S
K101Q
R211K
V202L
K219H
E39L
E177Y
D194A
D194V
T200V
S48T
Q211A
Q211V
Q211D
Q211E
L173K
D162I
A105S
K28Q
K28E
K103T
N174Q
G138K
G190R
D207E
K83R
V179L
M135R
N175H
A39K
M35R
K101P
R172K
S3C
K219G
D6E
E86N
D215E
R173E
T211A
T211V
K135V
V179D
S215D
I200V
K154R
P95Q
K166R
G99R
R64Q
V200A
M39K
M39N
M39I
T173E
K20L
S211A
S211V
P4S
A39N
A39I
N207L
I184M
N81K
S211D
K174E
L35G
S211E
V178L
V60I
D53Q
A162I
T173K
Y121D
R135T
E102K
I5V
K204E
G68N
T211D
T211E
E101Q
G211A
G211V
D203K
N204E
R173K
N57K
Q23H
P197E
R172S
E200V
I106V
T139K
N68V
T139R
T139I
R135V
D123K
A139R
A139K
A139I
K35V
S158A
D123G
N64Y
I39L
C121N
C121H
K122Q
S27T
S27P
A35R
S39K
I173E
K49Q
G211D
D179I
D179G
K122E
T39M
Q151M
G196K
M210S
E36A
A114P
V50I
S215E
L47V
L47I
S207T
G211E
V180F
M39L
I173K
D207T
K197Q
K211A
K211V
T35Q
M135T
T69P
K64Q
D194E
I8V
D218E
K211D
I94L
A28Q
K211E
E36D
D179E
V111I
T69D
I132V
K194A
K194V
R39L
K194E
H101G
G162Y
T35S
D169R
M135V
C162I
S173E
E101P
K70N
I35V
Q203DQ203K
R101G
N113D
I135R
Y115F
S200A
S162I
R64Y
I165T
R102K
R174E
K174Q
K200V
G141R
T84I
R101Q
R101P
S39N
S39I
N207E
F210S
K39L
Y144F
K64Y
I106M
G177N
R206K
Q207L
L35R
E196K
E48T
D69N
S173K
T69N
G207L
G207E
E200A
N177D
N104R
T200A
D169E
W210M
L197E
D79E
V118F
I200A
I2V
M178L
N207T
G207T
A39L
P1L
A179I
K169R
A179E
A6E
K169E
A179G
K44Y
L4S
K104R
R82K
L135I
D86N
A207L
E203K
D123S
V202T
E177N
I135T
Q207E
L205F
E123K
R174Q
E123G
R211A
R211V
G190C
R207L
C88S
I166R
R211D
T215V
R211E
G68V
I135V
N70Q
N68K
H207L
A207E
V179G
A138K
K219T
E138K
V179I
R20L
L188Y
C121D
S11T
K219R
A162Y
K11T
S39L
K22R
Q190E
R207E
T35G
S190C
A28E
N32R
L135M
Q207T
V184T
A69S
H207E
V179E
M35V
A98G
S68T
K207L
V90I
V21F
M139R
M139K
M139I
A204E
V16I
T4S
N123K
N123G
A207T
E123S
H101Q
T35R
H101P
T70R
C215N
T162Y
E40D
V21I
K200A
T39K
F77L
G190E
Q200V
Q200A
P39K
P39N
P39I
E67S
P197Q
R207T
H
197Q
H197E
T39N
T39I
G68K
K42E
T69A
E70Q
T69I
H207T
R35V
V75T
T69E
S98G
V118I
R219E
L215F
H211A
H211V
H211D
H211E
A62P
K207E
I195L
G177D
I178L
A35V
L135R
C162Y
S162Y
K70Q
N69A
G190A
K207T
G67D
E154R
F188Y
H219T
H219R
D28E
E177D
S137N
A69G
L35V
L135T
L135V
H174E
T215S
N69I
A106V
T58N
C181Y
N123S
W88C
H188Y
R190C
S190E
W88R
T39L
N43T
N43E
T58I
T215D
T69S
K219E
D185E
N69E
A106M
Q217P
Q70R
T32R
I181Y
A203K
A165T
S58N
S58I
P69S
P69G
G134S
T69G
K43Q
S190A
F116Y
T215E
S215N
T35V
N219Q
T158A
V215D
V215E
K36A
K36D
V215N
N219E
D67N
V215S
H174Q
Q43N
Q43R
S68N
R43T
E219Q
Q190A
L197Q
C215I
E215N
R219Q
S68V
D162Y
N69S
L210W
K219Q
L210Y
L210F
D40F
I215F
N162Y
E49Q
K44E
L210M
T75I
S68K
N101Q
P39L
N101P
N101G
W88S
N69G
R190E
R190A
V215I
L210S
Q43T
N70R
W210S
T105S
S215I
T215N
D69A
D69I
V184M
I74L
I100L
R43E
K43N
C134S
K43R
D44Y
G70R
E70R
D69E
K70R
K40F
C215F
D215N
T215I
D67S
E215I
L184M
M75T
L75I
M41L
D69S
K43T
V75I
N67S
V74L
D69G
V181Y
D44E
N215F
S215F
E40F
T215F
D215IH219E
K43E
Q43E
H219QP62V
A62V
E215F
D215F
F215Y
V215F G67N
M75I
S215Y
E215Y
L215Y
G67S
C215Y
A40F
D215Y
N215Y
I215Y
V215Y
0.0
M184V
1.0
V75M
C162S
0.20
M184V
0.20
C162N
I135V
K101E
G211R
M35I
D121N
S68G
E39S
I142Q
V21I
I35V
K43N
E207T
Q101R
H121N
M142V
V180M
T211K
T211N
D36A
G123S
I8V
K135L
S162D
S162G
M35V
T135L
K204E
C162A
T11K
I165T
K211N
A39S
V106M
R173L
K122P
M39S
H174R
K122E
T166R
K39S
L142M
L142V
S162Y
Q174R
Y181V
V179D
R135L
A162D
A162G
E211N
E79K
R173K
Q207T
L35I
L35V
D204E
M135L
R207T
A173L
K207T
D207T
A162Y
S173L
S211K
E173L
G207T
N175Y
E207A
C215V
C215D
K22R
K35V
G177D
Q211K
K83R
T173L
I106M
N207T
K174R
T195L
M210S
A139R
L39S
N69I
I173L
V60I
V135L
G45E
K104R
S173K
T173K
K166R
A173K
E173K
D113E
C134S
S200E
G99R
E203K
V75M
Q211N
E203D
E177D
T139R
K20R
S27T
D207A
V202T
D162Y
D44A
K70T
I202T
T39S
C88S
E36A
Y121N
Q204E
G211K
R200E
E215N
G207A
S211N
D123G
I173K
I195L
I142M
Q122P
Q122E
K207A
N123G
D218E
T35I
R199G
R211K
R207A
K101R
K139R
H207T
V200E
C162D
I200E
I166R
Y208H
S207A
A211K
C162G
D40F
A35I
A35V
W88S
T142M
T107S
D123S
Q207A
A33V
C181I
K66N
E48T
E48S
I135L
V50I
C162Y
R206K
N173L
N173K
N147H
C145Q
V122P
V122E
A211N
N104R
N123S
K172R
H175Y
E138A
T35V
H162D
H162G
D67G
K70R
T69D
A200E
I180M
E40F
G211N
I142V
R211N
T200E
Q46K
Y188L
L215F
T4S
N177D
F124L
K49R
K200E
N162A
H162Y
T142V
A123G
A123S
V111I
Q173L
A190S
D186N
K103S
K219N
R35I
R35V
D186E
Y115F
P4S
P133L
R103S
E123G
A106M
W162D
W162Y
M200E
W162G
N69S
T58I
E123S
Q173K
D79K
N166R
N207A
H207A
N162D
N162G
R162D
R45E
A142M
R162G
P39S
R43E
T58N
N103S
Q142M
L75A
K123G
K123S
G44A
V75A
T215C
S215N
G190S
V179I
E44A
N162Y
H182Q
I215N
L200E
K40F
I215F
I181V
A179I
T215V
K43E
T69I
E179I
E101R
T215D
K142M
D67S
T69S
F77L
Q43E
D179I
D67N
N43E
K44A
C181V
I108V
I100L
K219G
R162Y
E70T
E70R
E215F
A142V
G134S
L210S
F116Y
D215N
S215F
Q142V
L75I L215Y
E190S
K142V
V75I
T215N
R66N
N188L
S58N
K219R
C215N
A62V
K65R
D69I
F210S
M41L
D215F
L74V
E219G
D69S
T215F
Q151M
C215F
L210W
Q219G
G67S
M210W
G67N
F215Y
Q219R
E219R
D215Y
F210W S215Y
C215Y
T215Y
M184L
I215Y
M184I
E215Y I184V
NVP
0.15
0.15
V202I
S48T
change in log(rf)
change in log(rf)
1
−log(prob)
3.0
2.5
2.0
0.5
I202T
0.10
1.5
−log(prob)
1.0
0.5
M184W
M184R
M184L
M184I
0.0
M184L
Q151M
M184I
0.10
ABC
S163R
G196A
E196K
K201Q
R143S
I94T
Q151H
S162H
T39L
R199S
I35A
A162R
A162K
A162H
A162S
I60L
K211A
E204N
D123S
T35K
T35A
T35I
T35S
K13E
S211R
A62P
V111M
N123Y
Q145H
R211E
I165K
N54H
S163T
P150S
V106G
F171L
Y188H
I200T
D40G
K39R
P97A
E123Y
A162T
S134I
V50L
R125K
K102R
E203K
I31M
S211E
M178V
D177E
V75L
Q174R
I142T
A36D
A36N
A36K
A36G
T215C
V111L
K207Q
K207T
K207L
V142M
T11A
K211D
S68T
Y121H
N123G
D218T
N137K
K174E
N67E
A200S
K122D
E123G
P1T
K197T
A173K
A173I
E44Y
A200M
K102G
K82R
I173E
D86N
Q219S
T142L
T142V
I165L
E40K
I202T
A207K
K211V
E40S
T107P
I135T
K32R
L209M
P150T
V106A
I195T
D185N
D185E
E36D
E101G
V135S
L74V
T35Q
L109I
K102M
K46E
G99R
L209S
K39T
K39L
S162T
H96Y
K166E
C162D
C162Y
V108I
R103N
E39R
E39T
E39L
E39S
P14A
A207Q
E36G
I178P
V50I
F214S
T215V
M135K
M135R
K174T
I178N
D67N
L109R
N177H
Q197K
T215D
Y188L
Q207T
E138K
D169Q
K64N
D169A
C162G
E36N
D67E
T215L
V148G
T142M
T107S
R199K
E44D
G177D
G177E
A173E
L92I
Q197T
T200Q
P14S
K219E
V90I
D169R
V135K
K174Q
Q219T
K64H
L135K
A200K
R104K
T35L
G211K
G211A
S123Y
S123G
K173E
D203V
T139M
D203G
F87L
E196R
K174R
D207A
G190Q
T135V
T135L
Q207L
E6T
G51A
G196E
T211Q
G196K
I165T
T211S
T211R
T211E
K70E
H174Q
H174E
H174T
H174R
A39E
E40F
K122T
L109V
L35V
E204A
R101N
R101H
T215I
D123Y
D67K
C162R
I202L
T215N
K70T
T69D
M35Q
T139R
K30T
A207T
A207L
A207N
A207E
S215Y
G51K
C162K
C162S
C162H
K103S
I195L
K207N
K207E
T11N
T11K
T173K
T173I
D40K
G155R
L178P
L178N
A138G
Q122D
Q122T
Q122P
K122P
Q173K
Q173I
I142L
K70R
E36K
D123G
V118I
C162T
C181V
E169D
E6Q
G211Q
I31K
T35V
G211D
G211V
G211T
G211S
G211R
G211E
I178M
G68Y
A39S
K219N
R11N
R11K
R173T
Q207N
T215S
I135V
I135L
R143G
D218E
D215S
V75S
D207Q
D207K
E89G
Q91E
Q145C
D110N
S173K
S173I
S173E
I142V
R72K
K166R
M164H
I200Q
Y115N
A39R
R206K
R207H
V179L
E138D
K103H
D121H
D40S
N147H
F116Y
G196R
N173K
N173I
N32R
K103R
V135R
E32R
Q207E
S200E
R43Q
K200E
E204K
V179S
E207H
A39T
S48Q
A138K
E89K
A207H
S68Y
N39E
N39R
N39T
N39L
M41L
N39S
E197K
E197T
Q197H
K103N
K49R
I94L
E203V
E101Q
V179I
E203G
T173E
Q101K
K20R
T215Y
E53D
E122Q
A114P
A211T
T58N
I178V
A62V
A200E
K126E
I47F
Q151K
E6D
I63K
K197H
S39T
K30E
K64Y
L34I
A28Q
D207T
D207L
A28G
F124L
R201Q
A39L
D67S
L178M
I142M
W24R
T135S
A158Q
K211T
P4T
I100V
Y162G
K43Q
L214S
I215Y
I60V
A138D
R173K
R173I
F215I
N68V
L187F
K219Q
T200S
T69Y
E28K
R172I
I135S
N211Q
L34F
E194K
N211T
N211S
N211R
N211E
Q161H
P140Q
D67G
L210V
T200M
F210S
E169Q
L210M
P1L
V75T
R139K
I63V
K219S
I200S
E219N
S68V
K83R
D40F
E42K
N43T
K219T
I35Q
A122D
A122T
A122P
S98E
L210F
E48L
M75T
Q91H
Q173E
T135K
K211S
V75I
E169A
L210W
L178V
K135R
P150L
M35L
I135K
D186E
R172S
S176P
E169R
A211Q
I200M
R101E
V106M
A211S
R101G
A211R
A211E
K64R
N162H
K207H
T139V
E203Q
R173E
K211Q
F215N
D215Y
E29G
K102E
C215V
C215D
C215L
E197H
T139I
K66N
N177R
N177D
N177E
T200K
T139A
E53K
S48L
L210S
V106I
D53Q
D53K
D53G
K211R
Q151M
M35V
Q207H
K172I
K172S
E53Q
K6Q
K6E
K6T
K6G
T69P
D113E
N67K
A158P
K66R
V21I
T69E
D186N
A190Q
R101Q
R101K
I200K
G68V
K102Q
T69I
K211E
D207N
D207E
E53G
H64R
T69A
T135R
K169Q
E190Q
I35L
K169A
K169R
E101K
V111I
L135R
L215Y
L215S
I180V
Y162R
Y162K
I2V
E194V
P7Q
E194N
T69N
S48P
T139K
V202T
E28N
A35V
S39L
I135R
P4K
T200E
Y162H
Y162S
I200E
N69S
A158T
P4H
W162R
W162K
W162H
W162T
S103H
S103R
W162S
V202L
C215N
C215I
C215S
E122D
D207H
S27T
S190Q
P101E
L12I
E28Q
S98A
Y64R
I181F
I181Y
I132V
I132L
I35V
A158S
W210S
K6D
A33G
V75A
Y162T
N43K
P4S
Q166I
T69S
M75I
S48T
R102E
I189V
I189L
V122D
E194A
K28N
E122T
E122P
Q161P
E28G
H101Q
K219R
N162T
C215Y
C181F
D162G
D69Y
G98E
E48P
R102Q
P101Q
K194A
P101G
F215S
K142V
I39T
I39L
K203V
K203G
D162R
V215N
V215I
E48T
N173E
E43Q
E67K
Q138D
F208H
T4K
T4H
T4S
K138D
D203Q
R68Y
R68V
S103N
C181Y
K28Q
K28G
P101K
K65R
N166R
L75T
L75I
L75A
I69N
N219Q
L31M
L31I
L31K
F215Y
V203Q
D162K
D162H
E219Q
V122T
V122P
D162S
E219S
H101K
M139V
M139A
M139I
M139K
E219T
D69P
Q219R
D69E
N47F
G98A
D69I
Q200M
Q200S
Q200K
Q200E
A203V
K203Q
A203G
D69A
N219T
N219S
N67S
K142M
I100L
E69A
E69N
V181F
V181Y
D69N
A106M
A106I
P7T
Q166E
F115N
Y57N
T16M
I75A
D162T
N67G
I74V
D211Q
T75A
D211S
D211R
D211E
V215S
Y208H
Q166R
M102Q
M102E
F167I
A67G
D69S
M195L
S142M
C3G
T67S
C3S
T67G
M75A
E49R
A44Y
V215Y
E67S
H219R
G203Q
E219R
A44D
E67G
N43Q
I69S
A203Q
T202L
E69S
N219R
TDF
0.05
0.05
change in log(rf)
0.0
4
3
2
−log(prob)
1
V75I
0.20
K219N
0.0
0.00
3TC
change in log(rf)
0.00
0.30
1.5
0.10
0.25
Q23H
0
2.0
1.5
0.0
0.5
1.0
−log(prob)
2.5
3.0
d4T
0.05
Q151M
0.20
G211N
L142V
L142T
E207Q
R211G
K122E
K102E
K135T
H207A
Q11K
Q211G
A105S
V135K
A36G
E197Q
Q211N
H121D
H208Y
P4L
N162H
N162Y
A173T
I94L
C134S
E173T
S69N
L178V
K173R
S123G
H175Y
N207T
Y121D
K207E
M135T
P101Q
R211N
R135T
L197Q
R72K
S103K
L135T
K83R
A211G
H207T
A169E
V122E
E123G
D123G
T215Y
K207Q
A162H
S11K K172R
M35V
L178M
T162Y
W153G
K101Q
R11K
M210S
N81K
K22R
I135T
E211G
C215T
K169E
K197Q
I35V
C162H
L210F
G138K
C121D
A162Y
A138K
N32K
N68V
K43Q
C162Y
D162H
D162Y
N173T
A211N
Q207A
N215T
N6E
G207A
T216P
K211G
V10I
R219Q
R32K
L200E
T191S
A207T
A139P
R173T
V215T
M142V
R103K
R197Q
A122K
I173T
G99R
M142T
K44D
A14P
S162H
K139P
T139P
D179I
D215T
E211N
T211G
L215T
G67S
E32K
E219R
S162Y
T58N
K211N
E101Q
D177E
P197Q
Q207T
A122E
L188Y
T211N
E138K
K6E
I142V
D169E
H219Q
I8V
A179I
K57N
A40F
P170L
E203V
P122K
N67S
L35V
V200E
D79E
K142V
K142T
S211G
I202T
H197Q
T69D
P7T
I142T
D218E
R43Q
I165T
E203Q
I202V
I178V
Q200E
E36G
A215T
T163I
A215Y
D67S
D6E
V179I
K207A
M139P
K173T
G68V
A33G
D207A
S215T
I178M
R200E
N123G
I180M
N175Y
I202L
R139P
G69S
G69N
T4S
L4S
Q122K
E203D
I60V
P122E
E44D
Y127C
L173T
N43Q
N215Y
D36G
E43Q
I200E
R101Q
K219R
T68V
T123G
A44D
E207A
R102Q
K200E
T200E
F87L
V118I
L165T
P4S
N219Q
N219R
N103K
K102R
S200E
C215Y
S211N
I215T
S173T
A35V
Q122E
Q203D
T101Q
Q217P
N102Q
E40F
G134S
N177E
F210S
L210S
G177E
D207T
K207T
V135T
V75L
L215Y
G207T
S68V
E219Q
H4S
E215T
D215Y
T11K
K219Q
L207A
L207T
H211G
G44D
E207T
A200E
K35V
I215Y
E215Y
K102Q
S215Y
V215Y
P62V
F215T
M41L
L75A
K203V
L168S
R35V
K203Q
D40F
L74V
A43Q
T69S
N22R
T69N
P133L
K65R
H211N
I74V
T35V
F77L
D69S
K203D
Q6E D69N
M210W
F215Y V75A
L75M
L75T
V75T
V75M
E102R
L210W
F116Y
S210W
L31I E102QF210W A62V
V75I
L75I
E210S
I184V
E210W
change in log(rf)
R172I
S211M
V179F
T142I
D218V
G190R
V179G
T39K
Y188F
T35I
D121Y
D40G
K39A
T135M
A207T
A207L
W153M
K219E
G211V
L173Q
L173T
V50F
S98A
E122K
I173Q
I173T
T135R
K103Q
T173Q
A207K
H198P
T39A
D123Y
D177Q
I35A
I5LI8V
Q173S
K103H
E89G
S200L
E194G
L39T
L39K
I179E
I173S
K39I
I179S
Q207T
K207G
D179E
D179S
C215I
A173T
H198Q
K65N
T165I
V135I
T35A
T200L
N175Y
T69D
K207E
R102N
N207H
L26M
Q207L
D169K
D39T
D218H
W153R
T39I
A122T
E211V
I50M
L135M
L135R
A39I
K204N
H207T
H207L
H207K
S211V
D123G
R101E
R64Q
K211S
P9S
V180F
A173Q
R173K
Q122V
Q122E
Q122K
A207G
S162N
E101P
L173S
R32K
K103S
A211V
Q204N
T173S
L195I
C162R
S48T
Q207K
Q23H
N173T
K211M
D67G
I200L
K197R
L39A
K6E
L39I
M142T
M142I
A207E
G138E
Q197K
T11R
H207E
H207G
A173S
R72T
E28A
K138A
K70T
N123D
C215F
T139M
K70R
K35M
K39S
V74I
A35M
T195I
L35T
D186H
P25A
K13T
I100F
D204N
T39S
R174E
S123D
E196G
N103E
D207T
D207L
P119L
G45E
S123Y
E200L
D113E
C145Q
D194A
Y162C
G211N
N81K
I132V
S163C
R196K
D53Q
R139K
L215F
L215I
L215C
Y171F
C162T
R200L
R101P
K200L
D39K
D39A
D39I
T107S
E203Q
L12I
N32Q
V179L
W162N
W162S
E173Q
N123Y
E173T
E173S
Q101E
L39S
K196Q
V200L
F214L
A200L
D207K
M35G
T215L
S123G
I202V
E203D
Q91H
T215C
D179V
D179F
K64Y
D179G
M184L
S215D
Q207G
V215D
K135M
E44K
V5L
I106M
Q211V
I35M
P157A
D79E
F144S
T211V
A39S
R143T
R125S
T142V
A69S
A28G
S162D
R207T
D6E
A33V
L35I
N173Q
N136T
N173S
N39T
N39K
S139A
Q207E
K172G
N177D
A114P
I106L
E48T
F162N
E44Y
F162S
E70G
T11N
D169R
I47V
I47L
E179L
R122V
E177D
D39S
I135M
K135R
G162N
G162D
C121H
T35M
R211V
N123G
N177Q
I135R
I39S
K64N
S211N
D36A
A139K
G68N
S27T
K139P
L210Y
K20L
L210F
D207G
A55P
Q197R
A35V
A35G
D169A
R102K
G177N
G177E
R11N
E28G
R207L
I31L
K194A
Y144S
R143K
D179L
K49R
A211N
R39T
R39K
R39A
R39I
E177Q
A105S
P52L
C88S
D169E
I90V
E197K
E204N
T69N
I106V
T215I
G196R
K142T
K142I
H174E
M35V
D207E
K203V
I35G
K174E
I142V
Y188H
T4A
T4L
N69G
M142V
R139P
N207T
P14A
T215F
Y121E
V50M
P133L
L178I
Q203K
Q203V
N113E
K123D
K123Y
R13T
K123G
K173T
G196A
R207K
L205F
S215A
S215E
S215T
E32R
G18C
D121E
I50L
Q138E
K211V
I34F
E190A
N39A
V16M
W162D
N39I
N32R
E211N
K102Q
I179V
E215L
E215C
I35V
I179F
N207L
N207K
N219K
E40D
L168S
T58N
K70G
L165T
D215A
D215E
D215T
L35A
I179G
Q11N
W88S
Y64N
K28A
A179L
G196K
N174E
M211V
R122E
M211N
R122K
R104K
R43N
R43K
K169R
K169A
K169E
K173Q
E194V
Q101P
P197K
Y144D
C162S
G190C
D53G
M178V
W153G
I180L
R173T
Y162R
M210S
R35G
V74L
S146Y
E40G
K35G
E194D
D17G
T7A
K20R
E197R
T69G
Q174E
D218Y
D203K
A158T
N137H
K43A
Q190C
N102Q
Q36A
N102K
H197K
T67E
H197R
H219T
A7P
R152G
K101E
M184I
I2L
E39N
P122V
M164I
R39S
T7P
H96Y
K173S
T35G
C162NT11K
T75A
R190A
D192E
T75S
R190C
H96R
S163R
D218E
L165I
E53Q
E36G
K22R
R173Q
T211N
S139K
N39S
K35V
R103Q
G213D
R64Y
K43Q
Y162T
N207G
R11K
M106L
M106V
N70R
N70T
F162D
T162N
T162S
N134S
N70G
L187F
D186E
R35V
M164L
V75M
N162D
Q211N
K101P
G177D
K138E
R211N
R103H
G162A
P122E
R207G
R64N
P122K
E123D
G196Q
E203K
N207E
E196R
K211N
N32K
R93G
N22R
I178M
D186N
V135M
G177Q
C162D
L142T
L142I
V135R
N43K
L193F
V50L
N219E
T35V
S162A
R173S
N103Q
R196Q
E196A
S190C
E123Y
N103H
A11K
D203V
A138E
Q142T
Q142I
Q142V
L207K
L207E
T16M
A43E
L207G
G44K
G44Y
I214F
G69P
S170P
G69A
G69S
S117A
Y121H
E196K
I63V
D121H
D162A
E215F
E215I
E123G
P197R
L35M
K36D
S68N
T163CT163R
K36G
E36D
N192E
M41V
Q11K
K142V
K126E
A139P
L142V
R207E
E194A
R103S
I179L
L210M
L132V
R28A
G190A
F144D
R
28G
P140T
N103S
K219T
E32K
K28G
F210M
T139A
P140Q
E203V
K65R
T162D
V10G
P4A
M210W
I74L
N215F
N215I
P4L
T69Y
E53G
L197K
H208Y
Q35M
L94I
R45E
R66K
I69A
W162A
S139P
K42E
I69S
A122V
A122E
A122K
N81D
I2V
I181F
E102H
E102G
K219R
C162A
T69E
L35G
L178M
F116I
T139K
H101E
R102Q
E196Q
D69N
C134S
L197R
T69P
V202L
K53Q
H182Q
H218Y
H218E
A121E
K53G
A121H
V10I
L34I
A62V
M41L
E39T
G98S
L35V
V111I
D44K
Q190A
S215L
S215C
I178V
E39K
C181Y
Q46K
K172T
K172R
K172I
T139P
I202L
T215Y
V202T
I180V
D67E
E39A
Q70G
E152D
E102N
E30K
K36A
E152G
S58N
N162A
E39I
P4S
L178V
R70G
R199G
Y162S
T69A
A106M
A106L
L210S
D44Y
M139A
I202T
G98A
T68N
R162N
R162D
F162A
R162A
R162S
E39S
Y162N
P150L
V215A
V215E
V215T
E36A
R43Q
R43A
D215L
D215C
Q4A
Q4L
Q4S
R68N
L184I
M195I
I122V
I122E
I122K
G134S
P27T
T69S
E215Y
I180F
L34F
S215I
N69Y
N104K
T70G
L149F
F210S
S215F
M139K
M139P
Y162D
L149I
H64Y
H64N
N69E
D69G
L210W
N69P
H101P
A106V
Q104K
P39T
P39K
P39A
P39I
E102K
G39T
G39K
G39A
P39S
G39I
T4S
D86E
T162A
V184L
K43E
N215Y
D215I
Y162A
V118I
V21I
Q35G
Q35V
D215F
A215Y
L188Y
I181Y
F210W
E102Q
V181Y
L188F
S190A
I100L
A158S
N69A
N219T
H219R
I214L
V184I
C215Y
R43E
D67N
V109L
V215L
H103S
I215Y
I109L
P62V
V75L
G39S
N69S
V215C
L215Y
T158S
Q219T
N219R
L188H
V75T
Q43E
F215Y
A135M
A135R
F100L
S67E
F77L
N43A
D69Y
V75A
Q219R
D215Y
G67E
N43Q
S215Y
V75S
V215I
D69E
E67N
E219T
D69P
V215F
Q151L
E219R
H67E
T67N
D69A
D69S
M75L
M75T
I211N S67N
Q151H
M75A
M75S
Q151K
G67N
N43E
F116Y
V215Y
L75I
H67N
T75I
M75I
0.00
1.5
2.0
0.05
0.5
2.5
2.0
1.5
−log(prob)
1.0
0.5
T215Y
T215F
0.6
2
0.2
E36K
K11A
I200T
I200V
T200V
R135T
A173S
V142M
K207L
E173L
E173K
T216H
I173Q
I173A
M41V
T58I
K11S
G177N
L214I
A162H
A162G
K211D
E219T
G207K
G207L
H174K
V35L
I173S
T135M
Q173S
K39E
Y171F
L173K
Y121E
V108I
F215I
Y188F
K135R
K135T
N173I
T165A
D67G I8V
F87C
Q207D
S200M
K207H
E123S
K174N
K28E
W153R
S162R
R135M
L214S
K43T
I37F
S123T
K104N
Q211M
Q211K
Q211D
V179F
H198P
K174L
K102M
L142V
L142M
N207K
N207L
I178F
Q207R
A200I
G98E
A200T
K101R
S69G
H207A
W153S
N173Q
N173A
P197E
A162Y
N137I
A200V
D207R
V179D
K201N
R11N
A211E
M35R
T195I
N162S
A190G
T142V
K207T
S162T
T35V
M35S
N57K
D44K
R211A
V215D
H162Y
L165K
M210S
Q91H
R39E
A105T
T39K
T39R
D192E
D6E
T142M
Q197H
R174Q
R172S
R83S
S211A
K13S
I37L
E194D
D39M
K135M
H208F
G196Q
A169R
E79K
A169D
N123G
N173S
E39L
L215N
L215F
T163C
G207H
K207A
N192E
W162D
N123D
E123T
T39E
R173E
R173L
T35L
H174N
K43A
T211A
I35T
R102G
I35A
G211A
G196K
Y121N
P7S
A122N
A122V
Q102G
Q102K
M39R
M39K
M39E
Y144H
P150L
M164L
G196E
I195L
D79E
R173K
D79K
R101P
I165T
D215E
E177D
H101R
A138K
R22K
A11T
C162N
H174L
P197H
V5I
M16I
K39L
R102K
M16T
R196Q
D39I
R196K
R196E
D39R
D39K
D39E
E207Q
T215N
C162S
S48L
D215S
R207K
R207L
I165A
D113E
M200L
M200K
T173E
N162R
T173L
Q23H
I39R
I39K
I39E
S200E
S200L
I142V
Q122N
Q122V
N39R
N39K
N39E
K101P
E203A
E203G
D67H
Q211A
K49R
A69E
A69P
A69I
T215F
A36G
P122N
S211E
T173K
T215I
I35V
A39R
A39K
I142M
G162D
K220Q
T69D
R39L
S39R
S39K
S98A
N207H
E174R
E42G
V90I
S48P
F210M
E102K
E207D
E102M
E102G
A122E
S139P
S139R
C162R
Q173E
Q173L
I35L
C215N
C215F
C215I
M39L
S105T
G207T
Q102M
K194A
R82K
R45E
Q122E
D39L
E29G
D194Q
M139I
D194V
K122N
D194K
W210M
I173E
I173L
S200K
V139R
K36G
A162D
I135V
R43T
N69E
L193F
A39E
E207R
E35T
E35A
R200M
R200E
R200L
R200K
A173E
T163S
T39L
G207A
S27T
C162T
Q173K
G190R
I173K
P122V
A44K
A173L
V60L
K122V
S88W
C145H
E174Q
F116S
P119L
L187F
R190E
R207H
D218Y
R190S
A173K
V202I
K172R
Y121D
A28D
V106M
V106I
K43E
K211A
K219D
V178L
T195L
E44D
Q207K
R43A
H175Y
N207T
A204E
R155G
S150L
T158S
P7T
S107T
K122E
P122E
E40K
K174R
K83S
G211E
A114P
D204E
K11N
T166R
V50L
R211E
S123N
E169R
I63V
K35T
K35A
D207K
F116I
V215E
K169R
K169D
D40F
R207T
P4L
E44K
E36G
T200M
S39E
N207A
R43E
L210Y
S162A
A36D
H208Y
R197H
V200M
V200E
V200L
V200K
K142V
K142M
F144D
G204E
E169D
I200M
D207L
Q207L
C88W
K103H
R11T
A139P
N175Y
V189L
H174R
Q145H
I75T
V16T
D194A
M139P
A7S
A7T
Y162D
E215N
E215F
E215I
E123N
N174R
E138K
Q211E
S162H
F214L
M35T
M35A
S162G
L215I
N162T
S39L
T211E
K139M
R102M
K174Q
S123G
I50L
F214I
L12F
R162A
R162H
V132I
R162Y
I39L
S11NS11T
R162G
E194V
S123D
D67N
I200E
I200L
R199S
T162A
T162H
T162Y
D207H
T162G
K103R
Q32E
L197H
E179I
E197H
C162A
T200E
T200L
A200M
E36D
R207A
F214S
T139P
P197K
N211A
I200K
R135V
G68Y
K204E
K172S
K35V
K35L
A28K
D186H
K123N
A43E
H162D
P62V
V215S
D211A
D211E
I60L
K13T
N174Q
K6D
G68N
D121H
T200K
Q207H
Q204E
N57H
A106M
A106I
V118I
N69P
N69I
E123G
C162H
E35V
E35L
G45E
S215N
E123D
C162G
V21I
E207K
N173E
N173L
E219N
G177E
G99R
K211E
G194Q
H188L
D196K
D196E
E194K
G194V
G194K
I122N
I122V
G194A
I122E
P27T
C188H
C188L
G190T
S162Y
S68G
Q43T
R70G
D186E
H101P
M178L
W153G
C162Y
N177E
A39L
Y144D
K125R
H197K
K36D
L12I
C121D
G67H
R206K
N173K
A200E
R32E
A200L
G68R
E207L
L74I
K104R
Q197K
Y188H
A200K
A28E
V189I
S215F
S215I
Q196E
H219N
V31I
N6E
N166R
E194Q
G143S
G143R
S58T
S58I
M35V
D215N
D215F
D215I
N67T
A139R
Q207T
R70K
T139R
T69E
M35L
D67T
K6E
Q166R
T215Y
K11T
D185E
Q219G
G190V
V179I
E40Q
T4S
Q207A
A158S
K219G
G190E
E40D
E194A
K64H
L34F
L210M
N104R
G203Q
L135V
E67T
L132I
L75I
I139P
N11T
E190S
E121H
D207T
D207A
K219E
T69I
G177D
N39L
K139I
T69P
K166R
M139R
E207H
I181F
D218E
R101Q
K219T
D67S
Q219E
N103H
N177D
T135V
S48E
C162D
K101Q
H174Q
Q43A
Q103R
T101Q
F100L
T43E
T68R
N162A
I109L
H218E
R197K
G44D
G28D
G44K
R199K
V10G
I184L
M135V
F124S
S48T
S162D
E101H
E207T
D17G
P4H
R35T
R35A
Q46K
L47I
I166R
F124L
K219N
E207A
Y121H
P39L
N32E
V10I
T69S
L4S
C181V
Q219T
A190R
E101N
N69S
E215Y
C121H
L168S
Y188L
N43T
E197K
N211E
L41M
N103R
D179I
I178L
N67S
T69G
S173E
V111I
E70K
A135V
R162D
F181Y
K154E
E70G
G148V
I7SI7T
C163S
S173L
S68Y
E203Q
L41V
N204E
S68N
L197K
A69S
A69G
D215Y
V181Y
N113E
E40F
K135V
K123D
K123G
E210S
S173K
S68R
N162H
N162G
N43A
E67S
K219R
S215Y
F210S
N69G
K32E
K66N
P4S
Q43E
T103H
L31I
I34F
E219R
C215Y
V75M
I215Y
G98A
L165T
L210S
N64H
S103H
S103R
G67N
R64H
R35V
R35L
Q219N
G28K
G28E
L215Y
Q138K
G190S M184L
R30K
N162Y
Y64H
T162D
D203Q
F215Y
E101R
D69E
C181Y
N162D
I184V
N219R
R66N
L165A
F188H
I139R
F188L
N43E
I32E
D69I
D69P
E46K A62V
W210S
H219R
A190T
E101P
V75L
K40F
Q219R
I100F
F208Y
M75L
T103R
A190V
A190E
V215N
V215F
V215I
K139P
G67T
D69S
A203Q
K139R
K203Q
M184V
P101Q
D69G
F116Y
I100L
V203Q
L75T
H101Q
G67S
Q151K
F77L
V215YE101Q
I181V
V75I
A190S
I181Y
L74V
V75T
K65R
N101Q
M75I
M75T
I74V
E203K
V178M
S163I
I180F
T200I
I60L
N207A
V202L
Q122P
V180M
V142T
S162C
V82F
V82S
I54M
I54A
I54T
0.5
I54V
0.6
0
0.0
0.0
3
2
−log(prob)
1
0
R211W
P14Q
V75S
V35K
V135L
K83E
I135K
D218T
I179L
K22R
T84I
L209S
I159T
A39M
A39D
A200M
T173K
C162H
R125K
F171Y
D86N
V135I
D123K
R199E
C162Y
T48Q
E36D
I35A
I142T
E122R
D113N
V135K
K49Q
E169K
K49E
K70N
P97A
I135R
P97G
M178F
R104Q
R104N
R104T
R104E
V75T
K122Q
E122I
A39K
G190V
I106M
S173L
E138K
V111G
C162N
Y162N
D207T
Y162D
D207S
V35G
T27A
Q211I
Q211D
Q211W
Q211R
R166K
D40Q
V135R
E207D
E197K
A173K
N123Y
E196A
R173K
R173N
E207T
A162G
A200R
N175H
N175D
R206K
A200T
T142L
P52H
N57K
G123T
G123D
C162D
N175K
V189I
I35Q
I35L
T11K
N123T
E101T
R83K
P150L
N175Y
L109I
Q197E
D179V
I35N
D179S
I35S
G68T
I5L
T27I
K200P
S69A
T39A
M178I
T173N
C162A
C162I
N103K
K211N
M178V
N123D
D113E
Y146F
Y146C
Y146S
I179F
V179S
K101P
Q211A
W212S
R83E
D110N
K43N
N207R
E207S
H174Q
P4T
H174R
I2L
D36A
K65N
H162D
T107K
E6D
I200Q
R101G
V180F
G190E
N81K
A138D
L209V
T39M
T139R
N81Y
T69D
L187S
L135K
L135I
K83T
H96Y
Q197N
Q173N
R173S
G190T
I200A
K83G
A207D
A36G
R211A
I142L
R174Q
R20T
D40K
K43R
D36G
P4H
T84S
G18A
G123K
C162G
Q197R
T39D
K173N
A62P
N147H
A114P
K46T
D186H
W153R
L135R
E204N
G207L
G207A
T48A
H121V
K13S
E169N
N32T
A173N
I200M
I200R
I200T
S200Q
A162T
S200A
S200M
Q211V
Q211K
F160L
M35I
A98G
I47V
I47L
R20I
T39K
E173N
T173L
T173S
D218E
E200V
E200K
E200P
A28E
E200S
M16T
A28G
K196G
V184T
W153G
K83S
V184I
W88G
L187W
K200L
S117P
G177H
G177E
C88R
K103Q
N123K
E122D
A207T
D39S
K211G
K28Q
E32R
K28A
K28R
E32L
K28N
V111L
E194V
E179M
E179L
E179I
E179F
E179S
E179G
T84N
C162T
R173L
D123S
A207S
R196A
V60L
S211Q
E39L
W88C
I159V
Y144S
A173S
K204E
G45R
K103E
K204G
T128N
P4L
T139M
L187C
E43Q
I90V
A139K
A139I
V50L
E200L
F116L
V148E
V148A
F214L
T7Q
R83T
K70E
G207D
G207T
G207S
T211V
T211K
R101N
E177D
I108V
R11Q
S191P
S191A
S191T
I35V
I35K
R11S
K173S
V179M
E174H
E174N
A39S
Y162A
Y162I
I132L
K64N
M135T
R20L
P170L
V10I
R65I
M35Q
R135M
K70T
M35A
M35L
M35N
R139M
D204E
V10G
D204N
M35S
L193I
I165L
D204G
I35G
V118F
R72K
F61C
E194K
V75L
G152D
E194Q
L197E
R201M
L197N
L197R
D36K
W71R
D179M
D179I
D121V
Q197K
E36A
D179G
E203K
K64Y
R83G
A173L
H121N
N123S
S211I
W212L
I31M
S69I
K174E
H207V
K173L
H207G
R211V
N147S
N147K
N147I
V35T
R211K
S3V
V135M
R102Q
R102H
R102M
R102K
R102G
V179G
P14A
E203R
E122K
S3R
G123S
E39N
I180V
V118I
H96P
E211V
E211K
E211N
L39S
S211D
K174L
K135R
K135M
K154R
I37V
A106L
A106V
S134I
I37L
S158P
S211W
S211R
I60V
D121N
E174Q
T27P
A138V
N32R
N32L
E174R
A138G
I135M
L165P
S200R
S200T
I132V
E36G
F116S
P4A
I173R
A35V
A35K
A35T
L135M
Q102K
A35G
T27S
A36K
T69P
R122Q
A138E
A138K
K20T
R122D
R122K
D207N
E200Q
S134G
E200A
E200M
K204N
A190C
D179L
D179F
H162A
H162I
D186N
T211N
T211G
R101T
R101K
E122Q
M16V
R11T
R11K
E89K
E35V
E35K
E35T
K207R
S69G
S69E
E35G
L178F
A162S
E207N
K43T
N103Q
A211V
A211K
A211N
I202V
R39A
K103I
D40E
S134R
K219D
V179I
I165P
R83S
K13E
D169K
E39A
T142M
V75A
R104K
K39S
Q211N
E173S
V179L
K20I
E200R
E200T
N103E
G207N
G207K
D207K
A105S
Q173S
Q173L
E207K
E196G
R211N
K70I
R65K
G190A
E194D
E101K
E101P
S211A
L178V
L178I
A11Q
Q176P
Q23R
R32K
A11S
K201E
K103T
Q91H
D169N
T207N
T207K
V111I
A62T
V74M
V50I
T107P
A62S
D67G
L135T
L34V
V135T
Q207G
Q207V
A190Q
T163R
T39S
A190R
C162S
V179F
K200Q
K82Q
I180F
D67E
A204D
R35V
R35K
L188H
Q207L
R35G
K82M
A211G
R43T
T48S
N177G
H162G
E89G
K20L
R135T
Q101K
A207N
I106L
I106V
E173L
E39M
E39D
M16R
K70G
K172I
K172T
K172R
K174H
I173K
I173N
I173L
I35T
I173S
K6N
K13T
A207K
W162H
W162Y
W162N
W162D
S211V
S211K
R211G
K220N
W162C
W88L
K174N
D186E
L193F
Q211G
Q207A
A98V
F77S
G98V
G98E
E190Q
T7I
H197L
E190R
H197E
H197N
H197R
T162S
E190C
E123Y
T139Q
G190C
G18C
P59A
L197K
Y115F
Y56S
S3I
Y208Q
E36K
K174R
Y208F
K13N
D79E
E39K
R101P
I142M
A98E
Q161E
E123T
M142V
K64R
K200A
K200M
W88S
N177H
N177E
Y162G
H207L
S139QT139A
Q101P
G207R
N6A
K103H
S139A
S139P
K142M
S139K
S139I
N6S
D121C
V74L
Q23H
K49R
C181S
E123D
K28E
K28G
V202K
V202T
K43G
T139P
E48A
E70G
N43G
K174Q
I184W
H162T
Y144F
I135T
E53G
N162A
N162I
N162T
N162G
H121C
E123K
L35V
L35K
L35G
I60L
G190Q
A190S
E211G
E203D
K135T
W88R
D121Y
R196G
V184W
I167L
S211N
R43G
N177D
M35V
M35K
M35G
C134G
C134I
C134R
I165T
N103I
K200R
K200T
H207A
H121Y
T69S
G177D
I2V
R65N
N32K
C181F
A69N
E53Q
T69Y
S211G
V184L
G190R
N104Q
N104E
M210F
I32L
I32K
K82R
D207R
R122P
N103T
K66N
R143G
T69A
Q151K
H96R
K82N
P4S
S3G
S3C
T69I
K43A
L200M
I184L
I8V
L142M
T4S
I8G
I195L
E207R
Q207D
A33V
E39S
Y162T
T142V
R39M
R39D
R39K
E123S
M39S
E32K
F116I
A158Q
L188Y
T69E
Q207T
T69G
A207R
T107S
Q122P
K6A
K219I
D69P
R219E
Q207S
C215V
I202K
I202T
H162S
H207D
Q161P
M35T
A11T
A11K
A98S
K122P
T139K
T207R
T139I
K201R
N103H
A158T
E44N
I100F
S117A
V202L
A204E
K203D
L34F
A204G
R43A
E102H
E102M
Q151H
E102G
K103S
A33G
S163R
L142V
Y144D
C181Y
W162A
W162I
L31M
W162T
L165T
N102K
G190S
W162G
I74L
W162S
R143T
H207T
H207S
E53D
I184M
D44G
E190S
Q161H
I202L
V184M
V5L
I34F
D162T
V139K
V139I
A6D
Y208N
Y208L
D162G
K219R
K219S
N6E
N6D
N6G
R39S
K6S
K6G
E44G
G98S
K219H
N166K
E45R
E49R
N43A
R93G
Q151R
I142V
C145Q
L35T
N104K
Q207N
Q219E
R43Q
R43E
Q207K
S58T
R190S
A158G
K6E
K6D
K201M
R143K
T75L
E44Y
N103S
L210Y
K43E
T70I
L75I
H211V
H211K
H211N
H211G
E122P
T69N
L210S
L200R
L200T
H207N
H207K
D67Y
N162S
Y162S
L210V
R139Q
R139A
R139P
K43Q
N70E
N70T
T158P
I69N
N70I
H197K
D44Y
S210F
A7I
D44K
D44A
N70G
F77L
D67T
F116Y
A158S
Y208H
D162S
E48S
V75I
S68Y
K142V
D67S
A158P
D28N
D28E
T122R
N121C
T122I
G143T
N121Y
T122D
G143K
D28G
E44K
Q207R
I181F
I181S
E44A
A204N
R35T
L210M
F162S
D67H
R143S
S68V
Y215F
S68I
I122P
S69N
R139K
R139I
L210F
I100L
K219N
S68C
H207R
S68R
L4S
D69Y
D69S
K219Q
T140Q
K68V
K68I
K68R
K68N
T140P
K68T
A135T
T32K
K68C
S137N
K68G
F142MF142V
I214F
I214L
S68N
S68G
E102Q
S68T
S215C
S215E
I181Y
M41F
D69A
C215H
C215L
C215I
M75V
M75T
M75L
M75A
M75S
K219E
T70G
C188H
M41V
E102K
N43Q
S215V
N43E
D69I
K70R
N70R
V122P
T75A
A62V
D67N
T122Q
T122K
G143S
T122P
C188Y
D69E
D69G
D215C
N64R
D215E
F210WE70R
P62VQ151M
D215V
D215H
D215L
D69N
C215N
M75I
E215V
E215H
E215L
E215I
D215I
S215H
G67Y
G67T
S215L
T215D
L210W
E67Y
E67T
T215S
T75I
G67S
M41L
G67H
S215I
D215N
G70R
M210W
T215C
T215E
T215V
G67N
T215H
E67H
E67N
T215L
E67S
T215I
S215NS210W
T215N
E215N
T70R
C215Y
S215Y
S215F
E215Y
E215F
D215Y
D215F
C215F
ddC
T35A
0.0
I178V
4
3.5
ddI
3.0
ZDV
P95Q
129
0.0
0.1
0.2
0.3
I84V
I54V
0.4
change in log(rf)
Figure 6.6: Scatter plots between mutation scores of g2praw (x-axis) and g2ptransition
(y-axis) for different drugs. Mutations from the wild type amino acid to a
mutation listed by the IAS list (Johnson et al., 2008) for the corresponding
drug class are colored red.
6 Planning Sequences of HIV Therapies
ddI
0.4
0.8
1.0
0.00
0.05
0.10
change in log(rf)
0.15
0.15
Q151M
0.20
0.25
0.30
0.20
0.25
0.30
0.2
0.4
0.6
0.6
0.8
1.0
0.0
0.1
0.2
0.3
0.10
0.15
M184V
0.20
0.25
0.30
6
5
4
3
−log(prob)
7
EFV
0.8
6
5
4
3
−log(prob)
2
Y188L
K103N
1.0
1.2
T48F
L120I
I5T
Q197N
D40A
R83E
A173P
K122D
K11A
D169H
D169Y
C162K
A207S
K102H
L74M
K83S
A114V
T211W
T215H
I5N
L210G
D121A
V202K
K73N
Q122N
D192T
K139Q
Q91L
K139E
G15V
T39D
T165P
S173P
R20I
K174T
P97S
D121E
L193I
R104T
I179F
K35S
K39R
L195R
G207P
R78S
K32T
E79K
E169R
N174G
M184L
L35S
S211W
C162F
Y171I
T39Q
E35S
M35S
N68Y
V35A
T211I
G123Q
S68Y
I47M
I47L
V184W
L100F
V60R
T7S
K220E
R11A
N104T
E204A
I5M
I35S
K211W
E218Y
T48Q
V184T
H174G
V50L
G177K
K20I
K83E
G123T
K73T
E122N
E89K
E39Q
R199E
K197N
C162V
N39Q
T216S
T11A
I37F
A138T
A173M
T35K
L195M
T7N
D185N
T142F
T142R
I60R
R173P
E169H
E169Y
E197N
S48F
I135P
T35S
Q151K
K6T
P52H
K13I
E207L
Q122I
S211I
C38F
L210E
H197N
R201M
T215N
E32T
Y162K
E44K
A207P
N207P
G177Q
E40S
S3R
T162K
G18V
N211W
E6S
E102H
T173P
A138D
K11N
V135P
T200R
C162T
N43T
W210G
L215H
S88G
I50L
A138V
M178F
I135N
K211I
L197T
Q197T
E207P
R104Q
Y215H
D211W
L80T
K203R
E218T
V142F
D204A
I167M
E211W
M16R
M16I
N174T
Q145K
K207S
K39Q
P9Q
R174T
A207L
Q207S
V142R
V135N
D162K
T195R
V203R
V203N
E40A
T142L
E29Q
D169G
N207S
L173P
K70Q
I167F
Q122D
D6T
Y127F
C38G
H207P
P55Q
H207S
P55L
E123T
K28D
K203N
V60Y
K82N
K138V
M139P
T142S
R35S
E203R
K211E
G18A
R39Q
R49Q
I135A
A28D
S11A
S173M
E39P
R207P
I142F
N215H
V167M
K207P
G134T
G134I
S48Q
F115N
E89G
C88G
L80I
D215H
H197T
S191Y
E203N
G18C
D44K
D169Q
I142R
N177K
S191P
R43M
S191C
A7N
Q207P
F87S
T7I
G15R
V142S
N64Q
K49T
E122D
T162V
W24C
N211I
V106G
F215H
L195K
K46Q
I47N
R32T
A36G
K32I
R49T
E200M
F210G
I31K
K13V
E44V
V200Q
K197T
R99W
V135A
T39P
K49Q
Y162R
Q101I
K13N
E174T
I60Y
N39P
L149F
T39I
E29G
N104Q
S200Q
D211I
V108A
E122I
W88G
R166I
R169H
R169Y
E197T
Q91E
K138T
E200Q
E6T
E53G
R173M
R39P
A139P
V215H
Q101T
R104N
I184W
I142S
E211I
E6Q
E39I
D207P
H174T
Y162F
Y162V
L187F
K139P
S39R
S107P
V202T
N177Q
T195M
V148M
Y146F
E169G
Y146C
K35A
G141R
V148L
R143S
Q204A
T162F
M164K
E6A
I181F
G93R
T207P
E28D
D207S
V200R
T139P
S68T
I31M
I35E
D44V
E169Q
K138D
I184T
I109M
T39L
K40S
I47F
K39NK39P
K43M
T173M
R206K
R11N
K207L
S207P
T11N
L197R
I2L
Q197R
M39P
N68T
P48Q
V16R
V16I
L142F
L142R
A117T
T200L
I200Q
Q207L
E53K
M164L
V142L
L195T
W210E
K123T
R43T
E32I
N147S
E40K
D162R
N207L
N147K
S200M
L149I
K139I
K139V
R32I
D162F
T207S
V184L
S69P
Q49T
N219G
D162V
I180F
E44G
R166T
S39Q
Q43T
H207L
P217Q
E200R
V21G
R199K
P59A
R169Q
L142S
R169G
K32Q
K43T
R143G
D69P
A139V
L173M
N54K
I164K
N73T
K39I
Y215N
Y146D
K6N
N39I
P119L
S3G
V167F
G134N
L215N
P59Q
R139I
H197R
S191A
D110N
F215A
V180F
T139V
D44G
H174N
M35A
R39I
E204N
N220R
R162V
N220E
P35S
I10G
R32Q
Y116L
D207L
P176Q
E200L
I35A
K197R
S139P
H121Y
Q197L
L210Y
L35A
R139P
N32T
K40A
Q145E
N47F
P119T
S39P
S11N
D215N
T195K
V200L
K73M
S200R
F210E
E197R
Q197P
P27I
Y162W
I200R
S68R
A69P
D6N
P119S
V215A
A7IA36K
E207R
E39D
D207N
N39D
R125G
I142L
I139P
E43T
Q23H
T35A
R70Q
E194Q
T207L
V62P
E43M
E44D
E32Q
E207G
T70Q
P4S
E218H
M35E
V62T
I179M
N162G
E219G
Q219G
H208L
T211D
S139V
A207R
D204N
K39D
T142K
F215N
E35A
Q142F
Q142R
Q142S
A138S
E194K
W210Y
N211E
I69P
K35E
S211D
D162W
R139V
I2V
I200L
I31V
S39N
R35A
E79D
R196Q
R166Q
A207G
I10L
S162G
Y188S
Y146S
A204N
NV148G
147H
S69E
R103I
V215N
I179S
K102M
V178F
E6N
S191T
Y116S
C188S
A162G
N69P
E39L
E70Q
N207H
F208L
E70S
K125G
T35E
N39L
K49E
N32I
T215S
I178F
V179M
V203G
F210Y
E211D
S39I
R49E
F215L
A200K
S68N
V142K
Y162T
T69P
K139R
G70Q
K219G
E122Q
N69A
T165L
Q102M
D69E
G155R
S69I
A107P
V159L
N32R
I109V
D207H
K203A
N175Y
K211D
K39L
E102M
D69I
R219G
K203V
K203G
I181V
S215E
I184L
Y57K
S200L
K101R
A138Q
G211Q
V179S
Y188N
R3G
Q219D
M16V
S162N
A190V
Q145C
V215L
R103E
L210S
Y208L
K103I
I142K
E203G
K211S
S176Q
Y181F
K197L
T11R
A203Q
I139V
K197P
A138G
K207R
N32Q
C162G
T215E
H197L
D162T
N54I
E203V
V179F
H197P
E203A
Q204N
K28G
E197L
C188N
W162G
K173Q
F87L
E219D
E197P
D196Q
N207R
T69A
P4H
Q207R
A69E
K207G
K173E
Q211A
N73M
A162N
I132L
E173Q
L215S
K138S
T27S
R135M
T35L
K135R
H207R
L210M
R101G
F210S
N207G
Q207G
S69G
I142T
E179L
K173N
K101G
A190T
L215E
E30R
K102N
A190C
L4H
V8I
N211D
I50V
S39D
G177N
I179G
D169K
V203Q
E219H
T173R
S4H
W210S
V132L
H207G
R219D
D86E
N219H
Y162G
P35A
K219D
D121Y
K103E
T211G
I69E
L142K
W88S
Y215S
Y188C
R35E
Q49E
E173N
Q219H
I179A
E36D
K102Q
G203Q
D69G
A173Q
R28G
I159L
P58S
R211Q
A28G
Q102N
V179G
S215C
K101I
Q207K
N211T
T162G
E203Q
E28G
K203Q
K135M
T215I
R101I
P4T
D204K
G211A
W210M
D40F
N215S
S39K
R172K
D207R
K138Q
H175Y
N69E
A69G
E102N
L4T
S173Q
E204K
V179A
K101T
R103H
R173Q
R101T
Y215E
T207R
S4T
D215S
K82R
H208F
N207K
A69I
T173Q
K211G
T69E
A179L
F210M
D207G
E169K
A173N
T35M
D162G
H4T
E200I
S211G
I179L
F215S
A173E
V106L
N162H
S39L
K197E
F162G
T207G
R162G
T215C
D6K
T173S
T75A
H207K
E173I
V179L
K173I
N43Q
W88C
K138G
K219H
R219H
S173N
R173N
A98S
S173E
R103T
D121H
R211A
N211G
A173I
D162Y
T173N
C162N
Q219R
N69I
Y188F
N67E
F215E
L215C
N69G
T69G
G67E
V215S
V111I
T165I
T48E
V189I
D215E
S162H
T75L
L173Q
R173E
A162H
E40F
T142M
S211Q
E211G
R190T
L173R
C188F
R43N
K211Q
T162N
K122P
L215I
T173I
N215E
M135L
I69G
Q142L
Q142K
T69I
I135R
V215E
K211A
T
173E
S200I
V142M
N67G
V200K
I142V
A204K
Y181V
R173I
K43N
S211A
F162N
N207A
Y208F
W162H
N123S
I142M
T173A
N211A
L173N
T211A
N211Q
R190C
Y162N
T211Q
S48T
A190Q
R169K
Q207A
L142M
V135R
E6K
F34I
S173I
Y215I
K207A
Y215C
D207K
E69G
H207A
V75AA36D
M200K
N211S
V75T
H211Q
L173I
H64R
E211A
L173S
A207E
C162H
N69S
E6D
Q101H
T200K
N215I
D211G
D162N
I135M
I31L
Q204K
L173E
E219R
R174K
S48E
K103T
Y188H
D215I
R35L
E200K
V21I
K103H
E203K
E44A
F215C
A158S
E219Q
E211Q
T162H
C188H
D211A
P27S
E169D
D215C
K43Q
F215I
R190V
H211A
V135M
T69S
K101N
K102R
R43Q
G196E
T207K
E43N
I200K
R101N
N174K
R135L
L74I
S200K
L173A
D44A
E102Q
R122P
N215C
Y162H
C162A
V75L
V215C
K40F
Q142M
G123S
K36D
D211Q
G190R
R162N
E53D
V215I
G211R
E174K
I179E
G123N
D162H
K135L
Q219N
K64R
Q200K
V106M
Q102R
D177E
R35M
D207A
T69N
D123N
E43Q
E102R
F162H
D123S
E39A
V179E
E219N
K219R
K123S
F162A
H174K
R162H
Y162C
V178M
T207A
N39A
G172K
G123E
A203D
I178M
P48E
T39A
R39A
I135L
N64R
I106A
K123N
A189I
Y162A
K219E
E122P
V135L
R219N
N68G
K219N
Q122P
E123S
Q151M
T162A
V178I
E123N
K39A
D162C
T101PR101H
M178L
K101H
V203D
R169D
K203D
A190E
Q207E
N69D
T158S
A123S
D162A
N207E
F210W
I60V
K207E
V135I
E203D
H207E
G203D
I122P
R190Q
S98G
A98G
M75I
R162A
Q142T
T75M
M39A
I184V
E101P
V178L
A123N
T69D
K35V
H101P
M106A
S68G
K174Q
W210L
I178L
V106A
M135T
T211R
V179I
V200T
S39A
K219Q
E32K
L74V
R190E
N211K
T70R
Y171F
E179D
S200T
G123D
E200T
A162S
A173K
H101E
V200A
L41M
L135T
A75I
D207E
I179D
V106I
I74V
V184M
V75MK103R
V179D
K101Q
Q101P
R101Q
M200A
L75M
L75I
F100I
S211R
K211R
V108I
F215Y
F210L
E200A
A179D
T200A
T207E
T68G
N174Q
E70R
M35V
T75I
I35V
G177E
L195I
Q142V
N211R
L35V
V215Y
K101P
E197Q
I200T
I214L
R32K
T101E
R101P
V75I
S173K
V202I
I200A
N177E
D196E
T166K
T35V
E123D
R70K
S200A R103S
A196E
E174Q
I181C
R135T
E43K
K68G
E35V
R174Q
K197Q
E211R
R173K
R35V
Q177E
G70K
Q122K
T70K
Q200A
Q101E
R196E
K123D
V135T
L100I
C162S
S107T
I135T
A36E
R201K
H174Q
E102K
Y208H
I166K
T173K
E122K
V181C
G177D
V180I
N104K
R166K
A138E
A190S
R22K
G67D
N67D
I118V
L214F
E70K
T195I
K135T
N177D
D211R
K101E
I184M
K36E
N32K
R104K
T162S
Y181C R190S
Y162S
I109L
L173K
R101E
F162S
D196G
D162S
A196G
R162S
E45G
P27T
F144Y
R65K
E218D
A14P
P35V
G138E
R196G
K138E
Y57N
F34L
F115Y
P58T
A157P
C163S
G134S
A117S
V62A
Y113D
I10V
S58T
Q142I
L77F
E152G
G172R
G212W
R190A
E104K
Y116F
R45G N22K
A107T
I214F
R99G
E30K
N65K
N166K
R152G
0.0
0.2
0.4
0.4
G190T
G190C
G190V
G190Q
G190E
K103S
C188L
G190S
Y188L
G190A
K103N
R103N
0.6
0.8
1.0
1.2
change in log(rf)
0.5
0.6
5
6
7
NFV
I54T
I82S
I54A
0.7
S63F
T4QS63G
K14S
I13K
E34I
I15M
I66M S63R
Q18K
T4N
L63G
E16W
Q18T
L63F
Q7D
G94C
V32G
Q18A
P79L
F53V
R41Q
E17R
H63R
K55I
Q18I
T80S
R8A
C95L
Q7R
E17A
Q2I
E68R
A63F
A63G
C63R
V56L
R41T
C95S
S37I
Q63R
I13R
I72S
T63G
V15M
K41Q
L38M
C95I
T63F
Q58K
T4I
L63R
P79H
I11R
L19R
N98D
D29E
R14T
T96P
K69E
D29A
S4Q
E16D
V63G
P9L
E35K
R8L
R14E
V63F
L38F
D29V
I15K
K37I
A13R
Y67L
K41T
G49R
I72E
G94D
R72S
I85M
L89S
R8D
R55I
K70Q
I19R
D65K
A63R
S37H
I84L
Q18R
K18T
K18A
Q92H
K14T
S63C
E72S
V15K
A74R
R20S
T72S
S4N
N55I
E63F
E63G
K14E
E67W
N69T
T63R
V85M G48IK20S
L15M
V33M
R70N
I72F
R14Q
V63D
Q2L
A63D
L89V
N7D
N7R
K55Q
F53I
H69T
L10S
P19R
W6G
L54F
V82D
I82D
S63Q
V63R
E70Q
S4I
E70N
A63E
N98I
L13R
A12E
K14Q
D67L
I11F
L10C
N37Q
L90W
S37Q
I36A
M89ST74R
G48Q
Q39T
K18I
G48R
N98K
R69T
E63R
Q7E
E67S
H61R
R72F
P9T
E21Q
W67L
W6R
G6R
R70Q L90V D30V I54F
L6R
N14T
N14E
E37Q
L15K
I33M
P39S
L63C
T72F
K43I
N12A
V20S
E72F
K37H
P12K
N12E
D60N
W6L
Q92E
N12I
K70T
A91I
I19T
E12Q
N14Q
H61K
H37Q
G51A
S37Y
K69T
V63N
T12Q
S67L
R55Q
R70E
K18R
N61K
Q61K
N69Y
A63N
E63D
I97F
I12A
Q61R
H63Q
N37Y
E21K
R43I
D30Y
N61R
Q18L
Q18E
M36A
E61N
H37Y
A63V
K37Q
N55Q
E67L
H69Y
E37Y
Q39SK20L I82G G48L
V89S
C63Q
E61R
E61K
T19Q
S37E
A74K
A12Q
T63C
E12K
D35N
T19V
Y69Q
I12E
A63C T74K V82G
H61D
L63Q
N12Q
P12S
K18L
K69R
L64M
S37T
T72R
E70T
E67Y
K18E
T39S
N88K
I12Q
R69Y
T12K
I20T
I82C
V63H
A63H
K37Y
T74E
V12Q
A74E
N37T
D67Y
T72E
M46V
E63N
H37T
N69Q
Q61D
V19Q
K37E
E35N
L19V
I82M
V63C
N61D
V82M
R70T
A63Q
E61D
K69Y
V75I
G16A
E37T
V63S
N12K
V33I
L19Q
A12K
I19Q
A63S
N7E
E12S
I66F
M72L
I12K
I19V
F99LG48A
H69Q
T63Q
E12P
V82L
K45R
Q61N
T12P
V82CI82L
V12K
V82I
E63C
T12S
S67Y
R69Q
I10F
K37T
V63Q
Q61E
P19V
L24F
G73A
K92R
E16A
K69Q
E18H
T36V
I36V
P19Q
T74A
Q92R
N12S
L89I
A12P
A12S
L18H
A63L
E63Q
G73C
N12P
Q18H
I12S
M89V
L54S
I12P
I24F
E63S
V12S
A71T
I72L
V63L
T36L
L76V
V12P
A37D
I36L
R16A
H92R
R20M
M36V
L82S
K18HM89IT74P
A82F
R43KL33F
A74P
L10V
E72L
G48M
T72L K20M
L54M
I54S
L82F
T37D
C37D
Q58E
K43T
V32I
M72V
E63L
L24I
L54A
V89I
Q37D
M20T
M64V
L72V
I72V V33F
V71I
V82F V82S
I64V
V15I
I54L
Y37D
G73T
R72L
N55R
A82S
I82F M36L
R43T
A71I
F82S
V77I
S37D
I13V
M93L
M20I
L64V
K20R
V20M
H37D
I54T
L54TI82S
D60E
I33F
T72V
T71I
P10F
I54M I54A N88S
R20T
E72V
Q39P
L10F
N37D
I62V
V3I
K20T
G48V
E37D
I93L
T74S
V10F
R72V
A13V
D67C
A74S
N60E
V33L
Y67C
T39P
E16G
R20I
K37D
I37D
L15I
M46L
K20I
I63P
T72I
V20T
W67C
M89L
E17G
Q63P
M62V
E70K
S63P G73S
D65E
K74S
L63P
R70K
S4T
A71V
R63P
P10I
H63P
L10I A63P
C63P
I33L
L82A
V20I
L13V
V10I
I11V
T71V
V82A I84V
N55K
V93L T63P
I82A
S88D
S67C
E68G
G63P
V63P
M63P
E63P
A91T
E67C
L54V
N63P
L46I
M46I
I54V
L11V G67C
D83N
I97L
V50IV47I
L90M
0.0
0.1
0.2
0.3
0.4
N88D
D30N
0.5
change in log(rf)
LPV
ATV
0.0
0.1
0.2
0.3
0.4
change in log(rf)
0.5
0.6
0.7
0.0
0.1
0.2
0.3
change in log(rf)
0.4
5
4
3
−log(prob)
2
I54T
V82S
I54A
V82F
I54M
1
6
5
4
3
−log(prob)
2
I50V
1
E34G
K20Q
K70N
V63K
R8D
V63M
L10P
L24S
V63R
V63D
W6S
C63K
D25E
P79T
K55I
R14T
Q2P
A13R
E21G
N69E
Q58K
R8P
D25Y
E34I
L38M
T4I
L90V
L93V
T96P
H69L
Q58R
C63R
C63M
V13R
I82D
N41T
Q2T
R14E
R70N
Q92H
M72F
E34K
Q2D
T19R
C95I
N98H
R70Q
C63D
L6P
P9H
R14I
N98S
K69L
N37T
V15E
D35K
H69E
L10S
V56L
I85M
T71M
G48R
I20Q
K69E
L10R
N98D
D25N
Q19R
K45T
M46K
L93F
D29N
C95S
K92H
R41T
R20Q
G48W
I93F
I93V
I15E
Q58H
I13R
I84L
K45N
V15K
V10P
G86E
W6R
C67L
V63N
P39T
T70N
N41I
L6S
K43E
V10C
R55I
E70N
S67W
K41T
V82D
L10H
P79S
A12N
N41S
K37Y
M36K
E70Q
I15K
R8T
S37H
G68D
V11F
W6G
I47L
R41S
L33M
K34I
K7T
K7D
I36K
D29V
L10M
V10S
S12A
C37H
E35K
K12Q
I82G
I3F
L63S
R43E
K41S
E67L
R41I
T9H
V19R
I47K
E34D
T37Y
G17E
R87K
V33M
D29A
D29G
M89S
C95L
P12Q
G34I
Y67L
P12K
N83K
K21G
L19R
I41S
A37H
Q2L
Q14T
Q14E
A13L
L13R
T72R
D83K
N37Y
Q92L
S88Y
K43N
K20T
N88D
S12N
H37Y
C63N
S88T
G48E
V10M
K41I
V32A
M93F
V48I
P10M
V63H
E61K
F99L
L64M
V71M
CD29E
37YS88H
C37T
V10H
M36A
R43N
C67W
S37Y
L10T
V36A
P10S
V63Q
N12E
I13M
S39T
I36A
L89S
K20M
Q61R
K45I
K41N
K37T
V77L
V63S
N3F
L38F
L6R
I33M
Q19T
S12E
L39T
C67S
T12Q
V12Q
Q19V
L38W
I13L
Q19P
R10S
E67W
Q63T
V82G
P10T
I50M
L20Q
A12E
Q39S
C63H
P10H
K92L
V10T
K35Q
D35Q
L38I
V10R
A37Y
Q35G
L63Q
D67L
Q39T
M93V
H63S
I77L T82D L76I
L6G
E72R
A12Q
E67S
T91A
L11F
Q14N
Q14I
I12E
R69Q
S37T
V12E
N12Q
S63Q
Y67W
V54F
R10H
H61R
T12K
V72K
K61R
I12Q
M72R
K43I
S12Q
V48R
H63Q
I72K
G51A
E12Q
E61R
E12K
V48Q
Y67S
V48W
N69Y
K20V
R10M
R43I
E35Q
C37S
I3V
V12K
T20M
I54F
M20V
P79A
V19T
R12E
V19P
Q61D
Q92E
I82L
T4P
I12K
A63Q
L19T
D67W
R12Q
E72K
A12K
T72K
L19P
D67E
A37S
N83S
C63S
E61H
V82I
S12K
N12K
K69Y
K7E
T12P
A37T
H69Y
N69R
H92L
I72L
L63T
D12Q
V15L
V63T
V82S
V33I
I37Y
K37E
A71I
P19I
M15L
K92E
I91A
I64V
A71L
L19V
S63T
C67F
I15L
E34Q
N3V
F66L
M36V
R10T
E16A
T82G
H61D
C63Q
T37E
D67S
V11I
R72K
H12Q
V82L
A82L
E72T
G73C
Y37E
L72K
H63T
I36V
E63Q
H69R
E12P
C37E
G34D
Y69Q
K61D
K69R
T20V
A63T
I20V
R12K
E67F
D35N
V12P
E61D
N83D
T91S
M64V
I20M
M72T
N37E
A12P
I37T
S12P
R20V
M72K
I12P
S37E
K35N
G16A
R20M
N35G
D12K
N69Q
V48E
T71L
N12P
L66V
K69Q
D61N
K45R
E35N
I20T
K35G
E37D
I72V
Q37E
E72L
I20R
Y67F
R72L
H69Q
H92E
F66V
T72L
R20T
H12K
Q61N
I15V
F53L
I91S
V48A
A37E
Q92R
M89I
C63T
A74P
V71L
Q19I
H37E
T19I
L64V
M64I
G73T
I62V
D63T
T71I
I36M
D35G
C37N
R12P
D67F
M36L
K34Q
V19I
I71L
H61N
E35G
V36L
V10L
T72V
E63T
V54S
T82L
K37N
E72V
N60E
K57R
K92R
E61N
Y37D
L89I
L11I
M72L
K61N
R16A
L19II36L
D60E
G34Q
T37D
G48M
A13V
I89V
C37D
V71T
K37D
H12P
I54S
M89V
R72V
I37E
V71I
K69H
L20V
S37D
S37N
N37D
L24I
M89L
T82I
S73C
R70K
M10F
R43K
A74S
T82S
V46L
L64I
I85V
E72I
L20M
S88D I82F
E35D
H92R
V48M
A37D
I45R
T10F
S12T
Q63P
M72V
R19I
Q37D
T72I
I82A
I10F
A12T
I13V
V54T
I20K
N12T
L89V
I54T
D65E
T63P
K41R
L63P
H37D
V12T
V63P
L10F
R20K
K43T
L20T
V33L
H63P
R43T
V82F
I63P
I75V
S73T
L82F
P10F
A37N
Q58E
V54A
I33L
I54A
I12T
I43T
R14K
E70K
S63P
V32I
S39P
S74T
A63P
E7Q
R12T
A82FV10F
L20R
L10I
E61Q
P10I
V54M I47V T82F
K65E
H61Q
M46L
T1P
L39P
L6WI37D
I54LI54M
V54L
V82A
V46I
K18Q
I37N
V10I
D63P
L76V
L1P
Q14K
Q39P
R10F
M72I
L46I
M93I
Q65E
H18Q
C63P
L90M L33FI33F V33F
N7Q
I14K E63P
L62V
T82V
L62I
S73G
N30D
A74T
F97L
H12T
R7Q K7Q
M46I
I54V
L20KR10I
I23L V14K
I84V
T82A
L66I
V22A
S88N
E40GV48GV66IF66I
L38M
Q18K
K70Q
G94D
I15M
R14V
K14V
I66N
K43E
Q18A
T31P
R41I
E21G
L5I
V15M
K43I
N37Y
G48Q
D25Y
I72S
S37H
P1L
L76I
L24S
K57S
D29V
K55I
L93F
T96P
T80I
N41I
K37I
L5R
M93F
K41I
I66M
Q19M
V10C
E58R
T37Y
D65Q
T91I
M72S
R41T
T12Q
H69T
I93V
L2S
N88T
G48W
P19M
R8L
I85M
E21Q
E17R
D29E
R43E
R70Q
T72K
I36A
L15M
I47L
G16R
V32A
Q18R
A37I
E21K
S37I
I36K
T37I
T70Q
T72S
S39Q
N37I
E37I
V10P
S37Y
N41T
Y69T
M36K
E70Q
R43I
D67W
T19M
N83K
T31S
G48M
H2S
K41T
R69T
P39T
M93V
W6R
T80S
S37E
D25N
D67L
L5F
L10P
M36A
A37Y
E72S
I3F
L10C
E37Y
T74K
V32L
D25E
H37I
K69T
V62M
D35N
W6C
K41N
R92H
L72S
L10T
K37C
N69T
H2P
H18L
A12Q
T31A
L63S
E18L
K37Q
T8L
A63Q
L38I
E67W
L93V
A37Q
L6R
E67L
M72E
M72K
H37Y
H2L
I62M
S67L
S37Q
S12Q
A74K
Y67L
N85M
F66N
F66M Q92H
P19T
A37C
Q18L
E37Q
I19V
I72R
G48E
S37C
G73C
S39T
N37Q
L6C
E37C
S67W
I66L
Y67W
I82D
N37C
E92H
Q61D
P19Q
T37C
G51A
K12Q
E35N
T37Q
S74P
E67D
Q39T
A63C
K72S
H37Q
E34D
H37C
M72R
K55N
R12Q
Q63C
I12Q
T91A
K92H
Y67E
N12P
Q35N
S67D
C63T
I10R
V82DI54F
N83D
I3V
L54F
I37Q
A82M
Y67D
I32L
G48A
T72R
E60N
T12P
D60N I82G
N83S
L19V
I37C
P79A
D67F
V64M
M72T
I12K
L63T
Q61N
L64M
E72R
I64M
V10R
L24F
S88T
V82G
L10R
A12P
M46V
P10R
Q18H
K20V
G16A
L72R
I10Y
A63S
Q92K
S63T
L19I
S12P
V10Y
G73A
L10Y
Y69Q
H61E
E67F
I82L I82M
H69Q
P10Y
P19V
R69Q
Q12P
E34Q
T19V
I36V
A71L
M36V
K69Q
F66L
Q63S
S67F
I63T
I20M
Y67F
K12P
H63T
V71L
Q19V
A63L
N69QT71L
I12S
I20T
A63T
G73T
T37N
I82C
R12P
T74P
I72V
Q63L
M72V
I13V
A71T
V15I
R20T
A16E
I85V
I12P
V82I
R20M
V82LV82M
A74P
Q63T
T20M
S73A
M89V
N88D
P19I
I20R
D61E
F53L
N61E
I71L
F66V
E35D
C37D
K72R
V71I
L64I
L15V
T69Q
I36L
T19I
L10V
K20M
S37N
M36L
L89V
A71I
V82C
Q61E
G16E
T72V
E37N
V36L
T36L
Q19I
R72V
T74S
K20T
A37N
G73S
Q35D
S73T
Y37D
R69H
L76V
T71I
A74S
I82T
E72V
K74P
L72V
M36I
L54S I89V
M93L
K37D
E68G
A12T
V20M
A74T
Y69H
K20R
E37D
S37D
I54S
S12T
C73A
L24I L54T
K69H
R61E
H37D
I62V
M93I
E72I
N85V
M72I
I82S
I50V
T10F
A37D
I47V
T37D
N37D
T72I
P19L
V20T
V33L
V33F
L33F
Q37D
N69H
L54A
R70K
G48V
K74S
I10F
V82TI54L
L93I
L15I
L63P
S88D
V63P
I37D
I33F
V10F L54M
N41R
L10F
I82F
A82S
K41R
F24I
P10F
T19L
T70K
C73T
A71V
N63P
I77V
K12T
T88D
S63P
E17G
D67C
L6W
E70K
C63P
K57R
K72V
L72I
S39P
Q39P
T63P
Q19L
E7Q M46L
T71V
R12T
D65E
A82F
I12T
L2Q
L90M
I63P
V20RH63P
E63P
Q63P
E67C
E58Q
Y67C
A63P
K98N
I84V
C73SI98NN7Q
S67C
I82A
R63P
N30D
R7Q A7Q H2Q
V82A M46I
L46I
V46I
V22A
L54V
6
7
8
APV
6
0.05
change in log(rf)
Q2S
4
0.00
4
7
6
5
4
−log(prob)
3
2
G48V
0.25
change in log(rf)
T4L
G94C
E16W
T91I
V32G
R8V
Q18R
A63F
G94R
D67V
H63K
T4Q
H69P
H37V
L64R
L63R
H63F
Q63F
P19A
L15S
Q7A
I15M
S63F
D65K
K69E
T80K
E67V
S37Y
E37V
V63F
S63K
L90V
G68R
A37V
D65H
C63F
V56L
K55I
P39L
L24S
V72S
I84L
T96I
H69L
Q61K
E16D
P9L
I36R
I36T
K69P
N41Q
L90F
I63F
V15M
E67R
L89F
G40V
K14T
I15E
V3L
R20Q
A63R
L76S
V15E
N69P
Q58R
S37C
Q58H
T96Y
G48I
W6S
G48Q
V71D
V33M
N69E
L19R
T96N
Q39L
G40E
R14E
D67R
H63R
S67V
A71D
K20Q
L10P
I36A
E21V
Q19R
N41I
Q2I
I63K
A71G
N98R
N98S
E63F
I3L
V71G
I93F
L6S
E67G
L10S
Q2T
M89F
R8T
S67R
Q63R
K20S
V10P
K20A
D25N
I66M
T19R
S4Q
R20S
S4L
L76I
R20A
C63R
C95L
I72M
I3N
I19R
S63R
R41Q
I72S
G68D
D63F
L15M
L15E
K69L
I3F
M36R
M36T
V63R
T69P
D65Q
V10S
W6A
E12R
E72S
T72E
A11R
A11C
T26S
R14Q
D67G
A12R
K69T
N7D
N7A
E21G
K41Q
D29G
G48R
T72S
R72S
Q18I
N69L
I89S
N41T
D25Y
N63F
I63R
T96S
V77L
I66N
D8V
A71M
S67G
G6S
T4I
I36K
T96P
V13L
L54F
KE63K
14E
N98H
K43E
W6P
N88K
L10H
P19R
M72S
I77L
T80S
N55I
L10M
R41I
G49R
K70E
R43E
V10H
K37Y
G63R
Q2P
I12R
E34D
Q92H
G6A
T71D
T37Y
N7L
Q2H
K41I
N30V
G48W
L89S
T71G
H63V
K14Q
V10M
K37C
Q8V
S11R
S11C
E63R
D29V
T37C
N88H
N88Y
N69T
L6A
L6P
D29N
M89S
S63V
G6P
T69L
V93F
E21D
Q18K
T74R
S4I
M46K
D25E
R70T
R41T
I72E
I54F
N12Q
I13L
C67L
N98D
D63R
D35G
M36AM36K
T1L
C67W
D29E
K92H
K41T
C37Q
V20A
V20S
S12P
D29A
V46K
P39T
I12A
H37Q
M72E
S63C
S37Q
T74E
H37Y
I85M
N63R
L10T
F10M
D8T
K43I
Y37Q
K70Q
R70Q
E37H
E37Y
H37C
N37Q
A12E
Q92L
I23V
T71M
N7K
T70Q
Q39T
L82D
A37Y
N98I
T14E
H63C
R69Y
L24M
G48L
I93M
E67L
Q18E
H67L
S10P
R43I
V12Q
E37C
T37Q
V10T
A37C
S12Q
G34T
V20Q
G67L
H18L
W6R
F66N
T19Q
I19Q
N88IV82D I82D
E37T
E67W
T14Q
D67L
R70E
C63A
Q8T
D67W
L6R
H61D
G6R
E21Q
S12N
G34V
S10M
N98K
H67W
K37Q
N70Q
E37Q
W6C
I64M
K92L
A37Q
I12E
D67F
Q63A
E18L
E61D
S67L
S67D
G34A
T74K
G51A
Q61D
I63C
Q92K
V72K
I63V
T12Q
A63L
V63A
L6C
P12Q
T4P
T91S
S67W
Q18L
L10R
F10T
H63A
I12Q
A12Q
G73A
I72T
S10T
P12N
G48E
S63A
E12Q
P39S
Q12K
M46V
V33I
Q39S
D21Q
Q92E
S10H
T37S
I85L
V12K G6C
L82G
S67E
V10R
T12N
Q35N
L64M
L10V
E67F
P19Q
L82M
Q69Y
V82G I82G
K37S
N63C
E18H
A12N
H92L
K92E
N43E
L10Y
W67Y
M72T
E63C
V77I
N12K
Q7E
I72R
E63V
I82L
H61N
T10Y
H37S
T72R
Q35G
G48A
A71L
L63T
I12N
T19V
E12N
C67Y
I72K
S12K
H69Y
N35G
S4P
I63A
E61N
I82C
T12P
R57K
H69R
V63L
H7E
R7E
R72K
N69Y
K69Y
E72K
H12Q
G34K
V82L
Q61N
F53L
N43I
H69Q
V10Y
H63L
N88T
E21K
C63L
G34D
L19V
E72R
V82C
S63L
Q63L
T72K
F10R
I12K
N7E
A12K
I36V
T12K
D21K
A37S
V82M I82M
E12P
K69Q
V71L
K69R
H63T
L89I
S10Y
N69Q
A12P
P12K
I19V
S67F
Q63T
G48M
I66F
N69R
M72K
E12K
M72R
E67Y
Q19V
A63T
H67Y
I63T
E37S
A74P
E63A
P19V
F99L
G67Y
S63T
S10R
M89I
A22V
Q18H
H92E
D67Y
F10Y
S74P
T71L
I12P
K14R
G73C
H12N
I63L
N63A
T36L
G63T
G16A
A74S
T69R
I89V
C63T
M20I
T69Y
M36V
N69K
E92R
H12K
D88S
D35E
T74A
A91S
I13V
S67Y
A11I
T20I
A7E
E16A
R20M
K20R
K20M
I36L
V36L
V72L
R43K
I50V
D63T
L89V
T69Q
V63T
M89V
I62V
E63L
K45R
K55R
S37N
C37D
I54L
M64V
K20T
R20T
T74P
A71I
I64V
Q92R
N63T
T19L
I72L
D63L
R72L
D60E
N63L
R72V
Y37D
K43T
M36L
I45R K92R
L64V
I19L
N88S
S11I
V15I
I72V
E72L
R43T
T72L
H12P
E63T
T72V
I82V
S37D
Q58E
V71I
T37N
E72V
M72L
T74S
Q35D
E12T
V32I
A12T
N37D
L54S
H37D
H37N
L76V
G73T
I54S
K41R
T14R
K20I
K37D
H92R
T71I
T37D
K37N
R20I
N88D
V20M
M72V
N55R
L63P
A37N
P19L
E17G
A37D
M93L
E37D
F24I
M89L
I85V
I93L
V33L
Q35E
N60E
L24I
T63P
T71A
I12T
L54T
L15I
I33L
E37N
N35E
R70K
M36I
V20T
A82S
L82SI54M
D17G
H63P
V46L
L54M
E67C
E16G
Q63P
G6W
A63P
V93L
V82FL54AI82FV82S
A82F
L82F
S63P
G48V
C63P
F33L
I37D
M46L
D67C
I63P
K65E
N43T
T10I
L10I
V20R
D65E
V20I
N60D
S67C
S4T
L1P
S79P
G73S
M62V
G63P
V10I
M62I
V63P
A11V
I84V
N7Q L82A
S10I
N63P
G34E
D8R
V46I
F38L I11V
E63P
L54VV82A I82A
T1P
F10ID63P
N43K
S31T
L62V
L90M M46II54V
S11V
Q8R
H59Y
L62I
L46I
V47IL11V
I23L N30D
change in log(rf)
−log(prob)
1.5
IDV
1
4
3
0
1
2
−log(prob)
5
6
SQV
2
2
6
1.0
T165P
S123R
D86H
I202T
F171V
V148G
T142N
F61L
V148L
K174T
K83G
N175K
R83E
K20I
T11A
R70Q
V142N
K39Q
F171I
T7N
R20I
D6S
K11N
D76E
T142R
T142S
S48Q
S48F
K174G
K13I
Q102T
R72T
C162K
D76R
E79R
N104T
D86Y
Y146F
V142R
V142S
Y146C
G211M
H174L
N54D
L195V
P119T
T7I
F115C
P52L
R49T
E79Q
I135P
R174T
R174G
F214V
Y171V
K64Q
G15V
E29K
I167L
L74M
E86N
T35S
A44N
P217Q
S48P
I37N
R172I
V202L
Q151R
I159Y
E28D
R78K
K83E
F87S
W88G
A162V
L193F
E89G
R135P
F214S
E203Y
E174T
P55Q
Q161E
D192T
E169H
E169Y
E32T
I37F
G45E
K173M
M35A
E6S
D113E
A162F
F171L
I135A
H121N
K11R
V202K
P140Q
K13N
R199E
D123R
Q197T
R122M
R122D
K126E
Y171I
K73T
E197N
A122T
A122M
A122D
E86H
E86Y
I165P
K39P
A44V
D162F
K220E
S48L
K73N
R72K
E174G
E39Q
D6A
L215H
V60S
E102T
T11N
C38S
C38F
N207L
I60S
R172T
H174T
D44N
Y162K
N175I
F116I
K6Q
D169Q
N32A
T11Q
R135A
M41V
C38G
N67H
T216H
E44N
S3G
R199K
D169G
T195M
E218Y
V10L
K197N
Y171L
K32A
I195M
E39P
A36G
P176L
T173M
N57K
Q161H
H121E
T27L
T69H
M16R
N68V
N68C
Q173M
H174G
I167F
P119L
C38W
K39N
Q151K
E122M
V135P
S173M
D215N
T215H
D6Q
R206K
Y146D
E122D
C162V
E122T
D44V
E169R
V50L
R125G
D186E
P197N
A44G
D162V
M16T
T128P
R68Y
R68V
R68C
K43T
F215H
P119S
Y175I
E194G
I2L
V111L
T165A
I132V
E218T
A36Q
N32I
Q219P
E218R
E44V
C162F
Q207L
G211E
G177Q
K32T
F162V
A200Q
T165L
K82Q
I142N
E53Q
L195M
V200Q
I179S
A98E
K6N
W162V
N32T
E123R
S200Q
A55Q
E218H
P4N
V108A
K219P
L100V
R169H
R169Y
I31M
A122I
Y121E
H207L
R172S
E123Q
K172S
I47F
R219P
Q204G
Q204A
I167M
M139P
Q11N
I165S
Y121N
I179M
E29Q
P59A
E122A
G15R
A173I
A169Q
Q91E
I50L
N192T
A169G
V135A
R213V
S163I
T142K
V60Y
E218V
M184L
V215H
D6N
Y162V
E122I
R101G
N177Q
R70S
E79K
L197R
I37M
V8G
F215A
D110N
P59Q
G134T
H121C
D44G
V200L
I60Y
R22N
D121N
E6Q
A14Q
V60L
K82T
E169Q
S191Y
S105A
I165A
E6A
C162W
E169G
K166N
R173M
S191C
E197T
A138D
S200L
K32I
A200L
I47N
G141R
I60L
I39Q
N64Q
E173M
Q219S
I142Q
T207L
D121E
Q102M
E44G
K39L
Y215N
I142R
I142S
R143S
G148L
C163I
V10I
F87L
A139P
G196K
S162T
C121E
A36K
K65N
L215A
V142K
Y181S
T215A
V21G
I200Q
Y215D
S105T
F160L
G134R
N69K
E53G
H64Q
V178F
D211V
Y162F
T27I
R45E
F210V
F215L
Y57H
L193I
I2V
V62P
K166E
S215N
K207G
N67A
Q197R
N67S
K197T
D204G
I214V
I214S
V215A
T35A
E40F
K174N
A33P
I179F
K64Y
S210Y
S68T
T27P
K49T
S163C
Q204N
L215N
R43T
D204A
S
163G
S39Q
K32N
Y215S
F208L
S39P
K73M
G68T
A33V
L210Y
A162G
S163R
E6N
N43T
K70T
E203G
V142L
K122V
R169Q
E49T
N134I
T200M
R169G
A138Q
P176T
D215E
I109M
P197T
Q49T
E39N
Q197L
R103E
I39R
L173I
D211M
I184L
K219S
R207L
R101I
G143S
G211T
E102M
I200L
I159V
V184L
W210V
S162G
R166E
S163N
G155R
N54I
R32Q
Q219G
Q166I
T139Q
R174N
N69P
G134C
G134I
K173L
T142L
S200V
E207G
Q43T
M35E
V200R
E70Q
I200V
Y121C
H208L
A43T
I39G
N69E
R103I
D179L
E39L
M139S
V106G
I179A
G67E
M178F
K166I
A162T
R196D
E174N
R166N
E197P
R172G
T200S
S39G
K219G
I39P
K211S
Q173L
N175Y
R101T
K197R
K102N
K172G
L139P
R219G
R122V
D121C
T173L
H207N
F210Y
N162G
L197H
N68T
K82N
T200Q
V202T
Y215E
M139V
T215L
T215N
I109V
V179M
E203A
D69I
V62T
T69P
I106L
E169A
T139E
T11R
C162G
V179S
Y57K
D204N
E179L
P122V
E207K
V159L
S200R
E70S
T69K
T173E
R166I
E32Q
T35K
P176Q
K49Q
W88S
V8I
A162N
L210M
K103I
S173L
M200R
K32Q
E203V
S48E
F215N
P4H
S215E
E203Q
T166E
T173S
W162G
A122V
G134N
K197P
E197R
A139I
Y162W
Q173I
A33G
A179G
I135K
T139P Y188N
S163T
K207D
V215L
T200L
H174N
W210Y
I39D
A139V
A203Q
P197R
I179G
K35E
G211A
A158T
T215V
Y188S
H67A
A35E
M139I
E197L
N186E
E79D
M35R
T75S
R122Q
C162T
Y208L
S39D
L173N
S68G
A200R
K101G
A122Q
L200R
V179A
D162G
Y162G
D162T
F162G
M16V
R70T
A138G
R196Q
T173I
W162T
F210M
A162D
D169K
N69A
T35E
M50L
V215N
R190T
T207G
K143S
N32Q
Q122V
H207G
E194Q
N175H
K103E
R196A
P4T
I132L
N211Q
Q102N
K173I
L210S
D123G
T207N
K101I
N166E
R135K
M145C
L215E
Q207G
A173N
K166Q
Q197H
I181F
I178F
F193I
I39N
T166I
R173L
N207G
K197L
E200R
I200R
I165L
V179F
L215S
T200V
V184I
R207H
K101T
Q204K
E123K
K68T
R190C
L142M
D194K
R68T
T69A
R172K
E207D
A211Q
N64Y
H67S
T69E
E173L
R207N
I159L
S69G
V75S
E102N
K35R
E122Q
A35R
R173E
G211Q
Y162T
I122V
T122V
Q173N
V200E
V179G
S39N
D211E
I179L
G177N
Y175H
A69G
V142T
M139A
E211A
E122V
G207A
R173S
W88C
A162H
F162T
S173I
S200E
C188F
H64Y
V142M
R103H
R166Q
F210S
N104R
G190T
E211Q
T35R
C162N
I142K
A36D
V139I
L139I
I142L
T200R
R207G
T211A
N64H
D69G
H207K
K49E
A200E
E28K
T173N
A169K
P197L
K197H
V135R
N211S
E169K
G162H
T215E
Y181F
S215C
T211Q
E35L
F162N
I35L
V179L
T142M
R68N
K102R
W210M
D162N
V203Q
S162H
T215S
D211G
T139S
W162N
H208F
E173I
E194K
G190C
E6D
N207K
R173I
I39S
K135M
Q138G
S39L
V135K
Q122P
E40D
R169K
D204K
M35L
N166I
F215E
Q207K
R190V
A179L
Q194K
C162D
P69G
S215I
E197H
N69G
T69G
Q211S
T75A
T4S
F215S
K173A
I39L
E123G
P197H
K35L
T139IY188C
N69I
L215D
A35L
Y162N
K36D
T207K
T139V
Q173A
N162H
W210S
V75A
I50V
E28A
T39A
T207D
C162A
H207D
K173N
S39E
T166Q
R196K
G194K
G190R
T173A
A211S
C162H
E211S
K101N
V200K
E70T
D69S
G211S
W162H
Q207D
V75T
S139R
P4S
I200E
I39E
N207D
V106L
K64R
D211A
T69I
N67E
S173N
D215C
M200K
M211A
V215E
R173N
K28A
I31L
Q190E
Y215C
T35L
R135M
T35M
A139K
V31L
A69S
Y188FG190V
M211Q
T27S
K142M
V215S
I135M
V60I
D123S
Q102R
H67E
T211S
Y162H
S173A
Y208F
T215D
D207A
A139R
N166Q
T75L
R103T
M139K
I106M
D162H
I142M
D211T
K103H
N64R
L200K
E173N
Y215I
R207K
M35I
D215I
E219N
F40D
T215C
R43Q
R207D
Y188H
F215D
L215C
K103T
F162H
I69G
D211Q
V189I
G28A
T207A
V75L
R122P
Y162D
S48T
A122P
E102R
R101H
Q219R
I69S
S200K
I179E
D28A
K39A
R190Q
A200K
M139R
K65R
N215C
V21I
F215C
L39A
V111I
A158S
M75L
Q101E
L4S
T215I
L215I
T200E
T200I
A190E
T139A
V135M
L139R
R173A
E200K
H215C
Y64R
G123N
N215I
F215I
K35I
Q200K
E173A
E39A
E207A
K207A
A35I
I200K
R28A
V179E
N67G
K82R
S207A
N69S
T101H
T69S
V215D
V139K
L139K
Y162A
N207A
H215I
T35I
E48T
I181V
N39A
K219R
V215C
S123N
V178L
E122P
Y162C
A203D
G190Q
Y181IY181V
E123S
H207A
I179D
F116Y
D39A
K101H
T215F
K101R
K43E
T200K
R190E
I142T
E203K
H64R
Y121D
N207E
V106M
T69N
Q151M
T139R
V139R
M178L
D211S
A203K
G196E
M211S
A179E
K123N
E101P
P48T
Q207A
T75M
H207E
V215I
A123N
E203D
T139K
Q207E
E53D
I211S
K203D
H67G
S98G
V179D
H219N
R43E
Q101P
N68G
F210W
G190E
R207A
S211R
V75M
Q219N
M135L
D219N
K53D
K219N
R101Q
A43E
T207E
R135L V135L
N69D
T75I
H219E
T11K
A179D
Q211K
I135L
T101P
K219Q
R135I
N211K
L74V
N43E
M50V
D123N
R101P
T101Q
Q219E
A98S
M106A
Q43E
I178L
G177E
A98G
K39T
R219N
T69D
K101P
A162S
K219E
V75I
L100I
H207Q
V203K
K135L
V179I
K101Q
L39T
D36E
R101E
I106A
N177E
R68G
E123N
R219E
V203D
L35V
R207E
H101P
Q190S
K35V
I35V
N174Q
V106I
A35V
E39T
A211K
E211K
F214L
L215Y
R70K
M75I
G211K
S39A
P122K
K49R
T101E
V184M
K103S
I74V
I39A
Q43K
R103S
K211R
E70R
G67D
E190S
H101Q
Y171F
M35V
K204E
A190S
V108I
K103R V106A
E35V
E49R
R122K
Q49R
N39T
V180I
Q211R
T211K
V135I
Q190A
F100I
N204E
G211R
K101E
D39T
T35V
V196E
T211R
R166K
T207Q
Q122K
F210L
T215Y
E211R
A211R
Y208H
A122K
E123D
H101E
C162S
A36E
R190S
R196E
F215Y
R22K
E177D
I90V
G190S
E197Q
Q102K
F162S
R207Q
I165T
K174Q
E86D
D162S
N211R
W162S
V202I
A204E
M211K
M135T
R104K
R135T V135T
R43K
H174Q
H64K
A14P
I135T
I214L
R174Q
A138E
Y162S
N68S
K197Q
K138E
E70K
W210L
N64K
E174Q
N104K
E122K
V215Y
N67D
H67D
P197Q
G177D
I122K
T122K
G138E
C188L
N177D
D204E
E102K
I34L
M211R
D211K
Q204E
K135T
R190A
F115Y
Q138E
I214F
K36E
I118V
R196G
L77F
L94I
I109L
P62A
D211R
Y57N
S39T
E218D
M145Q
P131T
R213G
I39T
G190A
R68S
A117S
C134S
R66K
N134S
R99G
I211R
I181C
Y181C
S58T
Q46K
V181C
V109L
H182Q
A107T
G134S
V62A
R103N
0.0
0.20
D121N
I142Q
E79K
R199G
H121N
I180M
D113E
P133L
T58I
E207T F124L
A33V
K66N
N147H
D79K
S162G
C162N
G45E
Q207T
Y121N
D186N
A162G
R207T
K207T
S48T
D207T
E39S
G207T
G99R
N207T
R206K
D186E
H207T
R45E
V21I
C162G
R66N
E215N
V106M
K22R
V202T
H162G
T211N
I202T
Y181V
D36A
A39S
R173L
M210S
M39S
W162G
K211N
K39S
Q101R
T107S
I106M
L142M
N162G
K101E
R162G
A173L
E211N
T58N K219G
K43N
C88S
S173L
E173L
N69I
I135V
W88S
T173L
V179D
S215N
N175Y
K70T
I173L
L39S
I215N
E219G
A139R
M35I
H174R
Q219G
C215V
T39S
Q174R
D123G
N123G
T139R
Q211N
S162D
I142M
C181I
N173L
S211N
D67ST215N
D40F
S68G
K139R
D215N
L75A
T142M
V75A
E36A
A106M
S58N
G123S
H175Y
C215D
T195L
A211N
A162D
A123G
E40F
C215N
K174R
Q173L
M184L
G211N
V111I
T4S
R211N
E123G
K104R
K135L
K122P
S162Y
I195L
P4S L210S
D44A
T135L
I181V
C162A
S200E
P39S
K101R
K123G
A142M
R200E
M142V
A162Y
K103S T69I
G67S
N104R
C181V
Q142M
R103S
V200E
I200E
V202I
R135L
V75M
K40F
F210S
E70T
M135L
C162D
E207A
A200E
T215C
N103S
T200E
K142M
K200E
D162Y
E48T
T215V
H162D
E203D
N69S
G211R
Y115F
L35I
A190S
V135L
M200E
D69I
D207A
Q122P
L142V
C162Y
D123S
G207A
K207A
E203K
G44A
R207A
V122P
W162D
N123S
S207A
T166R
E44A
K219N
Q207A
D67G
T215D
H162Y
E101R
N162D
Y188L
I135L
R162D
A123S
L200E
G190S
E138A
W162Y
T69S
D218E
K44A
I35V
N162A
T35I
E123S
A35I
F77L
N207A
H207A
K166R
K123S
N162Y
K20R
E190S
T211K
F116Y
R35I
D69S
I166R
T69D
I142V
L75I
T142V
R162Y
M35V
V75I
K122E
K49R
I8V
V60I
L215F
R43E
K65R
T11K
N188L
L35V
M184I
K43E K219R
Q151M
N166R
A62V
S211K
V179I
Q43E
Q211K
N43E
K35V
A179I
K204E
E179I
I165T
Q219R
E219R
K70R
I215F
Q122E
R173K
A142V
D179I
G211K
V122E
Q142V
G177D
R211K
K142V
A211K
D204E
S27T
A35V
E215F
L74V
K83R
S215F
E177D
T35V
S173K
T173K
A173K
E173K
C134S
R35V
I173K
D215F
T215F
Q204E
Y208H
C215F
N173K
N177D
E48S
V50I
E70R
C145Q
K172R
Q46K
D67N
Q173K
L215Y
L210W
M41L
M210W
H182Q
F210W
I108VS215Y
I100LG67N
F215Y
G134S
D215Y
C215Y
T215Y
I215Y
E215Y
I184V
change in log(rf)
T4I
A22S
E16W
N69P
K45T
K14E
I19R
L76S
R14E
D60A R69P
R45N
N69L
Q63R
D37V
A63I
N69T
Y67R
Y67G
Q58K
S67G
S67R
E35K
I15S
K57I
D67R
D67G
I13L
R8T
S37I
N83K
M20N
I82D
G17A
I20NP19A
A12M
L63R
S63R
V63F
L10C
C37V
R69L
T19F
Y69P
L5I
V56L
D60H
R8A
L19R
P39L
L15E
V15E
I20A
E67R
K69P
E67G
S4I
I89S
D25Y
W6A
L19F
R70N
G17R
A63R
E72F
L63D
P9L
L15M
I10T
Y59H
I19F
Q58R
Q63D
V13R
I3N
E37V
P63N
Q2S
R20N
I36K
Y69E
S39L
R57S
R8V
R43S
E68D
S63D
K20N
V15M
V82D
K69E
V19F
D29G
I15E
V63I
V82N
R41T
I13R
F53S
L36K
E16D
P19F
S37V
Y59D
A82D
R8G
P19M
Y69L
R45T
I66M
R16W
M19F
R41S
S19F
N37V
T71D
G16D
K37V
N98D
T70N
P39T
Q19F
R70Q
R45I
A63D
I85M
W6S
T39L
Q2P
A71D
Q39L
I82G
W6P
K69L
K43S
L82D
N7A
N7R
N98I
K20A
K43N
R20A
D29N
T96P
I54F
E63I
R41I
M47L
L10S
D29H
G6A
C95W
R8D
P79T
K41T
A13R
D29A
V10Y
I11F
T12M
C37Y
I85N
T72F
I15M
D65Q
D29V
I32G
K41N
L5F
V20N
V20A
Q19R
M20A
N37K
V82L
Q2I
I37V
T80I
L38F
I63R
Q63C
K57S
K43E
Q18L
L6P
L6S
F53V
G6S
G6P
T37Q
M36K
R72F
H2I
H61K
P79R
N61D
Q7E
L82N
I98D
K41S
P19R
I93M
G49R
N12R
E67W
T72E
Q8V
Q8G
N43S
P79L
F53Y
Q2H
E37Y
Y69T
I36A
T96S
T82N
D29E
I84L
V93F
L36A
K41I
S79R
I32L
S39T
S67L
V15L
V63R
P9T
R55I
M89S
L38W
I63D
S88K
D67L
H63N
A71G
T71G
D37Q
Y67L
T20N
K98D
T70Q
T72R
E21K
V72K
S12R
L89S
K92H
T1L
A22V
W6L
V82G
N41T
D34T
T82D
I32A
E63R
L10T
C95I
A74R
T20A
M72K
M19R
R16D
V63E
S79L
A82G
K69T
N35K
L63C
Q8D
V47M
M62L
Q92H
V10M
Q18H
S37K
V63D
V12R
P12R
K34T
N88K
E61K
N41S
C95SV47K
L89M
T63N
I10M
K12R
N7K
Q61R
M36A
A12R
I12R
R55T
S63C
F10M
L23I
H18L
T80S
V76S
E67L
A12K
E63D
I20L
C63N
E37Q
L82G
I66L
C37Q
K92L
L10M
I10Y
S37Y
I12K
P79S
V47L
N41I
T12A
D12R
Q92L
L90W
D37T
L63N
F66M
T12R
W6C
Y59F
K37Y
S37Q
T70E
S74R
T37H
N37Y
K34I
C95R
A63C
V89S
S63N
E18L
A82I
Q39T
Q61K
L90V
R43I
Y69R
F82D
L63H
H63T
G17D
E67D
G6C
Q34I
T12I
S74E
D34I
R55Q
E67S
N37Q
R63N
Q34T
Q63H
D37H
I12N
Q63N
L24M
F53I
G48I
K37Q
A63N
C37H
L38I
S63H
C37T
A74K
F82N
Y37Q
I15L
K20L
L6C
E12Q
M20L
L10Y
S37H
Q39S
G6L
K43I
N43E
G51A
R43Q
K12N
I36L
C95L
T74RG48Q
I37Q
R20L
F38I
H61D
P79D
Y37T
D63N
I37Y
T82G
Q34K
I63N
T12K
E37T
Q61H
H2L
G63N
A63H
E37H
N7EQ2L
P12N
T70R
K34V
L33V
S74K
N37H
E72M
T72M
A12N
L19T
R12Q
R14K
S37T
G17E
N12E
L19V
L90F
I63C
K43Q
V19T
I19T
V20L
I72L
R63H
F99L
H7E
E16A
V12E
I37H
L19I
P19Q
D34V
S79D
L24F
I19V
N12Q
N43I
T72K
E72K
K7E
Y37H
I12E
K37H
V12Q
Q34V
K12E
N37T
I63H
W67F
T12N
D60N
I12S
S12E
K12Q
Q35G
I12Q
D12Q
G16A
A63L
P12E
V63C
V82T
D12E
G48R G48L
V63N
V82I
K37T
I36V
V63A
H12Q
T74E
E18H
L63T
A12E
F82G
E60N
D12N
C67F
N69Q
R72M
A12Q
S12Q
C63T
E63N
E61D
P19T
Q92K
Q61D
R55N
R20I
K12S
T12Q
H61N
N43Q
T12E
E63C
Y69Q
T20L
R72K
H69Q
V12S
S63T
I33V
V72L
T74K
K43R
P12Q
I37T
M19T
Q19T
M12Q
S19T
N88S
P12S
R69Q
G48T
L36V
Q92E
Q63T
M89V
V71L
V63H
R92E
F82M
K92E
V93M
S67F
Q34D
A63T
P19V
R63T
A12S
M36V
Q19V
I50V
H12E
K35G
D67F
M89I
I10V
E61N
L89V
L82I
M19V
V54T
S19V
I63T
M36L
Y67F
E35G
L89I
E35D
V33F
F82L
T12S
G48E
D35G
H92E
Q92R
R16A
Q61N
E63H
C37D
N35G
S74A
K69Q
I85V
M64I
M72L
T71LA71L
K92R
D12S
L46I
E67F
R16E
Q19I
I93L
D63T
N43R
Y69K
G63T
V10F
V63L
G67F
I77V
I64V
I10F
K20I
P19I
V89I
I54L
M64V
E72L
A71T
M46L
D60E
G48A
T72L
K20R
E37D
S37N
A74P
L10V
I13V
F82C
T82I
L64V
V63T
M54T
I82S
L54T
E72V
T72V
M19I
N69H
M93L
M54A
S74P
L10F
C95F
S37D
K58E
E63L
I15V
R72L
E63T
V20T
H63P
V72I
I54M
L54A
N37D
L33F
L54M
V71I
K37D
G73C
I54T
R43T
G48M
V82A
I54A
V82S
L24I
A13V
M46I
R72V T74A
L82S
I43T
T74P
F82I
Q19L
T63P
S39P
K43T
R69H
Q58E
V20I
L64I
V62I
C63P
L10I
S73T
I37D
M72I
I33F
T20I
A71I
T82S
L63P
T71IA82S
Y69H
F82S
K41R
S63P
N35D
V20R
R70K
N41R
T20R
V93L
L82A
S67C
E68G
S4T
P19LF82T
F53L
R63P
E18Q
Q63P
N43T
R45K
G73T
A63P
D67C
Q39P
T70K
V46L
I33L
T1P
Y67C
K69H
V93I
D65E
M19L
D63P
M62V
I63P
G63P
E72I
T71V
T72I
A71V
E67C
Q65E
N88D
V20K
K65E
N7Q
T82A
N55K
R72II11VS88D
M54V
V46I
M62I
H7Q
V63P
L54V
D94G E63P
I54V G73S
M47I F82A
R55K
I84V
F97L
F82V
N30D
K34E
S11V
Q34E
V76L
V47I
L90M
D34E
0.4
4
5
0.5
change in log(rf)
0.2
3
M184I
1
6
5
4
−log(prob)
3
2
1
6
5
4
3
−log(prob)
2
1
7
0.15
T215Y
0.15
V180M
NVP
K65R
M184V
0.10
change in log(rf)
F124L
V10I
I94M
K64E
A28N
E53Q
G177Y
D169N
I31K
P1T
K35G
K73M
K197T
N162I
K39I
R211M
M16I
K11S
M35G
A35G
E177Y
D194V
D162I
Q91H
P95Q
K154R
R64Q
K20L
N81K
L35G
G190V
D53Q
A162I
W212L
R172S
R103T
T27P
E39I
V202L
K64Q
K28Q
I180F
D207L
K194V
C162I
S162I
T84I
F87L
G99R
K44Y
F116S
Y121N
K197H
K219T
R20L
N103T
K101G
E86N
V184T
L47V
M39I
W210Y
V16I
V21F T35G
A39I
A28Q
N207L
T35S
E207T
E101G
R101N
N68V
D194A
Q211V
A114P
E154R
H219T
P1L
V180F
I5V
T211V
T58I
D185E
G190R
S211V
S58I
K22N
G141R
S39I
K219G
K103T
Q207L
G207L
T35Q
G211V
V118F
Q23H
R49Q
D169R
K39N
D86N
A207L
K219N
A207G
G190C
R207L
K211V
G68V
H207L
R206K
C121N
K194A
S27P
D44Y
S190C
N57K
V179L
K169R
P122Q
K219D
K207L
A36D
W88R
H101G
P39I
R101G
T39I
T69P
N39L
V75M
S207T
S200V
D207T
R211V
T139I
A139I
K70N
T39M
L205F
R190C
E39N
I202T
D179G
K207G
E32R
K49Q
N207T
G207T
I35R
N175H
H211V
A173E
V106M
I132V
E39L
A62P S68V
Q207T
S3C
A179G
K32R
A207T
I179E
T200V
S68T
R207T
Y144F
V179G
K35R
H207T
N70Q
M139I
M39N
K219H
A39N
I200V
G138K
K207T L210Y
W210M
Q211E
N43T
C215N
I2V
W210F
N101G
D123K
I39L
D215E
K122Q
E200V
E70Q
R43T
M39L
A162N
R211G
I135M
R39L
Q211D
K70Q
S211E
K39L
K22R
S39N
T211E
M35R
K200V
Q43T
V202T
I94L
R11T
S215E
S215N
V215N
G196K
M210S
N68K
A39L
E215N
K174E
S211D
Q211A
T58N
G211E
E123K
E200K
T211D
E49Q
K211E
K207A
T69A
N173E
S58N
K43T
S39L
F210S
D123G
N64Y
E177G
T211A
E196K
A35R
T139R
G211D
I106M
Q190E
K197E
A139R
G68K
E67S
S211A
T215N
N69A
P4S
E36D
V179D
E39K
E36A
N123K
G68N
A69G
K211D
R172K
A138K
E138K
N32R
P39N
D215N
R174E
G190E
R64Y
G211A
T39N
Q200V
R173E
P69G
T69G
K101P
L210M
G190S
T173E
C88S
R211E
K64Y
L35R
T69E
T139K
D179E
A139K
M135R
K211A
V111I
E138A
T39L
E123G
S190E
N69G
R211D
I173E
N69E
D69A
S68K
T135V
Y121H
S48T
A179E
S215D
A39K
H211E
T32R
N123G
P197E
S173E
L4S
M39K
P39L
M139R
E101P
R211A
G123S
T215E
R190E
V215E
R101P
A106M
V179E
T69IT35R
H211D
Y115F
D67S
D69G
I135R
K101Q
T215V
H174E
R174K
T4S
S39K
D69E
N67S
E203D
N69I
S11T
K11T
M139K
G177N
K36D
L197E
N104R
W88S
L210S
W210S
H211A
C162S
H174K
K36A
K104R
E177N
W88C
T215S
C121H
Q43R
H101P
V75T
E48T
E101Q
S68N
V21I
L135M
E40D
A162C
V215S
L210F
A69S
D69I
Q151M
H197E
I195L
D40F
L135R
G67S
K43R
G162Y
R101Q
T39K
P39K
Q203D
D123S
D203K
K40F
C215I
T69S
K135V
P69S
N101P
T215D
F77L
V178L
V215D
E123S
V215I
K219R
N69S
A162Y
Q203K
R135V
S215I
M75T
K166R
H101Q
D218E
K135T
T162Y
D69N
T69N
T215I E40F
V90I
M135V
E203K
N123S
E215I
F116Y
Q43N
D179I
N123E
C162Y
S162Y
H219R
D69S
M178L
D207E
E174Q
D215I
P122E
T69D
A98G
I135V
T75I
K43N
A40F
V200A
N101Q
D162Y
A179I
S98G
N162Y
A173K
K43Q
E173K
I166R
V60I
V179I
R219E
H121D
A203K
I178L
L75I
V75I
N207E
R211K
L135V
D204E
K219E
G207E
S200A
A162S
N219E
K6E
R135T
E200A
G190A
T200A
Q207E
I200A
N43E
K28E
A207E
I184M
P62V M75I
R207E
S190A
L215F
H207E
M135T
D6E
Q190A
K122E
E197Q
N173K
A62V
Q102K
K35V
L173K
R190A
K200A
K207E
Q200A
R43E
N174Q
K83R
A105S
S27T
I135T
I35V
I215F
I8V
H219E
T70R
V118I
T173K
L135I
R173K
Y121D
E102K
N219Q
K204E
Q43E
K43E
E219Q
I106V
N204E
C215F
R219Q
A6E
S158A
I173K
K219Q
V50I
K197Q
L47I
M35V
Q70R
D194E
N177D
L135T
K174Q
K194E
N215F
R102K
A28E
S215F
N113D
I165T
S173K
T215F
R35V
G67D
D169E
D79E
A35V
K169E
R174Q
E215F
R82K
L35V
L210W
N70R
D215F
D67N
G70R
E70R
C121D
L188Y
D28E
K70R
V215F
T35V
H219Q
A204E
G177D
P197Q
E177D
H197Q
K42E
V184M
C181Y
F188Y
A106V
I181Y
S137N
H188Y
M41L
L184M
H174Q
A165T
Q217P
G134S
T158A
L197Q
K44E
I74L
I100LF215Y
T105S
C134S
G67N
D44E
V181Y
V74L
S215Y
E215Y
L215Y
C215Y
N215Y D215Y V215Y
I215Y
0.0
−log(prob)
2
M184L
M184V
0.0
M184I
Q151M
ABC
M184R
M184W
TDF
0.10
0.05
M184L
change in log(rf)
1
6
4
−log(prob)
2
V75I
I35G
0.05
0.00
3TC
Q23H
R143S
S163R
I94T
Q151H
K13E
V111M
K201Q
A162K
N54H
R199S
T35S
D40G
P97A
S134I
I60L
A36N
N123Y
D218T
N137K
K122D
P1T
E44Y
G196A
D86N
A62P
A162R
T107P
E40S
L209M
P150T
D185E
V106G
F171L
L209S
E123Y
I178P
V50L
F214S
I178N
L109R
N177H
D169Q
E36N
A36G
K207L
P14S
K102G
Q145H
E6T
I165K
I31M
D67K
I202L
D185N
V111L
V135S
P150S
T11N
H96Y
L178P
L178N
Q122D
E36G
G68Y
R11N
K174T
R143G
Q91E
E89G
M164H
Y115N
D40S
L92I
I35A
S68Y
Q219T
S123Y
K126E
A28Q
Q207L
W24R
G51A
A158Q
L214S
H174T
T69Y
D123Y
R172I
Q161H
K30T
A207L
E169Q
G51K
C162K
P1L
K197T
T35Q
T35A
K102M
K46E
K39R
A122D
Q91H
Q219S
D169R
P150L
K211V
R125K
R172S
S163T
E196K
T11A
E29G
S68T
N177R
E53K
E101G
D110N
D53Q
R72K
D53K
T39L
K172I
K172S
V148G
E53Q
K6T
K6G
N67K
E204N
S48Q
K169Q
Y188H
A114P
E194V
P7Q
Q151K
E28N
I63K
D207L
F124L
P4K
R199K
T135S
E6Q
I31K
E40K
Q197T
V202L
K122T
E122D
K207T
E28Q
A36K
I135S
M35Q
N147H
K219T
T200Q
E89K
S162H
Q122T
E48L
S98E
E39R
P14A
V122D
K28N
Q161P
F87L
C162R
K30E
A162H
S48L
G99R
E138D
D69Y
N68V
L187F
G211V
P140Q
I63V
A200M
Y162K
S68V
T35K
E194N
K166E
C162G
I47F
T215L
V179L
R201Q
E67K
T215N
A162T
D186E
Q207T
T4K
E169R
I202T
A138D
W162K
R68Y
E197T
K28Q
K66N
R211E
I200Q
D53G
K6Q
D113E
A158P
D186N
D40K
E204A
E42K
I35Q
N43T
A122T
G68V
E53G
I100V
K211A
K169R
D203G
S48P
L34F
T142L
V179S
L109V
A39R
R206K
K219S
K103H
S211E
G98E
A207T
L12I
V75L
I181F
N39R
R101G
L210V
I195T
I189L
E36K
S162T
F115N
A200S
Q145C
C181F
Y162R
E48P
G155R
K39L
F215N
E39L
Y162G
D162K
E194A
K211D
E203G
D207T
W162R
E122T
T215C
D169A
V142M
T139V
G190Q
E219T
T107S
L109I
N123G
T58N
C3G
T69P
A173I
R68V
K32R
T139I
K194A
A44Y
M178V
N219T
E123G
I142L
K70E
L210M
K82R
N67E
C215L
L31M
L31K
P4H
K64N
C215N
K70T
A138G
R101N
Q138D
I165L
E138K
C162T
K138D
P101G
E203Q
S103H
T215D
T200M
T139M
K174E
T211E
D162R
V75S
E196R
D67S
A158T
D203V
E39S
A36D
E194K
I200M
T215S
V122T
D215S
T142M
T202L
V181F
D123S
Q197H
M135K
G211E
N47F
A28G
D67E
P4T
K102R
K66R
E122Q
K197H
L34I
S123G
Q174R
V215N
G196K
C181V
N39L
I2V
E219S
T4H
T69A
V135K
D162G
L135K
V202T
A207N
T215V
K207N
G196R
A39L
E36D
C162H
I94L
T173I
N219S
Q173I
I173E
V106A
K64Y
G211D
N32R
E203V
K64H
G211A
Q207N
E32R
A138K
D123G
E40F
E169A
A207K
T200S
T69E
A190Q
K102E
E197H
S173I
T139R
N211E
I132V
F210S
N173I
H174E
E190Q
Q166I
A39S
I200S
R101H
K203G
E204K
T211Q
M139V
D69P
A211E
T35I
D203Q
K169A
M135R
M139I
L210S
S190Q
L215S
N39S
W162T
A173E
T215I
I132L
I142M
I142T
R173I
V106M
K211E
G211Q
T139A
A33G
Y162T
S39L
K173E
V203Q
C215S
Y121H
L210F
G211T
I195L
R207H
T69I
A200K
C215D
I178V
E28G
K174R
R102E
V75A
N162T
K103S
A39E
Q197K
E207H
T35L
A207H
H174R
N162H
C162D
K203Q
A203G
W210S
T135K
D40F
C162Y
D207N
A211T
Q166E
I135K
E203K
L178V
Q200M
K211T
S173E
N211Q
F215S
F215I
D207K
T211S
V75T
I180V
Y162H
K28G
N211T
D69A
R139K
M75T
W162H
N39E
V135R
I39L
T173E
A211Q
K207H
E69A
K103R
K211Q
E53D
C215V
N67S
L75A
Y188L
Q207H
F116Y
G211S
D69E
V111I
V90I
K203V
E28K
S200E
K200E
D207A
Q173E
E197K
G203Q
T139K
A200E
V108I
R173E
D207H
K135R
E6D
D162T
P4S
K219N
C215I
T200K
M35L
Q200S
T67S
I75A
D121H
M139A
A203Q
N211S
L74V
T135L
T142V
Q122P
K122P
I200K
N69S
T135R
V21I
D69I
E67S
T75A
E101Q
L135R
K211S
A203V
R173T
K64R
V215S
M102E
D162H
I135R
A211S
I35L
K219E
T69S
A106M
V75I
R43Q
D211E
V106I
T4S
S103R
H64R
I135L
A62V
Q151M
M75A
E169D
E219N
D67G
V215I
K142M
E44D
T200E
I200E
S211R
K43Q
L75T
T135V
Y64R
D218E
A162S
D177E
A122P
R101Q
K6D
T69D
N173E
I178M
M139K
A158S
K102Q
M75I
I135V
R101E
S142M
S48T
I142V
R103N
L178M
D211Q
H101Q
K166R
V179I
K49R
K219R
Q200K
P101Q
I135T
E48T
G177E
E122P
I200T
P101E
L75I
K20R
T69N
K65R
R102Q
G196E
A207E
K207E
Q200E
M195L
D67N
D69S
V118I
E43Q
Q207E
K207Q
A106I
K70R
K219Q
D211S
I69S
V122P
K103N
Q219R
G211K
A173K
T211R
S215Y
K39T
A207Q
E39T
E69S
I69N
N67G
L35V
G177D
G211R
K142V
M41L
D207E
A67G
V50I
T35V
M102Q
T215Y
L210W
N177E
T67G
T11K
E69N
D69N
I215Y
K174Q
R11K
H219R
R104K
H174Q
C162S
E67G
N211R
N166R
E219R
I165T
D207Q
D215Y
T173K
Q173K
A211R
N43Q
A39T
I74V
N39T
K211R
S173K
L215Y
N173K
S39T
N219R
N219Q
M35V
Q101K
E219Q
I60V
S103N
A35V
R173K
K6E
K83R
N177D
Q166R
C215Y
I35V
E49R
S176P
S27T
A44D
R101K
E101K
Y162S
W162S
F215Y
S98A
I181Y
N43K
I189V
I39T
L31I
F208H
C181Y
D211R
P101K
D162S
P7T
H101K
V215Y
T16M
G98A
V181Y
I100L
C3S
Y57N
Y208H
F167I
change in log(rf)
0.00
0.30
−log(prob)
0.10
0.25
0
4
3
1
2
−log(prob)
5
6
7
d4T
0.05
0.20
S163I
V180M
I180F
V202L
I60L
N81K
A36G T163I
R72K I180M
Y127C
A139P
K139P
T139P
N68V
T216P
W153G
M139P
I202L L168S
R139P
P133L
E36G
V10I
G68V
D36G
P170L
T68V
P4L
N207T
K102E
H207T
G99R S68V
A207T
F87L
I94LT58N
Q207T
E203Q
K22R
G211N
G67S
M210S
N67S
V135K
D67S
D207T
K207T
R211G
G207T
I202T
N162H
L207T
E207T
E203K
Q211N
G138K
A138K
H175Y
T200I
L210F
Q211G
K203Q
S123G
A33G
Q122P
A162H
R211N
K173R
L178V
C162H
E123G
D123G
V142T
E203V
D162H
E138K
V178M
A211G
N207A
A211N
F210S
L210S
S162H
E211G
L142T
V75L
N22R
E211N
A40F
K211G
K211N
L75A
T211N
N175Y
T211G
N162Y
K203V
N123G
V75A
H207A
S162C
T4S
L4S
I178V
S211G
P101Q
T123G
E40F
S211N
P4S
L200E
T162Y
K101Q
H4S
L142V
A173T
E173T
A162Y
C162Y
D162Y
V200E
S69N
G69S
D40F
Q200E
H211G
R200E
H208Y
K102R
H211N
K43Q
M142T
I200E
E210S
S162Y
E101Q
K200E
E219R
S200ET200E
K142T
N173T
Q207A
I142T
G207A
A200E
L75T
R173T
I173T
R101Q
R43Q
L178M
K219R
T69SV75T
N219R
D69S
T101Q
E203D
N43Q
E43Q
R102Q
K207A
D207A
E102R
K173T F77L
D218E
N102Q
Q203D
M142V
L173T
K135T
E207A
G69N
K207E
K44D
K122E
K65RV75M
S173T
K102Q
P62V
I142V
D179I
T69D
K142V
L75M
L207A
I202V
I178M
A179I
F116Y
A43Q
R219Q
V179I
K203D
E44D
E207Q
M135T
A44D
R135T
L135T
Q11KD69N
H219Q
T69N
V75I
L75I
E102Q
V122E
I135T
T215Y
G44D
A62V
L74V
I74V
C215T
M35V
N219Q
I35V
D177E
A105S
N215T
E219Q
E197Q
H121D
S103K
S11K
V215T
K219Q
K207Q
V118I
R11K
C134S
Y121D
D215T
K83R
L215T
L197Q
A122E
A122K
A169E
N6E
A215Y
N32K
A215T
N177E
L35V
G177E
S215T
R32K
K169E
N215Y
P122E
K197Q
K172R
P122K
R103K
A14P
C121D
V135T
C215Y
K6E
E32K
Q122K
I8V
Q122E
I215T
L215Y
P7T
D6E
T191S
R197Q
D215Y
A35V
E215T
I215Y
E215Y
S215Y
V215Y
M41L
P197Q
L188Y
D169E
M210W
F215T
K35V
K57N
I60V
H197Q
D79E
I165T
N103K
R35V
T11K
L210W
S210W
F215Y
T35V F210W
L165T
Q217P
G134S
Q6E
E210W
L31I
I184V
change in log(rf)
S211M
D218V
R172I
D40G
A207L
W153M
Y188F
H198P
E194G
G190R
V179F
L26M
Q207L
K103Q
H207L
R64Q
P9S
D123Y
I5L
V50F
E89G
K211M
K39I
G211V
D186H
P25A
N103E
A207T
D207L
P119L
N81K
D53Q
V179G
L12I
T39I
I50M
A39I
M35G
V180F
H198Q
F144S
R143T
Q23H
L39I
E44Y
T11N
I47V
I47L
R72T
W153R
K139P
K20L
A35G
K13T
I100F
R11N
R207L
Y144S
P52L
G45E
S123Y
I179S
I35G
Q207T
T4A
R139P
P133L
D179S
D39I
G18C
D177Q
N123Y
E211V
K65N
N207L
L168S
Q11N
S211V
E194V
Q91H
K103H
R35G
E40G
K35G
E44K
D17G
V5L
D218Y
I35A
I2L
R125S
A211V
A122T
D192E
T35G
H96R
C162R
E53Q
H207T
E36G
K172G
A114P
G213D
L187F
N32Q
D186N
G44Y
R39I
S200L
T35A
M184L
K36G
D113E
N192E
K126E
A139P
V50M
V10G
P4A
K123Y
R13T
D218H
T69Y
A33V
S215A
N39I
S139P
N81D
E102H
E102G
D207T
D215A
D169R
F116I
L35G
T200L
Y144D
G190C
D194A
W153G
I180L
K53Q
I132V
H218Y
Q211V
Q190C
H219T
T139P
K172T
T211V
R143K
E152D
R190C
N136T
R199G
R103Q
K197R
D44Y
Y121E
P150L
R211V
I50L
Q4A
D121E
V179L
N69Y
N103Q
Q122V
S190C
L149F
E123Y
K169R
R207T
M139P
K207G
G44K
I200L
K204N
E203Q
F144D
K219T
R102N
N113E
H96Y
L210Y
S163R
Q35G
R45E
K211V
K194A
Q204N
E179L
M211V
E200L
D53G
V202L
D169K
Y162R
N207T
V10I
V50L
R200L
K200L
D44K
L205F
K172I
I202L
D179F
D179L
I34F
I63V
V200L
T163R
A200L
N177Q
E39I
A207G
M41V
D204N
Y188H
S163C
P14A
D186E
K196Q
V215A
T215L
E177Q
L173Q
I106L
D179G
I180F
D69Y
Q197R
H207G
A179L
P157A
L149I
A121E
I173Q
T39K
N137H
T173Q
P39I
P140T
G39I
Q151L
P140Q
C162T
T135M
E53G
R122V
T107S
T4L
I181F
L188F
K43A
L35A
D121Y
I179E
M164I
E194A
L132V
D179E
K53G
D123G
N207H
I179F
N219T
E197R
H197R
A28G
M164L
G69P
R103H
Q207G
N175Y
G177Q
I179L
N103H
E215L
Q219T
V184L
E204N
T135R
T11R
I179G
G196A
A158T
E28G
Q173S
M106L
T69P
E219T
P122V
P197R
T7A
A207K
G196R
D207G
I173S
L34F
N69G
E39N
T139M
G196Q
A173Q
K70T
L193F
R196Q
P4L
L197R
C215I
S215E
E32R
N32R
S123G
Q151H
E194D
K39S
E28A
T58N
D215E
E196A
T39S
S117A
T69G
Q151K
R43A
N69P
R174E
T163C
E70G
K211S
G68N
S162N
Y162T
A122V
K22R
D169A
I106M
L210M
F210M
E196Q
L39K
E196R
R196K
A106L
L39S
G69A
L135R
N207G
L173S
N123G
N22R
L210F
T173S
R207G
S215L
Q4L
R28G
L173T
L135M
K28G
L215C
K64Y
A173S
H207K
E173Q
L207G
A39S
I69A
I31L
K169A
I173T
K70G
T75S
K123G
E101P
K103S
D215L
I122V
S139A
T165I
T215C
D39S
S215D
V215D
D36A
I39S
Q207K
C88S
W162N
M210S
A7P
K64N
T35I
G211N
N173Q
I2V
T7P
R139K
T69A L188H
D69P
E102N
E173S
N70G
N43A
K203V
L215I
H174E
S48T
K174E
Q197K
W88S
Q203V
T75A
S58N
S68N
R101P
F162N
K219E
V215E
A173T
M142T
Y64N
N174E
G162N
N173S
K28A
Q36A
V215L
G196K
D69G
E123G
Q174E
D207K
V202T
R39S
K135M
E215C
L34I
K39A
N70T
K173Q
N39S
S211N
I202T
D39K
M178V
T69E
A139K
N69A
K135R
D86E
R64Y
I135R
A211N
R64N
I135M
Q70G
T68N
D203V
R173Q
D67G
R70G
E40D
T215I
R68N
E211N
G177N
N173T
L210S
K138A
E203V
E196K
T139A
M211N
E48T
K35M
N39K
R28A
T70G
A35M
C162N
A69S
T67E
T162N
K173S
F210S
N69E
A106M
R207K
V74I
E197K
N207K
T211N
Q101P
K36D
R39K
R101E
E36D
S139K
T39A
K36A
Q211N
R211N
D69A
R173S
I180V
K211N
K142T
E173T
E36A
E39S
M139A
I35M
P197K
I178V
S215C
V135R
M184I
K101P
H197K
L178V
V135M
E215I
V75S
T142I
H64N
P39S
H64Y
S162D
T35M
N215I
D215C
L207K
R103S
T139K
N103S
E203D
Y162C
C121H
G162D
V111I
R162N
Y162N
D69E
L165I
D67E
Q101E
V75A
L35T
L142T
M75S V75L
R43N
Q142T
P4S
G69S
L197K
T69D
K173T
G39S
K102Q
W162D
Q4S
M139K
S215I
R173T
I69S
L39A
N102Q
K65R
V215C
E39K
T4S
M75A
K207E
L184I
L35I
H101P
M75L
D215I
F162D
N162D
T142V
E122K
C215F
I202V
L35M
T69N
C162D
D39A
Y121H
D121H
V75M
T69S
Q35M
L215F
P39K
G39K
T162D
S67E
K101E
H103S
G67E
G162A
A207E
H207E
Q203K
I142V
M142V
A135R
S162A
R102Q
V21IV75T
V184I
A121H
V215I
D162A
A135M
H67E
K49R
K43Q
D203K
S98A
F214L
W162A
V135I
R39A
R162D
I8V
I211N
C162A
K219R
N69S
K70R
A62V
E203K
D218E
L39T
N219E
I178M
N39A
Y162D
H101E
E190A
Q122E
N162A
M75T
Q142V
T215F
F162A
R162A
D69N
Q122K
Q207E
K142V
E102Q
L142V
R190A
D39T
S215T
L178M
K20R
G177E
D215T
D69S
T162A
D207E
R43Q
A158S
N123D
Y162A
G98S
H218E
K6E
A43E
S123D
R32K
G190A
F77L
T158S
E215F
H208Y
H219R
R173K
P62V
N215F
N219R
Q190A
M142I
E196G
E39A
N207E
L195I
Q219R
G138E
L207E
R207E
E219R
D6E
T195I
A35V
W162S
N70R
N43Q
P39A
G39A
C145Q
M35V
N39T
F116Y
S27T
S215FK43E
Y171F
V215T
D179V
R122E
K123D
N177D
R122K
I35V
S190A
E177D
F162S
L75I
N219K
R39T
D215F
D79E
V16M
R43E
P122E
M75I
K35V
T11KT75I
R35V
Q43E
P122K
R102K
E123D
R43K
K142I
T35V
L178I
A55P
D169E
R11K
I90V
I106V
A105S
I179V
C162S
M210W
A122E
I214L
Q138E
A11K
A122K
V74L
V215F
T16M
L165T
K169E
Q11K
R104K
G177D
N32K
N43K
T162S
N102K
S146Y
L35V
R152G
M41L
N43E
E32K
M106V
I214F
L142I
I122E
K138E
N134S
Q142IE30K
I122K
T215Y
V118I
R93G
A138E
E39T
I74L
S170P
L210W
E215Y
P27T
K42E
F210W
Q35V
L94I
R66K
C181Y
Y162S
C134S
N215Y
R162S
H182Q
G98A
A215Y
P39T
K172R
G39T
Q46K
E152G
C215Y
M195I
D67N
G134S
N104K
I215Y
L215Y
E102K
A106V
F215Y
I181Y
Q104K
V181Y
D215Y
S215Y
I100L
S67N
T67NL188Y
V109LE67N
I109L
F100L
G67N
V215Y
H67N
0.00
0
5
4
3
−log(prob)
2
0.6
3
0.2
Q151M
2
0.0
1
4
−log(prob)
6
6
8
T215F
T215Y
0
2
P14Q
T84I
R211W
I159T
L209S
K83E
R199E
T48Q
D218T
P97A
P97G
V111G
D86N
N175D
P52H
P150L
R104T
I35N
K200P
C162I
V35G
L209V
N123Y
R125K
N81Y
L187S
Q197N
K83G
R20T
T84S
G123T
D207S
I179L
K46T
D186H
H121V
K13S
E169N
N123T
Q211W
R20I
R104E
E200P
I35S
K83S
W88G
L187W
S117P
G177H
C88R
E32L
K28N
E194V
V75S
T84N
Y144S
Y146F
T128N
L187C
I178V
R83E
D110N
N175K
T7Q
I2L
T107K
S191T
N81K
Y162I
R20L
R65I
M35N
L193I
V10G
F61C
L197N
K83T
D121V
E122I
R83G
T27I
H207V
A36G
G190V
K22R
T27A
S3V
R102H
D36G
K154R
I35Q
K174L
N147H
A200M
D113E
Y146C
T48A
N32L
Q211I
F160L
P4A
I47V
E207S
D40Q
K20T
V184T
H162I
R206K
V180F
I5L
E122D
K103I
K28Q
A39M
V111L
R83S
K13E
K20I
H96Y
D207T
K70I
G190T
V60L
Q23R
K201E
I35A
D169N
K70N
V74M
E122R
A62S
Q207V
V148E
K82Q
V35K
A114P
K82M
R83T
W153R
K49Q
K20L
G207L
M16R
S191P
S191A
K6N
R11S
N32T
W88L
E101T
M35S
A98V
I35G
G98V
H197N
T7I
R104Q
N57K
P59A
G152D
G18C
R201M
W71R
Y208Q
K13N
Q161E
E207T
N177H
N147S
N147I
S139P
N6S
R102G
C181S
T139P
I184W
D113N
N162I
R101G
A207S
W88R
D179S
V184W
I167L
A39D
S134I
N103I
V50L
A138V
W212S
E36G
F116L
A35G
I32L
R122D
A190C
E196A
G207S
R143G
H96R
M178F
S3G
E35G
G18A
V10I
V179S
F116I
V118F
A158Q
R72K
K219I
K65N
Q211V
Q161P
K103E
A11S
Q91H
K49E
E203R
A162G
V148A
R35G
D123K
I200Q
E89G
Y146S
Q151H
E102H
S211W
K172I
K172T
K13T
A138D
G68T
Y144D
L165P
W162I
M35Q
I135K
R143T
F77S
G98E
T139Q
E123Y
E190C
G190C
Y56S
E169K
A62P
Y208N
I179F
W153G
A98E
E123T
E89K
W212L
I31M
S139Q
N6G
K6S
K6G
I165P
V202K
E48A
K43G
D40K
Q151R
N43G
T211V
I47L
I60L
L35G
H96P
M16T
I37V
P170L
R43G
M35G
C134I
K103Q
S200Q
E53Q
Q197R
T69Y
E44Y
E179M
N175H
E179S
I180F
Q207L
T70I
R211V
N147K
K66N
D186N
Q151K
I8G
D67Y
K204G
S3R
A207T
R139P
E211V
N70I
D44Y
A7I
N103E
I37L
S3I
H121N
C162H
S211I
I202K
S68Y
H207L
R11Q
F116S
K122Q
V179M
D28N
A62T
T107P
G143T
E44N
P4H
L34V
D204G
T163R
V202L
I181S
S173L
D179M
I159V
A211V
G207T
P4L
S158P
E102G
E179L
E179F
C162G
D121N
T27P
T142L
A200R
K200L
P14A
S69A
T69P
T39M
E138K
I202L
G45R
I200M
E179G
R101T
S200M
E36D
F171Y
Q23H
A33V
R196A
R102M
S134R
I106M
K68I
L197R
E200L
Q207S
M41F
S211V
A158GS68I
K201M
N103Q
K43T
D36K
E204N
P4T
N104E
E200Q
A139I
D179G
A11Q
E53G
K103T
R139Q
D186E
H207S
Q161H
K70E
V179G
Y208L
M35A
R43T
G190E
V135K
A36K
T122D
D179L
D179F
I100F
A190R
K28R
Q211D
K82N
K200Q
A28G
C134R
R173N
I142L
N6A
V179L
S163R
L31M
K220N
V184L
E190R
T39D
V135L
V5L
D204N
I132V
D69Y
I184L
L109I
V179F
S134G
E200M
R65N
A162T
L188H
H197R
C181F
A106L
G123K
I200R
N103T
D69P
I132L
V75T
C162N
Y162N
E194Q
E36K
G67Y
G190R
K219S
L193F
C162T
I135R
K6A
R104N
T173N
R143S
S68V
E194K
K204N
N123K
E200V
H162G
E67Y
Q207T
L178F
E39L
L34F
A204G
K43A
T173L
N207R
N175Y
E32R
K68V
D36A
H211V
K200M
I106L
Y162G
S139I
E200S
M178V
I34F
R173L
R101N
E102M
V135R
Q173N
N162G
D44K
K103H
R143K
E39M
H207T
N104Q
E194D
C134G
Y144F
S68C
R43A
A39K
M16V
K173N
K43R
E44K
V189I
T139R
E174N
T140Q
T158P
K68C
A173N
L200M
D121C
A138G
S200R
K211N
S69G
Q211A
E173N
H121C
I165L
A158P
A173L
K219D
I142T
H207G
D44G
D169K
N43A
T122I
G143K
G143S
K173L
S117A
C162Y
E207D
V75L
L135K
M41V
T139I
E200R
E44G
E45R
N32R
L210V
E197K
K28G
L210Y
E39N
N103H
A158T
I2V
S69E
I181F
R211A
T139M
R39M
W162G
I35K
E39D
A190Q
R122Q
V139I
D162G
V75A
E122Q
D39S
K101P
K70T
V202T
A138K
K211G
Q173L
R139M
W88C
E190Q
Y162D
I35L
Q197E
T69A
E36A
K28A
Q207G
H162T
N162T
S69I
K174E
H174R
K64Y
G190Q
E173L
A35K
I173L
K174N S68T
K200R
A39S
S211Q
R173S
H197L
E123K
I180V
E35K
C215H
S211D
N70E
S3C
L135R
K70G
Y162T
T122R
R39D
T69G
I202T
K207R
V184I
K64N
D207N
L39S
A139K
C162D
E174H
E207N
T107S
E6D
W88S
T173S
A204D
R35K
E70G
G207N
K68T
A207D
R139I
T69E
T142M
T207N
W162T
C188H
I173N
T211G
E211N
K39S
N121C
A173S
D162T
E200K
T139A
L178V
A207N
T27S
I173R
A204N
S139A
K43N
L35K
T39K
D67T
W162H
G207R
Y208F
T39S
M35K
A33G
T211N
K173S
A211N
K201R
D28G
H162D
A211G
Q211N
S211A
D215H
I142M
V111I
R211N
L210M
R211G
Q211G
K142M
R11T
K135R
C162A
D207R
E215H
L200R
S215H
D67E
E207R
G207D
T69I
D67S
K174H
R135M
A207R
D67H
L197E
K82R
E211G
E174R
W162N
M35L
T207R
M210F
V135M
L142M
S211G
Q197K
G207A
D69A
E101P
K219H
E173S
E39S
Q173S
K135M
T75L
M39S
D123S
S139K
S68R
L210S
S211N
I135M
L135M
K68R
G207K
D207K
N70G
E207K
N177G
M75S
I173S
P4S
T4S
R101P
T207K
Q101P
T215H
Q207N
N123S
K174R
A207K
R39S
Y115F
R139A
G123S
M75A
N70T
H197E
T139K
A11T
H207N
E39K
Y162A
C215V
D69G
R102Q
Q207R
I195L
L197K
M75L
H211G
T69S
G67T
V139K
E43Q
S210F
C215L
K64R
T39A
C215N
T75A
R83K
T70G
A98G
K103S
H207R
M35I
K68N
D69E
E67T
D121Y
S68N
H211N
T122Q
A190S
H121Y
Q207D
R39K
H162A
D218E
S215E
L210F
W162D
H207D
T69D
F142M
V135I
Q207A
E53D
N103S
W162Y
E203K
D215N
G67H
G67S
D69I
L4S
D67G
Q207K
R139K
A6D
H207A
E179I
D215L
E123S
N6D
G190S
E67H
S215N
E190S
N162A
E215L
H207K
T215N
E215N
E67S
K6D
T173K
S215L
Q211R
D215E
E203D
R190S
M75T
D179I
R122P
V35T
H197K
I200A
D44A
S215C
Q122P
S215V
A35T
D69S
N121Y
K122P
E44A
S200A
E35T
W162A
C215I
G123D
I202V
V179I
T215L
K219R
K203D
F77L
L75I
R39A
G190A
F214L
W162C
F116Y
E39A
G177E
T215E
I35TR43Q
A158S
A69N
D215C
V75I
M142V
N123D
R166K
T11K
A200T
D215V
E122P
K49R
A173K
E215V
R219E
R173K
L135I
K219N
T215S T215V
T142V
K43Q
E200A
E215I
D215I
N103K
I122P
E102Q
N64R
D179V
V118I
M35T
A98S
Q211K
M178I
L142V
M135T
Q219E
T69N
S215I
I142V
R43E
G98S
A28E
H174Q
I69N
T215DT215C
K200A
E49R
L35T
K43E
I200T
V122P
T211K
A62V
I35V
N177E
T122P
N43Q
S211R
R174Q
T215I
R211K
K196G
S69N
M75I
E211K
Q151M
K142V
L135T
V135T
P62V
E122K
R135T
A35V
R35T
E177D
A211K
E35V
T75I
R122K
K219E
F142V
K68G
I135T
K204E
I90V
S68G
K135T
R11K
I108V
S200T
R35V
S211K
N43E
I60V
D204E
E123D
R102K
E200T
Y215F
A162S
E174Q
L35V
A106V
R32K
E196G
R101K
M35V
Q102K
D40E
A138E
D69N
K28E
L178I
E101K
C162S
R104K
R65K
I214L
A105S
Q101K
T162S
Q176P
V50I
I173K
K200T
I184M
I106V
I8V
N177D
T48S
R196G
V184M
N32K
G177D
K172R
V74L
I32K
A11K
K219Q
K174Q
D79E
E32K
I165T
H162S
N6E
H211K
L188Y
K6E
W162S
A135T
C181Y
I74L
A204E
N102K
N70R
K70R
L165T
L200T
N166K
D28E
N104K
R93G
N162S
C145Q
Y162S
D162S
S58T
Y208H
E70RD67N E215F
F162S
F210W
E48S
T32K
T122K
I214F
S215F
I100L
D215F
C215F
I181Y
L210W
M75V
T140P
S137N
G70R
M210W
S210W
M41L
T70R
C188Y E102K
G67N
E67N
C215YS215Y
E215Y
D215Y
ddC
K207L
T35A
E36K
T58I
K11A
K11S
G207L
M41V
Y121E
T165A
F87C
T216H
E219T
K43T
H198P
K174L
N207L
W153S
K201N
R11N
Q91H
D192E
Y188F
R172S
R83S
K13S
W153R
I200V
E79K
L214S
I37F
S123T
N192E
L214I
Q211M
P7S
Y144H
P150L
G98E
D79K
H174L
M16I
R207L
I165A
D113E
N57K
M35S
L165K
K220Q
S48P
S139P
D194V
R43T
T200V
E123T
P119L
L187F
D218Y
R102G
Y121N
A162G
S150L
K83S
Q102G
A114P
S162R
K11N
V50L
F116I
F144D
N137I
D207L
Q207L
A139P
D39I
M139P
A7S
S48L
Q23H
I50L
S11N
E194V
A36G
T139P
G68Y
K172S
D44K
K102M
A122N
E102G
G194V
E29G
Q43T
I173Q
Y144D
K36G
E207L
G190R
S58I
V60L
D185E
I139P
Q122N
S200M
V142M
A169R
P122N
E36G
E42G
T39R
V10G
F124S
V179F
D17G
R45E
I37L
V189L
K122N
N43T
L168S
L193F
L215N
I7S
S68Y
C145H
F116S
N113E
F214S
A28D
D186H
I60L
G196Q
K13T
E173L
M16T
M39R
M164L
K207T
G99R
K194A
I63V
N162R
K43A
L12I
A44K
G143S
D39R
Q145H
T215N
C162R
T163C
A162H
I39R
L165A
L12F
N39R
D39M
I178F
M139I
R199S
I181F
K219T
R196Q
A39R
I35A
A173S
S69G
E44K
N57H
S39R
G45E
K211D
E102M
C215N
N173I
I100F
Q102M
I122N
G190T
F124L
D194A
Q219T
A190R
W153G
A105T
K139P
E169R
K169R
L41V
A69P
V16T
G194A
K174N
L210Y
G28D
I184L
R43A
G207T
S162T
E194D
G177N
Q207R
G190V
E215N
M200L
E194A
L34F K66N
E35A
R102M
F214I
R66N
S200L
D186E
Q197H
A200V
D207R
N207T
N173Q
S215N
G44K
E203G
A122V
K35A
R207T
P4L
E40K
R206K
K139I
I173A
R200M
R200L
A211E
K103H
D215N
D215E
S105T
Q219G
E40Q
K219G
D194Q
M184L
M35A
N69P
Q151K
V10I
L142M
P197H
G207K
Q122V
I34F
K207H
F210M
K135R
Q211D
F215I
A190T
K154E
T200M
K104N
V200M
V200L
I173S
I200M
W210M
E39L
M210S
Q173S
T135M
K39E
Q43A
D67H
R199K
H174N
Q207T
T69P
D215S
I200L
V215D
D194K
T200L
A200M
S162G
P122V
K122V
R162G
V35L
C162T
T162G
D207T
N103H
M35R
T142M
C188H
A190V
C162G
K39L
A200L
N43A
Y188H
G196K
E207T
R35A
V215E
H208F
G194Q
A69E
Q207D
S211E
E194Q
E207R
N123G
R197H
E203A
R39L
G203Q
N162T
R196K
K219D
M39L
V179D
A138K
D39L
N69E
R173L
R135M
T103H
R211A
T39L
S103H
V215N
I122V
N67T
L197H
E197H
K101R
D67T
G207H
S211A
E194K
G194K
L210M
E67T
D69P
G211E
E203Q
P4H
R211E
I142M
N207K
P197E
V215S
N162G
T211A
G67H
G211A
T173L
V106M
T69G
Q211E
R190E
S39L
T211E
A69G
E123S
A69I
I39L
Q173L
F188H
S139R
K139M
D67G
D203Q
I195L
D67S
N69G
D211E
H174K
G68R
I173L
R39E
V139R
K211E
N207H
A173L
R173E
N173A
Q211A
C162N
K135M
A39L
N173S
K142M
R101P
H175Y
T69E
N67S
E138K
H101R
T39E
A106M
R70G
T68R
S162H
R207H
N39L
D196K
V108I
A203Q
R162H
E67S
K203Q
T162H
M39E
N175Y
K101P
T215I
T173E
E101N
V203Q
S123G
A11T
K211A
C162H
T35L
C215I
T39K
R135T
I200T
A200I
P39L
E174R
D39E
S68R
G190E
N211E
E123G
G67T
T195L
A162Y
R207K
Q173E
I39E
D69G
N39E
N173L
D40F
N69I
G68N
I173E
D207H
H207A
M200K
A173E
C181V
E70G
Q207H
E207D
A36D
S200E
M39K
K174R
A139R
T139R
H162Y
N211A
E210S
D211A
A39E
F210S
D69E
M139RL210S
T69I
E215I
D39K
W162D
S200K
I35L
E36D
L215I
H174R
R200K
G67S
N174R
Q207K
R200E
I39K
N39K
E207H
N162H
A39K
S39E
D207K
K207A
I75T
S39K
K36D
S173L
T4S
Q138K
R11T
K123G
W210S
H101P
V200K
E101H
S68N
N173E
S215I
D215I
S11T
V200E
S48E
K104R
Q32E
V75L
A28K
A169D
K64H
L142V
M75L
V189I
G162D
L4S
N104R
E207K
V21I
R190S
I200E
I200K
T200E
A190E
T200K
K6D
V106I
R32E
K35L
V111I
A162D
I139R
E40F
K103R
K11T
E35L
I181V
P4S
E40D
D69I
V90I
A200K
N11T
A200E
T142V
T158S
P197K
M35L
S173E
G207A
K139R
N32E
N64H
T69S
R64H
H197K
E173K
N69S
Y162D
A106I
K135T
Q197K
Y64H
K40F
N207A
A69S
Q103R
R162Y
E101R
I35T
H162D
T162Y
K32E
S162A
S123N
E190S
E101P
N103R
L215F
G28K
R197K
S162Y
R207A
R162A
T162A
C162Y
D121H
V215I
K169D
C162A
E123N
A158S
E197K
E169D
E219N
I142V
K49R
R35L
L74I
I32E
I135V
L197K
S48T
K123N
S103R
H219N
C162D
Q207A
H188L
S162D
C188L
P62V
L75I
E121H
T69D
D207A
R101Q
V178L
E35T
D69S
L75T
R162D
T215F
K101Q
T101Q
K43E
G190S
C215F
K219N
N162A
Y121H
T103R
E207A
C121H
Q211K
L173K
R43E
K35T
K142V
Y171F
I8V
R135V
H208Y
V75M
G196E
E44D
T162D
V75T
Y188L
N162Y
K219R
M35T
N162D
K28E
T166R
E219R
Q219N
A43E
R196E
M75T
M178L
S68G
E215F
L135V
K219E
Q219E
T35V
N219R
A200T
T135V
E179I
D218E
M135V
H219R
F188L
H218E
F77L
Q219R
A190S F116Y
A62V
A135V
N123D
T43E
S215F
D215F
P101Q
I178L
N166R
K135V
V179I
Q166R
R35T
H101Q
N162S
D6E
F214L
K166R
A190G
E101Q
G44D
Q43E
V75I
T195I
I166R
D179I
I35V
R174Q
G177E
A122E
K65R
Q122E
N101Q
D196E
N177E
R22K
M75I
E177D
V5I
N43E
Q196E
E207Q
V118I
R173K
K122E
P122E
Q102K
C162S
F208Y
D79E
I165T
R102K
S27T
T173K
S98A
D67N
V215F T163S
S123D
K35V
E102K
P7T
E35V
I122E
E123D
Q173K
I173K
R82K
E174Q
A173K
V202I
M35V
A7T
T215Y
L74V
Y121D
S88W
K172R
A204E
S107T
R155G
D204E
P27T
G204E
K174Q
C88W
E215Y
A28E
N6E
R70K
L41M
K6E
V31I
N174Q
I74V
D215Y
V132I
K204E
S215Y
K123D
Q204E
G177D
N173K
I215Y
C215Y
N177D
G67N
C121D
L215Y
R35V
K125R
F215Y
S58T
G143R
E70K
I7T
L132I
H174Q
F100L
F181YV215Y
I109L
G28E
L31I
V181Y
Q46K
L47I
S173K
R30K
C163S
G148V
N204E
I184V
G98A
L165T
C181Y
M184V
E46K
I100L
I181Y
1
ZDV
P95Q
1
130
I54V
0.5
0.6
G94S
Q92E
N37C
V11S
K70Q
N88K
L63V
I85L
A22S
K57S
E34G
V15M
Q18T
R57S
V19A
H69T
L18T
S37I
I13R
Q19A
L19A
R8D
V11C
I85M
I15M
V13R
V19S
G16R
D65Q
T4I
R8P
N88T
N83K
I11C
Q69E
L97F
L19S
L24S
P19A
R8T
N69T
Q19S
T37I
K69E
K92E
E34I
A11C
R19A
I24S
N37I
K45N
R8L
L38F
L63C
D8P
E21G
N41T
E37I
S37Y
R70Q
P19S
A16R
P63H
R14E
D67W
K41TT71D
A71D
L19Q
L19V
R19S
R14Q
V82D
R6S
D8L
H61R
P12I
T37C
V11F
V63C
I82D
H69R
L10T T71G
P12E
T37Y
M36A A71G
L19P
D67L
L36A
E34A
I11F
I10M
Y67L
N85M
V64M
Q19P
S11C
N14E
L89I
N37Y
G51A
E7H
A11F
V13M
Q69T
I13M
K34I
N61R
C67LL10M
E37Y
V15L
R92L
N14Q
Y67W
N70Q
S67L
Q61R
T14E
N7H
K69T
K41N
S67W
I47V
C6S
R92H
I15L
C67W
N69R
N69Y
H69Y
T14Q
R92EK20L
V82G
S63Q
S63H A12N
N37E
E12N
L10R
A63V
I82GM46V
A63C
T12N
I20T
E34D
L63H
Q63T
K12N
T37E
L63Q
P12N
L63S
N83D
A22V
S12N
I63H
K34A
F95L
I72L
L24F
C63H
P63T
V63H
C67Y
D30N
R69Y
I63Q
Q61N
S11F
I12N
L64M
N61H
S74A
E35G
C63Q
V82M
R63H
V63Q
Q69K
G48M
D88S
N88S
F53Y
V63S
R63Q
Q61H
A63H
T74A
I82M
I63S
I24F
T71L
I50V
A71L
C63S
A63Q
M89I
S63T
D67F
T69Y
Q69Y
I72V
H63T
G73C
Y67F
G16E
K69Y
E34Q
I93L
D35G
I63T
M36V
L63T
Q69R
A63S
L36V
C37D
C67F
T72L
N35G
S67F
D12N
I85V
K35G
C63T
V63T
K69R
N61E
A63L
R63T
R43T
K34D
N88D
V89I
E72L
Q61E A63T
V19I
A16E
M20R
A71T
F10V
I62V
N69H
L24I
V71I
V33F
L19I
T72V
I10FE72V
K34Q
K45R
V82I
Q19I
T37N
V82S
S37D
V32I
I10V
V82F
Q58E
G48V
I82S
K74A
D60E
I82F
L10F
R19IP19I
S63P
I54A
L54M
L10V F53L G73T
V20T
L33F
V36I
V64I
N85V
T37D
Q35G
K92Q
M20T
T71I
A71I I54L I54M
M93LR20T
L46I
N37D
L63P
K20T
I43T
N60E
I37D
I33F K20R
A16G
Q39P
M64I
I63P
M93IE37D
S39P
V20I
C63P
R70K
V63P
R43K
R14K
R20I
M20IV46L
Q69H
R63P
L36I
M36I M46LK20I
T43K
D65E
A63P
K69HL64I
I41R
E7Q
G73S
T39P
R6W
K98N
I75V
N41R
V93L
N14K
N7Q
V46I T71V
K41R
K65E
M46I
M62I
A71V
F82A
M62V
V93I
T14K
N43K
H2Q
L90M V82A
R92Q L82A
I43K
L54V
I82A
C6W
F95C
0.0
0.1
0.2
0.3
I84V
I54V
0.4
change in log(rf)
Figure 6.7: Scatter plots between mutation scores of g2praw (x-axis) and g2pmixed (y-axis)
for different drugs. Mutations from the wild type amino acid to a mutation
listed by the IAS list (Johnson et al., 2008) for the corresponding drug class
are colored red.
6.3 Mutation Models
131
Moreover, the g2pmixed model excludes mutations that did not appear among resistant
sequences from the EuResist database by definition, since Rp,a0 = 0 ⇒ M (p, a, a0 ) = 0.
6.3.2 Mutation Probabilities Derived From Treatment Data
The previously introduced mutation models assume (to a varying extent) that mutations
conferring the largest increase of resistance as measured in vitro are selected first during treatment. Certainly, this assumption does not hold for real treatments. In fact, the
order of mutations can also be expected to be influenced by restrictions imposed by the
protein structure. For example, a mutation might result in an impossible side chain conformation and therefore requires a prior mutation in the vicinity that may relax the side
chain placement. In the case of the HIV integrase, it was suggested that different pathways may be induced by alternative binding conformations of the inhibitor to the target
protein (Loizidou et al., 2009). These circumstances are not respected by in vitro drug
resistance. A perfect mutation model should consider all these aspects. However, as briefly
mentioned in the previous section there are some rarely occurring mutations that confer
a large change in resistance. Unfortunately, the mechanisms underlying these rare events
are not yet understood and, in general, knowledge on how and why a certain mutation is
preferred over another one is rather scarce.
Instead of integrating all these different (and probably incomplete) sources of knowledge,
one can study the history of the virus for constructing a mutation model. The mutagenetic
trees introduced in Chapter 4 rely on the fact that in a large database of viral sequences
every step of a resistance pathway occurs sufficiently often. Briefly, if mutation X is rarely
observed alone but rather frequently in the presence of mutation Y , one can assume that Y
has to be acquired before X. The success of the mutagenetic trees demonstrates that relying
on large databases for deriving evolutionary models is a promising alternative. Obviously,
longitudinal sequence data from monotherapies constitute the ideal source of information
for estimating in vivo mutation probabilities. Monotherapies, however, are considered
insufficient regarding today’s standard of care and are therefore not enriched in large
databases. Nowadays standard therapies typically comprise multiple drugs from different
drug classes. The majority of these treatments, however, uses only one protease inhibitor
or one NNRTI, thus even modern therapies can be regarded as pseudo-monotherapies.
Precisely, from the approximately 100,000 therapies stored in the EuResist database
99.6% and 95.6% of the NNRTI- and PI-containing regimens, respectively, apply only
one representative of the corresponding class. In contrast, only 15.8% of the recorded
treatments are pseudo NRTI monotherapies. NRTIs are given mostly in combination and
consequently 73.5% of the NRTI regimens contain exactly two NRTIs. These statistics
suggest that in general it should be possible to deduce in vivo mutation probabilities for
individual drugs (PIs and NNRTIs) or at least for drug pairs (NRTIs) from large databases.
The bottleneck of this estimation is the availability of sequence data, as at least one
sequence is required before a treatment change (baseline genotype) and one during the
treatment (follow-up genotype). Figure 6.8 depicts a schematic description of the training
data requirements and a graph relating the time allowed between baseline genotype and
treatment start to the size of the resulting training set.
Beyond a threshold of 180 days, the amount of available training sequences grows only
132
6 Planning Sequences of HIV Therapies
start
Sequencing A
< x days
Sequecing B
stop
111111111111111111111111
000000000000000000000000
000000000000000000000000
111111111111111111111111
> 0 days
> 0 days
000000000000000000000000
111111111111111111111111
Therapy
training set size
4000
3000
2000
1000
0
0
100
200
300
400
500
600
700
days allowed between treatment start and baseline sequence
Figure 6.8: The top figure shows a schematic description of the training data requirements.
The baseline sequence has to be obtained at most x before start of the treatment, while the follow-up sequence may be obtained at any time during the
treatment. The lower graph depicts the relationship between x and the size of
the dataset.
slowly with respect to the time frame allowed between obtaining the baseline sequence and
treatment start. The downside of a large time frame is that the viral population is under
continuous selective pressure by the old treatment and thus might accumulate additional
mutations before start of the next treatment. From this point of view, a small time frame
is preferred. Here, however, the downside is clearly the lack of available sequence data.
For complying with the standard datum definition the cutoff of 90 days was selected and
gave rise to approximately 1,900 sequence pairs. For exploiting the wealth of the EuResist database a cutoff of 720 days was investigated as well. These datasets provide
for every treatment the applied drugs, the baseline sequence, the follow-up sequence, and
the three time points. For estimating the in vivo mutation rates, the available information for a sequence pair is represented as an itemset. On this dataset, itemset mining is
applied to discover interesting rules, e.g. rules in the form of “drug → mutation”. The
confidence of a rule serves as the mutation probability. The plain application of an itemset
mining algorithm unfortunately retrieves a number of artifacts, for instance, rules stating
that zidovudine causes NNRTI related mutations with a high confidence. These artifacts
mainly result from the co-administration of several compounds and have to be removed
in a postprocessing step. Furthermore, the approach assumes that RTIs do not influence
mutation probabilities of positions in the protease and vice versa. This allows to treat the
estimation of protease and RT mutations as two independent problems and consequently
reduces the amount of artifacts. The following two sections describe the itemset mining
6.3 Mutation Models
133
and subsequent filtering in detail. The resulting mutation models are termed vivo90 and
vivo720 .
Mining Frequent Mutations
In frequent itemset mining instances are represented as sets containing a subset of N
binary attributes termed items. Due to historic reasons, each instance is called transaction t ⊆ {i1 , i2 , . . . , iN } and a collection of transactions is referred to as database
D = {t1 , t2 , . . . , tM }. In our case, each drug applied in a regimen is represented by one item.
Furthermore, every difference between the baseline sequence and the follow-up sequence is
encoded as an item. For limiting the number of items to be considered in total, only the
mutations in the baseline sequence with respect to the reference HXB2 were encoded as
single items (instead of all positions of the baseline sequences). Ambiguities at positions
were resolved, i.e. a1 , . . . , aj at one position in the baseline sequence was encoded as j
items. The same holds true for encoding the differences between baseline and follow-up
sequence.
Given a large database, one wants to find sets of items that frequently occur together
in transactions, i.e. item sets whose support exceeds a certain threshold. The support of
an itemset X is defined as the number of times all items of X occur together in the same
transaction in the database: supp(X) = |{m|X ⊆ tm }|. Frequent itemsets can be used to
derive rules X1 ⇒ X2 , with X1 , X2 ⊂ X ∧ X1 ∩ X2 = ∅. The confidence of a rule is defined
as
supp(X)
conf(X1 ⇒ X2 ) =
.
supp(X1 )
These rules are termed association rules and can be interpreted as P (X|X1 ) (Agrawal
et al., 1993).
The best-known algorithm for finding frequent itemsets is the Apriori algorithm (Agrawal
and Srikant, 1994). Apriori exploits the fact that
X1 ⊆ X ⇒ supp(X1 ) ≥ supp(X).
(6.5)
In consequence, this means that if X1 is not considered a frequent itemset, then a set
containing X1 can never be a frequent itemset, and also that if X is a frequent itemset,
then all subsets of X are frequent itemsets, too. The algorithm limits the search space by
first finding all frequent n-itemsets (|X| = n) and based on these defines candidates for
frequent n + 1-itemsets. Precisely, if X1 , X2 are frequent n-itemsets and X1 ∩ X2 = n − 1,
then X1 ∪ X2 is a candidate for a frequent n + 1-itemset. The non-frequent candidate
sets are then discarded by checking if all subsets of size n are frequent. The candidate
generation and validation, however, can be quite costly in terms of computation.
The frequent pattern (FP) tree mining algorithm avoids the candidate generation step (Han
et al., 2004). The FP tree mining algorithm first represents the database D as an FP tree.
In essence, an FP tree is a prefix tree with counts of occurrences of a prefix stored in the
node representing that prefix. For efficient tree traversal the FP tree maintains an item
header table pointing to an item’s occurrences in the tree via a linked list. Figure 6.9
depicts an example comprising nine transactions and the resulting FP tree including its
header table.
134
6 Planning Sequences of HIV Therapies
List of itemIDs
{I1, I2, I5}
{I2, I4}
{I2, I3}
{I1, I2, I4}
{I1, I3}
{I2, I3}
{I1, I3}
{I1, I2, I3, I5}
{I1, I2, I3}
reordered list Item ID Support Node−link
(I2, I1, I5)
(I2, I4)
I2 7
(I2, I3)
I1 6
(I2, I1, I4)
I3 6
I1:4
(I1, I3)
I4 2
(I2, I3)
I5:1
I5 2
(I1, I3)
I3:2
(I2, I1, I3, I5)
(I2, I1, I3)
null {}
I2:7
I1:2
I3:2
I4:1
I3:2
I4:1
I5:1
Figure 6.9: The left table lists a database comprising 9 transactions. In the right column
the items I1,. . . ,I5 are ordered with respect to their frequency in the database.
The FP tree represents exactly this database. The header table allows for an
efficient retrieval of all occurrences of an item within the tree. This is needed to
construct the conditional pattern base. The colored edges indicate the links one
has to traverse for constructing the conditional pattern base for I3. Example
adapted from Kamber and Han (2001).
For the FP tree construction the items within every transactions are first ordered in
decreasing order of their support in the whole database. Of note, after the reordering the
itemsets are no longer sets but item tuples. The root node of the tree is labeled with the
empty set, then each transaction tm is inserted into the tree in the following way: starting
from the root the tree is traversed following the order imposed by the items in transaction
tm , if no branch exists then it is created and the counter at the target node is initialized
with 0, after the last item of tm the count on each visited node is increased by one. The task
of finding frequent patterns in the database is now transformed to mine frequent patterns
in the FP tree.
The FP tree is mined recursively. The process starts with constructing the conditional
pattern base for each frequent 1-itemset. The frequent 1-itemsets can easily be determined
from the header table. The conditional pattern base for an item in is simply given by the
itemsets on all paths from nodes labeled with in to the root node with the support set to the
count stored in the in node. For example, the conditional pattern base for I3 in Figure 6.9
is {(I2 I1: 2), (I2: 2), (I1: 2)} as indicated by the colored solid edges. The occurrences
of I3 in the tree can easily be found by following the linked list (colored dashed edges).
Another FP tree is constructed for the conditional pattern base and mined for frequent
patterns. The end of the recursion is reached when either the FP tree for the conditional
pattern base is empty, i.e. comprises only the root node, or the FP tree comprises only
a single path. In the latter case, all combinations of nodes on that single path are (parts
of) a frequent pattern. Algorithm 5 provides pseudo code for the recursive FPgrowth
function (adapted from Kamber and Han (2001)).
Due to the excessive computational requirements of Apriori, the FP tree mining was
used to find frequent mutations caused by protease and reverse transcriptase inhibitors. A
pattern was considered frequent in the case of protease mutations and reverse transcriptase
mutations if it occurred 12 and 20 times, respectively. These thresholds were manually set
6.3 Mutation Models
135
Algorithm 5 FPgrowth(Tree, α)
if Tree contains a single path P then
for each combination (denoted as β) of the nodes in path P do
generate pattern β ∪ α with supp(β ∪ α) = minimum support of any node in β
end for
5: else
for each αi in the header of Tree do
generate pattern β = αi ∪ α with supp(β) = supp(αi )
construct β’s conditional pattern base and β’s conditional FP_tree: Tree β
if Tree β 6= ∅ then
10:
FPgrowth(Tree β , β)
end if
end for
end if
and represent a trade-off between robustness of mined rules and the number of distinct
rules. For estimating the mutation probability and its variance 1,000 bootstrap samples of
the original dataset were generated.
Postprocessing of Mutation Rules
For protease and reverse transcriptase, executing the FP tree mining algorithm results in
3323 and 2283 (61,498 and 24,968) frequent patterns on average per bootstrap replicate, respectively, using a cutoff of 90 (720) days. A large fraction of these frequent patterns is not
of interest for our application. Combining results from the bootstrap replicates by taking
the union of the discovered patterns and limiting the list of frequent patterns to itemsets
containing at least a drug and one additional mutation (i.e. difference between baseline
sequence and follow-up sequence) yields 171,993 and 21,124 (4,844,840 and 217,042) potentially interesting patterns for protease and RT, respectively, using the 90 (720) days
cutoff. Figure 6.10 a) depicts the relationship between the median support (x-axis) and
the number of bootstrap replicates, in which a pattern was considered frequent (y-axis) for
the protease dataset using the 720 days cutoff. As expected, the majority of the patterns
was selected only in very few bootstrap replicates (Figure 6.10 b)), e.g 105,032 and 4117
patterns were selected once and at least 200 times, respectively, on the protease 90 days
dataset. However, at first, all frequent patterns were rewritten as rules. In general, a
P
n
frequent n-itemset can be written as n−1
i=1 i different rules. In our application, however,
there is only one meaningful rule per pattern. Precisely, an item of a pattern is either a
compound, a baseline mutation, or an additional mutation, thus the only interesting rule
has drugs and baseline mutations on the left side and additional mutations on the right
side. This allows for interpreting the confidence of a rule as the probability of developing
a certain mutation (or a list of mutations) given a drug and a set of baseline mutations.
Unfortunately, the application of an association rule mining algorithm on our kind of data
generates a large number of artifact rules originating from the frequent co-administration
of drugs. For example, since the year 2000 the activity of protease inhibitors is increased
by administering a boosting dose of ritanovir (RTV) that occupies the cytochrome P450-
136
6 Planning Sequences of HIV Therapies
(a) support vs. bootstrap selection
(b) histogram bootstrap selection
Figure 6.10: Scatter plot between the median support and the times a pattern was selected
in a bootstrap replicate (a). Histogram of the times a pattern was selected
from the 1000 bootstrap replicates (b). Both figures are derived from the
protease 720 days cutoff dataset.
mediated metabolism in the human liver (Kumar et al., 1996) and therefore allows improved
plasma levels of drugs that undergo the same metabolism (typically protease inhibitors).
As a consequence, today RTV (or RTVb for boosting dose) is part of every PI containing
regimen and indeed, approximately 70% of PI containing treatments in the dataset list
RTV as one of the drugs, thus many mined rules claim that RTV is the cause for certain
mutations. This is most certainly not the case, however, and corresponding rules have to
be filtered out. For the reverse transcriptase there exist similar cases, lamivudine (3TC)
for example is used in approximately 60% of the NRTI containing therapies and a number
of rules accordingly blame 3TC for causing a great variety of mutations. In fact, 3TC is
only associated with mutations at positions 65 and 184 (Johnson et al., 2008). On the
other hand, other RTIs are connected with the appearance of mutations at position 184
since due to the relation in Eqn. 6.5 all frequent itemsets containing 3TC imply a frequent
itemset without 3TC. This example demonstrates that inferring which RTI causes which
RT mutation is a challenge. The subsequent application of the mutation models, however,
does not require the estimation of models for individual drugs. Especially the models for
RTIs will have to be combined, while the PI models are independent to a greater or lesser
extent. Therefore, solving the inference of causality for RTI models can be avoided by
estimating models for combinations of RTIs directly. The challenges for a postprocessing
filter differ between the drug targets. The following two paragraphs describe the filtering
steps that were applied to remove most of the artifacts from the list of protease and
reverse transcriptase mutation rules. In the following the left side and right side of a rule
are denoted by Xl and Xr , respectively. The set of PIs and RTIs is denoted by Cpro and
6.3 Mutation Models
137
Crt , respectively, with C = Cpro ∪ Crt ∪ “empty00 . The set of all baseline and additional
mutations is denoted by Ebase and Eadd , respectively.
All rules containing RTV as the only PI are most likely artifacts
and are therefore filtered: |RTV ∩ Xl | = 1 ∧ |Cpro ∩ Xl | = 1; the first part of the expression
checks whether RTV is on the left side of the rule, while the second expression checks
whether there is only one PI listed on the left side of the rule. Moreover, since not all
regimens contain a PI, there are many transactions for the protease task that contain the
item “empty00 . These transactions represent pseudo-treatment breaks for the protease.
Rules with “empty” as only compound on the left side are interesting, since they provide
information on how resistance mutations disappear from the predominant viral population
when there is no selective pressure on the protease. For this task, however, these rules
constitute an artifact and have to be removed: |“empty00 ∩ Xl | = 1. In the last step, all
Patterns (Xl ∪ Xr ) that were not selected a sufficient number of times in the bootstrap
analysis were discarded.
Protease Rules Filter:
In a first step, all rules that only contain one drug
are discarded (|Xl ∩ C| = 1), this also includes the rare case of “empty” as the only listed
compound. In contrast, all rules comprising three or more drugs are retained (|Xl ∩ Crt | >
2). Finally, all rules comprising exactly two RTIs have to undergo a further filtering
step. Briefly, if two drugs are said to cause a certain mutation, then there is the chance
that actually a third drug (not part of the rule) is the true cause. This case obviously
occurs when three drugs are frequently administered together. For example, Atripla is
available as a single pill and comprises three compounds, two NRTIs (FTC+TDF) and one
NNRTI (EFV). As a consequence one obtains the artifact rule {FTC, TDF} ⇒ {K103N}.
The mutation is clearly an NNRTI related mutation and therefore most likely caused by
EFV. Given a rule Xl ⇒ Xr containing exactly two drugs {c1 , c2 } = Xl ∩ Crt , then
0
= Xl − {c1 , c2 } is the set of baseline mutations. The filter checks for every RTI
Ebase
0
that is not part of the rule x ∈ Crt − {c1 , c2 } whether the rules {c1 , x} ∪ Ebase
⇒ Xr and
0
{c2 , x}∪Ebase ⇒ Xr exist. If for any drug x both rules exist and both exhibit a significantly
higher confidence than the original rule, then the original rule is discarded. For assessing
whether the difference in confidence is significant a t-test based on mean and standard
deviation of the confidence obtained from the bootstrap replicates can be applied. In
accordance with the protease filter, all patterns that were not selected a sufficient number
of times in the bootstrap analysis were discarded as well.
Reverse Transcriptase Rules Filter:
From Rules to Mutation Models
Applying the described filters with a bootstrap selection threshold of 20% for both sets
results in 1188 and 4541 (12,572 and 3861) rules for protease and RT, respectively, using
the 90 (720) days cutoff. These rules have two major advantages over the mutation scores
derived from in vitro measured resistance. Firstly, the rules reflect the mutation rates in
vivo, and secondly, the presence of baseline mutations on the left side of the rules allows to
model mutation rates in dependence of pre-existing mutations, rather than independently.
1
10% minimum selection threshold
138
6 Planning Sequences of HIV Therapies
An interesting example for rules obtained with the 720 day cutoff is given by the rather
novel protease inhibitor tipranavir (TPV). The rule mining discovered three major TPV
related protease mutations 33F, 82T, and 84V, with probability 0.27, 0.63, and 0.38, respectively. The mutation at position 82, however, does not occur from the wild type amino
acid (Valine), but from a mutant (Alanine) that was probably selected during previous
treatment with another PI. For this mutation there exist two rules in the list: {TPV}
⇒ {A82T} and {TPV, 82A} ⇒ {A82T} with confidence 0.28 and 0.63, respectively. For
constructing the mutation model one has to take into account that A82T is not a mutation from wild type and therefore use the confidence of the rule listing the corresponding
baseline mutation on the left side.
While in theory one can construct transducers that represent mutation models with an
arbitrary number of baseline and additional mutations, their construction algorithm can
be rather complex. For instance, in order to model the mutation probability in presence
of baseline mutations, the mutation transducer must have a path that is different from the
two states in Figure 6.3 and comprises at the position of the baseline mutation only the
corresponding amino acid. Moreover, one has to distinguish between baseline mutations
that occur before and after the follow-up mutation in the sequence (regarding the alignment
position). For example, for extending the mutation transducer in Figure 6.4 with differences
in the mutation probability of position two in the presence of mutations at position one
or three, one has to introduce two new states (2 and 3). One new edge connects the
initial state with state 2, its input and output label is “PRO_1A” and the weight is equal
to the mutation probability of position 2 in presence of 1A. State 2 and the final state
are connected via an edge with input label and output label “PRO_2I” and “PRO_2C”,
respectively. The same labeling is used for the edge between the initial state and state 3.
Finally, state 3 is connected to the final state using an edge with input and output label
“PRO_3E” and has the altered mutation probability as weight. Moreover, in theory, states
2 and 3 need loop edges for all alignment positions between the baseline and the follow-up
mutation. As can be easily seen, the size of the transducer grows rapidly with respect to
the number of baseline mutations that are represented. For practical reasons we restrict
the transducer construction to rules that provide the probability of a single mutation, i.e.
|Xr | = 1, and with at most one baseline mutation. The construction algorithm is simplified
0
by the fact that for the vast majority of rules conf(Xl ⇒ Xr ) ≤ conf(Xl ∪ Ebase
⇒ Xr )
holds (see Figure 6.11 a)). Thus, one simply has to add an alternative path featuring the
elevated probability in addition to the normal mutation probability. In accordance with
in vitro mutation models we define mutation models derived from treatment data as
Md,m (p, a, a0 ) = max{conf({d, m} ⇒ {apa0 }), conf({d, m, pa} ⇒ {apa0 })},
where m is a single baseline mutation, and d is either a single drug (like above) or a
combination of drugs. Hence, for complete treatments we have
Y
Mt,m (p, a, a0 ) =
Mt0 ,m (p, a, a0 )−k ,
t0 ⊆t
with k = |{t0 |t0 ⊆ t ∧ Mt0 (p, a, a0 ) 6= 0}|.
Figure 6.11 b) depicts a the number of rules per PI derived from the 720 days cutoff
dataset. The figure also clearly demonstrates the existence of a bias towards frequently
6.4 Validation
(a) effect of baseline mutations
139
(b) rules per protease inhibitor
Figure 6.11: Scatter plot between the probability for one mutation with and without the
presence of baseline mutations given the same drug (a). Final number of rules
for different protease inhibitors, different grey scales indicate the number of
baseline mutations in those rules (b). Both figures were derived from the
protease 720 days cutoff dataset.
used drugs that are obviously captured better by the model. For example, the majority of
rules describe mutation development under LPV/r containing treatments. Moreover, the
figure indicates that by restricting the mutation models to at most one baseline mutation
the majority of discovered rules is exploited (except for LPV).
6.4 Validation
Finding a validation scenario for an FDO score is a challenging task. In contrast to the
prediction of treatment response where one has the viral load as a direct measure, there
is no useful covariate in the HIV treatment databases, which necessarily has to correlate
with a good FDO. For instance, it is unlikely that a suboptimal choice regarding FDOs in
the first treatment will significantly shorten the patient’s life as so many other factors most
likely have a more observable impact, e.g. adherence. Moreover, the group of patients that
are followed from their first therapy till death is rather small and most likely did only have
very limited alternatives for the first treatment.
Nonetheless, there are possibilities for (at least) validating the mutation models alone.
The following two sections describe approaches for estimating the quality of the introduced
mutation models.
140
6 Planning Sequences of HIV Therapies
6.4.1 Prediction of Response to the Next But One Treatment
In previous chapters we aimed at predicting the virological response to the patient’s upcoming treatment. To this end we extracted retrospective treatment change episodes from
a database collecting routine treatment data. Then, statistical learning was used to correlate information about the virus and the treatment to the patient’s virological response
to the antiretroviral combination treatment. The availability of a mutation models allowing to evolve the viral population during a treatment makes a modification of this initial
setting possible. Knowing the genotype of a patient shortly before one treatment, one can
estimate the additional mutations at end of that treatment, and predict response to the
subsequent treatment based on this simulated genetic makeup of the viral population. This
scenario allows to study the impact of parameters of the mutation models. First, one can
study which mutation model provides the basis for the best predictions. Second, one can
study whether a fixed or a variable number of newly introduced mutations provides better
results. Third, one can assess the benefit of using the n-most likely results of the in silico
evolution over the single best variant. The following sections provide the experimental
setup, the results, and the discussion of this validation scenario.
Experimental Setup
Figure 6.12 illustrates the requirements for TCEs that are used for this validation. Briefly,
each TCE comprises two treatment changes. A genotype has to be available at most
90 days prior to the start of treatment A. In analogy to the EuResist standard datum
definition for assessing response to the treatment, a baseline viral load and follow-up viral
load have to be available at most 90 days before and about eight weeks (4 to 12 weeks)
after onset of treatment B, respectively. Application of this modified TCE definition to the
EuResist integrated database (release 2008/10/10) yields 1952 instances. A treatment
success is defined by suppressing the viral load below the limit of detection (400 copies of
viral RNA per ml blood) or achieving at least a 100-fold reduction compared to the baseline
value. Of note, the response to treatment A is of no interest, thus an instance is considered
successful if treatment B is a successful treatment. Using this definition, 1395 instances
are considered to be successful and 557 are considered treatment failures. The instances
selected for this validation overlap with the instances used for training the EuResist
prediction model. Precisely, 1428 TCEs and 350 TCEs of the EuResist training data
correspond to treatment A and treatment B, respectively. Since we predict neither the
response to treatment A nor use the real viral genotype obtained shortly before treatment
B for our predictions, we can consider this dataset as an independent test set.
Each of the sequences in the dataset was represented by a linear acceptor. Ambiguities in the amino acid sequence were encoded as alternative edges between two states. In
a first step only those amino acids in ambiguities conferring the most resistance against
the current treatment were retained (using a rating transducer as in Figure 6.3). Then,
novel mutations were introduced into the sequence by applying all five mutation models
described above. The number of newly introduced mutations is a parameter. We follow
three strategies for setting the number of mutations. The first strategy introduced a fixed
number of mutations into each sequences and we vary this number from 0 to 10, with 0 representing the reference model that does not introduce any mutation. The second approach
6.4 Validation
start
Sequencing
< 90 days
start
VL
111111111111111111111111
000000000000000000000000
000000000000000000000000
111111111111111111111111
> 14 days
000000000000000000000000
111111111111111111111111
< 90 days
> 0 days
VL < 400 cp/ml ?
141
stop
111111111111111111111111
000000000000000000000000
000000000000000000000000
111111111111111111111111
> 0 days
28 days
56 days
000000000000000000000000
111111111111111111111111
Therapy A
Therapy B
stop
Figure 6.12: Modified TCE definition. Every TCE comprises two treatments. The sequence had to be obtained at most 90 days before treatment A. A baseline
viral load measurement had to be available at most 90 days before treatment
B and a follow-up viral load measure had to be available within 4 to 12 weeks
after onset of the therapy. A TCE is successful if treatment B is successful,
i.e. viral load is reduced 100-fold or below 400 copies of viral RNA per ml
(limit of detection).
produces a number of mutations that depends on the composition of the antiretroviral
therapy. More precisely, the number of mutations is simply based on the average potency
of the regimen. Here, we use the instantaneous inhibitory potential (IIP) after 24 hours
of the maximal concentration of the drug in the plasma (IIP24 ) as defined by Shen et al.
(2008). The IIP24 ranges from 0 for d4T to 8.4 for DRV/r. The number of mutations
based on the average potency pot is computed by
n(pot) = dα · exp(1 − pot)e, with α ∈ {1, . . . , 5}.
(6.6)
If n(pot) is less than 1 or exceeds 10 mutations, then n(pot) is set to 1 or 10, respectively.
The parameter α controls the increase of mutations with the decrease of average potency.
The exponential function is used to realize a convex function.
The third strategy originates from the fact that an effective treatment gives the virus
fewer opportunities to mutate. Thus, the number of mutations is determined by the predicted success probability of treatment A (pA ). Precisely, the number of mutations is
calculated as
n0 (pA ) = dα(exp(1 − pA ) − 1)e, with α ∈ {1, . . . , 5}.
(6.7)
Again, the exponential function is applied for realizing a convex function.
Finally, we use a combination of potency and predicted success – the effect – for estimating the number of mutations. Precisely, we apply Eqn. 6.6 to pot · pA instead of pot
alone.
For models with varying number of mutations we use two different baselines approaches.
The first approach generates a random number of mutations that is uniformly sampled
from 1 to 10 (randomA). The second method generates a random number of mutations
that follows the distribution of the pA -based model. This is achieved by randomly shuffling
the number of mutations computed with Eqn. 6.7 (randomB). For both random models
the provided performance measure is the mean of 100 repetitions.
For each of these models we compute the list of 10,000 variants receiving the highest
score according to the mutation model.
Our prediction model, which we contributed to the EuResist prediction engine (5.2), is
used to predict the response of the 10,000 most likely outcomes of the simulated evolution
142
6 Planning Sequences of HIV Therapies
g2praw
g2ptransition
g2pmixed
baseline
0.757
fixed number
1
0.757
0.764
0.762
2
0.749
0.755
0.756
3
0.724
0.735
0.741
4
0.702
0.714
0.725
5
0.694
0.699
0.710
6
0.692
0.690
0.696
7
0.690
0.674
0.683
8
0.687
0.660
0.671
9
0.686
0.651
0.660
10
0.684
0.645
0.651
dependent on current treatment composition
randomA 0.694
0.687
0.690
randomB 0.737
0.738
0.741
potency
0.759
0.768
0.768
success
0.766
0.767
0.768
effect
0.762
0.769
0.768
vivo90
vivo720
0.749
0.743
0.738
0.728
0.718
0.710
0.703
0.697
0.693
0.689
0.757
0.756
0.750
0.735
0.721
0.711
0.705
0.701
0.696
n/a
0.707
0.737
0.757
0.755
0.759
0.717
0.745
0.764
0.765
0.768
Table 6.1: Performance for different mutation models and mutation numbers measured in
AUC.
to the drugs in treatment B. The predicted success for treatment B is the weighted mean
of the top n of these 10,000 predictions
1
Pn
n
X
j=i sj i=1
s i pi ,
(6.8)
where si and pi are the mutation score and the predicted success probability of the ith
variant, respectively. For g2praw si corresponds to the increase in resistance, and for all
other models si equals the real probability (not the negative decadic logarithm).
Results
Table 6.1 lists the AUC values that were achieved in this setting with all five mutation
models. None of the models manages to substantially improve over the baseline (no mutations) and only subtle differences are visible between the mutation models. A general
trend is the decline of prediction performance with increased fixed number of mutations.
Consequently, the introduction of only one mutations provides the highest AUCs.
The introduction of a variable number of mutations depending on the treatment composition of regimen A provides a slight improvement over the baseline (no mutations).
Moreover, all three proposed ways of estimating the number of mutations improve substantially over models generating a random number of mutations. Here, the uniform sampling
provides the worst results and is probably an underestimation of the baseline.
6.4 Validation
143
Still, none of the mutation models paired with an approach to estimate the number of
mutations provides a substantial improvement over the baseline. One reason for this observation is doubtless the large fraction of treatments having a low success rate with respect
to regimen B even before regimen A caused changes in the viral population. Figure 6.13 a)
depicts the distribution of predicted success rate according to regimen B (pB ) at baseline
for the viral sequence obtained before onset of regimen A. Interestingly, the distribution
resembles the one that can be expected for pA as seen in Figure 5.7 a). It becomes obvious
that a large number of treatments failing regimen B had a bad perspective of succeeding
even before regimen A was administered. Hence, the baseline of zero mutations is almost
insurmountable and little improvement is visible by the mutation models. For a fair comparison of the mutation models, the distributions of the predicted success rate in the failing
and succeeding groups should be identical for regimen B before onset of regimen A.
In order to have an ideal starting point for the validation of the mutation models, the
distribution of pB should be identical in failing and succeeding cases. Thus, for studying
a distribution of pB that is more similar in failing and succeeding cases, we applied a
threshold on pB and considered only cases exceeding that threshold. The cutoff was varied
from 0.0 (all cases) to 0.9 in steps of 0.1. Moreover, ambiguous failures (i.e. treatments
labeled as failure but achieving a VL reduction below 500 copies at least once during the
treatment; see Section 5.3) were removed for obtaining a more robust quality assessment.
Figure 6.13 c) displays the decline in AUC for increasing cutoffs of pB for the baseline
(no mutations), different mutation models with variable number of mutations based on
pA , and both random number of mutations models (using g2pmixed ). This setting reveals
qualitative differences among the g2p-based mutation models and the mutation models
derived from treatment data. All three g2p-based models perform the best and achieve an
AUC of approximately 0.630 compared to the baseline of 0.547 at a pB cutoff of 0.8. At this
threshold, the vivo90 model is indistinguishable from the baseline with an AUC of 0.556,
while the vivo720 model reaches a slightly better AUC of 0.602 and places itself between the
baseline and the g2p-based models. The results also demonstrate the importance of linking
the right number of mutations to the right treatment. The performance of randomB is
very close to the baseline, while g2pmixed with pA -based number of mutations demonstrates
clearly superior prediction accuracy. Potency- or effect-based computation of the number
of mutations produces similar curves (data not shown). Of note, only 10 cases were labeled
as failure when the threshold of 0.9 was applied. Due to the low number of failing cases,
results obtained using the highest cutoff cannot be considered reliable. For instance, the
increase observed for both random models when moving from the 0.8 to the 0.9 cutoff are
most likely an artifact.
For investigating the impact of the number of most likely viral variants considered for
the computation of the success for the next but one regimen, we chose the pB cutoff of 0.8,
the g2pmixed model, pA -based number of mutations, and varied the number of considered
in silico variants: {1, 10, 100, 1000, 10000}. Figure 6.13 b) depicts the corresponding ROC
curves. Clearly, it is beneficial to consider more than the single most likely variant: the
top 10 yield already an improvement of 0.03 in AUC. Nevertheless, no improvement was
observed beyond the top 1000, which increased the AUC by 0.051 compared to the single
most likely variant.
6 Planning Sequences of HIV Therapies
0.6
0.4
1 (0.58)
10 (0.61)
100 (0.62)
1000 (0.631)
10000 (0.631)
0.0
0.2
True positive rate
0.8
1.0
144
0.0
0.2
0.4
0.6
0.8
1.0
False positive rate
(a) Distribution of pB
0.8
(b) Impact of varying top n variants
●
●
●
0.7
●
●
AUC
●
0.5
0.6
●
baseline
g2p raw
g2p mix
g2p trans
prob 90
prob 720
randomA
randomB
●
●
●
●
0.0
0.2
0.4
0.6
0.8
success cutoff
(c) Development of AUC
Figure 6.13: (a) Distribution of pB before onset of regimen A in failing treatments (red)
and successful treatments (green). (b) Change of AUC depending on the
considered top n viral variants. AUC computed with g2pmix mutation model
at cutoff 0.8 using. (c) Development of AUC after restricting the data in terms
of pB . Apart for the random models, the number of mutation was based on
pA .
6.4 Validation
145
Discussion
The results demonstrated that it is possible to infer response to the treatment following
the next treatment on the basis of the current genotype. Unfortunately for our validation,
the baseline (i.e. no additional mutations) provided already a very good prediction performance. Indeed, the performance was close to AUC values observed for predicting response
to the next antiretroviral regimen (see results in Chapter 5). The high performance was
the result of a large number of cases that had bad perspectives for succeeding with regimen
B even before regimen A caused additional mutations. Consequently, the mutation models
showed only little improvement.
A scenario that is more suitable for assessing the quality of the mutation models is
the separation of failing and succeeding regimens that had the same (or at least similar)
distribution of success rate pB before the onset of regimen A. Here, mutations selected by
regimen A are the cause for the later failure, and the mutation models should allow for a
separation. The removal of all instances with pB below a certain cutoff served as way for
matching the distributions of pB in successful and failing regimens.
The altered scenario revealed differences between mutation models: g2p-based models
generally outperform the models estimated from treatment data. Moreover, the importance
of assigning the right number of mutations to the treatments was made evident and we
could demonstrate a clear improvement over the baseline (no mutations). Of note, the
performance of the baseline can be seen as a measure of dissimilarity of the distribution of
pB in both groups (if the distributions were identical, then the baseline would achieve an
AUC of 0.5).
6.4.2 Comparing Predicted Future Drug Options to Data From Real Cases
The second validation scenario aims at comparing predicted future drug options after a
treatment to real future drug options observed at treatment failure in patients. In order
to avoid problems with archived drug resistance mutations in patients, which had already
received antiretroviral therapies, this scenario is restricted to treatment naïve patients.
That restriction has another advantage, as it allows us to assume that the virus before
the treatment is a wild type virus and therefore we only need to retrieve sequences of the
patient at end of the first treatment serving for the computation of future drug options.
Consequently, this maximizes the amount of available data for this analysis. The simplification bears also one risk: by assuming that a patient was infected with a wild type virus,
we ignore the existence of conferred drug resistance mutations.
Validation Setup
The EuResist database was queried for HIV genotypes that were obtained during the
first treatment the patient ever received. More precisely, the sequence comprising RT
and protease had to be obtained 30 days after start earliest and 14 days before stop of
the treatment latest, respectively. As argued earlier (Section 4.2), the ability to sequence
the virus while the patient is on antiretroviral treatment indicates virological failure. In
order to ensure that the patient indeed received the first treatment, it was required that all
treatments, which the patient had ever received, were stored in the database. Table 6.2 lists
146
6 Planning Sequences of HIV Therapies
all antiretroviral treatments that met the requirements and occur at least three times. For
all sequences fold-change in resistance against 17 drugs was computed with geno2pheno
and the activity score for each drug was determined (Section 3.2). The FDO was simply
defined as the sum of the activity for all 17 drugs, thus the FDO was in the range of [0, 17].
In addition to the FDO for all drugs, drug class specific FDOs were calculated.
Unfortunately, a number of sequences (about 5.9%) suggested complete resistance against
a large number of drugs from all classes. Given that the sequences were obtained at the
end of the first antiretroviral treatment, this was an unexpected result and most likely originates from faulty entries in the database or cases of transmitted drugs resistance. Hence,
we excluded all cases that exhibit an FDO score of 5.0 or less from the analysis. Table 6.3
lists FDO and drug class wise FDO after removal of the cases.
Predictions for the FDOs were generated by constructing a linear acceptor representing
the consensus B wild type sequence. Then, for every treatment listed in Table 6.3 a
mutation model was used to introduce a fixed number of mutations (1 to 10), and the 10,000
variants receiving the highest mutation score were stored. For simplicity, the mutation
model was restricted to g2pmixed , which also ranked among the best models in the previous
setting. Next, for all predicted variants the FDO was computed. The consensus FDO of
the (at most) 10,000 variants was computed using a weighted mean (see Eqn. 6.8).
Finally, we correlated the predicted FDOs with the observed FDOs. Here, however, the
number of mutations that have to be introduced into the wild type sequence is unknown.
Thus, to get an upper estimate on the prediction performance we selected the number of
mutations for each treatment that best fits the observed value (best fit). In addition, we
used a crude heuristic for estimating the number of mutations (manual): the number of
mutations to be introduced is 2, treatments containing boosted PIs introduce one mutation
less, treatments with less than 3 drugs one mutation more, and so do treatments containing
the combination d4T+ddI.
Results
Table 6.4 lists the obtained correlation for a varying number of top N most likely viral
variants. For the best fit approach the correlation steadily increases with larger viral
population and reaches up to 0.951 for the full list of 10,000 variants. Using the crude
manual heuristic for inferring the number of mutations the correlation between predicted
FDO and actual FDO decreases to approximately 0.570.
Figure 6.14 shows scatter plots of the real FDO against the predicted FDO using the
100 most likely variants. A surprising observation is that using the best fit approach a
small number of mutations is sufficient to achieve a high correlation with the observed
values. Precisely, the majority of treatments are best modeled when only one, two, or
three mutations are introduced. In order to verify whether the drug classes are correctly
modeled, we correlated the observed class specific FDO to the predicted class specific FDO.
Here, we used the number of mutations determined with best fit approach on the FDO
for all drugs. The correlations was very high for NRTIs and NNRTIs, reaching 0.809 and
0.796, respectively. The PIs, however, showed with 0.407 only an acceptable correlation. A
confounder of the FDOs for PIs are treatments that did not contain any PIs. Consequently,
removal of non-PI treatments elevated the correlation to 0.599. Figure 6.14 c) shows the
6.4 Validation
drug combination
ZDV
ddC+ZDV
3TC/FTC+ZDV
3TC/FTC+LPV+ZDV
3TC/FTC+NFV+ZDV
3TC/FTC+NVP+ZDV
3TC/FTC+IDV+ZDV
3TC/FTC+EFV+ZDV
3TC/FTC+LPV+TDF
3TC/FTC+ABC+ZDV
ddI+ZDV
3TC/FTC+SQV+ZDV
3TC/FTC+d4T+NFV
3TC/FTC+EFV+TDF
3TC/FTC+d4T
d4T+ddI+NFV
3TC/FTC+d4T+EFV
3TC/FTC+d4T+NVP
ddC+SQV+ZDV
3TC/FTC+d4T+SQV
d4T+ddI+EFV
3TC/FTC+APV/FPV+TDF
3TC/FTC+ATV+TDF
3TC/FTC+ABC+LPV
3TC/FTC+d4T+IDV
3TC/FTC+d4T+LPV
d4T+ddI
d4T+ddI+NVP
ddI+NVP+ZDV
3TC/FTC
3TC/FTC+NVP+TDF
d4T+ddI+IDV
d4T+ddI+SQV
ddI+EFV+TDF
count
100
44
40
38
38
34
31
24
23
20
20
19
15
11
11
9
7
7
7
6
6
5
5
4
4
4
4
4
4
3
3
3
3
3
all
13.94 (3.39)
14.13 (3.17)
13.01 (2.49)
15.55 (1.97)
10.70 (3.99)
13.11 (3.13)
13.42 (3.39)
12.17 (4.77)
14.15 (4.82)
12.01 (3.13)
12.74 (2.59)
12.09 (3.61)
10.37 (4.89)
11.65 (5.42)
11.39 (1.67)
10.58 (4.31)
13.04 (2.10)
14.38 (2.73)
14.62 (2.59)
9.25 (4.44)
12.12 (3.62)
12.65 (6.95)
13.59 (2.58)
15.46 (2.26)
12.69 (5.34)
13.38 (6.31)
9.90 (3.58)
11.25 (2.72)
11.60 (2.42)
12.57 (0.39)
13.95 (3.92)
14.28 (2.22)
10.56 (5.78)
9.15 (4.99)
NRTI
4.80 (2.21)
4.83 (2.33)
3.49 (2.13)
6.17 (1.30)
3.67 (1.63)
4.60 (2.07)
5.04 (1.80)
4.62 (2.32)
5.94 (1.80)
2.60 (2.60)
3.23 (2.24)
3.82 (1.90)
3.44 (2.34)
4.47 (2.44)
2.02 (1.24)
3.07 (2.29)
5.02 (1.74)
5.85 (1.48)
5.32 (1.73)
2.16 (1.79)
5 (2.35)
4.95 (2.88)
4.78 (1.59)
5.96 (1.59)
4.79 (2.36)
5.41 (2.98)
3.43 (2.74)
4.48 (2.42)
2.88 (1.68)
3.23 (0.39)
5.39 (2.55)
5.59 (1.30)
3.72 (2.98)
4.89 (1.60)
NNRTI
2.66 (0.69)
2.82 (0.32)
2.64 (0.77)
2.88 (0.24)
2.64 (0.69)
1.63 (1.23)
2.83 (0.27)
1.48 (1.39)
2.38 (1.01)
2.68 (0.77)
2.55 (0.88)
2.76 (0.67)
2.89 (0.18)
1.67 (1.44)
2.53 (0.86)
2.89 (0.07)
1.64 (1.24)
1.62 (1.50)
2.86 (0.19)
2.96 (0.03)
0.66 (1.19)
2.11 (1.21)
2.97 (0.04)
2.81 (0.22)
2.93 (0.09)
2.72 (0.52)
2.98 (0.02)
0.86 (1.44)
1.79 (0.97)
2.85 (0.25)
1.96 (1.70)
1.69 (1.53)
2.96 (0.04)
0.00 (0.00)
147
PI
6.49 (1.70)
6.49 (1.63)
6.88 (0.34)
6.50 (0.99)
4.38 (2.60)
6.88 (0.25)
5.55 (2.27)
6.07 (1.95)
5.83 (2.35)
6.73 (0.72)
6.97 (0.06)
5.51 (2.36)
4.03 (2.87)
5.52 (2.18)
6.84 (0.34)
4.62 (2.80)
6.38 (1.02)
6.91 (0.15)
6.43 (1.27)
4.12 (3.29)
6.46 (1.17)
5.60 (3.13)
5.84 (1.60)
6.69 (0.47)
4.97 (3.35)
5.25 (3.50)
3.48 (4.02)
5.91 (2.16)
6.92 (0.15)
6.49 (0.88)
6.60 (0.42)
7 (0.00)
3.88 (3.48)
4.26 (3.41)
Table 6.2: First line antiretroviral treatments and observed future drug options (FDO).
The column count states the sample size that served to estimate the FDO.
The column all, NRTI, NNRTI, and PI state the FDOs for all drugs, NRTIs,
NNRTIs, and PIs, respectively. Values are the mean among all observed cases
and standard deviation is state in parenthesis.
148
6 Planning Sequences of HIV Therapies
drug combination
ZDV
ddC+ZDV
3TC/FTC+ZDV
3TC/FTC+LPV+ZDV
3TC/FTC+NVP+ZDV
3TC/FTC+NFV+ZDV
3TC/FTC+IDV+ZDV
3TC/FTC+EFV+ZDV
3TC/FTC+LPV+TDF
3TC/FTC+ABC+ZDV
ddI+ZDV
3TC/FTC+SQV+ZDV
3TC/FTC+d4T+NFV
3TC/FTC+d4T
3TC/FTC+EFV+TDF
d4T+ddI+NFV
3TC/FTC+d4T+EFV
3TC/FTC+d4T+NVP
ddC+SQV+ZDV
d4T+ddI+EFV
3TC/FTC+ATV+TDF
3TC/FTC+ABC+LPV
3TC/FTC+APV/FPV+TDF
3TC/FTC+d4T+SQV
d4T+ddI
d4T+ddI+NVP
ddI+NVP+ZDV
3TC/FTC
3TC/FTC+NVP+TDF
3TC/FTC+d4T+IDV
3TC/FTC+d4T+LPV
d4T+ddI+IDV
count
95
43
40
38
34
33
30
22
21
20
20
18
12
11
10
8
7
7
7
6
5
4
4
4
4
4
4
3
3
3
3
3
all
14.47 (2.55)
14.39 (2.70)
13.01 (2.49)
15.55 (1.97)
13.11 (3.13)
11.75 (3.06)
13.71 (3.05)
13.24 (3.26)
15.39 (2.62)
12.01 (3.13)
12.74 (2.59)
12.55 (3.06)
11.95 (4.08)
11.39 (1.67)
12.79 (4.09)
11.41 (3.77)
13.04 (2.10)
14.38 (2.73)
14.62 (2.59)
12.12 (3.62)
13.59 (2.58)
15.46 (2.26)
15.67 (1.89)
11.93 (1.97)
9.90 (3.58)
11.25 (2.72)
11.60 (2.42)
12.57 (0.39)
13.95 (3.92)
15.30 (1.38)
16.53 (0.63)
14.28 (2.22)
NRTI
4.98 (2.10)
4.94 (2.24)
3.49 (2.13)
6.17 (1.30)
4.60 (2.07)
4.04 (1.41)
5.10 (1.79)
5.01 (1.99)
6.40 (0.97)
2.60 (2.60)
3.23 (2.24)
3.83 (1.95)
4.28 (1.78)
2.02 (1.24)
4.89 (2.11)
3.34 (2.28)
5.02 (1.74)
5.85 (1.48)
5.32 (1.73)
5 (2.35)
4.78 (1.59)
5.96 (1.59)
6.18 (0.96)
2.90 (1.76)
3.43 (2.74)
4.48 (2.42)
2.88 (1.68)
3.23 (0.39)
5.39 (2.55)
5.76 (1.63)
6.90 (0.05)
5.59 (1.30)
NNRTI
2.67 (0.69)
2.81 (0.33)
2.64 (0.77)
2.88 (0.24)
1.63 (1.23)
2.70 (0.57)
2.87 (0.20)
1.61 (1.38)
2.60 (0.72)
2.68 (0.77)
2.55 (0.88)
2.91 (0.14)
2.87 (0.20)
2.53 (0.86)
1.83 (1.40)
2.88 (0.07)
1.64 (1.24)
1.62 (1.50)
2.86 (0.19)
0.66 (1.19)
2.97 (0.04)
2.81 (0.22)
2.49 (0.99)
2.96 (0.04)
2.98 (0.02)
0.86 (1.44)
1.79 (0.97)
2.85 (0.25)
1.96 (1.70)
2.91 (0.10)
2.63 (0.59)
1.69 (1.53)
PI
6.82 (0.87)
6.64 (1.31)
6.88 (0.34)
6.50 (0.99)
6.88 (0.25)
5.01 (2.17)
5.74 (2.06)
6.62 (0.64)
6.38 (1.54)
6.73 (0.72)
6.97 (0.06)
5.81 (2.01)
4.80 (2.68)
6.84 (0.34)
6.07 (1.25)
5.18 (2.37)
6.38 (1.02)
6.91 (0.15)
6.43 (1.27)
6.46 (1.17)
5.84 (1.60)
6.69 (0.47)
7 (0.00)
6.07 (1.68)
3.48 (4.02)
5.91 (2.16)
6.92 (0.15)
6.49 (0.88)
6.60 (0.42)
6.63 (0.62)
6.99 (0.01)
7 (0.00)
Table 6.3: First line antiretroviral treatments and observed future drug options (FDO).
The column count states the sample size that served to estimate the FDO.
The column all, NRTI, NNRTI, and PI state the FDOs for all drugs, NRTIs,
NNRTIs, and PIs, respectively. Values are the mean among all observed cases
and standard deviation is state in parenthesis.
6.4 Validation
top N
best fit
manual
1
0.784
0.148
10
0.876
0.351
100
0.907
0.561
1,000
0.943
0.588
149
10,000
0.951
0.572
Table 6.4: Pearson correlation between predicted and actual FDOs. Top N indicates how
many most likely variants were used for computing the predicted drug options.
The two methods for finding the number of mutations to be introduced are best
fit and manual.
(a) 100 variants (best fit)
(b) 100 variants (manual)
(c) 100 variants protease (best fit)
Figure 6.14: Scatter plot between real FDO and predicted FDO using 100 most likely
variants. FDO for all 17 drugs, and the number of mutations was set using
the best fit approach (a) or the manual heuristic (b). FDO for PIs only, the
number of mutations was set using the best fit approach on all 17 drugs (c).
corresponding scatter plot for only PI containing regimens.
Discussion
The best fit approach explains the future drug options very well, while requiring only a
small number of additional mutations for the majority of treatments to explain the observed
FDO value. Moreover, the best fit (in terms of mutations) regarding the FDO of all drugs
reflects also the class-wise FDOs with correlations raging from 0.599 to 0.809, indicating
that the underlying mutation model reflects the resistance development in the different
drug targets in vivo to a satisfying extent.
Indeed, the measured FDOs exhibit a high variance themselves, most likely owing to
differing treatment lengths, differing patient adherence, and also differing mutations in
the virus preexisting at treatment start. Clearly, these sources of uncertainty add to
the difficulty of fitting a single FDO to a single drug combination. Consideration of the
other factors, however, would either lead to overfitting the model to the noisy data (e.g.
introduction of a treatment length parameter) or unreliable estimations of the FDOs due to
low sample numbers (e.g. requirement of baseline genotype). Moreover, for other external
factors like adherence no data are available.
We made a further assumption which does certainly not hold: every patient started
with a wild type consensus B virus. As can be seen from the first extraction of data from
the database, a substantial number of cases showed an FDO of 5.0 or less and, therefore,
150
6 Planning Sequences of HIV Therapies
were excluded from the analysis. Moreover, also the cleaned set of instances contains
unlikely post-treatment FDOs. For instance, patients treated with ddI+d4T only have
only 3.5 active PIs at treatment failure, on average. A change in FDO among PIs after
a PI-less combination therapy can only be explained by either preexisting drug resistance
mutations or errors in the database, e.g. the treatment contained a PI or stop date of the
first treatment is incorrect and the genotype was actually obtained during the following
PI-based regimen.
The best fit estimation clearly serves as an upper bound of the model performance. Despite the many sources of uncertainty originating from the gold standard, i.e. the real data,
the model explains the observations using only few mutations. And, most importantly, the
resistance development in the different drug classes is captured well. Thus, the mutation
model provides the means to simulate resistance development in antiretroviral combination
treatments, an essential basis for the sequencing of treatments.
6.5 Remarks
We introduced five different mutation models together with a framework for quickly simulating resistance development in HIV against combination treatments. The mutation
models demonstrated, to varying extent, that prediction of response to the one after next
regimen can be achieved at moderate performance. Moreover, a second validation setting
showed that observed future drug options can be explained using one of the models (g2pmix )
by the introduction of only few mutations. Here, the overall resistance development was
reflected well by resistance development within drug classes.
A crucial variable contributing to success and failure of the model is the number of
mutations that are introduced by a treatment. The amount of mutations depends on a
range of parameters, including activity of the regimen, potency of the drugs in the regimen,
duration of the regimen, and the patient’s adherence to the regimen. The exact number
is therefore hard to estimate. Within the first validation setting, we explored different
functions to infer a suitable number of mutations based on activity and potency of the
regimen. These approaches showed an improved behavior compared to the introduction of
a fixed number of mutations. However, there is clearly room for further improvements, for
instance by functions based on adherence of the patient. Of note, inferring the number of
mutations from the duration of the regimen alone is not likely to succeed. For example,
one can expect that a long treatment duration results in a higher number of mutations.
But, on the other hand, a long duration indicates good compliance of the patient and, as
a consequence, a sustained treatment success resulting in only few mutations. Thus, only
a measure of patient adherence paired with treatment length is likely to succeed.
The introduced model can still provide insight on how to sequence treatments even when
the number of mutations is unknown. More precisely, one can observe the development of
the estimated success probability of potential nth line treatments with respect to (n − 1)th
line regimen. For instance, on a Dual-Core AMD OpteronTM processor with 2.6 GHz and
32 GB memory, the simulation from 1 to 10 additional mutations including storage of
the 1,000 most likely variants takes approximately 35 seconds. The interpretation of all
variants using the EV EuResist engine requires (due to suboptimal implementation for
this task) another 70 seconds. Furthermore, the framework for simulating the evolution is
6.5 Remarks
151
easily scalable and one can achieve further speedup by limiting the number of variants (6
seconds with 100 variants instead of 1000), the number of different mutations (17 seconds
with 5 mutations instead of 10) or a combination of both (3 seconds). Hence, we reached
a computational performance that enables the construction of web services. Such a service
might allow to upload the sequence of a patient isolate as well as the selection of a few
potential treatments and possible follow-up regimens. As a result the service will provide
the change in treatment options (using different measures) as a consequence of the selected
regimen. In the following we provide a case-study of four typical first-line treatments using
three different measures for future treatment options.
Figure 6.15 depicts the development of four different regimens (different colors) in response to two potential first-line regimens (different line styles) in terms of PSS (a) and
predicted success by the EV EuResist engine (b). The simulation started with a consensus B wild type sequence, up to 10 mutations were introduced and the 1,000 most
likely variants were analyzed. Figure 6.15 a) reveals that up to six mutations there
is no difference for TDF-containing regimens (black and green) whether the preceding
treatment contained TDF or ZDV. For the ZDV-containing regimens, however, the TDFcontaining pre-treatments exhibit a higher score (around 0.3). Hence, first using TDF
and then ZDV seems to be a better choice. Also the analysis in terms of estimated success probability (Figure 6.15 b) demonstrates that the use of 3TC+ZDV+LPV provides
a slightly worse perspective for future regimens than administration of 3TC+TDF+LPV
(i.e. all dashed lines are below solid lines of same color). The benefit of 3TC+TDF over
3TC+ZDV in terms of resistance at treatment failure is well recognized in the medical
community, and indeed, TDF-containing regimens are suggested for first-line regimens
(http://www.aidsinfo.nih.gov/Guidelines/). The advantage of TDF is that it selects
for the mutation K65R, which causes hypersusceptibility of ZDV and other NRTIs (Parikh
et al., 2006). Of note, also the applied mutation model assigns a high probability to the
appearance of K65R during TDF treatment (see Figure 6.7). The simulation of both treatments (including interpretation) took approximately 3.5 minutes (using only the 100 most
likely variants the computational time is reduced to 30 seconds, while maintaining the
same qualitative result).
Instead of studying the development of the predicted success rate for a set of fixed regimens, one can also focus on resistance against individual drugs. For instance, Figure 6.16
shows the decrease in FDO as response to the same four drug combinations. By inspecting
the FDO for all drugs no difference between the use of TDF or ZDV is visible. When focusing on the FDO in the class of NRTIs, however, the inferiority of ZDV becomes evident,
as there is a difference of approximately half a drug with four and five mutations.
Concluding, the transducer-based framework provides the computational means for efficiently simulating resistance development during anti-HIV therapy, and thereby allows the
search for optimal orderings of antiretroviral therapies. The computational performance
can be even further increased by using the C++ interface instead of the command-line
tool; the latter requires to store transducers on the hard drive, while the former allows to
store transducers in the main memory and therefore avoid the I/O overhead. Furthermore,
the mutation models, on which are simulation depends, can also be based on expert-based
rating systems, as opposed to geno2pheno. Likewise, depending on the users’ preference
the used rating scheme for computing FDOs can be based on expert-based systems.
●
●
●
●
●
●
●
●
3TC+TDF+LPV
3TC+ZDV+LPV
3TC+TDF+EFV
3TC+ZDV+EFV
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
3TC+TDF+LPV
3TC+ZDV+LPV
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0
4
●
●
●
●
●
●
2
●
●
●
●
●
●
●
●
0.6
●
●
●
●
●
●
●
●
●
0.4
●
predicted success
●
●
●
●
●
●
●
●
3TC+TDF+LPV
3TC+ZDV+LPV
3TC+TDF+EFV
3TC+ZDV+EFV
●
●
0.2
●
●
6
number of mutations
(a) geno2pheno PSS
8
●
●
●
10
0.0
2.0
1.5
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1.0
geno2pheno PSS (activity)
●
●
●
●
●
●
●
●
●
●
0.8
2.5
●
●
1.0
6 Planning Sequences of HIV Therapies
3.0
152
●
3TC+TDF+LPV
3TC+ZDV+LPV
2
4
6
8
10
number of mutations
(b) success probability
Figure 6.15: Development of geno2pheno PSS (a) and predicted success probability (b)
of four different regimens (different colors) after treatment with two typical
first-line regimens (different line styles)
6.6 Discussion
Transducers provide a solid framework for simulating development of resistance mutations
in HIV. The problem of finding the most likely viral variant, which may emerge during a
combination therapy, is modeled as a shortest path problem, in which edge costs are related
to the probability of single mutations to emerge during treatment. Due to the application of transducers in the domain of natural language processing (e.g. spoken language
recognition and text translation) a number of free and efficient implementations of algorithms that are based on transducers are available. Moreover, the transducer framework
can easily be scaled to the available computational hardware. For instance, the number
of mutations to be added, the number of most likely viral variants to be tracked, and the
amino acid positions of the HIV sequence to be considered (positions have to be simply
omitted from the linear input acceptor representing the baseline sequence) can be adapted
without modifying the mutation models.
Most importantly though, transducers allow to easily implement different mutation models. In addition to the models presented here, mutation models based on Hidden Markov
Models (HMMs) can be realized (see for instance Healy and Degruttola (2007)). The capability of representing HMMs follows from the fact that weighted finite state transducers
comprise output labels and weights which can be used to model observation (emissions of
the HMM) and probabilities for transitions and emissions, respectively.
A major drawback of using transducers for implementing mutation models is the practical requirement to model mutations independently from each other. That is, each mutation
will lead to the same increase in total path cost; independent of the mutations that are
already present in the genotype. In theory it is possible to construct mutation transduc-
●
15
153
7
6.6 Discussion
●
●
●
●
6
●
●
●
●
5
●
●
●
●
●
●
3TC+LPV+TDF
3TC+LPV+ZDV
3TC+EFV+TDF
3TC+EFV+ZDV
4
●
●
●
●
●
●
●
●
3
10
●
●
FDO NRTI
●
●
●
●
●
●
●
●
●
●
2
●
●
●
0
0
1
5
FDO ALL
●
1
2
3
4
5
6
1
2
3
●
●
●
●
7
●
●
●
●
●
●
●
●
●
●
●
●
●
●
2.0
5
●
1.5
●
4
●
●
3
FDO PI
●
●
1.0
FDO NNRTI
6
6
●
●
5
mutations
2.5
3.0
mutations
4
●
2
●
●
●
●
1
0.5
●
●
●
●
●
●
1
2
3
4
mutations
5
6
0
0.0
●
1
2
3
4
5
6
mutations
Figure 6.16: Change of future drug options (FDO) as response to four typical first-line
treatments. The panels depict the FDO for all drugs (top left), NRTIs (top
right), NNRTIs (bottom left), and PIs (bottom right).
ers that implement mutation scores that are dependent on present mutations; however,
these transducers are complicated to construct, and the results are in general very large
automata (i.e. many additional states and edges are required to represent alternative paths
that go along preexisting mutations). Of course, large transducers increase the required
computation time (and the I/O overhead).
Another method that takes preexisting mutations into account is the approach introduced by Deforche et al. (2008) for estimating in vivo viral fitness landscapes. Here, a
Bayesian network model for an individual drug is trained on viral sequences that were
obtained during treatment with that drug. This network is used to capture interactions
between mutations. Furthermore, in a second training phase the conditional probabilities
of the Bayesian network are replaced with conditional fitness contributions of each mutation. This fitness function is applied together with a simple finite ideal Wright-Fisher
population model (Wright, 1931) for simulating the resistance development in the viral
population during anti-HIV treatment. However, the approach has to model and simu-
154
6 Planning Sequences of HIV Therapies
late the resistance development for each viral variant in the population (Deforche et al.
(2008) used a population size of 104 ), and is therefore likely not to meet the computational
demands for an interactive system.
The major problem for all approaches is the estimation of mutation models for drug
combinations instead of individual drugs. For instance, longitudinal in vivo data is usually
scarce (see Figure 6.8); the available data decreases even further when one is restricted
to a single drug combination. Thus, constructing models for individual drugs is likely
to achieve more robust models. Estimation of such models from in vivo data, however,
is problematic since drugs are given in combination and the causal inference between
drugs and emerging mutations is hard to solve. In the end, the most stable approach is
the estimation of mutation models for individual drugs from in vitro data followed by a
combination approach to realize models for drug combinations.
Finally, another possibility for finding sequences of beneficial drug combinations is to
omit the explicit resistance development in the viral population and directly learn from
the sequences of combination therapies in the patients’ pasts. For instance, one can focus on the success of a treatment given the N − 1 preceding combinations. We recently
investigated such an N -gram graph of therapy sequences as an potential encoding for a
patient’s therapy history (Müller, 2008). Unfortunately, only few treatment sequences occurred sufficiently often. However, it cannot be excluded that with the significantly larger
EuResist integrated database, which is available today, and a more elaborate statistical
learning method (or encoding of therapy sequences) information about beneficial orders of
drug combinations can be extracted without explicitly simulating the viral population.
7 A Solution to the HIV Pandemic: A Vaccine
Current treatment options do not provide a cure to HIV infections. Hence, the ultimate
solution to the HIV pandemic is to prevent further infections. One established instrument
for preventing infections are vaccines, and therefore the development of a successful HIV
vaccine is of major interest today. In this chapter we will present a bioinformatics approach
to aide the search for immunogens that elicit broadly neutralizing antibodies against HIV.
In Section 7.1 we will very briefly summarize the major elements of the immune system
and the molecular basis of immunity conferred by vaccination. This is followed by a short
overview on HIV vaccine research, including a focus on the viral spike (gp160), which is the
target for neutralizing antibodies (Section 7.2). Section 7.3 focuses on predicting antibodymediated neutralization from gp160 genotype based on statistical models. These models are
also used to identify features of gp160 that increase susceptibility to neutralizing antibodies.
This knowledge will be used to find gp160 variants that can be used as immunogens in a
successful HIV vaccine. Section 7.4 concludes the chapter with an outlook.
7.1 The Immune System and Vaccines
In the following we provide a brief overview on the immune system based on DeFranco
et al. (2007). The immune system of an organism is organized as a layered defense. The
first layer comprises physical barriers (e.g. shells, skin) and mechanical mechanisms (e.g.
coughing, sneezing). Once a pathogen manages to surmount the first line of defense, it
faces mechanisms of the innate immune system and the adaptive immune system, where
the latter one is a characteristic of jawed vertebrates.
Innate Immune System
The cells and molecules of the innate immune system recognize features that are conserved across broad groups of pathogens. This system reacts to pathogens in a generic
(non-specific) way and it confers neither long-lasting immunity nor increased efficacy after
previous exposure to a pathogen. The innate immune system includes different defense
mechanisms: humoral and chemical barriers of the innate system include inflammation
and the complement system. Inflammation is a process for recruiting immune cells to
the sites of infection; it is produced by eicosanoids and cytokines that are released by injured or infected cells. The complement system consists of about 30 serum and membrane
proteins that can trigger a variety of immune reactions by a biochemical signal cascade.
Furthermore, the innate response depends on a range of leukocytes (white blood cells). A
major subgroup of leukocytes comprises phagocytes, which act by engulfing particles or
pathogens: the pathogen is trapped in a vesicle and subsequently “destroyed” by digestive
enzymes. Dendritic cells and macrophages (a type of phagocyte) present parts of the digested pathogen on their surface. These antigens (antibody generators) bridge the gap
155
156
7 A Solution to the HIV Pandemic: A Vaccine
to the adaptive immune system by triggering the response of T cells and B cells, which
constitute the main building block of the adaptive immune system.
Adaptive Immune System
T cells and B cells belong to the group of lymphoctytes, a special type of leukocytes. T cells
are involved in a cell-mediated immune response, whereas B cells are part of the humoral
response. Both T and B cells carry receptor molecules that recognize only specific targets,
and each cell produces only one type of antigen-recognizing receptor. T cells are divided
into two subgroups: killer or cytotoxic T cells and helper T cells. T cells only respond to
antigens that are bound to a protein of the major histocompatibility complex (MHC), killer
T cells (CD8+ T cells) require antigens coupled to Class I MHC molecules, whereas helper
T cells (CD4+ T cells) rely on Class II MHC molecule-antigen complexes. As their name
suggests cytotoxic T cells release substances, which are toxic for cells (cytotoxins), and
thereby directly kill infected (e.g. by viruses) and damaged or dysfunctinal (e.g. cancer)
cells. Helper T cells activate other cells of the immune system and thereby regulate both
the innate and adaptive immune responses. For instance, helper T cells stimulate killer
T cells and enhance the microbicidal function of macrophages. Moreover, helper T cells
also activate B cells to produce neutralizing antibodies. Both helper and killer T cells
originate from naïve T cells, and the required proliferation into effector T cells is mediated
by the dendritic cells from the innate immune system. Naïve B cells carry B cell antigen
receptors (BCR) on their surface that contain immunoglobulins (antibodies). Once the
BCR binds to its specific antigen (e.g. a virus surface protein) the complex undergoes
endocytosis. The antigen is cleaved into small peptides that are presented by Class II
MHC molecules on the surface of the cell. After recognition and subsequent activation
by a helper T cell the B cell becomes activated and undergoes differentiation into an
effector B cell that secretes copies of exactly the antibody, which recognized the antigen.
Antibodies can directly inactivate (neutralize) pathogens by binding to its surface proteins
that are used to infect target cells or to toxins secreted by bacteria. In addition, bound
antibodies mark their target for destruction by the innate immune system, i.e. complement
activation or processing by phagocytes. In principle, only antigens that can be recognized
by the preexisting pool of B cells can trigger an immune response. However, on successful
activation, the antigens produced by B cells undergo a process termed affinity maturation
in which somatic hypermutations lead to modified binding affinities of the antibody to the
antigen, and antibodies with higher affinity are selected. In contrast to the innate immune
system, the adaptive immune system maintains specific cells targeting specific pathogens,
or more precisely: specific antigens. Moreover, an immunologic memory is established by
offsprings of T and B cells that begin to replicate after activation. These memory cells can
survive for decades and therefore the immune system is prepared for a challenge with the
same pathogen or more precisely by pathogens presenting the same antigens.
The aim of a vaccine is to induce immunity against a pathogen, that is, invoking a
strong immune response against the pathogen upon recognition. While the main goal of a
vaccine is to achieve sterilizing immunity (complete protection from infection), it is often
sufficient to substantially reduce the amount of active pathogens in the host and therefore
limit disease-related symptoms. Protection against a pathogen is elicited by challenging
7.2 Vaccines and HIV
157
the adaptive immune system with adequate antigens that are able to trigger a response
by the immune system and consequently establish the immunologic memory in the form
of T and B cells.
An immune response can be triggered by mild versions of the pathogen that usually do
not cause the disease (live attenuated vaccines), by killed pathogens (inactivated vaccines),
or by parts of the pathogens (compound vaccines) (Plotkin, 2005). The general idea is to
mimic an attack by the pathogen and thereby create the immunity that would be naturally
established after the real challenge. In general, the more similar the immunogen comprising
the vaccine is to the real pathogen, the better and longer lasting is the protection by
the vaccines. Hence, live attenuated vaccines are more effective than inactivated vaccine,
which, in turn, are more effective than compound vaccines (Lambert et al., 2005). For
instance, multiple shots of the new hepatitis B vaccine (a compound vaccine) are required
to establish immunity. Inactivated and compound vaccines, however, are safer than live
attenuated vaccines, since they are incapable of causing the disease.
Most of today’s successful vaccines rely on the establishment of memory B cells and
therefor on the rapid generation of matching neutralizing antibodies (nAbs) upon a challenge by the pathogen. Vaccines are a major instrument for controlling diseases, and since
the introduction of the first vaccine against smallpox in 1798, smallpox have been eradicated and other pathogens have almost disappeared. Today, vaccination is extended from
infectious to noninfectious diseases (immunotherapy). Among the infectious diseases HIV,
malaria, and tuberculosis are still of leading interest, while research in the area of noninfectious diseases currently focuses on various types of cancer, autoimmune diseases (e.g.
multiple sclerosis), atherosclerosis, and Alzheimer’s disease (Plotkin, 2005).
7.2 Vaccines and HIV
After identification of the pathogen that causes AIDS, researchers expected to have a
vaccine within a few years. Indeed, on April 23 in 1994 Magaret Heckler (then U.S. Health
Secretary) proclaimed publicly that a vaccine candidate will be available for testing within
two years (Walker and Burton, 2008). Now, 28 years after the discovery of HIV there
is still no vaccine. Even worse, there is not even a promising candidate in close sight.
The challenges that researchers are facing during the development of an HIV vaccine are
manifold: First, HIV is extremely diverse, which makes it hard to identify conserved
epitopes, and due to the error-prone replication process HIV can quickly evade the immune
response by mutating epitopes that are recognized by the adaptive immune system. Second,
a characteristic of HIV is the unique choice of target cells: the virus infects cells of the
immune system, in particular CD4+ helper T cells that coordinate the immune response.
Consequently, the virus slowly and steadily destroys is natural antagonist. Third, HIV
has evolved mechanisms to counter immune responses, e.g. the accessory protein Nef
down-regulates the synthesis of MHC molecules, which are crucial for the identification
of infected cells by T cells (Yang et al., 2002), and the surface envelope proteins exhibit
multiple features for successfully avoiding antibody recognition. A further complication
for vaccine design is resulting from the fact that HIV integrates its genetic information
into the host genome rapidly after infecting a cell. When the provirus is in its latent state,
the cell shows no signs of infection and is immunologically silent and thus unrecognized
158
7 A Solution to the HIV Pandemic: A Vaccine
by the immune system. This way the virus can survive for decades in memory cells of the
immune system, its latent viral reservoir.
All these unique features of HIV lead to following requirements for a successful vaccine:
recognition of an array of diverse viruses and fast response to the challenge in order to
prevent the establishment of viral reservoirs. If the vaccine fails to prevent the latter, it
has at least to assist the natural immune response in preventing the destruction of CD4+
cells during the acute phase (Walker and Burton, 2008). This would enable the patient to
better control the infection, i.e. maintain a low viremia, and thereby substantially delay
progression to AIDS and, in addition, reduce the risk of transmission to further individuals.
Because cells present fragments of proteins they produce on their surface via their MHC
molecules, HIV vaccines aiming at eliciting T cell-mediated immunity can (in theory) be
based on any of the viral proteins. Vaccines that rely on eliciting neutralizing antibodies, on
the other hand, can only target the viral spike (gp160). As mentioned earlier (Section 2.2)
the viral spike comprises two sub-units, gp41 (transmembrane) and gp120, that occur in
trimeric form (Figure 7.1 a) (Liu et al., 2008). During cell entry, the glycoprotein gp120
binds to the CD4 receptor and the coreceptors, while gp41 is responsible for membrane
fusion. The viral spike is heavily shielded against the host immune system by multiple
mechanisms. For instance, the glycoprotein, as its name suggests, is densely covered with
sugar molecules that shield conserved epitopes against the adaptive immune response. The
subunit gp120 contains five variable regions (V1-V5) that are inter-spaced with conserved
regions. These five variable loops exhibit a very high sequence diversity and provide further
protection of the conserved parts of gp120 against immune recognition. The variable surface
region of gp120 is also referred to as the “silent face” because only few antibodies are elicited
against it (Karlsson Hedestam et al., 2008). Figure 7.1 b) shows the core of the gp120
protein in contact with the CD4 receptor. Also its own structure protects the viral spike
against the immune response. Conserved regions in gp41, for instance, are inaccessible,
simply because there is too little space between gp120 and the viral membrane to allow
for binding of the B cell receptor. A more advanced protection mechanism is termed
conformational masking. Here, conserved epitopes are only accessible after conformational
changes in the protein. For instance, the coreceptor binding site of gp120 is only accessible
after the conformational change induced by CD4 binding (Douek et al., 2006).
Upon infection, a race between HIV and the adaptive immune system commences. Unfortunately, the virus is always one step ahead. More precisely, the adaptive immune
system inflicts selective pressure on the evolution of HIV, which drives HIV to evolve into
variants that cannot be neutralized by the available array of neutralizing antibodies. Richman et al. (2003) studied blood sera and viruses from patients at different time points.
Briefly, their analysis showed that the blood serum at time point t is very well capable of
neutralizing older viral variants, i.e. obtained at time point t − 1 and earlier. However,
the sera are only insufficiently capable or incapable of neutralizing the current or future
variants, respectively. Thus, antibodies typically generated during a chronic HIV infection
cannot be used as basis for a vaccine. Fortunately, some chronically infected individuals
elicit antibodies that are capable of neutralizing a broad range of HIV variants. These
broadly neutralizing antibodies are of major interest for the design of a successful HIV
vaccine because they provide the means of establishing sterilizing immunity. In addition,
currently existing neutralizing antibodies demonstrate that the viral spike has a few weak
7.2 Vaccines and HIV
(a) The viral spike
159
(b) Structure of the gp120 core
Figure 7.1: A schematic representation of the HIV-1 envelope glycoprotein trimer in contact
with the CD4 receptor (red) and the coreceptor (green) of the host cell (a). The
trimer comprises three copies of the transmembrane glycoprotein gp41 (brown),
the three copies of the surface glycoprotein gp120 (blue) are attached to gp41.
Figure reprinted from Phogat and Wyatt (2007) with kind permission from
Bentham Science Publishers. Structure of core gp120 (purple) in complex with
the CD4 receptor (orange) (b). The figure was rendered with PyMOL on the
basis of PDB entry 2nxy. Estimated position of the five loops is given in blue
font.
spots that can be exploited by rational vaccine design. This knowledge can help to develop
an immunogen that elicits broadly neutralizing antibodies (bnAbs) in vivo.
Until now, a number of broadly neutralizing antibodies have been identified in chronically
infected individuals (Burton et al., 2004) and the search for new bnAbs is still going on
(Walker et al., 2009). One of the first bnAbs discovered, which is called b12, targets an
epitope that substentially overlaps with the conserved CD4 binding site on gp120 (Burton
et al., 1994). Consequently, upon binding to gp120 the nAb prevents CD4 attachment,
and thus the conformational change that is a prerequisite for successful entry of HIV into
the target cell. The nAbs 2F5 and 4E10 recognize conserved linear epitopes on gp41. It
is expected that these nAbs act like fusion inhibitors, i.e. they exhibit their neutralizing
effect by preventing the conformational change in gp41 leading to fusion of viral and cell
membranes. Figure 7.2 depicts the viral spike and epitopes recognized by some bnAbs.
Recently, Trkola et al. (2008) demonstrated the efficacy of bnAbs in vivo with the aim to
determine clinically relevant titers of the antibodies.
HIV Vaccine Candidates
So far, there have been a few vaccine candidates that were tested in clinical trials. The
most prominent examples are AIDSVAX and the STEP study.
Within STEP (launched in 2004) a candidate vaccine using a replication-deficient adenovirus type 5 vector that expresses some HIV clade (subtype) B proteins (Gag, Pol, Nef )
was investigated. As it is clear from the list of HIV proteins involved, the candidate was
160
7 A Solution to the HIV Pandemic: A Vaccine
Figure 7.2: Structure of the envelope trimer with highlighted epitopes of some broadly
neutralizing antibodies: 2F5 (red), 4E10 (yellow), b12 (grey), and 17b (orange).
Figure reprinted from Phogat and Wyatt (2007) with kind permission from
Bentham Science Publishers.
targeted to elicit T cell mediated immunity. The clinical trial, however, was stopped in
2007 by the Data and Safety Monitoring Board. The reason for the premature end of the
trial was that individuals that received the vaccine exhibited an enhanced rate of HIV-1
acquisition compared to people in the control group (Barouch, 2008). This, of course, was
an immense setback for the HIV vaccine research.
The AIDSVAX vaccine candidate developed by the company VaxGen uses monomeric
gp120 to elicit humoral immune response, i.e. neutralizing antibodies. The candidate
managed to elicit type-specific antibody response. A protective effect, however, could not
be detected in two clinical trials in 2005 and 2006 (Barouch, 2008).
Recently, AIDSVAX received further media coverage when in September 2009, researchers
reported a success in the search for an HIV vaccine. In a study started in October 2003
in Thailand, about 16,000 participants of HIV negative men and women received a combination of the two earlier tested vaccine candidates AIDSVAX and ALVAC-HIV. ALVACHIV uses a canarypox vector that, like the candidate in the STEP trial, expresses HIV
(poly-)proteins (here: Gag, Env, PR). At the endpoint of the study, 76 participants from
the control group were infected with HIV, compared to 56 of the vaccinated participants.
Hence, the vaccine’s efficacy was 26.4%. This result, however, did not reach statistical
significance. Statistical significance and 31.2% vaccine efficacy was achieved after excluding seven individuals that were found to have HIV at baseline (Rerks-Ngarm et al., 2009).
Although a glimmer of hope, the protection conferred by the vaccine is only marginal and
the analysis of the trial is regarded a “data crunch” for achieving statistical significance1 .
1
http://www.nature.com/news/2009/091021/full/news.2009.1035.html
7.3 Predicting Neutralization from Genotype
161
Despite the immense promise given by bnAbs, it has not been possible yet to identify an
immunogen that elicits bnAbs in vivo. With the work presented in the following we investigate what determines the susceptibility of an Env variant against three of the available
bnAbs.
7.3 Predicting Neutralization from Genotype
This section describes a joint work with Fabian Müller, Oliver Sander, Thomas Lengauer,
Klaus Überla, and Francisco Domingues. Here, using an array of statistical learning methods, we investigate the genotype-phenotype-relationship between the Env genotype and
the neutralization phenotype of 147 viral variants with three different bnAbs. The experimental assay for determining the neutralization phenotype is very similar to the approach
used for assessing in vitro drug resistance phenotypes (Section 2.5) and thus not described
in further detail (Richman et al. (2003) for example used an adapted version of the original drug resistance assay developed by Petropoulos et al. (2000)). We selected statistical
learning methods that allow for assessing feature importance and analyze statistically relevant positions in terms of susceptibility to neutralization by bnAbs. Finally, based on the
knowledge derived from interpreting the statistical learning methods we propose in vitro
experiments for experimental validation of our model. The focus of our investigation is the
antibody b12, which targets an epitope that largely overlaps with the CD4 binding site.
The model will be used to optimize an immunogen in an HIV vaccine development project.
7.3.1 Material and Methods
Data
The data for this study originate from four publications. Binley et al. (2004) studied
the neutralization phenotype regarding a panel of nAbs of 90 viral variants from different
clades, Li et al. (2005, 2006) analyzed 19 and 18 variants from clade B and C, respectively, and Schweighardt et al. (2007) examined another 20 variants from subtype B. The
datasets are named according to their origin and subtype of focus (Binley, Li.B, Li.C, and
Schweighardt.B ). Due to an overlap of viral variants studied in Binley and Li.B, five of the
total 147 samples were excluded from the analysis. All publications provided IC50 values
for the three nAbs b12, 2F5, and 4E10. The continuous neutralization phenotypes were
dichotomized to “neutralizing” and “non-neutralizing” using an antibody-dependent cutoff;
IC50 values below the cutoff indicate neutralization by the antibody. For b12 and 2F5
the cutoff selection was straightforward, since a large fraction of the samples was clearly
not susceptible to the antibody (values “>50” and “>25” of mg antibody/ml, respectively),
and variants with intermediate susceptibility were rare. The neutralization effect of 4E10,
however, was so broad that only very few viruses were completely resistant. Thus, a cutoff between very susceptible and intermediate susceptible isolates was selected based on
the distribution of IC50 values. Figure 7.3 a) depicts the distribution of IC50 values for all
three bnAbs and the applied cutoffs, and Figure 7.3 b) shows the distribution of the binary
neutralization phenotype within each clade in the dataset. The b12 phenotype data for
clades B, C, and D is almost balanced, while other clades (represented by a few isolates
only) are mainly non-neutralizing.
7 A Solution to the HIV Pandemic: A Vaccine
60
50
Frequency
40
50
0
10
20 >25
30
40
>50
20
10
0
0
10
20
30
Frequency
40
50
40
30
0
10
20
Frequency
4E10
60
2F5
60
b12
30
162
0
10
IC50 (mg antibody/ml)
20 >25
30
40
>50
0
10
IC50 (mg antibody/ml)
20 >25
30
40
>50
IC50 (mg antibody/ml)
20
30
40
50
neutralizing
non−neutralizing
0
10
number of samples
60
70
(a) IC50 distribution
A
AC
AE
AG
B
BF
BG
C
D
F
G
J
clade
(b) Clade distribution
Figure 7.3: Histograms of IC50 values for antibodies b12, 2F5, and 4E10 (a). Origin of
the data is color-coded: Binley (blue), Li.B (light green), Li.C (dark green),
Schweighardt.B (orange). Cutoffs are depicted by red dashed lines. Distribution of the binary neutralization phenotype within each clade (b). Left, middle,
and right bars correspond to the b12, 2F5, and 4E10 phenotype, respectively.
Amino-acid sequence data for the 142 Env variants were obtained from GenBank using
the accession numbers listed in the four publications. The Env gene comprises the signal peptide, gp120, and gp41. The sequences were aligned using MUSCLE (Edgar, 2004)
according to a guide tree based on the HIV phylogenetic tree published by Leitner et al.
(2005). The sequence of HXB2 was added to the multiple sequence alignment as an alignment independent reference. For details on the sequence processing please refer to the
Master thesis of Fabian Müller (Müller, 2009). The final alignment comprised 975 amino
acid positions.
Feature Encodings
The amino acids at each of the 975 alignment positions were encoded in two different ways.
The categorical encoding was used for statistical methods that can operate on categorical
data. Here, each alignment position was represented as a single character: one of the 20
7.3 Predicting Neutralization from Genotype
163
amino acids, a gap symbol (-), symbols for uncertain amino acids (B and Z, representing
presence of Asparagine or Aspartic acid and Glutamine or Glutamic acid, respectively), or
a symbol representing any amino acid (X). Hence, each viral variant was represented by
one feature for each alignment position. The binary encoding was input to support vector
machines (SVMs), which are unable to operate on categorical features with more than two
categories. Consequently, we used the same encoding as in geno2pheno (Section 2.5.2),
i.e. we encoded an amino acid sequence using binary indicator variables. More precisely,
one indicator variable was used for each possible symbol at each alignment position. Thus,
a sequence was represented as a binary vector of length 24 × 975. We refer to the categorical
encoding and the binary encoding as the sequence encoding.
A property that is only implicitly captured by the sequence encoding is the location of
potential N-linked glycosylation sites. As stated before, sugar molecules attached to the
surface of gp120 mask conserved epitopes from recognition by the immune system. The
glycosylation processes is carried out by the host cell at Asparagines (N) that are followed
by any amino acid (X) and Serine (S) or Threonine (T), i.e. the typical NX[S,T] pattern
of N-linked glycosylation. There were 268 positions in the alignment showing at least one
N among all sequences. Of those, 148 fulfilled the NX[S,T] criterion in at least one Env
variant. The respective 148 binary features constitute the glycosylation encoding, which
were added to the sequence encoding (glyco).
Apart from regions of high sequence variability, the multiple sequence alignment was
very stable. Unfortunately, regions of high sequence variability comprise exactly the five
variable loops that are believed to play an important role in determining antibody neutralization (Wyatt and Sodroski, 1998; Pantophlet and Burton, 2006). In order to further
investigate the influence of variable loops on the ability to predict neutralization by antibodies, they were completely removed from the sequence encoding (no loops). In addition,
we used sequence encodings that maintained sequence information on exactly one variable
loop, for all five loops (only Vx ). Furthermore, as a replacement for the sequence information, we encoded the five loops using alignment-independent features meant to capture
physicochemical properties, which can be derived from the sequence information. More
precisely, for variable loop features we used the molecular weight (sum of all molecular
weights of residues in the loop; one continuous feature per loop), the charge (number of
positively and negatively charged amino acids or the overall charge; three integer features
per loop), the length (number of residues in the loop, one integer variable per loop), the
hydropathy (sum of all hydropathy indices of the residues (Kyte and Doolittle, 1982); one
continuous feature per loop), the number of potential hydrogen bonds (absolute number of
residues potentially participating in hydrogen bonds, and sum of all donors and acceptors;
two integer values per loop), the N-linked glycosylation (number of potential glycosylation
sites; one integer per loop), the fold index (based on Prilusky et al. (2005); one continuous
variable per loop), and simply the abundance of each amino acid in the loop (20 integer
values per loop). These variable loop features were added to the no loops encoding (loop
feat).
Finally, we also investigated subsets of the sequence encoding that were restricted to the
three parts of Env : the signal peptide (signal ), gp120, and gp41.
All applied encodings are briefly summarized in Table 7.1.
164
encoding
sequence
glyco
no loops
loop feat
only Vx
signal
gp120
gp41
7 A Solution to the HIV Pandemic: A Vaccine
description
comprises sequence information of the Env gene: for the RF, amino
acids sequences were encoded using categorical features; for the SVM,
sequences were represented by a binary encoding
sequence encoding and additional 148 binary features indicating
potential glycosylation sites
sequence information removed from all variable loops
sequence information removed from all variable loops; instead loops are
represented by physicochemical properties (see text)
sequence information removed from all variable loops, but Vx, with
x ∈ {1, . . . , 5}
sequence encoding restricted to signal peptide
sequence encoding restricted to gp120
sequence encoding restricted to gp41
Table 7.1: List of feature encodings. The features are briefly summarized, for a full description see the text.
Statistical Learning Methods
We employed two statistical learning methods for predicting the binary neutralization
phenotype of viral Env variants. The Random Forests (RF) method (Breiman, 2001)
trains B decision trees on B bootstrap replicates of the training data. At each node, only
m randomly chosen variables of all p variables are considered for the split. The variable
that reduces the impurity of the labels the most is selected for splitting the node. In RF
P
the impurity is measured using the Gini index: k pk (1 − pk ), where k and pk are the class
and the frequency of that class at the node to be split, respectively. All B trees are fully
grown and left unpruned. For the two parameters of the method, the number of features
√
checked at each split (m) and the number of trees (B) we chose standard values b pc and
500, respectively. For the RF classifier the categorical sequence encoding was used since
RF can operate on categorical variables.
As second classifier we applied linear support vector machines (SVMs). For the parameter C, which penalizes misclassified samples in the training set, values of 2−7 , 2−6 , . . . , 22
were tested in a 10-fold cross-validation. The choice of the parameter did not influence the
performance substantially, and was therefore set to 1. Since linear SVMs do not support
categorical data the binary sequence encoding was applied. For achieving optimal performance, constant features were removed, and the built-in scaling option for features in the
libSVM implementation (Chang and Lin, 2001) was used.
The classification performance for both methods was assessed using the area under the
ROC curve (AUC) in a 10 times five-fold cross-validation setting. Both classifiers were used
together with all feature encodings for predicting neutralization by all three antibodies.
Statistical significance is assessed using a one-sided Wilcoxon test on the 10 mean AUCs
obtained in the five-fold cross-validation runs.
7.3 Predicting Neutralization from Genotype
165
Measures of Variable Importance
We used Mutual Information (MI) and p-values from Fisher’s exact test (Fisher, 1922)
as univariate measures of feature importance. Both measures were applied only to the
categorical sequence encoding. Furthermore, we used the mean decrease in Gini index
(MDG) extracted from the RF models and z-scores of features weights derived from the
linear SVM (Section 3.1) as multivariate measures of feature importance. Briefly, the MDG
of a feature is the decrease in Gini index observed when a node was split using that feature
– averaged over all B trees in the forest. Due to the binary encoding of the sequence
for the SVM, the feature importance (z-scores) was provided for each amino acid at each
alignment position. In order to make the measure comparable to the measures provided
by the other three methods, which provide one value for each alignment position, only the
maximum z-score at each alignment position was used as a measure of feature importance.
As recently described, the MDG is biased towards features with a high number of categories (Strobl et al., 2007), e.g. an uninformative variable with 10 categories receives
a higher MDG than an uninformative variable with two categories. Consequently, informative features with few categories may go unnoticed among uninformative features with
many categories. This bias is of particular interest for our problem, since alignment positions in unstable regions of the alignment show a higher number of different amino acids,
i.e. categories, than positions in stable regions. Thus, MDG for amino acids located in the
variable loops are likely to be boosted by this artifact.
In order to correct for this bias, we developed a heuristic based on permutations. Briefly,
the response vector is randomly permuted k times, and for each permutation the RF
classifier is trained and MDG for each variable is computed. Mean µ and standard deviation
σ of MDG for one feature are estimated from the k models. Under the assumption that the
MDG of an uninformative variable is normally distributed, given µ and σ, the probability
of observing the MDG value obtained in combination with the original response vector
serves as a p-value. We termed this approach permutation importance (PImp; Altmann
et al. (2010)). For correcting the MDG importance, we used 1000 permutations of the
original response vector.
For MI and the combined z-scores we observed the same bias as for MDG (Müller, 2009),
and consequently, we applied PImp for correcting it. For both MI and SVM-based z-scores
the PImp p-value was based on 1000 iterations.
All measures of variable importance were applied to the complete set of 142 training
samples, i.e. not derived from cross-validation models.
In vitro Validation
In order to experimentally validate our b12 prediction model, site-directed mutagenesis is
performed to confirm important positions derived from the feature importance analysis.
To this end, the b12 neutralization phenotype is measured before (original viral variant)
and after the introduction of up to three mutations. In total, 18 sets of mutations were
introduced into 10 different viral variants.
166
7 A Solution to the HIV Pandemic: A Vaccine
7.3.2 Results
Classification Performance
Table 7.2 lists the mean AUC and its standard deviation assessed in 10 repetitions of 5-fold
cross-validation. Based on sequence information alone both classifiers achieve an AUC of
around 0.8 for predicting the b12 neutralization phenotype. Adding information on putative glycosylation sites slightly decreased the performance for both methods. Removal of
the sequence information from the five variable loops led to a substantial and significant
decrease in performance for the RF (p = 0.00098) and the SVM model (p = 0.00195).
Replacement of the sequence information of the variable loops with the variable loop features led to a further decrease in performance. Maintaining the sequence information of
the V2 loop resulted in re-establishment (and even slight improvement) of the performance
obtained with the complete sequence. Maintaining one of the other variable regions did not
improve performance substantially over the model without any variable loop information
(data not shown). The model based on the gp120 region alone achieved a performance
comparable to the sequence encoding, while models based on the signal peptide or gp41
alone were clearly inferior (both p = 0.00098). For most encodings (except “signal” and
“gp120”) the SVM outperformed the RF model significantly (p < 0.03223).
RF performs significantly better than the linear SVM (p < 0.00977) for the prediction
of neutralization by nAb 2F5. With AUCs above 0.9 the performance for the RF models
is better than that for the b12 model. Addition of glycosylation features resulted in a
slight decrease in performance, while removal of variable loop information, unlike in the
case of b12, resulted in a slight increase. The best performance was achieved for both
learners when the sequence information was restricted to gp41. For both methods the
model restricted to gp41 sequence information model significantly better than the sequence
encoding (p < 0.004883).
With values around 0.72 the performance of 4E10 neutralization prediction is slightly
worse than the performance observed for the b12 models. Here, addition of glycosylation
and removal of the variable regions slightly increased the model performance. As in the
case of 2F5, the best models were obtained when the information was restricted to the
target of the nAb: gp41. The improvement reached statistical significance only for the RF
classifier (p = 0.001953).
Variable Importance
Figure 7.4 a) depicts the raw MDG importance extracted from the RF model predicting
b12 neutralization. A large number of alignment positions within the variable loops (primarily V1,V2, and V4) received high MDG values. The PImp correction of the MDG
(Figure 7.4 b), however, yields a different picture. Here, most of the variable loops do
not contain any information. The exception is V2, which contains four of the top 20 most
important positions. In general, according to this model, only few positions are linked to
the b12 neutralization phenotype. Surprisingly, among the top 20, two important positions
are located in the signal peptide, and another five are in gp41. One of the positions in
the signal peptide (5b; an insertion compared to HXB2) is indeed the top-most important
position. The second most important position is located in the b12 binding site.
5b
●
135
140136
147
148
150
164
1.0
●
●
●
31
MDG
460
●
●
446
●
396
400
410
360
●
●
167
●
●
185188
1.5
134
146 151
7.3 Predicting Neutralization from Genotype
●
●
●
● ●
●
● ●
0.0
0.5
●
0
200
400
600
800
1000
alignment position
5b
25
(a) MDG
369
15
●●
●
●
●
750
633
641
●
●
446
●
●
●
●
●
0
●
372
134
●
155
164
173
188
84
47
●
63
5
●
786b
360
●
676
185
●
●
21
10
−log10(MDG.pimp)
20
●
0
200
400
600
800
1000
alignment position
750
774
525
535
633
646
●
●
●
●
●
●
●
● ●
●
● ●
● ●
●
●
0
●
786b
403
●
372
386
●
333
●
277
292
●
234
●
190
●
155
173
●
132
●
84
2
726
●
706
676
641
●
467
●
460
●
432
446
369
360
●
352
●
161
●
63
●
3
188
185
●
164
134
●
47
●
21
●
●
1
consensus
●
5b
4
(b) PImp MDG
0
200
400
600
800
1000
alignment position
(c) consensus
Figure 7.4: Feature importance derived from models that predict b12 neutralization: RF
MDG (a), PImp of MDG (b), consensus of all four methods (c). The regions of
Env (signal peptide, gp120, and gp41) are separated by vertical dashed lines.
The variable regions within gp120 are highlighted in the background orange
(V1), green (V2), beige (V3), light blue (V4), and yellow (V5). Furthermore,
known epitopes are highlighted with red (CD4 binding), blue (b12 binding),
purple (CD4 and b12), orange (2F5 binding), and green (4E10 binding). For
the top 20 important residues, the position in reference to HXB2 is provided.
168
7 A Solution to the HIV Pandemic: A Vaccine
b12
sequence
glyco
no loops
loop feat
only V2
signal
gp120
gp41
RF
0.791 (0.018)
0.782 (0.021)
0.765 (0.013)
0.755 (0.015)
0.801 (0.017)
0.708 (0.024)
0.788 (0.015)
0.734 (0.019)
SVM
0.808 (0.020)
0.803 (0.020)
0.781 (0.019)
0.773 (0.021)
0.813 (0.019)
0.579 (0.031)
0.792 (0.026)
0.752 (0.020)
2F5
RF
0.914 (0.013)
0.905 (0.019)
0.927 (0.016)
0.911 (0.017)
0.924 (0.011)
0.766 (0.016)
0.825 (0.028)
0.956 (0.013)
SVM
0.840 (0.012)
0.836 (0.014)
0.845 (0.015)
0.849 (0.013)
0.829 (0.011)
0.693 (0.021)
0.807 (0.016)
0.866 (0.016)
4E10
RF
SVM
0.717 (0.026)
0.718 (0.018)
0.724 (0.032)
0.723 (0.020)
0.736 (0.025)
0.719 (0.022)
0.719 (0.023)
0.710 (0.021)
0.727 (0.028)
0.708 (0.029)
0.621 (0.038)
0.587 (0.037)
0.694 (0.023)
0.680 (0.026)
0.754 (0.019) 0.726 (0.028)
Table 7.2: Classification performance. The mean (sd) classification performance measured
using AUC and assessed via 10 repetitions of 5-fold cross-validation for the two
classifiers (RF and SVM) and the three different antibodies. Rows correspond to
different feature sets (see Table 7.1 for a brief description). The best performance
for each antibody is indicated by bold font.
Every measure for feature importance provided a slightly different picture (data not
shown). Thus, for the final interpretation we will rely on a consensus of the four methods.
The consensus (Figure 7.4 c) counts how often each position occurs among the top 20
most important positions (thus, only values of 0, . . . , 4 are possible). Figure 7.5 depicts
the location of important positions (i.e. appearing in two or more top 20 lists) on the
gp120 structure (PDB ID 2nxy). Of the eight positions with a consensus count of two
or larger in gp120 and outside the variable loops, two cannot be mapped to the structure
(HXB2 positions 47 and 63), two are located in the b12 binding site (positions 369 and
432), three are surrounding the b12 binding side (positions 352, 360, and 460), and one
is located on the opposite site of the b12 binding site (position 446). Of note, position
446 is a potential glycosylation site, and also, mutations at this position might influence
the glycosylation status of position 444. Due to the trimeric structure of gp120 (see for
instance Liu et al. (2008)), glycans attach at this position might influence binding of b12.
Of note, two additional b12 binding sites appear among the top 20 list of one method
(positions 372 (RF) and 386 (SVM)).
Figure 7.6 a) shows the PImp MDG from the RF 2F5 model. One positions (665) in
gp41 receives outstanding importance and is located in the 2F5 epitope (Zwick et al.,
2001). Among the top 20, two additional positions (662 and 667) belonging the linear
epitope are discovered. Likewise, Figure 7.6 b) depicts the PImp MDG importance for
the 4E10 model. Important positions are more scattered in the complete Env gene, than
in the case of 2F5. The most important position is located in gp41 and the second most
important one between the variable loops V3 and V4. With position 674, one known 4E10
binding site position (Zwick et al., 2001) is ranked third by the corrected MDG measure.
In vitro Validation
Table 7.3 lists the sets of mutations that were selected for experimental validation. With
pathways of three mutations we aim at observing a substantial change in the b12 neutralization phenotype. Mutations in the signal peptide and in gp41 are used to verify, whether
these positions are statistical artifacts or have biological impact. Further mutations are
used to assess the effect of adding and removing potential glycosylation sites. The remain-
7.3 Predicting Neutralization from Genotype
169
0
200
400
600
853
●
●
●
●
787
800
●
662
●
490
492
444
●
184
99
102
●
●
●
63
20
21
22
23
30
32
0
●
● ●
●● ●
730
667
400
600
●
200
−log10(MDF.pimp)
800
665
Figure 7.5: Important positions for b12 neutralization. Important positions were mapped
to the structure of gp120 (PDB ID 2nxy). The right picture depicts the surface
of b12 and CD4 binding sites, the left picture provides the location of the
important positions on the HXB2 reference sequence.
● ●
●
800
1000
alignment position
60
567
(a) PImp MDG 2F5
348
40
30
658
674
417
●
●
●
●
●●
●
793
●
629
●
585
588
●
●
557
321
●
513
525
265
●
429
134
●
●
●
●
0
20
●
360a
373
92
20
●
10
−log10(MDF.pimp)
50
●
0
200
400
600
800
1000
alignment position
(b) PImp MDG 4E10
Figure 7.6: Feature importance from models that predict 2F5 and 4E10 neutralization:
PImp of RF MDG for 2F5 (a) and 4E10 (b). For further information see
caption of Figure 7.4.
170
7 A Solution to the HIV Pandemic: A Vaccine
ing experiments shall clarify the effect of mutations close to the b12 binding site and within
conserved parts of V1 and V2. For further information on the mutation selection process
please refer to the Master thesis of Fabian Müller (Müller, 2009).
The mutagenesis and neutralization experiments are currently being conduced at Prof.
Dr. Klaus Überla’s laboratory at the Ruhr-University in Bochum, Germany.
7.3.3 Discussion
We trained statistical learning models that predict the neutralization of viral variants by
three nAbs based on the Env amino acid sequence. With AUC values from 0.717 to 0.914
all models achieved good to very good prediction performance using the sequence encoding.
The removal of sequence information of the five variable loops resulted in a sensible
loss of performance for the RF model in the prediction of neutralization by b12. The
dependence on information on the variable loop V2 for predicting the b12 phenotype is
confirmed by the re-established performance after providing the V2 sequence information
and by the consensus variable importance (Figure 7.4). Here, several positions within V2
receive high importance values. The influence of the V2 loop on the neutralization efficacy
by b12 was recognized soon after the discovery of b12 (Roben et al., 1994). Of note,
the corrected MDG measure concurs better with the increase in performance due to the
addition of single loops than the uncorrected MDG measure. The variable loop features
derived from sequence information that aimed at capturing physicochemical properties
of the loops failed to improve the model. In fact, inclusion of these features resulted in
decreased performance with both statistical learning methods.
Several of the important sequence positions in the conserved region of gp120 were located
either in the b12 binding site or structurally close to the b12 and CD4 binding site. The
virus has to maintain the capability to bind the CD4 receptor, hence mutations in the CD4
binding site are unlikely. Consequently, substitutions near the or at the b12 binding site,
but not at the CD4 binding site, are likely to affect susceptibility to nAb b12.
For other important positions that do not afford an obvious explanation, we will conduct
in vitro neutralization experiments that will either confirm their importance or mark them
as an artifact of the statistical methods.
The interpretation of the 2F5 prediction models revealed that decisions of the RF model
were mainly based on a single position within the known 2F5 epitope. Also the analysis of
the 4E10 RF model reveals a known position in the binding epitope among the top three
important positions. Given that the cutoff selection for 4E10 was not as straightforward
as for the other two antibodies, the rediscovery of a known epitope is reassuring. The
agreement between the positions relevant for 2F5 and 4E10 neutralization and their known
epitopes confirm the validity of the approach.
Unfortunately, the described method is susceptible to evolutionary bias. Here, evolutionary bias occurs in the form of two (or more) mutations, with one mutation having a causal
link to the phenotype, and the other mutations merely being correlated to the first one
due to emergence in a common ancestor or accumulation in the same lineage. Of course,
the statistical learning models cannot discriminate between a causal link and a correlation
to the phenotype. These correlations might explain the positions falsely assessed as being
important by the method. Mutations that are located in the signal peptide or in gp41
accession
AY835438
AY835438
AY835438
DQ388515
DQ388515
DQ388515
DQ435683
DQ435683
DQ411853
DQ411853
DQ411853
DQ388517
DQ388517
AY835452
DQ411852
AY423984
AY835448
AY835448
mutations
T132S, L134A
T132S, L134A, P369L
D386N
K360V, L369P
K360V, L369P, R432K
N460I
S132T, A134L
S132T, A134L, L369P
V134A
V134A, D185N
K446T
T185D
T185D, M161T
N386D
H352I
N462S
K5bQ
D750N
expected effect
more resistant
more resistant
more resistant (SVM)
more susceptible
more susceptible
more susceptible
more susceptible
more susceptible
more resistant
more resistant
more resistant
more susceptible
more susceptible
more susceptible
more resistant
more susceptible
more resistant
more susceptible (SVM)
measured effect
comment
path of three mutations
path of three mutations
add glycosylation; RF model shows no effect
path of three mutations
path of three mutations
remove glycosylation
path of three mutations
path of three mutations
mutations in V1, V2
mutations in V1, V2
close to two glycosylation sites
mutations in V1, V2
mutations in V1, V2
remove glycosylation
near binding site
remove glycosylation
mutation in signal peptide
mutation in gp41, RF model shows no effect
Table 7.3: Site-directed mutagenesis experiments. The table lists the name of the isolate, its accession number, the mutations that have to be
introduced (sequence position is given in reference to HXB2; capital letters before and after the position denote wild type amino
acid found in the isolate and the mutated amino acid, respectively), the expected effect, the measured effect, and the effect that is
to be tested (comment).
isolate
6535
6535
6535
ZM197M
ZM197M
ZM197M
CAP210.2.00.E8
CAP210.2.00.E8
Du172
Du172
Du172
ZM233M
ZM233M
CAAN
Du156
ZM53M.PB12
THRO
TRJO
7.3 Predicting Neutralization from Genotype
171
172
7 A Solution to the HIV Pandemic: A Vaccine
and are supposed to be determinants of b12 binding are likely examples for this bias. For
instance, position 5b in the signal peptide is highly correlated with position 352 in the
proximity of the b12 binding site (Müller, 2009). Moreover, best performance was usually
achieved when the sequence information was restricted to the Env region targeted by the
nAb. Models based on the remaining regions that accommodate (potential) correlated mutations were clearly inferior. With such a restricted dataset a cautious interpretation of the
results is required. Also the clade membership is a strong indicator for this evolutionary
bias. In order to study a less biased dataset we repeated the RF analysis of b12 neutralization prediction with a dataset restricted to sequences from clades B, C, and D that show a
balanced neutralization phenotype. The dataset comprised 106 instances, and the AUC of
the sequence encoding was 0.75±0.037, corresponding to a slight decrease in comparison
to the full dataset. The qualitative findings, however, could be confirmed: decrease of performance without information on variable loops (0.67±0.038), models restricted to signal
peptide (0.578±0.054) and gp41 (0.63±0.043) achieved worst performance, and sequence
information on the V2 loop is pivotal for prediction performance (0.764±0.028).
A drawback of this study was the fact that experimental data were pooled from four
different publications using different assays. Different neutralization assays may show a
different phenotype in combination with some or all nAbs (Fenyö et al., 2009). However,
dichotomizing the neutralization phenotype to “neutralizing” and “non-neutralizing” using
a cutoff is likely to remove most of the variation originating from different assays. Moreover,
we did not use the misclassification rate as a performance measure, but the area under
the ROC curve. The latter measure can be interpreted as the probability that a randomly
chosen positive (neutralizing) sample receives a higher prediction score than a randomly
selected negative (non-neutralizing) sample. More precisely, the AUC is computed by first
sorting all samples according to their predicted value. Hence, even if, due to experimental
variation, a sample is mislabeled, it is less likely to substantially perturb the performance
measure. Simply because samples close to the cutoff are more likely to receive prediction
values in-between clear negatives and clear positives.
Despite the fact that only relatively few training data were available, we were able to
build statistical models with high predictive power. Furthermore, interpretation of these
models revealed sequence positions that were known to be in the binding site of the studied
nAbs. Therefore, this seems to be a valid approach for identifying residues that determine
antibody neutralization.
In terms of vaccine design our approach can be used of identify viral variants that
are potential immunogens for eliciting a broadly neutralizing antibody. More precisely,
assuming that variants that are very susceptible to a given nAb are also potential potent
immunogens, which elicit the nAb in vivo, our in silico prediction method can help to
screen large available sequence databases for interesting candidate immunogens.
7.4 Outlook
In the search for new broadly neutralizing antibodies, comparatively large panels of different viruses are tested against the new candidate antibody, and the three antibodies studied
here are typically used as a reference, see for instance the work by Walker et al. (2009).
Hence, if made public, additional data can be used to improve the prediction models intro-
7.4 Outlook
173
duced here, or alternatively, the additional samples can be used as an independent validation set. Moreover, our approach can also be extended to study the genotype-phenotype
relation of newly discovered nAbs.
An additional approach to improving models that predict neutralization by antibodies
is the utilization of features derived from the protein structure instead of the sequence.
Clearly, successful binding of a nAb to its epitope on the surface of the viral spike is determined by shape and chemical complementarity. The protein sequence is only an indicator
for the structure of the protein in vivo. As a consequence of its high impact for vaccine
design, the structure of the gp120 core was studied in great detail, and today there are
a number of experimentally generated protein crystal structures available. Some of these
structures contain even the variable loop V3 (Huang et al., 2005) that is crucial in coreceptor binding. And, precisely this structure of the V3 loop was previously used to derive
structural descriptors for the prediction of coreceptor usage (Sander et al., 2007). In this
previous work, a statistical learning model using structural descriptors could outperform
a sequence-based approach, and a combination of both, structure and sequence, provided
the best model. Likewise, the prediction of neutralization by nAbs could benefit from the
application of structural descriptors. This approach is of particular interest for b12, as a
gp120 structure in complex with b12 is available (Zhou et al., 2007).
The proposed approach to model nAb neutralization will be applied to assist vaccine
development. The knowledge extracted from the statistical learning models and the models themselves can help to construct a large library of Env variants that are likely to be
neutralized by the nAb. The library of Env variants is then used in an immunogen optimization protocol to develop an HIV vaccine. In particular, the variants in the library will
be screened for their ability to evoke a strong immune response and additional statistical
models will also be implemented for refining the search of candidate immunogens.
The approach for finding immunogens that elicit a desired type of antibodies can be
extended to find nAbs for other pathogens or even for specific types of cancer.
174
7 A Solution to the HIV Pandemic: A Vaccine
8 Conclusion and Outlook
We have presented methods for improving HIV patient care. The developed models use
techniques from statistical learning and focus on the prediction of response to combination
treatment as opposed to resistance against single drugs.
In particular, we investigated the benefit of features encoding the viral evolution during
therapy. The most predictive feature was the genetic barrier to drug resistance, which
quantifies the likelihood that the virus will escape from the drug by developing further
resistance mutations. A statistical model was trained on clinical data from the United
States of America comprising viral genotype, treatment, and outcome of the regimen. The
use of the genetic barrier together with indicators for mutations in protease and reverse
transcriptase as well as indicators for the usage of antiretroviral drugs in the regimen outperforms commonly used rules-based systems in finding successful treatments. Precisely,
a retrospective analysis on data from Europe demonstrated that our tool geno2phenoTHEO reaches a true positive rate (TPR) of 64% at a false positive rate (FPR) of 20%.
By contrast, the rules-based approaches yield only 44% to 48% TPR at the same FPR, i.e.
our method identifies up to 20% more successful combination treatments than standard
approaches.
Within the EuResist project, which integrated multiple HIV resistance databases storing treatment related information, we further developed geno2pheno-THEO. In particular,
we improved the genetic barrier by relying on a large number of predicted resistance phenotypes rather than a small number of measured phenotypes for defining mutation patterns
that correspond to complete resistance. And, more importantly, we made use of the patient’s treatment history (i.e. drugs the patient has been exposed to) and baseline viral
load to improve prediction to combination therapy. Apart from geno2pheno-THEO, which
is (due to the employed features based on viral evolution) also referred to as evolutionary
engine, two other prediction models were developed within the project. In order to provide
a single prediction for one request, we explored ways for optimally combining the output
of all three engines. Here, it turned out that simply using the mean of all predictions is
an efficient and robust way. On a small set of 25 treatment change episode the predictions
of the combined prediction engine were compared to the predictions of ten international
human HIV treatment experts. Our system performed as well as the best human expert
and also as well as the consensus of all human experts.
The availability of three prediction engines trained on the same training data unveiled a
serious limitation of the current definition to of treatment response. The standard datum
definition focuses on short-term response measured at eight weeks (4-12) of therapy. The
cutoff for success was a viral load below 500 cp/ml at the time-point closest to eight weeks.
A substantial number of treatments, however, which were labeled as failure, reached a
VL below the threshold at a later time during the treatment. Interestingly, all three
engines captured that trend frequently, and disagreed with the label of the treatment.
175
176
8 Conclusion and Outlook
As a consequence, we studied modified treatment response definitions: sustained response
at 24 weeks (16-32) of treatment, the area under the viral load curve during one year
of therapy, and a labeling that combines short-term and sustained response. The latter
definition rejects treatments that have a discordant response at eight and 24 weeks of
therapy, and therefore filters instances with potential adherence problems. In this setting
our methodology achieves an AUC of 0.85 compared to an AUC of 0.77 on the original
response definition (using the richest model).
Furthermore, we addressed the update-problem of data-driven decision support systems.
Briefly, the update-problem originates from the fact that data-collection efforts providing
the training data for those systems lag behind the introduction of novel drugs. More
precisely, novel drugs tend to be prescribed to patients in an advanced state of the disease.
Hence, it takes time to collect a sufficient amount of training data for the data-driven
systems. In order to circumvent that problem, we introduced a new covariate, which
represents the activity of novel drugs in the regimen. Here, activity of novel drugs is
assessed by a rules-based system and when applied to treatments comprising newly licensed
drugs the modified system outperforms the purely rules-based approach – manifested in
an increase of the TPR by approximately 25% at a FPR of 20%.
We also investigated treatment history as a potential replacement for the viral sequence.
The rationale behind the approach is that the prescribed drugs shape the genetic makeup
of the viral population. Consequently, the treatment history, i.e. list of drugs the patient
was exposed to, is a strong predictor of treatment response. Here, we found that prediction
models based on history information alone are slightly inferior (AUC of 0.75) to genotypebased predictions (AUC of 0.77). History and genotype information seem to be partially
complementary, since combining genotype-based and history-based predictions resulted in
a further increase in performance (AUC of 0.79), regardless of the mode of combination.
We made first progress towards the sequencing of anti-HIV therapies. To this end, we
used methods from large vocabulary language processing to develop a framework that
allows rapid search for likely viral variants arising at treatment failure. We developed
five mutation models on the basis of in vitro phenotypic resistance data and emergence of
resistance mutations observed in vivo. The framework was challenged to separate successful
from failing treatments on the basis of mutations caused by the immediately preceding
regimen. The performance was moderate (AUC of 0.63) but constitutes a substantial
improvement over the baseline (AUC of 0.55). Additionally, we could demonstrate that
resistance development is correctly captured within drug classes. Simulating resistance
development during antiretroviral treatment with a maximum of five mutations takes in
the new framework only 3 seconds while keeping track of the 100 most likely viral variants.
Hence, the framework affords web services with acceptable response time.
Finally, we developed statistical models that predict neutralization of HIV variants by
three broadly neutralizing antibodies based on the Env genotype. The predictive performance of the models using sequence information of complete Env ranged from 0.72 AUC to
0.91 AUC, depending on the antibody. Interpretation of the statistical learning methods for
all three antibodies recovered at least one known position located in the antibody-specific
epitope. For the antibody b12, sequence information on the variable loop V2 is pivotal
to maintain model performance. Furthermore, selected mutations from the b12 model are
currently tested in vitro to further validate the approach. Knowledge derived from the
177
statistical models and the models themselves can be used to build focused libraries that
screen in vitro for potent immunogens. Our methodology can, in theory, be applied to
novel HIV targeting neutralizing antibodies or even be extended to other pathogens and
cancer.
The work described in this thesis resulted in a number of freely available web services. Geno2pheno-THEO is available as part of the geno2pheno web suite (http://www.
geno2pheno.org). Part of this web suite is also a prediction system for resistance against
integrase inhibitors and can be directly accessed at http://integrase.geno2pheno.org.
Finally, our contribution to the EuResist prediction engine can be queried as part of the
complete system using the website http://engine.euresist.org.
Outlook
A most straightforward direction is the extension of the models developed in this thesis to
the newly available drugs from different classes. In particular, integrase inhibitors and entry
inhibitors. A further challenge is the development of prediction tools for treatment naïve
patients. This group faces the unique problem of transmitted drug resistance mutations.
Transmitted mutations may, due to replicative disadvantage, vanish from the majority of
the viral population, and consequently, go unnoticed using standard interpretation tools.
However, traces of those mutations in viral genome are likely to be present at nucleotide or
in rare cases even at amino acid level. In addition, it is of particular interest to infer drug
adherence problems from stored treatment data (primarily viral load trajectories). Such
information will not only enhance response prediction, but, more importantly, will also
provide the means for treating clinicians to identify patients having trouble in correctly
taking their medication. Here, one can provide a tool for detering development of resistance
mutations.
Personalized treatment in the HIV field has, so far, only focused on the interaction between the drug and virus. With our approaches for predicting response to antiretroviral
drugs in vivo, a third player enters the interaction network: the patient. Until now, the
patient was almost neglected in the prediction of treatment response. However, in order to
make further advancements in personalized anti-HIV therapy the interplay between virus
and host as well as between drugs and host has to be considered. Indeed, initial studies
focusing on the relationship between human genomics and the control of HIV infection
have already been conduced (Telenti and Goldstein, 2006). The next steps in this direction should involve the systematic exploitation of available resistance databases that offer
a wealth of treatment data. Hence, it is comparatively uncomplicated to updated these
databases gradually with the missing human genetics data. This will offer a wealth of possibilities, including pharmocogenetic studies uncovering the interaction between drugs and
host-related differences in the involved metabolic pathways (Telenti and Zanger, 2008)1 .
The resulting knowledge will help to provide the right dosage of drugs for the patient in
addition to the right drug attacking the virus. The correct dosing is of particular interest,
since drug-related side-effects are a major obstacle in the life-long HIV treatment. Also
the use of next generation sequencing techniques will allow to study the evolution of the
1
http://www.hiv-pharmacogenomics.org/
178
8 Conclusion and Outlook
viral population during treatment and result in more advanced decision support tools.
The evaluation of clinical decision support systems used for assessing drug resistance
should be carried out on standardized publicly available datasets. To date, validation of
such tools is based on retrospective analysis of clinical data – with different sets of data used
for different tools. Consequently, the performance of the various number of tools is hard to
compare. Especially since the datasets are prone to local differences in drug prescriptions
or widely differing extent of pre-treatment of involved patients. Hence, a publicly available
benchmark dataset with fair distributions of drug usage and certain level of pre-treatment
(i.e. not to many “simple cases”) is needed. Moreover, publicly available datasets attract
a variety of scientists applying available methods or developing new tailored methods to
the problem. For instance, the freely available (standardized) genotype-phenotype dataset
(Rhee et al., 2006) caused a number of follow-up publications (Saigo et al., 2007; Kjaer
et al., 2008; Kierczak et al., 2009).
The methods developed in this theses can also be applied to optimization of treatment for
other viral diseases such as Hepatitis B and Hepatitis C. While data-collection efforts and
statistics will remain the same, the extension to these viral diseases is not straightforward,
as clinically relevant definitions of response to treatment have to be identified and agreed
upon. Furthermore, using matched genotype-phenotype data it is comparably simple to
identify resistance mutations. Understanding their mechanism of resistance, however, still
remains a challenge.
Bibliography
(2006). Denying science. Nature Medicine, 12(4):369.
(2008). The cost of silence? Nature, 456(7222):545.
Abbadessa, G., Accolla, R., Aiuti, F., Albini, A., Aldovini, A., Alfano, M., Antonelli, G.,
Bartholomew, C., Bentwich, Z., Bertazzoni, U., Berzofsky, J. A., Biberfeld, P., Boeri,
E., Buonaguro, L., Buonaguro, F. M., Bukrinsky, M., Burny, A., Caruso, A., Cassol,
S., Chandra, P., Ceccherini-Nelli, L., Chieco-Bianchi, L., Clerici, M., Colombini-Hatch,
S., de Giuli Morghen, C., de Maria, A., de Rossi, A., Dierich, M., Della-Favera, R.,
Dolei, A., Douek, D., Erfle, V., Felber, B., Fiorentini, S., Franchini, G., Gershoni,
J. M., Gotch, F., Green, P., Greene, W. C., Hall, W., Haseltine, W., Jacobson, S.,
Kallings, L. O., Kalyanaraman, V. S., Katinger, H., Khalili, K., Klein, G., Klein, E.,
Klotman, M., Klotman, P., Kotler, M., Kurth, R., Lafeuillade, A., Placa, M. L., Lewis,
J., Lillo, F., Lisziewicz, J., Lomonico, A., Lopalco, L., Lori, F., Lusso, P., Macchi, B.,
Malim, M., Margolis, L., Markham, P. D., McClure, M., Miller, N., Mingari, M. C.,
Moretta, L., Noonan, D., O’Brien, S., Okamoto, T., Pal, R., Palese, P., Panet, A.,
Pantaleo, G., Pavlakis, G., Pistello, M., Plotkin, S., Poli, G., Pomerantz, R., Radaelli, A.,
Robertguroff, M., Roederer, M., Sarngadharan, M. G., Schols, D., Secchiero, P., Shearer,
G., Siccardi, A., Stevenson, M., Svoboda, J., Tartaglia, J., Torelli, G., Tornesello, M. L.,
Tschachler, E., Vaccarezza, M., Vallbracht, A., van Lunzen, J., Varnier, O., Vicenzi, E.,
von Melchner, H., Witz, I., Zagury, D., Zagury, J.-F., Zauli, G., and Zipeto, D. (2009).
Unsung hero Robert C. Gallo. Science, 323(5911):206–7.
Agopian, A., Gros, E., Aldrian-Herrada, G., Bosquet, N., Clayette, P., and Divita, G.
(2009). A new generation of peptide-based inhibitors targeting HIV-1 reverse transcriptase conformational flexibility. J Biol Chem, 284(1):254–64.
Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of
items in large databases. In Proceedings of ACM SIGMOD Conference on Management
of Data, pages 207–216.
Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. Proc.
20th Int. Conf. Very Large Data Bases, pages 487–499.
Aharoni, E., Altmann, A., b Borgulya, D’utilia, R., Incardona, F., Kaiser, R., Kent,
C., Lengauer, T., Neuvirth, H., Peres, Y., Petroczi, A., Prosperi, M., Rosen-Zvi, M.,
Schulter, E., Sing, T., Sonnerborg, A., Thompson, R., and Zazzi, M. (2007). Integration
of viral genomics with clinical data to predict response to anti-HIV treatment. IST-Africa
2007 Conference Proceedings, Paul Cunningham and Miriam Cunningham (Eds).
Aho, A., Sethi, R., and Ullman, J. (1986). Compilers: Principles, techniques, and tools.
Addison-Wesley.
179
180
8 Bibliography
Akaike, H. (1974). A new look at the statistical model identification. IEEE transactions
on automatic control, AC-19:716–723.
Allauzen, C. and Mohri, M. (2002). On the determinizability of weighted automata and
transducers. In Proceedings of the workshop Weighted Automata: Theory and Application
(WATA).
Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., and Mohri, M. (2007). OpenFst: A
general and efficient weighted finite-state transducer library. CIAA 2007, Lecture Notes
in Computer Science, 4783:11–23.
Altmann, A., Beerenwinkel, N., Sing, T., Savenkov, I., Däumer, M., Kaiser, R., Rhee, S.-Y.,
Fessel, W. J., Shafer, R. W., and Lengauer, T. (2007a). Improved prediction of response
to antiretroviral combination therapy using the genetic barrier to drug resistance. Antivir
Ther (Lond), 12(2):169–78.
Altmann, A., Däumer, M., Beerenwinkel, N., Peres, Y., Schülter, E., Büch, J., Rhee, S.-Y.,
Sönnerborg, A., Fessel, W. J., Shafer, R. W., Zazzi, M., Kaiser, R., and Lengauer, T.
(2009a). Predicting the response to combination antiretroviral therapy: retrospective
validation of geno2pheno-THEO on a large clinical database. J Infect Dis, 199(7):999–
1006.
Altmann, A., Sing, T., Vermeiren, H., Winters, B., Craenenbroeck, E. V., der Borght,
K. V., Rhee, S. Y., Shafer, R. W., Schuelter, E., Kaiser, R., Peres, Y., Sonnerborg, A.,
Fessel, W. J., Incardona, F., Zazzi, M., Bacheler, L., Vlijmen, H. V., and Lengauer,
T. (2007b). Inferring virological response from genotype: with or without predicted
phenotypes? Antivir Ther (Lond), 12(5):S169–S169.
Altmann, A., Sing, T., Vermeiren, H., Winters, B., Craenenbroeck, E. V., der Borght,
K. V., Rhee, S.-Y., Shafer, R. W., Schülter, E., Kaiser, R., Peres, Y., Sönnerborg, A.,
Fessel, W. J., Incardona, F., Zazzi, M., Bacheler, L., Vlijmen, H. V., and Lengauer, T.
(2009b). Advantages of predicted phenotypes and statistical learning models in inferring
virological response to antiretroviral therapy from HIV genotype. Antivir Ther (Lond),
14(2):273–83.
Altmann, A., Thielen, A., Frentz, D., Laethem, K. V., Bracciale, L., Incardorna, F., Sönnerborg, A., Zazzi, M., Kaiser, R., and Lengauer, T. (2009c). Keeping models that predict response to antiretroviral therapy up-to-date: fusion of pure data-driven approaches
with rules-based methods. Reviews in Antiviral Therapy, page A92.
Altmann, A., Toloşi, L., Sander, O., and Lengauer, T. (2010). Permutation importance: a
corrected feature importance measure. Bioinformatics, 26(10):1340–7.
Andreola, M. L., Pileur, F., Calmels, C., Ventura, M., Tarrago-Litvak, L., Toulmé, J. J.,
and Litvak, S. (2001). DNA aptamers selected against the HIV-1 RNase H display in
vitro antiviral activity. Biochemistry, 40(34):10087–94.
Arthos, J., Cicala, C., Martinelli, E., Macleod, K., Ryk, D. V., Wei, D., Xiao, Z., Veenstra,
T. D., Conrad, T. P., Lempicki, R. A., McLaughlin, S., Pascuccio, M., Gopaul, R.,
8 Bibliography
181
McNally, J., Cruz, C. C., Censoplano, N., Chung, E., Reitano, K. N., Kottilil, S., Goode,
D. J., and Fauci, A. S. (2008). HIV-1 envelope protein binds to and signals through
integrin alpha4 beta7, the gut mucosal homing receptor for peripheral T cells. Nat
Immunol, 9(3):301–9.
Baelen, K. V., Salzwedel, K., Rondelez, E., Eygen, V. V., Vos, S. D., Verheyen, A., Steegen, K., Verlinden, Y., Allaway, G. P., and Stuyver, L. J. (2009). Susceptibility of
human immunodeficiency virus type 1 to the maturation inhibitor bevirimat is modulated by baseline polymorphisms in gag spacer peptide 1. Antimicrob Agents Chemother,
53(5):2185–8.
Barouch, D. H. (2008). Challenges in the development of an HIV-1 vaccine. Nature,
455(7213):613–9.
Barré-Sinoussi, F., Chermann, J. C., Rey, F., Nugeyre, M. T., Chamaret, S., Gruest,
J., Dauguet, C., Axler-Blin, C., Vézinet-Brun, F., Rouzioux, C., Rozenbaum, W., and
Montagnier, L. (1983). Isolation of a T-lymphotropic retrovirus from a patient at risk
for acquired immune deficiency syndrome (AIDS). Science, 220(4599):868–71.
Beerenwinkel, N., Däumer, M., Oette, M., Korn, K., Hoffmann, D., Kaiser, R., Lengauer,
T., Selbig, J., and Walter, H. (2003a). Geno2pheno: Estimating phenotypic drug resistance from HIV-1 genotypes. Nucleic Acids Res, 31(13):3850–5.
Beerenwinkel, N., Däumer, M., Sing, T., Rahnenfuhrer, J., Lengauer, T., Selbig, J., Hoffmann, D., and Kaiser, R. (2005a). Estimating HIV evolutionary pathways and the
genetic barrier to drug resistance. J Infect Dis, 191(11):1953–60.
Beerenwinkel, N., Lengauer, T., Däumer, M., Kaiser, R., Walter, H., Korn, K., Hoffmann,
D., and Selbig, J. (2003b). Methods for optimizing antiviral combination therapies.
Bioinformatics, 19 Suppl 1:i16–25.
Beerenwinkel, N., Rahnenführer, J., Däumer, M., Hoffmann, D., Kaiser, R., Selbig, J., and
Lengauer, T. (2005b). Learning multiple evolutionary pathways from cross-sectional
data. J Comput Biol, 12(6):584–98.
Beerenwinkel, N., Rahnenführer, J., Kaiser, R., Hoffmann, D., Selbig, J., and Lengauer,
T. (2005c). Mtreemix: a software package for learning and using mixture models of
mutagenetic trees. Bioinformatics, 21(9):2106–7.
Beerenwinkel, N., Schmidt, B., Walter, H., Kaiser, R., Lengauer, T., Hoffmann, D., Korn,
K., and Selbig, J. (2002). Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype. Proc Natl Acad Sci USA,
99(12):8271–6.
Bellman, R. (1958). On a routing problem. Quaterly of Applied Mathematics, 16(1):87–90.
Berger, E. A., Murphy, P. M., and Farber, J. M. (1999). Chemokine receptors as HIV-1
coreceptors: roles in viral entry, tropism, and disease. Annu Rev Immunol, 17:657–700.
182
8 Bibliography
Bickel, S., Bogojeska, J., Lengauer, T., and Scheffer, T. (2008). Multi-task learning for
HIV therapy screening. Proceedings of the 25th International Conference on Machine
Learning, pages 56–63.
Binley, J. M., Wrin, T., Korber, B., Zwick, M. B., Wang, M., Chappey, C., Stiegler,
G., Kunert, R., Zolla-Pazner, S., Katinger, H., Petropoulos, C. J., and Burton, D. R.
(2004). Comprehensive cross-clade neutralization analysis of a panel of anti-human
immunodeficiency virus type 1 monoclonal antibodies. J Virol, 78(23):13232–52.
Boffito, M., Acosta, E., Burger, D., Fletcher, C. V., Flexner, C., Garaffo, R., Gatti, G.,
Kurowski, M., Perno, C. F., Peytavin, G., Regazzi, M., and Back, D. (2005). Therapeutic
drug monitoring and drug-drug interactions involving antiretroviral drugs. Antivir Ther
(Lond), 10(4):469–77.
Boser, B., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin
classifiers. Proceedings of the fifth annual workshop on Computational Learning Theory,
pages 144–152.
Bratt, G., Karlsson, A., Leandersson, A. C., Albert, J., Wahren, B., and Sandström, E.
(1998). Treatment history and baseline viral load, but not viral tropism or CCR-5
genotype, influence prolonged antiviral efficacy of highly active antiretroviral treatment.
AIDS, 12(16):2193–202.
Breiman, L. (2001). Random forests. Machine learning.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and
regression trees. Wadsworth, Belmont.
Bromley, S. K., Burack, W. R., Johnson, K. G., Somersalo, K., Sims, T. N., Sumen, C.,
Davis, M. M., Shaw, A. S., Allen, P. M., and Dustin, M. L. (2001). The immunological
synapse. Annu Rev Immunol, 19:375–96.
Brun-Vézinet, F., Costagliola, D., Khaled, M. A., Calvez, V., Clavel, F., Clotet, B.,
Haubrich, R., Kempf, D., King, M., Kuritzkes, D., Lanier, R., Miller, M., Miller, V.,
Phillips, A., Pillay, D., Schapiro, J., Scott, J., Shafer, R., Zazzi, M., Zolopa, A., and
DeGruttola, V. (2004). Clinically validated genotype analysis: guiding principles and
statistical concerns. Antivir Ther (Lond), 9(4):465–78.
Burton, D. R., Desrosiers, R. C., Doms, R. W., Koff, W. C., Kwong, P. D., Moore, J. P.,
Nabel, G. J., Sodroski, J., Wilson, I. A., and Wyatt, R. T. (2004). HIV vaccine design
and the neutralizing antibody problem. Nat Immunol, 5(3):233–6.
Burton, D. R., Pyati, J., Koduri, R., Sharp, S. J., Thornton, G. B., Parren, P. W., Sawyer,
L. S., Hendry, R. M., Dunlop, N., and Nara, P. L. (1994). Efficient neutralization
of primary isolates of HIV-1 by a recombinant human monoclonal antibody. Science,
266(5187):1024–7.
Carries, S., Muller, F., Muller, F. J., Morroni, C., and Wilson, D. (2007). Characteristics,
treatment, and antiretroviral prophylaxis adherence of south african rape survivors. J
Acquir Immune Defic Syndr, 46(1):68–71.
8 Bibliography
183
Chang, C. and Lin, C. (2001). LIBSVM: a library for support vector machines. Software
avaliable at http: // www. csie. ntu. edu. tw/ ~cjlin/ libsvm .
Chapelle, O. and Zien, A. (2005). Semi-supervised classification by low density separation.
Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics,
pages 57–64.
Chen, J. C., Krucinski, J., Miercke, L. J., Finer-Moore, J. S., Tang, A. H., Leavitt, A. D.,
and Stroud, R. M. (2000). Crystal structure of the HIV-1 integrase catalytic core and Cterminal domains: a model for viral DNA binding. Proc Natl Acad Sci USA, 97(15):8233–
8.
Clavel, F. and Hance, A. J. (2004). HIV drug resistance. N Engl J Med, 350(10):1023–35.
Cozzi-Lepri, A., Phillips, A. N., Ruiz, L., Clotet, B., Loveday, C., Kjaer, J., Mens, H.,
Clumeck, N., Viksna, L., Antunes, F., Machala, L., Lundgren, J. D., and Group, E. S.
(2007). Evolution of drug resistance in HIV-infected patients remaining on a virologically
failing combination antiretroviral therapy regimen. AIDS, 21(6):721–32.
Crum, N. F., Riffenburgh, R. H., Wegner, S., Agan, B. K., Tasker, S. A., Spooner, K. M.,
Armstrong, A. W., Fraser, S., Wallace, M. R., and Consortium, T. A. C. (2006). Comparisons of causes of death and mortality rates among HIV-infected persons: analysis
of the pre-, early, and late HAART (highly active antiretroviral therapy) eras. J Acquir
Immune Defic Syndr, 41(2):194–200.
Dalgleish, A. G., Beverley, P. C., Clapham, P. R., Crawford, D. H., Greaves, M. F., and
Weiss, R. A. (1984). The CD4 (T4) antigen is an essential component of the receptor
for the AIDS retrovirus. Nature, 312(5996):763–7.
Däumer, M., Beerenwinkel, N., Hoffmann, D., Selbig, J., Lengauer, T., Oette, M., Fätkenheuer, G., Rockstroh, J. K., Pfister, H. J., and Kaiser, R. (2007). Geno2pheno: Determination of clinically relevant cut-offs. European Journal of Medical Research, 12(Supplement III):1–2.
Däumer, M. P., Kaiser, R., Klein, R., Lengauer, T., Thiele, B., and Thielen, A. (2008).
Inferring viral tropism from genotype with massively parallel sequencing: qualitative
and quantitative analysis. Antivir Ther (Lond), 13(4):A101–A101.
De Luca, A., Cingolani, A., Giambenedetto, S. D., Trotta, M. P., Baldini, F., Rizzo, M. G.,
Bertoli, A., Liuzzi, G., Narciso, P., Murri, R., Ammassari, A., Perno, C. F., and Antinori,
A. (2003). Variable prediction of antiretroviral treatment outcome by different systems
for interpreting genotypic human immunodeficiency virus type 1 drug resistance. J Infect
Dis, 187(12):1934–43.
De Luca, A., Giambenedetto, S. D., Romano, L., Gonnelli, A., Corsi, P., Baldari, M.,
Pietro, M. D., Menzo, S., Francisci, D., Almi, P., Zazzi, M., and Group, A. R. C.
A. S. (2006). Frequency and treatment-related predictors of thymidine-analogue mutation patterns in HIV-1 isolates after unsuccessful antiretroviral therapy. J Infect Dis,
193(9):1219–22.
184
8 Bibliography
De Luca, A., Vendittelli, M., Baldini, F., Giambenedetto, S. D., Trotta, M. P., Cingolani,
A., Bacarelli, A., Gori, C., Perno, C. F., Antinori, A., and Ulivi, G. (2004). Construction,
training and clinical validation of an interpretation system for genotypic HIV-1 drug
resistance based on fuzzy rules revised by virological outcomes. Antivir Ther (Lond),
9(4):583–93.
Deforche, K., Camacho, R., Laethem, K. V., Lemey, P., Rambaut, A., Moreau, Y., and
Vandamme, A.-M. (2008). Estimation of an in vivo fitness landscape experienced by
HIV-1 under drug selective pressure useful for prediction of drug resistance evolution
during treatment. Bioinformatics, 24(1):34–41.
DeFranco, A. L., Locksley, R. M., and Robertson, M. (2007). Immunity: the immune
response in infectious and inflammatory disease. New Science Press.
DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L. (1988). Comparing the areas
under two or more correlated receiver operating characteristic curves: a nonparametric
approach. Biometrics, 44(3):837–45.
Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data
via the EM algorithm. Journal of the Royal Statistical Society B, 39:1–38.
Descamps, D., Apetrei, C., Collin, G., Damond, F., Simon, F., and Brun-Vézinet, F.
(1998). Naturally occurring decreased susceptibility of HIV-1 subtype G to protease
inhibitors. AIDS, 12(9):1109–11.
Desper, R., Jiang, F., Kallioniemi, O. P., Moch, H., Papadimitriou, C. H., and Schäffer,
A. A. (1999). Inferring tree models for oncogenesis from comparative genome hybridization data. J Comput Biol, 6(1):37–51.
Dijkstra, E. (1959). A note on two problems in connexion with graphs. Numerische
mathematik, 1:269–271.
DiRienzo, A. G., DeGruttola, V., Larder, B., and Hertogs, K. (2003). Non-parametric
methods to predict HIV drug susceptibility phenotype from genotype. Statistics in
medicine, 22(17):2785–98.
Dorigo, M. and Gambardella, L. (1997). Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on evolutionary computation.
Douek, D. C., Kwong, P. D., and Nabel, G. J. (2006). The rational design of an AIDS
vaccine. Cell, 124(4):677–81.
Drăghici, S. and Potter, R. B. (2003). Predicting HIV drug resistance with neural networks.
Bioinformatics, 19(1):98–107.
Durant, J., Clevenbergh, P., Halfon, P., Delgiudice, P., Porsin, S., Simonet, P., Montagne, N., Boucher, C. A., Schapiro, J. M., and Dellamonica, P. (1999). Drug-resistance
genotyping in HIV-1 therapy: the VIRADAPT randomised controlled trial. Lancet,
353(9171):2195–9.
8 Bibliography
185
Dybul, M., Fauci, A., Bartlett, J., Kaplan, J., and Pau, A. (2002). Guidelines for using
antiretroviral agents among HIV-infected adults and adolescents - the panel on clinical
practices for treatment of HIV. Ann Intern Med, 137(5):381–433.
Edgar, R. C. (2004). Muscle: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Res, 32(5):1792–7.
Edmonds, J. (1967). Optimum branchings. J Res Nbs B Math Sci, B 71(4):233–&.
Esté, J. A. and Telenti, A. (2007). HIV entry inhibitors. Lancet, 370(9581):81–8.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27:861–
874.
Fellay, J., Shianna, K. V., Ge, D., Colombo, S., Ledergerber, B., Weale, M., Zhang, K.,
Gumbs, C., Castagna, A., Cossarizza, A., Cozzi-Lepri, A., Luca, A. D., Easterbrook,
P., Francioli, P., Mallal, S., Martinez-Picado, J., Miro, J. M., Obel, N., Smith, J. P.,
Wyniger, J., Descombes, P., Antonarakis, S. E., Letvin, N. L., McMichael, A. J., Haynes,
B. F., Telenti, A., and Goldstein, D. B. (2007). A whole-genome association study of
major determinants for host control of HIV-1. Science, 317(5840):944–7.
Fenyö, E. M., Heath, A., Dispinseri, S., Holmes, H., Lusso, P., Zolla-Pazner, S., Donners,
H., Heyndrickx, L., Alcami, J., Bongertz, V., Jassoy, C., Malnati, M., Montefiori, D.,
Moog, C., Morris, L., Osmanov, S., Polonis, V., Sattentau, Q., Schuitemaker, H., Sutthent, R., Wrin, T., and Scarlatti, G. (2009). International network for comparison of
HIV neutralization assays: the NeutNet report. PLoS ONE, 4(2):e4505.
Fields, B. N., Knipe, D. M., and Howley, P. M. (2007). Fields virology. Lippincott Williams
& Wilkins.
Fiscus, S. A., Cheng, B., Crowe, S. M., Demeter, L., Jennings, C., Miller, V., Respess,
R., Stevens, W., and for Collaborative HIV Research Alternative Viral Load Assay
Working Group, F. (2006). HIV-1 viral load assays for resource-limited settings. PLoS
Med, 3(10):e417.
Fisher, R. (1922). On the interpretation of χ2 from contingency tables, and the calculation
of P. Journal of the Royal Statistical Society.
Freed, E. O. (2004). HIV-1 and the host cell: an intimate association. Trends Microbiol,
12(4):170–7.
Gallo, R. C. (2002).
298(5599):1728–30.
Historical essay. the early years of HIV/AIDS.
Science,
Gallo, R. C., Sarin, P. S., Gelmann, E. P., Robert-Guroff, M., Richardson, E., Kalyanaraman, V. S., Mann, D., Sidhu, G. D., Stahl, R. E., Zolla-Pazner, S., Leibowitch, J.,
and Popovic, M. (1983). Isolation of human T-cell leukemia virus in acquired immune
deficiency syndrome (AIDS). Science, 220(4599):865–7.
186
8 Bibliography
Gao, F., Chen, Y., Levy, D. N., Conway, J. A., Kepler, T. B., and Hui, H. (2004). Unselected mutations in the human immunodeficiency virus type 1 genome are mostly
nonsynonymous and often deleterious. J Virol, 78(5):2426–33.
Gareiss, P. C. and Miller, B. L. (2009). Ribosomal frameshifting: an emerging drug target
for HIV. Current opinion in investigational drugs (London, England : 2000), 10(2):121–
8.
Gilks, C. F., Crowley, S., Ekpini, R., Gove, S., Perriens, J., Souteyrand, Y., Sutherland, D.,
Vitoria, M., Guerma, T., and Cock, K. D. (2006). The WHO public-health approach to
antiretroviral treatment against HIV in resource-limited settings. Lancet, 368(9534):505–
10.
Glass, T. R., Geest, S. D., Hirschel, B., Battegay, M., Furrer, H., Covassini, M., Vernazza,
P. L., Bernasconi, E., Rickenboch, M., Weber, R., Bucher, H. C., and Study, S. H. C.
(2008). Self-reported non-adherence to antiretroviral therapy repeatedly assessed by two
questions predicts treatment failure in virologically suppressed patients. Antivir Ther
(Lond), 13(1):77–85.
Green, D. R., Droin, N., and Pinkoski, M. (2003). Activation-induced cell death in T cells.
Immunol Rev, 193:70–81.
Grobler, J. A., Mckenna, P. M., Ly, S., Stillmock, K. A., Bahnck, C. M., Danovich, R. M.,
Dornadula, G., Hazuda, D. J., and Miller, M. D. (2009). Hiv integrase inhibitor dissociation rates correlate with efficacy in vitro. Antivir Ther (Lond), 14(4):A27–A27.
Grossman, Z., Meier-Schellersheim, M., Paul, W. E., and Picker, L. J. (2006). Pathogenesis
of HIV infection: what the virus spares is as important as what it destroys. Nature
Medicine, 12(3):289–95.
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002). Gene selection for cancer
classification using support vector machines. Machine learning, 46:389–422.
Hahn, B. H., Shaw, G. M., Taylor, M. E., Redfield, R. R., Markham, P. D., Salahuddin,
S. Z., Wong-Staal, F., Gallo, R. C., Parks, E. S., and Parks, W. P. (1986). Genetic
variation in HTLV-III/LAV over time in patients with AIDS or at risk for AIDS. Science,
232(4757):1548–53.
Hall, M. (1999). Correlation-based feature selection for machine learning. Ph.D. Thesiss,
Department of Computer Science, University of Waikato, Hamilton, New Zealand.
Hammer, S. M., Eron, J. J., Reiss, P., Schooley, R. T., Thompson, M. A., Walmsley, S.,
Cahn, P., Fischl, M. A., Gatell, J. M., Hirsch, M. S., Jacobsen, D. M., Montaner, J.
S. G., Richman, D. D., Yeni, P. G., Volberding, P. A., and Society-USA, I. A. (2008).
Antiretroviral treatment of adult HIV infection: 2008 recommendations of the international AIDS society-USA panel. JAMA, 300(5):555–70.
Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundacker, H., Schooley, R. T.,
Haubrich, R. H., Henry, W. K., Lederman, M. M., Phair, J. P., Niu, M., Hirsch, M. S.,
and Merigan, T. C. (1996). A trial comparing nucleoside monotherapy with combination
8 Bibliography
187
therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter.
N Engl J Med, 335(15):1081–90.
Han, J., Pei, J., Yin, Y., and Mao, R. (2004). Mining frequent patterns without candidate
generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery,
8:53–87.
Harrigan, P. R., Hogg, R. S., Dong, W. W. Y., Yip, B., Wynhoven, B., Woodward, J.,
Brumme, C. J., Brumme, Z. L., Mo, T., Alexander, C. S., and Montaner, J. S. G.
(2005). Predictors of HIV drug-resistance mutations in a large antiretroviral-naive cohort
initiating triple antiretroviral therapy. J Infect Dis, 191(3):339–47.
Hart, P., Nilsson, N., and Raphael, B. (1968). A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics,
SSC4(2):100–107.
Hastie, T., Tibshirani, R., and Friedman, J. H. (2001). The elements of statistical learning:
data mining, inference, and prediction. Springer Series in Statistics.
Hazenberg, M. D., Otto, S. A., van Benthem, B. H. B., Roos, M. T. L., Coutinho, R. A.,
Lange, J. M. A., Hamann, D., Prins, M., and Miedema, F. (2003). Persistent immune
activation in HIV-1 infection is associated with progression to AIDS. AIDS, 17(13):1881–
8.
Hazuda, D. J., Felock, P., Witmer, M., Wolfe, A., Stillmock, K., Grobler, J. A., Espeseth, A., Gabryelski, L., Schleif, W., Blau, C., and Miller, M. D. (2000). Inhibitors of
strand transfer that prevent integration and inhibit HIV-1 replication in cells. Science,
287(5453):646–50.
Healy, B. and Degruttola, V. (2007). Hidden markov models for settings with intervalcensored transition times and uncertain time origin: application to HIV genetic analysis.
Biostatistics, 8(2):438–452.
Heeney, J. L., Dalgleish, A. G., and Weiss, R. A. (2006). Origins of HIV and the evolution
of resistance to AIDS. Science, 313(5786):462–6.
Hetherington, L. (2004). The MIT finite-state transducer toolkit for speech and language
processing. Proceeding of the 8th International Conference on Spoken Language Processing.
Huang, C., Tang, M., Zhang, M.-Y., Majeed, S., Montabana, E., Stanfield, R. L., Dimitrov,
D. S., Korber, B., Sodroski, J., Wilson, I. A., Wyatt, R., and Kwong, P. D. (2005).
Structure of a V3-containing HIV-1 gp120 core. Science, 310(5750):1025–8.
Huang, Y. and Suen, C. (1995). A method of combining multiple experts for the recognition
of unconstrained handwritten numberals. Ieee T Pattern Anal, 17:90–94.
Jacks, T., Power, M. D., Masiarz, F. R., Luciw, P. A., Barr, P. J., and Varmus, H. E.
(1988). Characterization of ribosomal frameshifting in HIV-1 gag-pol expression. Nature,
331(6153):280–3.
188
8 Bibliography
Jenwitheesuk, E. and Samudrala, R. (2005). Prediction of HIV-1 protease inhibitor
resistance using a protein-inhibitor flexible docking approach. Antivir Ther (Lond),
10(1):157–66.
Jiang, H., Deeks, S. G., Kuritzkes, D. R., Lallemant, M., Katzenstein, D., Albrecht, M.,
and DeGruttola, V. (2003). Assessing resistance costs of antiretroviral therapies via
measures of future drug options. J Infect Dis, 188(7):1001–8.
Joachims, T. (1999). Transductive inference for text classifcation using support vector
machines. International Conference on Machine Learning (ICML).
Johnson, V. A., Brun-Vezinet, F., Clotet, B., Conway, B., Kuritzkes, D. R., Pillay, D.,
Schapiro, J. M., Telenti, A., and Richman, D. D. (2005). Update of the drug resistance mutations in HIV-1: Fall 2005. Topics in HIV medicine : a publication of the
International AIDS Society, USA, 13(4):125–31.
Johnson, V. A., Brun-Vezinet, F., Clotet, B., Gunthard, H. F., Kuritzkes, D. R., Pillay, D.,
Schapiro, J. M., and Richman, D. D. (2008). Update of the drug resistance mutations
in HIV-1. Topics in HIV medicine : a publication of the International AIDS Society,
USA, 16(5):138–45.
Jordan, R., Gold, L., Cummins, C., and Hyde, C. (2002). Systematic review and metaanalysis of evidence for increasing numbers of drugs in antiretroviral combination therapy. BMJ, 324(7340):757.
Kamber, M. and Han, J. (2001). Data mining: Concepts and techniques. Morgan Kaufmann.
Kanki, P. J., Hamel, D. J., Sankalé, J. L., c Hsieh, C., Thior, I., Barin, F., Woodcock,
S. A., Guèye-Ndiaye, A., Zhang, E., Montano, M., Siby, T., Marlink, R., NDoye, I.,
Essex, M. E., and MBoup, S. (1999). Human immunodeficiency virus type 1 subtypes
differ in disease progression. J Infect Dis, 179(1):68–73.
Kanthak, S. and Ney, H. (2004). FSA: an efficient and flexible C++ toolkit for finite state
automata using on-demand computation. ACL ’04: Proceedings of the 42nd Annual
Meeting on Association for Computational Linguistics, pages 510–517.
Kantor, R., Katzenstein, D. A., Efron, B., Carvalho, A. P., Wynhoven, B., Cane, P., Clarke,
J., Sirivichayakul, S., Soares, M. A., Snoeck, J., Pillay, C., Rudich, H., Rodrigues, R.,
Holguin, A., Ariyoshi, K., Bouzas, M. B., Cahn, P., Sugiura, W., Soriano, V., Brigido,
L. F., Grossman, Z., Morris, L., Vandamme, A.-M., Tanuri, A., Phanuphak, P., Weber,
J. N., Pillay, D., Harrigan, P. R., Camacho, R., Schapiro, J. M., and Shafer, R. W.
(2005). Impact of HIV-1 subtype and antiretroviral therapy on protease and reverse
transcriptase genotype: results of a global collaboration. PLoS Med, 2(4):e112.
Karlsson Hedestam, G. B., Fouchier, R. A. M., Phogat, S., Burton, D. R., Sodroski, J.,
and Wyatt, R. T. (2008). The challenges of eliciting neutralizing antibodies to HIV-1
and to influenza virus. Nat Rev Microbiol, 6(2):143–55.
8 Bibliography
189
Katzenstein, D. A., Bosch, R. J., Hellmann, N., Wang, N., Bacheler, L., Albrecht,
M. A., and Team, A. . S. (2003). Phenotypic susceptibility and virological outcome
in nucleoside-experienced patients receiving three or four antiretroviral drugs. AIDS,
17(6):821–30.
Kauder, S. E., Bosque, A., Lindqvist, A., Planelles, V., and Verdin, E. (2009). Epigenetic
regulation of HIV-1 latency by cytosine methylation. PLoS Pathog, 5(6):e1000495.
Kierczak, M., Ginalski, K., Draminski, M., Koronacki, J., Rudnicki, W., and Komorowski,
J. (2009). A rough set-based model of HIV-1 reverse transcriptase resistome. Bioinformatics and Biology Insights.
Kirkpatrick, S., Gelatt, C., and Vecchi, M. (1983). Optimization by simulated annealing.
Science, 220(4598):671–680.
Kittler, J., Hatef, M., Duin, R., and Matas, J. (1998). On combining classifiers. Ieee T
Pattern Anal, 20(3):226–239.
Kjaer, J., Høj, L., Fox, Z., and Lundgren, J. D. (2008). Prediction of phenotypic susceptibility to antiretroviral drugs using physiochemical properties of the primary enzymatic
structure combined with artificial neural networks. HIV Med, 9(8):642–52.
Korber, B., Muldoon, M., Theiler, J., Gao, F., Gupta, R., Lapedes, A., Hahn, B. H.,
Wolinsky, S., and Bhattacharya, T. (2000). Timing the ancestor of the HIV-1 pandemic
strains. Science, 288(5472):1789–96.
Kumar, G. N., Rodrigues, A. D., Buko, A. M., and Denissen, J. F. (1996). Cytochrome
p450-mediated metabolism of the HIV-1 protease inhibitor ritonavir (ABT-538) in human liver microsomes. J Pharmacol Exp Ther, 277(1):423–31.
Kuncheva, L. (2002). Switching between selection and fusion in combining classifiers:
anexperiment. IEEE Transactions on Systems Man And Cybernetics, Part B-cybernetics,
32(2):146–156.
Kuncheva, L., Bezdek, J., and Duin, R. (2001). Decision templates for multiple classifier
fusion: an experimental comparison. Pattern Recognition, 34(2):299–314.
Kyte, J. and Doolittle, R. F. (1982). A simple method for displaying the hydropathic
character of a protein. J Mol Biol, 157(1):105–32.
Lambert, P.-H., Liu, M., and Siegrist, C.-A. (2005). Can successful vaccines teach us how
to induce efficient protective immune responses? Nature Medicine, 11(4 Suppl):S54–62.
Landwehr, N., Hall, M., and Frank, E. (2005). Logistic model trees. Machine learning,
59:161–205.
Lapins, M., Eklund, M., Spjuth, O., Prusis, P., and Wikberg, J. E. S. (2008). Proteochemometric modeling of HIV protease susceptibility. BMC Bioinformatics, 9:181.
190
8 Bibliography
Larder, B., Degruttola, V., Hammer, S., Harrigan, R., Wegner, S., Winslow, D., and Zazzi,
M. (2002). The international HIV resistance response database initiative: a new global
collaborative approach to relating viral genotype and treatment to clinical outcome.
Antivir Ther (Lond), 7:S111–S111.
Larder, B., Wang, D., Revell, A., and Lane, C. (2003). Neural network model identified
potentially effective drug combinations for patients failing salvage therapy. 2nd IAS
Conference on HIV Pathogenesis and Treatment 13–17 July2003, Paris, France. Poster
LB39.
Larder, B., Wang, D., Revell, A., Montaner, J., Harrigan, R., Wolf, F. D., Lange, J.,
Wegner, S., Ruiz, L., Pérez-Elías, M. J., Emery, S., Gatell, J., Monforte, A. D., Torti,
C., Zazzi, M., and Lane, C. (2007). The development of artificial neural networks to
predict virological response to combination HIV therapy. Antivir Ther (Lond), 12(1):15–
24.
Larder, B. A., Darby, G., and Richman, D. D. (1989). HIV with reduced sensitivity to
zidovudine (AZT) isolated during prolonged therapy. Science, 243(4899):1731–4.
Larder, B. A. and Kemp, S. D. (1989). Multiple mutations in HIV-1 reverse transcriptase
confer high-level resistance to zidovudine (AZT). Science, 246(4934):1155–8.
Lathrop, R. and Pazzani, M. (1999). Combinatorial optimization in rapidly mutating
drug-resistant viruses. Journal of Combinatorial Optimization, 3:301–320.
Le, T., Chiarella, J., Simen, B. B., Hanczaruk, B., Egholm, M., Landry, M. L., Dieckhaus,
K., Rosen, M. I., and Kozal, M. J. (2009). Low-abundance HIV drug-resistant viral
variants in treatment-experienced persons correlate with historical antiretroviral use.
PLoS ONE, 4(6):e6079.
Le Cessie, S. and Van Houwelingen, J. C. (1992). Ridge estimators in logistic regression.
Applied Statistics, 41:191–201.
Leitner, T., Korber, B., Daniels, M., Calef, C., and Foley, B. (2005). HIV-1 subtype
and circulating recombinant form (CRF) reference sequences, 2005. HIV sequence compendium.
Lengauer, T., Sander, O., Sierra, S., Thielen, A., and Kaiser, R. (2007). Bioinformatics
prediction of HIV coreceptor usage. Nat Biotechnol, 25(12):1407–10.
Lengauer, T. and Sing, T. (2006). Bioinformatics-assisted anti-HIV therapy. Nat Rev
Microbiol, 4(10):790–7.
Li, M., Gao, F., Mascola, J. R., Stamatatos, L., Polonis, V. R., Koutsoukos, M., Voss,
G., Goepfert, P., Gilbert, P., Greene, K. M., Bilska, M., Kothe, D. L., Salazar-Gonzalez,
J. F., Wei, X., Decker, J. M., Hahn, B. H., and Montefiori, D. C. (2005). Human immunodeficiency virus type 1 env clones from acute and early subtype B infections for standardized assessments of vaccine-elicited neutralizing antibodies. J Virol, 79(16):10108–
25.
8 Bibliography
191
Li, M., Salazar-Gonzalez, J. F., Derdeyn, C. A., Morris, L., Williamson, C., Robinson,
J. E., Decker, J. M., Li, Y., Salazar, M. G., Polonis, V. R., Mlisana, K., Karim, S. A.,
Hong, K., Greene, K. M., Bilska, M., Zhou, J., Allen, S., Chomba, E., Mulenga, J.,
Vwalika, C., Gao, F., Zhang, M., Korber, B. T. M., Hunter, E., Hahn, B. H., and
Montefiori, D. C. (2006). Genetic and neutralization properties of subtype C human
immunodeficiency virus type 1 molecular env clones from acute and early heterosexually
acquired infections in Southern Africa. J Virol, 80(23):11776–90.
Liu, J., Bartesaghi, A., Borgnia, M. J., Sapiro, G., and Subramaniam, S. (2008). Molecular
architecture of native HIV-1 gp120 trimers. Nature, 455(7209):109–13.
Liu, R. and Yuan, B. (2001). Multiple classifiers combination by clustering and selection.
Information Fusion, 2:163–168.
Loizidou, E. Z., Kousiappa, I., Zeinalipour-Yazdi, C. D., Van de Vijver, D. A. M. C., and
Kostrikis, L. G. (2009). Implications of HIV-1 M group polymorphisms on integrase
inhibitor efficacy and resistance: genetic and structural in silico analyses. Biochemistry,
48(1):4–6.
Lombardy, S., Regis-Gianas, Y., and Sakarovitch, J. (2004). Introducing Vaucanson. Theoretical Computer Science, 328:77–96.
Lyles, R. H., Muñoz, A., Yamashita, T. E., Bazmi, H., Detels, R., Rinaldo, C. R., Margolick, J. B., Phair, J. P., and Mellors, J. W. (2000). Natural history of human immunodeficiency virus type 1 viremia after seroconversion and proximal to AIDS in a large
cohort of homosexual men. J Infect Dis, 181(3):872–80.
Maggiolo, F., Airoldi, M., Callegaro, A., Ripamonti, D., Gregis, G., Quinzan, G., Bombana, E., Ravasio, V., and Suter, F. (2007). Prediction of virologic outcome of salvage
antiretroviral treatment by different systems for interpreting genotypic HIV drug resistance. Journal of the International Association of Physicians in AIDS Care (JIAPAC),
6:87–93.
Mahungu, T., Smith, C., Turner, F., Egan, D., Youle, M., Johnson, M., Khoo, S., Back,
D., and Owen, A. (2009). Cytochrome p450 2b6 516g–>t is associated with plasma
concentrations of nevirapine at both 200 mg twice daily and 400 mg once daily in an
ethnically diverse population. HIV Med, 10(5):310–7.
Mallal, S., Phillips, E., Carosi, G., Molina, J.-M., Workman, C., Tomazic, J., Jägel-Guedes,
E., Rugina, S., Kozyrev, O., Cid, J. F., Hay, P., Nolan, D., Hughes, S., Hughes, A.,
Ryan, S., Fitch, N., Thorborn, D., and Benbow, A. (2008). HLA-B*5701 screening for
hypersensitivity to abacavir. N Engl J Med, 358(6):568–79.
Mansky, L. M. and Temin, H. M. (1995). Lower in vivo mutation rate of human immunodeficiency virus type 1 than that predicted from the fidelity of purified reverse
transcriptase. J Virol, 69(8):5087–94.
Markowitz, M., Mohri, H., Mehandru, S., Shet, A., Berry, L., Kalyanaraman, R., Kim,
A., Chung, C., Jean-Pierre, P., Horowitz, A., Mar, M. L., Wrin, T., Parkin, N., Poles,
192
8 Bibliography
M., Petropoulos, C., Mullen, M., Boden, D., and Ho, D. D. (2005). Infection with
multidrug resistant, dual-tropic HIV-1 and rapid progression to AIDS: a case report.
Lancet, 365(9464):1031–8.
Mattapallil, J. J., Douek, D. C., Hill, B., Nishimura, Y., Martin, M., and Roederer, M.
(2005). Massive infection and loss of memory CD4+ T cells in multiple tissues during
acute SIV infection. Nature, 434(7037):1093–7.
McCallister, S., Lalezari, J., Richmond, G., Thompson, M., Harrigan, R., Martin, D.,
Salzwedel, K., and Allaway, G. (2008). HIV-1 Gag polymorphisms determine treatment
response to bevirimat (PA-457). Antivir Ther (Lond), 13(4):A10–A10.
McColl, D. J., Fransen, S., Gupta, S., Parkin, N., Margot, N., Chuck, S., Cheng, A. K.,
and Miller, M. D. (2007). Resistance and cross-resistance to first generation integrase
inhibitors: insights from a Phase II study of elvitegravir (GS-9137). Antivir Ther (Lond),
12(5):S11–S11.
Meynard, J.-L., Vray, M., Morand-Joubert, L., Race, E., Descamps, D., Peytavin, G.,
Matheron, S., Lamotte, C., Guiramand, S., Costagliola, D., Brun-Vézinet, F., Clavel,
F., Girard, P.-M., and Group, N. T. (2002). Phenotypic or genotypic resistance testing
for choosing antiretroviral therapy after treatment failure: a randomized trial. AIDS,
16(5):727–36.
Miller, M. D. and Hazuda, D. J. (2004). HIV resistance to the fusion inhibitor enfuvirtide:
mechanisms and clinical implications. Drug Resist Updat, 7(2):89–95.
Mocroft, A., Vella, S., Benfield, T. L., Chiesi, A., Miller, V., Gargalianos, P., d’Arminio
Monforte, A., Yust, I., Bruun, J. N., Phillips, A. N., and Lundgren, J. D. (1998). Changing patterns of mortality across europe in patients infected with hiv-1. eurosida study
group. Lancet, 352(9142):1725–30.
Mohri, M. (2002). Semiring frameworks and algorithms for shortest-distance problems.
Journal of Automata, Languages and Combinatorics, 7(3):321–350.
Mohri, M. (2005). Statistical natural language processing. M. Lothaire, editor, Applied
Combinatorics on Words. Cambridge University Press., pages 199–226.
Mohri, M., Pereira, F., and Riley, M. (2000). The design principles of a weighted finite-state
transducer library. Theoretical Computer Science, 231:17–32.
Mohri, M. and Riley, M. (2001). A weight pushing algorithm for large vocabulary speech
recognition. Proceedings of the Seventh European Conference on Speech Communication
and Technology, pages 1303–1606.
Mohri, M. and Riley, M. (2002). An efficient algorithm for the N-best-strings problem.
Proceedings of the Seventh International Conference on Spoken Language Processing,
pages 1313–1316.
Montagnier, L. (2002).
298(5599):1727–8.
Historical essay. a history of HIV discovery.
Science,
8 Bibliography
193
Müller, F. (2008). Inferring virological response to antiretroviral combination therapy based
on past treatment lines. Bachelor Thesis, Saarland University, Saarbrücken, Germany.
Müller, F. (2009). Assessing antibody neutralization oh HIV-1 as an initial step in the
search of gp160-based immunogens. Master Thesis, Saarland University, Saarbrücken,
Germany.
Ney, H., Essen, U., and Kneser, R. (1994). On structuring probabilistic dependences in
stochastic language modelling. Computer Speech and Language, 8:1–38.
Nijhuis, M., van Maarseveen, N. M., Lastere, S., Schipper, P., Coakley, E., Glass, B.,
Rovenska, M., de Jong, D., Chappey, C., Goedegebuure, I. W., Heilek-Snyder, G., Dulude, D., Cammack, N., Brakier-Gingras, L., Konvalinka, J., Parkin, N., Kräusslich,
H.-G., Brun-Vezinet, F., and Boucher, C. A. B. (2007). A novel substrate-based HIV-1
protease inhibitor drug resistance mechanism. PLoS Med, 4(1):e36.
Novembre, J., Galvani, A. P., and Slatkin, M. (2005). The geographic spread of the CCR5
δ32 HIV-resistance allele. PLoS Biol, 3(11):e339.
Nowozin, S., BakIr, G., and Tsuda, K. (2007). Discriminative subsequence mining for
action classification. 11th IEEE International Conference on Computer Vision, pages
1919–1923.
Oette, M., Kaiser, R., Däumer, M., Petch, R., Fätkenheuer, G., Carls, H., Rockstroh, J. K.,
Schmalöer, D., Stechel, J., Feldt, T., Pfister, H., and Häussinger, D. (2006). Primary
HIV drug resistance and efficacy of first-line antiretroviral therapy guided by resistance
testing. J Acquir Immune Defic Syndr, 41(5):573–81.
Pantaleo, G., Graziosi, C., and Fauci, A. S. (1993). New concepts in the immunopathogenesis of human immunodeficiency virus infection. N Engl J Med, 328(5):327–35.
Pantophlet, R. and Burton, D. R. (2006). GP120: target for neutralizing HIV-1 antibodies.
Annu Rev Immunol, 24:739–69.
Parikh, U. M., Bacheler, L., Koontz, D., and Mellors, J. W. (2006). The K65R mutation in human immunodeficiency virus type 1 reverse transcriptase exhibits bidirectional
phenotypic antagonism with thymidine analog mutations. J Virol, 80(10):4971–7.
Pereira, F. and Riley, M. (1997). Speech recognition by composition of weighted finite
automata. Roche, E. and Schabes, Y. (Eds.) Finite-State language processing. MIT
Press, pages 431–453.
Perelson, A. S. (2002). Modelling viral and immune system dynamics. Nat Rev Immunol,
2(1):28–36.
Perner, J., Altmann, A., and Lengauer, T. (2009). Semi-supervised learning for improving
prediction of HIV drug resistance. Lecture Notes in Informatics, Proceedings of the
German Conference on Bioinformatics 2009, P-157:55–65.
194
8 Bibliography
Petropoulos, C. J., Parkin, N. T., Limoli, K. L., Lie, Y. S., Wrin, T., Huang, W., Tian, H.,
Smith, D., Winslow, G. A., Capon, D. J., and Whitcomb, J. M. (2000). A novel phenotypic drug susceptibility assay for human immunodeficiency virus type 1. Antimicrob
Agents Chemother, 44(4):920–8.
Pettit, S. C., Everitt, L. E., Choudhury, S., Dunn, B. M., and Kaplan, A. H. (2004). Initial
cleavage of the human immunodeficiency virus type 1 gagpol precursor by its activated
protease occurs by an intramolecular mechanism. J Virol, 78(16):8477–85.
Phogat, S. and Wyatt, R. (2007). Rational modifications of HIV-1 envelope glycoproteins
for immunogen design. Curr Pharm Des, 13(2):213–27.
Picker, L. J. (2006). Immunopathogenesis of acute AIDS virus infection. Curr Opin
Immunol, 18(4):399–405.
Plantier, J.-C., Leoz, M., Dickerson, J. E., Oliveira, F. D., Cordonnier, F., Lemée, V.,
Damond, F., and Simon, D. L. R. F. (2009). A new human immunodeficiency virus
derived from gorillas. Nature Medicine.
Plotkin, S. A. (2005). Vaccines: past, present and future. Nature Medicine, 11(4 Suppl):S5–
11.
Popovic, M., Sarngadharan, M. G., Read, E., and Gallo, R. C. (1984). Detection, isolation,
and continuous production of cytopathic retroviruses (HTLV-III) from patients with
AIDS and pre-AIDS. Science, 224(4648):497–500.
Prilusky, J., Felder, C. E., Zeev-Ben-Mordehai, T., Rydberg, E. H., Man, O., Beckmann,
J. S., Silman, I., and Sussman, J. L. (2005). Foldindex: a simple tool to predict whether
a given protein sequence is intrinsically unfolded. Bioinformatics, 21(16):3435–8.
Prosperi, M., Giambenedetto, S. D., Trotta, M., Cingolani, A., Ruiz, L., Baxter, J., Clevenbergh, P., Perno, C., Cauda, R., Ulivi, G., Antinori, A., and De Luca, A. (2004).
A fuzzy relational system trained by genetic algorithms and HIV-1 resistance genotypes/virological response data from prospective studies usefully predicts treatment outcomes. Antivir Ther (Lond), 9(4):U89–U89.
Prosperi, M., Rosen-Zvi, M., Altmann, A., Aharoni, E., Peres, Y., Sönnerbord, A., Incardona, F., Struck, D., Kaiser, R., and Zazzi, M. (2009a). Antiretroviral therapy optimisation without genotype resistance testing: a perspective on treatment history based
models. Reviews in Antiviral Therapy, 1:28–29.
Prosperi, M., Zazzi, M., Perno, C., Giambenedetto, S. D., Baxter, J., Ruiz, L., Clevenbergh, P., Ulivi, G., Antinori, A., and De Luca, A. (2005). ’Common law’ applied to
treatment decisions for drug resistant HIV. Antivir Ther (Lond), 10(4):S62–S62.
Prosperi, M. C. F., D’Autilia, R., Incardona, F., De Luca, A., Zazzi, M., and Ulivi, G.
(2009b). Stochastic modelling of genotypic drug-resistance for human immunodeficiency
virus towards long-term combination therapy optimization. Bioinformatics, 25(8):1040–
7.
8 Bibliography
195
Rahnenführer, J., Beerenwinkel, N., Schulz, W. A., Hartmann, C., von Deimling, A.,
Wullich, B., and Lengauer, T. (2005). Estimating cancer survival and clinical outcome
based on genetic tumor progression scores. Bioinformatics, 21(10):2438–46.
Rambaut, A., Posada, D., Crandall, K. A., and Holmes, E. C. (2004). The causes and
consequences of HIV evolution. Nat Rev Genet, 5(1):52–61.
Ratner, L., Haseltine, W., Patarca, R., Livak, K. J., Starcich, B., Josephs, S. F., Doran,
E. R., Rafalski, J. A., Whitehorn, E. A., Baumeister, K., Ivanoff, L., Petteway, S. R.,
Pearson, M. L., Lautenberger, J. A., Papas, T. S., Ghrayeb, J., Chang, N. T., Gallo,
R. C., and Wong-Staal, F. (1985). Complete nucleotide sequence of the AIDS virus,
HTLV-III. Nature, 313(6000):277–84.
Rerks-Ngarm, S., Pitisuttithum, P., Nitayaphan, S., Kaewkungwal, J., Chiu, J., Paris, R.,
Premsri, N., Namwat, C., de Souza, M., Adams, E., Benenson, M., Gurunathan, S.,
Tartaglia, J., McNeil, J., Francis, D., Stablein, D., Birx, D., Chunsuttiwat, S., Khamboonruang, C., Thongcharoen, P., Robb, M., Michael, N., Kunasol, P., Kim, J., and
the MOPH-TAVEG Investigators (2009). Vaccination with ALVAC and AIDSVAX to
prevent HIV-1 infection in thailand. N Engl J Med.
Revell, A. D., Wang, D., Harrigan, R., Gatell, J., Ruiz, L., Emery, S., Perez-Elias, M. J.,
Torti, C., Baxter, J., DeWolf, F., Gazzard, B., Geretti, A. M., Staszewski, S., Hamers, R.,
Wensing, A. M. J., Lange, J., Montaner, J. M., and Larder, B. A. (2009). Computational
models developed without a genotype for resource-poor countries predict response to HIV
treatment with 82% accuracy. Antivir Ther (Lond), 14(4):A38–A38.
Rhee, S.-Y., Gonzales, M. J., Kantor, R., Betts, B. J., Ravela, J., and Shafer, R. W. (2003).
Human immunodeficiency virus reverse transcriptase and protease sequence database.
Nucleic Acids Res, 31(1):298–303.
Rhee, S.-Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D. L., and Shafer, R. W.
(2006). Genotypic predictors of human immunodeficiency virus type 1 drug resistance.
Proc Natl Acad Sci USA, 103(46):17355–60.
Ribeiro, R. M. and Bonhoeffer, S. (2000). Production of resistant HIV mutants during
antiretroviral therapy. Proc Natl Acad Sci USA, 97(14):7681–6.
Richman, D. D., Wrin, T., Little, S. J., and Petropoulos, C. J. (2003). Rapid evolution of
the neutralizing antibody response to HIV type 1 infection. Proc Natl Acad Sci USA,
100(7):4144–9.
Roben, P., Moore, J. P., Thali, M., Sodroski, J., Barbas, C. F., and Burton, D. R. (1994).
Recognition properties of a panel of human recombinant Fab fragments to the CD4
binding site of gp120 that show differing abilities to neutralize human immunodeficiency
virus type 1. J Virol, 68(8):4821–8.
Robertson, D. L., Anderson, J. P., Bradac, J. A., Carr, J. K., Foley, B., Funkhouser, R. K.,
Gao, F., Hahn, B. H., Kalish, M. L., Kuiken, C., Learn, G. H., Leitner, T., McCutchan,
F., Osmanov, S., Peeters, M., Pieniazek, D., Salminen, M., Sharp, P. M., Wolinsky, S.,
and Korber, B. (2000). HIV-1 nomenclature proposal. Science, 288(5463):55–6.
196
8 Bibliography
Rogova, G. (1994). Combining the results of several neural network classifiers. Neural
networks, 7(5):777–781.
Roomp, K., Beerenwinkel, N., Sing, T., Schuelter, E., Buech, J., Sierra-Aragon, S.,
Daeumer, M., Hoffmann, D., Kaiser, R., Lengauer, T., and Selbig, J. (2006). Arevir: A secure platform for designing personalized antiretroviral therapies against HIV.
Lect Notes Comput Sc, 4075:185–194.
Rosen-Zvi, M., Altmann, A., Prosperi, M., Aharoni, E., Neuvirth, H., Sönnerborg, A.,
Schülter, E., Struck, D., Peres, Y., Incardona, F., Kaiser, R., Zazzi, M., and Lengauer,
T. (2008). Selecting anti-HIV therapies based on a variety of genomic and clinical factors.
Bioinformatics, 24(13):i399–406.
Rousseeuw, P. and Kaufman, L. (1990). Finding groups in data: an introduction to cluster
analysis. Wiley.
Sacktor, N., Nakasujja, N., Skolasky, R. L., Rezapour, M., Robertson, K., Musisi, S.,
Katabira, E., Ronald, A., Clifford, D. B., Laeyendecker, O., and Quinn, T. C. (2009).
HIV subtype D is associated with dementia, compared with subtype A, in immunosuppressed individuals at risk of cognitive impairment in kampala, uganda. Clin Infect Dis,
49(5):780–6.
Saigo, H., Uno, T., and Tsuda, K. (2007). Mining complex genotypic features for predicting
HIV-1 drug resistance. Bioinformatics, 23(18):2455–62.
Salzwedel, K., Martin, D. E., and Sakalian, M. (2007). Maturation inhibitors: a new
therapeutic class targets the virus structure. AIDS reviews, 9(3):162–72.
Sander, O., Sing, T., Sommer, I., Low, A. J., Cheung, P. K., Harrigan, P. R., Lengauer, T.,
and Domingues, F. S. (2007). Structural descriptors of gp120 V3 loop for the prediction
of HIV-1 coreceptor usage. PLoS Comput Biol, 3(3):e58.
Sarkar, I., Hauber, I., Hauber, J., and Buchholz, F. (2007). HIV-1 proviral DNA excision
using an evolved recombinase. Science, 316(5833):1912–5.
Schweighardt, B., Liu, Y., Huang, W., Chappey, C., Lie, Y. S., Petropoulos, C. J., and
Wrin, T. (2007). Development of an HIV-1 reference panel of subtype B envelope clones
isolated from the plasma of recently infected individuals. J Acquir Immune Defic Syndr,
46(1):1–11.
Shaw, G. M., Hahn, B. H., Arya, S. K., Groopman, J. E., Gallo, R. C., and Wong-Staal, F.
(1984). Molecular characterization of human T-cell leukemia (lymphotropic) virus type
III in the acquired immune deficiency syndrome. Science, 226(4679):1165–71.
Shaw, G. M., Harper, M. E., Hahn, B. H., Epstein, L. G., Gajdusek, D. C., Price, R. W.,
Navia, B. A., Petito, C. K., O’Hara, C. J., and Groopman, J. E. (1985). HTLV-III infection in brains of children and adults with AIDS encephalopathy. Science, 227(4683):177–
82.
8 Bibliography
197
Sheehy, A. M., Gaddis, N. C., Choi, J. D., and Malim, M. H. (2002). Isolation of a human
gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature,
418(6898):646–50.
Shehu-Xhilaga, M., Crowe, S. M., and Mak, J. (2001). Maintenance of the Gag/Gag-Pol
ratio is important for human immunodeficiency virus type 1 RNA dimerization and viral
infectivity. J Virol, 75(4):1834–41.
Shen, L., Peterson, S., Sedaghat, A. R., McMahon, M. A., Callender, M., Zhang, H., Zhou,
Y., Pitt, E., Anderson, K. S., Acosta, E. P., and Siliciano, R. F. (2008). Dose-response
curve slope sets class-specific limits on inhibitory potential of anti-HIV drugs. Nature
Medicine, 14(7):762–6.
Simon, V., Ho, D. D., and Karim, Q. A. (2006). HIV/AIDS epidemiology, pathogenesis,
prevention, and treatment. Lancet, 368(9534):489–504.
Sing, T., Sander, O., Beerenwinkel, N., and Lengauer, T. (2005a). ROCR: visualizing
classifier performance in R. Bioinformatics, 21(20):3940–1.
Sing, T., Svicher, V., Beerenwinkel, N., Ceccherinig-Silberstein, F., Däumer, M., Kaiser,
R., Walter, H., Korn, K., Hoffmann, D., Oette, M., Rockstroh, J., Fatkenheuer, G.,
Perno, C., and Lengauer, T. (2005b). Characterization of novel HIV drug resistance
mutations using clustering, multidimensional scaling and SVM-based feature ranking.
Lect Notes Artif Int, 3721:285–296.
Sluis-Cremer, N., Arion, D., and Parniak, M. A. (2000). Molecular mechanisms of HIV1 resistance to nucleoside reverse transcriptase inhibitors (NRTIs). Cell Mol Life Sci,
57(10):1408–22.
Snoeck, J., Kantor, R., Shafer, R. W., Laethem, K. V., Deforche, K., Carvalho, A. P.,
Wynhoven, B., Soares, M. A., Cane, P., Clarke, J., Pillay, C., Sirivichayakul, S., Ariyoshi,
K., Holguin, A., Rudich, H., Rodrigues, R., Bouzas, M. B., Brun-Vézinet, F., Reid, C.,
Cahn, P., Brigido, L. F., Grossman, Z., Soriano, V., Sugiura, W., Phanuphak, P., Morris,
L., Weber, J., Pillay, D., Tanuri, A., Harrigan, R. P., Camacho, R., Schapiro, J. M.,
Katzenstein, D., and Vandamme, A.-M. (2006). Discordances between interpretation
algorithms for genotypic resistance to protease and reverse transcriptase inhibitors of
human immunodeficiency virus are subtype dependent. Antimicrob Agents Chemother,
50(2):694–701.
Strobl, C., Boulesteix, A.-L., Zeileis, A., and Hothorn, T. (2007). Bias in random forest
variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8:25.
Swanstrom, R., Bosch, R. J., Katzenstein, D., Cheng, H., Jiang, H., Hellmann, N.,
Haubrich, R., Fiscus, S. A., Fletcher, C. V., Acosta, E. P., and Gulick, R. M. (2004).
Weighted phenotypic susceptibility scores are predictive of the HIV-1 RNA response in
protease inhibitor-experienced HIV-1-infected subjects. J Infect Dis, 190(5):886–93.
Telenti, A. and Goldstein, D. B. (2006). Genomics meets HIV-1. Nat Rev Microbiol,
4(11):865–73.
198
8 Bibliography
Telenti, A. and Zanger, U. M. (2008). Pharmacogenetics of anti-HIV drugs. Annu Rev
Pharmacol Toxicol, 48:227–56.
Thibaut, L., Rochas, S., Dourlat, J., Monneret, C., Carayon, K., Deprez, E., Mouscadet,
J. F., Soma, E., and Lebel-Binay, S. (2009). New integrase binding inhibitors acting in
synergy with raltegravir. Antivir Ther (Lond), 14(4):A29–A29.
Trkola, A., Kuster, H., Rusert, P., von Wyl, V., Leemann, C., Weber, R., Stiegler, G.,
Katinger, H., Joos, B., and Günthard, H. F. (2008). In vivo efficacy of human immunodeficiency virus neutralizing antibodies: estimates for protective titers. J Virol,
82(3):1591–9.
Uchil, P. D. and Mothes, W. (2009). HIV entry revisited. Cell, 137(3):402–4.
Van Laethem, K., De Luca, A., Antinori, A., Cingolani, A., Perno, C. F., and Vandamme,
A.-M. (2002). A genotypic drug resistance interpretation algorithm that significantly
predicts therapy response in HIV-1-infected patients. Antivir Ther (Lond), 7(2):123–9.
Vercauteren, J. and Vandamme, A.-M. (2006). Algorithms for the interpretation of HIV-1
genotypic drug resistance information. Antiviral Res, 71(2-3):335–42.
Verheyen, J., Verhofstede, C., Knops, E., Vandekerckhove, L., Fun, A., Brunen, D., Dauwe,
K., Wensing, A., Pfister, H., Kaiser, R., and Nijhuis, M. (2009). High prevalence of
bevirimat resistance mutations in protease inhibitor-resistant hiv isolates. AIDS.
Vermeiren, H., Van Craenenbroeck, E., Alen, P., Bacheler, L., Picchio, G., Lecocq, P., and
Team, V. C. R. C. (2007). Prediction of HIV-1 drug susceptibility phenotype from the
viral genotype using linear regression modeling. J Virol Methods, 145(1):47–55.
von Andrian, U. H. and Mackay, C. R. (2000). T-cell function and migration. two sides of
the same coin. N Engl J Med, 343(14):1020–34.
Walker, B. D. and Burton, D. R. (2008). Toward an AIDS vaccine. Science, 320(5877):760–
4.
Walker, L. M., Phogat, S. K., Chan-Hui, P.-Y., Wagner, D., Phung, P., Goss, J. L., Wrin,
T., Simek, M. D., Fling, S., Mitcham, J. L., Lehrman, J. K., Priddy, F. H., Olsen,
O. A., Frey, S. M., Hammond, P. W., Investigators, P. G. P., Kaminsky, S., Zamb,
T., Moyle, M., Koff, W. C., Poignard, P., and Burton, D. R. (2009). Broad and potent
neutralizing antibodies from an african donor reveal a new HIV-1 vaccine target. Science,
326(5950):285–9.
Walter, H., Schmidt, B., Korn, K., Vandamme, A. M., Harrer, T., and Überla, K. (1999).
Rapid, phenotypic HIV-1 drug sensitivity assay for protease and reverse transcriptase
inhibitors. J Clin Virol, 13(1-2):71–80.
Wang, D., Larder, B., Revell, A., Harrigan, R., and Montaner, J. (2003). A neural network
model using clinical cohort data accurately predicts virological response and identifies
regimens with increased probabilisticbility of success in treatment failures. Antivir Ther
(Lond), 8(3):U99–U99.
8 Bibliography
199
Wang, J. Y., Ling, H., Yang, W., and Craigie, R. (2001). Structure of a two-domain
fragment of HIV-1 integrase: implications for domain organization in the intact protein.
EMBO J, 20(24):7333–43.
Wang, K., Jenwitheesuk, E., Samudrala, R., and Mittler, J. E. (2004). Simple linear model
provides highly accurate genotypic predictions of HIV-1 drug resistance. Antivir Ther
(Lond), 9(3):343–52.
Winters, B., Montaner, J., Harrigan, P. R., Gazzard, B., Pozniak, A., Miller, M. D., Emery,
S., van Leth, F., Robinson, P., Baxter, J. D., Perez-Elias, M., Castor, D., Hammer, S.,
Rinehart, A., Vermeiren, H., Craenenbroeck, E. V., and Bacheler, L. (2008). Determination of clinically relevant cutoffs for HIV-1 phenotypic resistance estimates through
a combined analysis of clinical trial and cohort data. J Acquir Immune Defic Syndr,
48(1):26–34.
Wong-Staal, F., Shaw, G. M., Hahn, B. H., Salahuddin, S. Z., Popovic, M., Markham, P.,
Redfield, R., and Gallo, R. C. (1985). Genomic diversity of human T-lymphotropic virus
type III (HTLV-III). Science, 229(4715):759–62.
Woods, K., Kegelmeyer Jr., W. P., and Bowyer, K. (1997). Combination of multiple
classifiers using local accuracy estimates. Ieee T Pattern Anal, 19:405–410.
Wright, S. (1931). Evolution in mendelian populations. Genetics, 16:97–159.
Wyatt, R. and Sodroski, J. (1998). The HIV-1 envelope glycoproteins: fusogens, antigens,
and immunogens. Science, 280(5371):1884–8.
Yang, O. O., Nguyen, P. T., Kalams, S. A., Dorfman, T., Göttlinger, H. G., Stewart, S.,
Chen, I. S. Y., Threlkeld, S., and Walker, B. D. (2002). Nef-mediated resistance of
human immunodeficiency virus type 1 to antiviral cytotoxic T lymphocytes. J Virol,
76(4):1626–31.
Yin, J., Beerenwinkel, N., Rahnenführer, J., and Lengauer, T. (2006). Model selection for
mixtures of mutagenetic trees. Statistical applications in genetics and molecular biology,
5:Article17.
Yoder, K. E. and Bushman, F. D. (2000). Repair of gaps in retroviral DNA integration
intermediates. J Virol, 74(23):11191–200.
Zaccarelli, M., Lorenzini, P., Ceccherini-Silberstein, F., Tozzi, V., Forbici, F., Gori, C.,
Trotta, M. P., Boumis, E., Narciso, P., Perno, C. F., and Antinori, A. (2009). Historical
resistance profile helps to predict salvage failure. Antivir Ther (Lond), 14(2):285–91.
Zazzi, M., Prosperi, M., Vicenti, I., Giambenedetto, S. D., Callegaro, A., Bruzzone, B.,
Baldanti, F., Gonnelli, A., Boeri, E., Paolini, E., Rusconi, S., Giacometti, A., Maggiolo,
F., Menzo, S., De Luca, A., and Group, A. C. (2009). Rules-based HIV-1 genotypic
resistance interpretation systems predict 8 week and 24 week virological antiretroviral
treatment outcome and benefit from drug potency weighting. J Antimicrob Chemother,
64(3):616–24.
200
8 Bibliography
Zhou, T., Xu, L., Dey, B., Hessell, A. J., Ryk, D. V., Xiang, S.-H., Yang, X., Zhang, M.-Y.,
Zwick, M. B., Arthos, J., Burton, D. R., Dimitrov, D. S., Sodroski, J., Wyatt, R., Nabel,
G. J., and Kwong, P. D. (2007). Structural definition of a conserved neutralization
epitope on HIV-1 gp120. Nature, 445(7129):732–7.
Zhu, T., Korber, B. T., Nahmias, A. J., Hooper, E., Sharp, P. M., and Ho, D. D. (1998).
An african HIV-1 sequence from 1959 and implications for the origin of the epidemic.
Nature, 391(6667):594–7.
Zhu, X. (2007). Semi-supervised learning literature survey. Computer Science.
Zwick, M. B., Labrijn, A. F., Wang, M., Spenlehauer, C., Saphire, E. O., Binley, J. M.,
Moore, J. P., Stiegler, G., Katinger, H., Burton, D. R., and Parren, P. W. (2001).
Broadly neutralizing antibodies targeted to the membrane-proximal external region of
human immunodeficiency virus type 1 glycoprotein gp41. J Virol, 75(22):10892–905.
Appendix: List of Publications
Jasmina Bogojeska, Steffen Bickel, André Altmann, Thomas Lengauer (submitted).
Dealing with sparse data in predicting outcomes of HIV combination therapies. Bioinformatics (submitted).
André Altmann*, Laura Tolosi*, Oliver Sander, Thomas Lengauer (2010). Permuation
importance: a corrected feature importance measure. Bioinformatics, 2010, 26(10): 13401347.
Hendrik Weisser, André Altmann, Saleta Sierra, Francesca Incardona, Daniel Struck,
Anders Sönnerborg, Rolf Kaiser, Maurizio Zazzi, Monika Tschochner, Hauke Walter, and
Thomas Lengauer (2010). Only slight impact of predicted replicative capacity for therapy
response prediction. PLoS ONE, 2010, 5 (2): e9044.
Thomas Lengauer, André Altmann, Alexander Thielen, Rolf Kaiser (2010). Chasing the
AIDS virus. Communications of the ACM, 2010, 53(3): 66-74.
Thomas Lengauer, André Altmann, Alexander Thielen (2009). Bioinformatische Unterstützung der Auswahl von HIV-Therapien. Informatik-Spektrum, 2009, 32(4): 320-331.
Juliane Perner , André Altmann, Thomas Lengauer (2009). Semi-Supervised Learning
for Improving Prediction of HIV Drug Resistance. Lecture Notes in Informatics (Proceedings of GCB 2009), P-157: 55-65.
Nadine Sichtig, Saleta Sierra, Rolf Kaiser, Martin Däumer, Stefan Reuter, Eugen Schülter,
André Altmann, Gerd Fätkenheuer, Ulf Dittmer, Herbert Pfister, Stefan Esser (2009).
Evolution of Raltegravir Resistance During Therapy. Journal of Antimicrobial Chemotherapy, 64: 25-32.
Mattia CF Prosperi, André Altmann, Michal Rosen-Zvi, Ehud Aharoni, Gabor Borgulya,
Fulop Bazso, Anders Sönnerborg, Yarenda Peres, Eugen Schülter, Daniel Struck, Giovanni
Ulivi, Francesca Incardona, Anne-Mieke Vandamme, Jürgen Vercauteren and Maurizio
Zazzi for the EuResist and Virolab study groups (2009). Investigation of Expert Rule
Bases, Logistic Regression, and Non-Linear Machine Learning Techniques for Predicting
Response to Antiretroviral Treatment. Antiviral Thearapy, 14: 433-442.
André Altmann*, Tobias Sing*, Hans Vermeiren, Bart Winters, Elke Van Craenenbroeck, Koen Van der Borght, Soo-Yon Rhee, Robert W Shafer, Eugen Schülter, Rolf
Kaiser, Yardena Peres, Anders Sönnerborg, W Jeffrey Fessel, Francesca Incardona, Maurizio Zazzi, Lee Bacheler, Herman Van Vlijmen, Thomas Lengauer (2009). Advantages of
Predicted Phenotypes and Statistical Learning Models in Inferring Virological Response
to Antiretroviral Therapy from HIV Genotype. Antiviral Thearapy, 14: 273-283.
201
202
8 Bibliography
André Altmann, Martin Däumer, Niko Beerenwinkel, Yardena Peres, Eugen Schülter,
Joachim Büch, Soo-Yon Rhee, Anders Sönnerborg, Jeffrey Fessel, Robert W Shafer, Maurizio Zazzi, Rolf Kaiser and Thomas Lengauer (2009). Predicting response to combination
antiretroviral therapy: retrospective validation of geno2pheno-THEO on a large clinical
database. Journal of Infectious Diseases, 199(7), 999-1006.
André Altmann, Michal Rosen-Zvi , Mattia Prosperi, Ehud Aharoni, Hani Neuvirth, Anders Sönnerborg, Eugen Schülter, Joachim Büch, Daniel Struck, Yardena Peres, Francesca
Incardona, Rolf Kaiser, Maurizio Zazzi and Thomas Lengauer (2008). Comparison of Classifier Fusion Methods for Predicting Response to Anti HIV-1 Therapy. PLoS ONE, 3(10):
e3470.
Jasmina Bogojeska, Adrian Alexa, André Altmann, Thomas Lengauer and Jörg Rahnenführer (2008). Rtreemix: an R package for estimating evolutionary pathways and genetic
progression scores. Bioinformatics, 24 (20): 2391.
Michal Rosen-Zvi, André Altmann, Mattia Prosperi, Ehud Aharoni, Hani Neuvirth,
Anders Sönnerborg, Eugen Schülter, Daniel Struck, Yardena Peres, Francesca Incardona,
Rolf Kaiser, Maurizio Zazzi and Thomas Lengauer (2008). Selecting anti-HIV therapies
based on a variety of genomic and clinical factors. Bioinformatics, 24 (13):i399-406.
André Altmann, Niko Beerenwinkel, Tobias Sing, Igor Savenkov, Martin Däumer, Rolf
Kaiser, Soo-Yon Rhee, W Jeffrey Fessel, Robert W Shafer and Thomas Lengauer (2007).
Improved prediction of response to antiretroviral combination therapy using the genetic
barrier to drug resistance. Antiviral Therapy, 12 (2):169-178.

Similar documents