Assessment of the genetic and phenotypic diversity among
Transcription
Assessment of the genetic and phenotypic diversity among
FEMS Microbiology Ecology, 91, 2015, fiv081 doi: 10.1093/femsec/fiv081 Advance Access Publication Date: 17 July 2015 Research Article RESEARCH ARTICLE Assessment of the genetic and phenotypic diversity among rhizogenic Agrobacterium biovar 1 strains infecting solanaceous and cucurbit crops 1 Laboratory for Process Microbial Ecology and Bioinspirational Management (PME&BIM), Department of Microbial and Molecular Systems (M2 S), KU Leuven, Campus De Nayer, B-2860 Sint-Katelijne-Waver, Belgium, 2 Department of Animal Health, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, E-28040 Madrid, Spain, 3 Research Centre Hoogstraten vzw, B-2328 Meerle, Belgium, 4 Research Station for Vegetable Production vzw, B-2860 Sint-Katelijne-Waver, Belgium, 5 Scientia Terrae vzw, B-2860 Sint-Katelijne-Waver, Belgium and 6 Centre of Microbial and Plant Genetics, M2 S, KU Leuven, B-3001 Leuven, Belgium ∗ Corresponding author: Laboratory for Process Microbial Ecology and Bioinspirational Management (PME&BIM), Department of Microbial and Molecular Systems (M2 S), KU Leuven, Campus De Nayer, Fortsesteenweg 30A, B-2860 Sint-Katelijne-Waver, Belgium. Tel: +32-15-305590; Fax: +32-15-305599; E-mail: [email protected] † Both authors equally contributed to this work. One sentence summary: This study provides a first glimp on the genetic and phenotypic diversity of rhizogenic biovar 1 agrobacteria infecting solanaceous and cucurbit crops. Editor: Angela Sessitsch ABSTRACT Rhizogenic Agrobacterium biovar 1 strains have been found to cause extensive root proliferation on hydroponically grown Cucurbitaceae and Solanaceae crops, resulting in substantial economic losses. As these agrobacteria live under similar ecological conditions, infecting a limited number of crops, it may be hypothesized that genetic and phenotypic variation among such strains is relatively low. In this study we assessed the phenotypic diversity as well as the phylogenetic and evolutionary relationships of several rhizogenic Agrobacterium biovar 1 strains from cucurbit and solanaceous crops. A collection of 41 isolates was subjected to a number of phenotypic assays and characterized by MLSA targeting four housekeeping genes (16S rRNA gene, recA, rpoB and trpE) and two loci from the root-inducing Ri-plasmid (part of rolB and virD2). Besides phenotypic variation, remarkable genotypic diversity was observed, especially for some chromosomal loci such as trpE. In contrast, genetic diversity was lower for the plasmid-borne loci, indicating that the studied chromosomal housekeeping genes and Ri-plasmid-borne loci might not exhibit the same evolutionary history. Furthermore, phylogenetic and network analyses and several recombination tests suggested that recombination could be contributing in some extent to the evolutionary dynamics of rhizogenic Agrobacterium populations. Finally, a genomospecies-level identification analysis revealed that at least four genomospecies may occur on cucurbit and tomato crops (G1, G3, G8 and G9). Together, this study Received: 8 March 2015; Accepted: 12 July 2015 C FEMS 2015. All rights reserved. For permissions, please e-mail: [email protected] 1 Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Lien Bosmans1,† , Sergio Álvarez-Pérez2,† , Rob Moerkens3 , Lieve Wittemans4 , Bart Van Calenberge4 , Stefan Van Kerckhove5 , Anneleen Paeleman5 , René De Mot6 , Hans Rediers1 and Bart Lievens1,∗ 2 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 gives a first glimpse at the genetic and phenotypic diversity within this economically important plant pathogenic bacterium. Keywords: Cucurbitaceae; genomospecies; hairy roots; phenotypic and phylogenetic diversity; Solanaceae INTRODUCTION species [cucumber (Cucumis sativus), melon (Cucumis melo) (both Cucurbitaceae) and tomato (Solanum lycopersicum; Solanaceae)] from different countries and years of isolation were subjected to a number of phenotypic assays. Next, the strains were characterized by multilocus sequence analysis (MLSA) targeting four core housekeeping genes and two regions located on the Riplasmid. Finally, we assessed the relative contribution of mutation and recombination in shaping the observed genetic diversity within the investigated collection of isolates. Altogether, this study should lead to a better understanding of the epidemiology and ecology of rhizogenic Agrobacterium biovar 1 strains, and ultimately to adequate strategies to control hairy roots. MATERIALS AND METHODS Bacterial strains Forty-one rhizogenic Agrobacterium biovar 1 strains isolated from three different Cucurbitaceae and Solanaceae host species (cucumber (Cucumis sativus), melon (Cucumis melo) (both Cucurbitaceae) and tomato (Solanum lycopersicum; Solanaceae)) were used in this study (Table 1). Each studied strain represented a unique genotype, as demonstrated by different fingerprinting techniques (Fig. S1, Supporting Information). Cucurbit isolates were obtained from culture collections; the majority of tomato strains were isolated in this study. In this regard, sampling was conducted in 2014 at 12 different greenhouses in Sint-Katelijne Waver and Hoogstraten (both in the province of Antwerp, Belgium) in which hydroponic tomatoes are grown. Half a gram affected tomato roots were crushed in 9 ml phosphate-buffered saline (pH 7.4) and macerated for at least 30 min at room temperature. Next, samples were plated on modified 1A-medium (0.001% crystal violet; 0.104% K2 HPO4 ; 0.054% KH2 PO4 ; 0.304% L-arabitol; 0.025% MgSO4 ·7H2 O; 0.016% NH4 NO3 ; 0.029% sodium taurocholate; 0.3% bile salts; and 1.5% agar; supplemented after autoclaving: with: 0.002% cycloheximide and 0.008% K2 TeO3 ) (Brisbane and Kerr 1983). After 7 days of incubation at 25◦ C, colonies were picked up, purified and identified as Ri-plasmid harbouring bacteria by a PCR screening with virD2 and rolB primers (Table S1, Supporting Information). Further, 23 isolates were classified as biovar 1 according to a set of biochemical tests as described by Weller et al. (2000) (including growth on Schroth’s medium (positive), Brisbane and Kerr’s Medium 2E (negative), and Roy and Sasser Medium 3 (negative), and oxidase positive). In the phylogenetic analyses (see below), the type strains A. radiobacter LMG 140T and R. rhizogenes LMG 150T were included as a reference or outgroup, respectively. Isolates were preserved at −80◦ C in Luria broth (Oxoid, Basingstoke, UK), containing 32.5% glycerol. Phenotypic characterization For phenotypic characterization, all strains investigated were first grown aerobically in liquid Luria–Bertani (LB) medium at 25◦ C for 24 h. Subsequently, strains were subjected to Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 ‘Hairy root’ disease is characterized by extensive root proliferation and occurs on several dicotyledonous plants, among which many are economically important crops. It was first described as a soilborne disease of economic importance on apples in the early 20th century (Riker et al. 1939). However, since the early 1990s, also hydroponically grown cucumber plants and tomato crops have been affected by the disease (Weller et al. 2000), resulting in significant economic losses. Although hairy roots is generally associated with pathogenic Rhizobium rhizogenes strains (formerly Agrobacterium rhizogenes; Escobar and Dandekar 2003; Lindström and Young 2011; Chandra 2012), the cause of the disease on (hydroponically grown) Cucurbitaceae and Solanaceae plants is generally Agrobacterium biovar 1 strains (A. radiobacter) harbouring a root-inducing Riplasmid (Weller et al. 2000; Weller and Stead 2002; Weller, Stead and Young 2004, 2006). Symptoms are induced following transfer and expression of a particular segment of the Ri-plasmid into the plant genome (‘transferred DNA’; T-DNA), in a similar manner to that seen by tumour-inducing Ti-plasmids in A. tumefaciens (Hooykaas and Beijersbergen 1994). Isolates that do not harbour an Ri-plasmid are considered avirulent (Gelvin 2003, 2009). However, as these plasmids can be readily transferred to nonpathogenic agrobacteria or related rhizobia carrying no Ti- or Riplasmids (Kerr and Panagopoulos 1977; Weller, Stead and Young 2004), novel pathogenic populations may arise regularly. The rhizosphere represents a highly dynamic and complex system where intricate chemical, physical and biological interactions take place that select for specific microbial populations (Berg and Smalla 2009). Therefore, as rhizogenic (i.e. root-inducing) Agrobacterium biovar 1 populations present on hydroponically grown crops live under very similar ecological conditions and seem to infect a limited number of crops (Solanaceae and Cucurbitaceae) it may be hypothesized that genetic and phenotypic variation among such strains is relatively low and that these stringent conditions are an important factor in determining their population structure (Souza et al. 1999). Surprisingly, in contrast to tumour-inducing agrobacteria (Irelan and Meredith 1996; Llop et al. 2003; Pulawska and Kalużna 2012), so far only little is known about the diversity among rhizogenic Agrobacterium strains at the genetic and phenotypic level. Agrobacterium biovar 1 currently encompasses nine genomospecies (genomovars) (G1–G9) for which no Latin binomials have been accepted yet, except for G4 which represents the A. radiobacter lineage (Lindström and Young 2009, 2011). So far, no information is available on the genomospecies occurring on Cucurbitaceae and Solanaceae crops. Moreover, studies on the population structure of rhizogenic strains from the same host and across different hosts are currently lacking. To fill these research gaps, we assessed the phenotypic diversity as well as the phylogenetic and evolutionary relationships among a set of rhizogenic Agrobacterium biovar 1 strains from both Cucurbitaceae and Solanaceae. To this end, first a collection of 41 isolates from three Cucurbitaceae/Solanaceae host Bosmans et al. 3 Table 1. Rhizogenic strains of Agrobacterium biovar 1 used in this study. Isolateb Host Family Host Species Geographic originc Year of isolation Hydroponicsd 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 MAFF 106580 MAFF 106587 MAFF 106591 MAFF 301724 MAFF 210265 MAFF 210268 NCPPB 2655 NCPPB 2656 NCPPB 2657 NCPPB 2659 NCPPB 2660 NCPPB 4043 NCPPB 4042 ST15.13/067 ST15.13/001 ST15.13/004 ST15.13/006 ST15.13/007 ST15.13/012 ST15.13/013 ST15.13/039 ST15.13/040 ST15.13/042 ST15.13/046 ST15.13/048 ST15.13/054 ST15.13/056 ST15.13/057 ST15.13/059 ST15.13/060 ST15.13/064 ST15.13/077 ST15.13/090 ST15.13/091 ST15.13/095 ST15.13/097 ST15.13/098 NCPPB 4062 ST15.13/043 ST15.13/045 NCIB 8196 Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae Solanaceae – Cucumis melo Cucumis melo Cucumis melo Cucumis melo Cucumis melo Cucumis melo Cucumis sativus Cucumis sativus Cucumis sativus Cucumis sativus Cucumis sativus Cucumis sativus Cucumis sativus Cucumis sativus Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum Solanum lycopersicum – Japan Japan Japan Japan Japan Japan UK UK UK UK UK UK UK UK Belgium, Hoogstraten, greenhouse 1 Belgium, Sint-Katelijne-Waver, greenhouse 2 Belgium, Sint-Katelijne-Waver, greenhouse 2 Belgium, Sint-Katelijne-Waver, greenhouse 2 Belgium, Hoogstraten, greenhouse 3 Belgium, Hoogstraten, greenhouse 3 Belgium, Sint-Katelijne-Waver, greenhouse 4 Belgium, Hoogstraten, greenhouse 5 Belgium, Hoogstraten, greenhouse 6 Belgium, Hoogstraten, greenhouse 6 Belgium, Hoogstraten, greenhouse 6 Belgium, Sint-Katelijne-Waver, greenhouse 7 Belgium, Sint-Katelijne-Waver, greenhouse 8 Belgium, Sint-Katelijne-Waver, greenhouse 9 Belgium, Sint-Katelijne-Waver, greenhouse 8 Belgium, Sint-Katelijne-Waver, greenhouse 2 Belgium, Hoogstraten, greenhouse 10 Belgium, Sint-Katelijne-Waver, greenhouse 9 Belgium, Sint-Katelijne-Waver, greenhouse 9 Belgium, Sint-Katelijne-Waver, greenhouse 11 Belgium, Sint-Katelijne-Waver, greenhouse 11 Belgium, Sint-Katelijne-Waver, greenhouse 12 Belgium, Sint-Katelijne-Waver, greenhouse 12 UK Switzerland Switzerland – 1985 1985 1985 – – – 1974 1974 1974 1974 1974 1997 1998 – 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 1998 2014 2014 – – – – – – – N N N N N Y Y – Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y – a Melon, cucumber and tomato origin are indicated by dark green, light green and red filled circles, respectively. NCPPB, National Collection of Plant Pathogenic Bacteria, York, UK; MAFF, NIAS Genebank (National Institute of Agrobiological Sciences), Ibaraki, Japan; ST, own isolates, Belgium; NCIB: now moved to NCIMB, National Collections of Industrial, Marine and Food Bacteria, Aberdeen, Scotland. c Isolates from Belgian origin were obtained from hydroponically grown tomato plants from 12 different greenhouses. d Y, isolate obtained from plant in hydroponic system; N, other than hydroponic growing medium. –, unknown. b different phenotypic assays. First, catalase activity was determined by evaluating bubble production with a 3% (v/v) hydrogen peroxide solution (Cappuccino and Sherman 2002). Next, isolates [10 μL of a culture with optical density (OD600 ) of 0.6] were inoculated in several 96-well plates (Thermo Scientific Nunc MicroWell 96-Well Microplates), containing 190 μL of LB medium per well, and were each subjected to different treatments. As a control, one row of 12 wells was not inoculated. In a first treatment, isolates were exposed to buffered LB medium with different pH values, including pH 3, 5, 7 (both buffered with citric acid), 9 and 11 (buffered with sodium bicarbonate) (incubation at 25◦ C). In a second treatment, isolates were exposed to vari- ous temperatures, including 4, 22, 25, 30, 37, 41.5 and 44◦ C. In a third treatment, isolates were cultured at 25◦ C in LB medium supplemented with 300 or 600 ppm hydrogen peroxide, a commonly used disinfectant in hydroponic horticulture. For all treatments, plates were incubated with gentle agitation and growth was photospectrometrically (OD600 ) monitored after 48 h of incubation. Further, biofilm forming ability was evaluated using the crystal violet assay (Peeters, Nelis and Coenye 2008). To this end, a round-bottomed polystyrene 96-well microtiter plate (Thermo Scientific Nunc MicroWell 96-Well Microplates) was inoculated with 200 μL per well of an overnight LB culture containing approximately 108 CFU ml−1 . Following an incubation period of Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Identifiera 4 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 48 h at 25◦ C with gentle agitation, the culture medium was discarded and 100 μL 99% methanol was added. After 15 min, the methanol was removed and the plate was air-dried. Then, 200 μL of a crystal violet solution (Merck, Galloping, USA) was added to each well. After 20 min, the amount of crystal violet not attached to the biofilm was removed by washing the plates under running tap water. Finally, bound cristal violet was released by adding 250 μL of 33% acetic acid (Sigma-Aldrich) and the absorbance was measured at 600 nm. For all experiments in 96-well plates, a strain was scored positive when the measured absorbance (OD600 ) exceeded 1.1 times the background value (negative control). All phenotypic assays were performed three times. Sequencing of selected loci and multiple sequence alignments Analysis of nucleotide diversity For each studied locus, different haplotypes were assigned arbitrary numbers. Unique combinations of haplotype numbers (i.e. allelic profiles) for the four chromosomal housekeeping genes analysed were used to unambiguously define the nucleotide sequence type (ntST) of isolates. The same was done for plasmid-borne loci. In addition, amino acid sequence types (aaSTs) were determined from protein alignments. Molecular diversity indices and the guanine plus cytosine (G + C) content of the nucleotide sequences were computed using the DnaSP v.5.0 program (Librado and Rozas 2009). The same software was used to perform the Tajima’s D test of neutrality. The ratio between the number of non-synonymous substitutions per nonsynonymous site and synonymous substitutions per synonymous site (dN/dS; Nei and Gojobori 1986) was computed using Analysis of genetic population structure Genetic population structure was assessed with a Bayesian clustering approach using a Markov Chain Monte Carlo (MCMC) assignment method implemented in STRUCTURE v.2.3.4 (Pritchard, Stephens and Donnelly 2000). STRUCTURE assigns individuals to inferred ancestral populations or lineages (K) representing the best fit for the observed genetic variation. The admixture model was used together with the LOCPRIOR option. This model allows for the possibility of mixed ancestry, while LOCPRIOR allows the model to use sampling location information to assist the clustering. We varied K from 1 to 10, and for each value of K an MCMC was run 20 times using 100 000 iterations with a burn-in period of 50 000. The optimal number of genetically homogeneous clusters of isolates was identified using K corresponding with the approach of Evanno, Regnaut and Goudet (2005). For K values selected in this way, the run with the highest likelihood value was used to assign posterior membership coefficients to isolates, and those isolates with a probability of membership to any inferred cluster ≥80% were considered to be representative of that lineage. Phylogenetic reconstruction Phylogenetic trees for each individual chromosomal and Riplasmid-borne gene were constructed using Bayesian inference (BI), maximum likelihood (ML) and neighbour-joining (NJ) methods. In addition, phylogenetic trees were also obtained by the same methods for two different concatenations of housekeeping genes (a four-loci concatenation of all chromosomal sequences: 16S rRNA + recA + rpoB + trpE; and the three-loci concatenation of protein-encoding genes: recA + rpoB + trpE), and for a concatenation of the two plasmid-borne genes (rol + vir). BI analyses were performed with MrBayes v.3.1.2 (Ronquist and Huelsenbeck 2003). The simplest models of sequence evolution among those available in MrBayes that best fitted the sequence data according to the Akaike Information Criterion were determined using the jModeltest v.0.1.1 package (Posada 2008). Four Metropoliscoupled Markov chains (five for the 16S rRNA data set) were run twice, until the average standard deviation of split frequencies dropped below 0.01. The chains were sampled each 100 generations and chain temperature was set to 0.2. About 50% majority rule consensus trees were calculated using the sumt command and discarding the first 25% of the trees to yield the final Bayesian estimates of phylogeny. Posterior probabilities (PP) from the 50% majority rule consensus trees were used as estimates of robustness. ML analyses were carried out using the online version of PhyML v.3.0 (Guindon et al. 2010). Model parameters were estimated in all cases from the data set and support for the inferred topologies was tested using 1000 bootstrap replications. NJ trees were obtained using MEGA v.5 (Tamura et al. 2011). Evolutionary distances were computed by the Tamura–Nei method (Tamura and Nei 1993), and the rate of variation among sites was modelled by a gamma distribution, with shape parameters set at the values estimated in ML analyses (see above). A total of 1000 bootstrap replications were used to infer consensus trees. The confidence of alternative tree topologies based on single gene and concatenated data sets was evaluated by the Shimodaira–Hasegawa (SH) test (Shimodaira and Hasegawa 1999), as implemented in TREE-PUZZLE v.5.2 (Schmidt et al. 2002). Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 For all isolates, genomic DNA was extracted from 2-day old cultures grown on trypticase soy agar (TSA; Oxoid) using the phenol–chloroform extraction method described by Lievens et al. (2003). Four core housekeeping genes [16S rRNA gene, recA (encoding the DNA repair protein RecA), rpoB (encoding the β subunit of RNA polymerase) and trpE (encoding the anthranilate synthase component I)] and two regions located on the Riplasmid (rolB and virD2) were partially amplified and sequenced. PCR amplification was performed using a Bio-Rad T100 thermal cycler in a reaction volume of 20 μL, consisting of 0.15 mM of each dNTP, 0.5 μM of each primer (for primer pairs, see Table S1, Supporting Information), 1 unit Titanium Taq DNA polymerase, 1x Titanium Taq PCR buffer and 5 ng genomic DNA. Before amplification, DNA samples were denatured at 94◦ C for 2 min, followed by 35 cycles of 45 s at 94◦ C, 45 s (all except rpoB) or 1 min (rpoB) at the annealing temperature indicated in Table S1 (Supporting Information) and 45 s at 72◦ C, with a final extension at 72◦ C for 10 min. Following agarose gel electrophoresis, target amplicons were cut from the gel and purified using the QIAquick gel extraction kit (Qiagen, Valencia, CA, USA). All amplicons were sequenced with the same reverse primers as those used for amplification. Subsequently, obtained sequences were individually trimmed for quality, using a minimum Phred score of 20, and, in cases of ambiguous base calls, manually edited based on the obtained electropherograms. Sequences were aligned using the MUSCLE algorithm (Edgar 2004). Next, resulting alignments were trimmed to ensure that all sequences had the same start and end point. The nucleotide sequences determined in this work have been deposited in GenBank under the accession numbers KP851457–KP851702 (see Table S2, Supporting Information, for further details). the Sequence Type Analysis and Recombinational Tests (START) v.2 program (Jolley et al. 2001). Bosmans et al. Phylogenetic clustering with defined genomospecies In order to determine the lineage affiliation of the studied isolates, recA and rpoB sequences for reference strains of Agrobacterium biovar 1 belonging to different genomospecies (G1–G9) were retrieved from GenBank (Table S3, Supporting Information) (only a limited number of rpoB sequences for defined genomospecies were available in GenBank; 16S rRNA gene and trpE sequences were not available), and aligned with the corresponding sequences determined for our isolates. ML, BI and NJ phylogenetic trees were then inferred from these alignments using the procedures described above. Interpretation of the resulting clades and their relationships was in accordance with Costechareyre et al. (2010). binary trait is random with respect to the phylogeny, whereas a D value of 0 represents a distribution expected under Brownian motion (Fritz and Purvis 2010). Similarly, a D value > 1 means that the studied trait is more overdispersed than expected at random, while a negative D suggests a highly phylogenetically clustered trait (Fritz and Purvis 2010). In all cases, estimated D values were compared with simulated distributions (1000 permutations) of D under (i) randomly reshuffled trait values across the tips of the tree, and (ii) trait evolution under Brownian motion. Only phenotypic traits with each count of state (i.e. 0 or 1) greater than 5 were considered and the concatenated ML trees were used in these calculations, but with all branch lengths set to 1 so as to avoid the effects of zero branch lengths in the original trees. RESULTS Split networks of single locus and the different alignments of concatenated sequences were constructed with SplitsTree4 (Huson and Bryant 2006) by the Neighbor-Net algorithm, using uncorrected P distances. For ideal data, this distance-based method results in tree-like representations, whereas networklike structures can represent conflicting phylogenetic signals in the data set under study (Bryant and Moulton 2004). The existence of conflicting phylogenetic signals was further assessed across taxa by calculating delta (δ) scores (Holland et al. 2002) in SplitsTree4. The value of this score ranges from 0 and 1, where 0 reflects an exact fit to a bifurcating tree and 1 complete departure of the data from a tree-like structure (Holland et al. 2002). The possible presence of recombination in the studied loci was determined by the pairwise homoplasy index (PHI) test (Bruen, Philippe and Bryant 2006) as implemented in SplitsTree4, and linkage equilibrium (i.e. free recombination among sequences) among the four housekeeping genes and between the two plasmid-borne loci was evaluated from haplotype data using the classical (Smith et al. 1993) and standardized (Haubold and Hudson 2000) indexes of association (IA and I AS , respectively, with 1000 resamplings in both cases), which were calculated in START2. Potential recombination events in the studied genes were subsequently identified using the Recombination Detection Program (RDP) v.3.44 (Martin et al. 2010) following the procedure detailed in Álvarez-Pérez, de Vega and Herrera (2013). Statistical significance was set at the P < 0.01 level, and only recombination events that were identified by at least five different analysis methods among those available in RDP were accepted as evidence of recombination and kept for subsequent analyses. Finally, the LDhat module (McVean, Awadalla and Fearnhead 2002) available within RDP was used to estimate recombination and mutation rates (ρ and θ, respectively), again as detailed in Álvarez-Pérez, de Vega and Herrera (2013). Phenotypic characterization The phylogenetic signal (i.e. a quantitative measure of how much variation in a trait is related to phylogenetic relatedness) of the different phenotypic traits was calculated by the ‘phylo.d’ function of the caper package (Orme et al. 2012) in R. Briefly, the studied phenotypic traits were coded as binary (1/0) data, and the D metric (Fritz and Purvis 2010) was calculated for each of them. D compares the number of observed changes in the state of a binary categorical trait with the number expected under a Brownian motion model of evolution that produces the same number of tip species with each character state as the observed pattern. An estimated D of 1 means that the distribution of the Seven isolates (one from cucumber and six from tomato plants) were catalase positive, while all others did not display catalase activity. All isolates were found to grow after 48 h of incubation between 22 and 37◦ C [except ST15.13/001 and ST15.13/042 which did not grow at 22◦ C (replicated three times), but were positive at 25◦ C]. Remarkably, some isolates were able to grow at 4◦ C (11 isolates) and/or at 44◦ C (12 isolates) (Fig. 1). Additionally, all isolates except one (NCPPB2659, unable to grow at pH 9) were able to grow at pH 5, 7 and 9, whereas none of the tested isolates grew at pH 11. A total of 11 isolates were found to grow at pH 3, among which 10 that showed already growth after 24 h of incubation (data not shown). Twenty-seven isolates scored positive for biofilm formation. A total of 34 and 26 isolates were able to resist and grow in the presence of 300 ppm and 600 ppm hydrogen peroxide, respectively (Fig. 1). Sequence variation in core housekeeping and plasmid-borne genes The length of the nucleotide sequences analysed ranged from 228 bp for rol to 682 bp for the 16S rRNA gene (Table S4, Supporting Information). The mean number of nucleotide haplotypes per chromosomal housekeeping gene was 28.5 and ranged from 24 (for rpoB) to 33 (trpE) (Tables S4 and S5, Supporting Information), while the number of protein haplotypes ranged from 7 (for both recA and rpoB) to 28 (trpE). Combined data of the core housekeeping genes yielded 41 different ntSTs and 30 aaSTs out of 41 isolates (Table S5, Supporting Information). In addition, 10 and 5 nucleotide haplotypes (corresponding to 10 and 5 protein haplotypes) were found for the rol and vir genes, respectively (Tables S4 and S5, Supporting Information). Among all studied loci, trpE exhibited the highest proportion of polymorphic sites (40.9%) and the 16S rRNA gene the lowest (13.6%; Table S4, Supporting Information). Nucleotide diversity (π) varied widely across genes for the entire collection of isolates (between 0.021 and 0.1, for the 16S rRNA gene and trpE, respectively; Table S4, Supporting Information). The G + C content of the studied gene fragments ranged from 53.5% (for vir) to 63.8% (for trpE). dN/dS ratios were below 1 for all studied genes, with the highest values obtained for the plasmid-borne ones (Table S4, Supporting Information). Tajima’s D values were negative for all studied loci, but only deviated significantly from zero for rol (P < 0.05), suggesting that this plasmid-borne gene was the single one subject to selection. Cucurbitaceae- and Solanaceae-associated groups of isolates displayed some relevant differences in the aforementioned Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Network and recombination analyses Testing for phylogenetic signal in phenotypic traits 5 6 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 descriptive statistics (Table S4, Supporting Information). For example, Solanaceae isolates displayed higher percentages of nucleotide polymorphic sites (between 1.9 and 21.2 times) and values of π (between 1.1 and 27.7 times) for all studied loci, with the highest differences corresponding in both cases to rol and the lowest to trpE. Nevertheless, trpE was the most variable locus in both groups of isolates (although closely followed by rol in the Solanaceae group), and the G + C content of the four chromosomal and two plasmid-borne loci was very similar regardless of the plant host. Tajima’s D values only deviated significantly from zero for the rol gene (P < 0.05) and the Cucurbitaceae-associated group, but not for Solanaceae isolates (Table S4, Supporting Information). formation) were classified into the same cluster regardless of their plant host. On the contrary, for chromosomal housekeeping genes one of the inferred clusters grouped together most Cucurbitaceae-associated isolates, whereas melon isolate MAFF 210265 was included into a different cluster and tomato isolates were distributed into all clusters. Furthermore, a certain amount of admixture was apparent for some isolates and, notably, tomato isolate ST15.13/064 showed mixed ancestry in the three data sets analysed (Fig. 2). STRUCTURE analyses performed without selecting the LOCPRIOR option yielded similar results (data not shown). Phylogenetic analysis Analysis of genetic population structure STRUCTURE analyses performed for the concatenations of chromosomal protein-encoding and plasmid-borne genes revealed that the most probable number of genetic clusters was two, while the four-loci data set yielded a K value of 3 (Table S6, Supporting Information). Barplots of the individual assignment of isolates into genetic clusters are shown in Fig. 2. For plasmidborne genes, most isolates (87.5%; Table S6, Supporting In- The ML consensus tree based on the concatenation of the four chromosomal housekeeping genes analysed (2368 bp in total) grouped most Agrobacterium biovar 1 isolates together with reference strain LMG140T into a major clade which was well supported by all phylogenetic methods (100% ML and NJ bootstrap support and BI PP = 1; Fig. 3). Other smaller subgroups could be recognized within this largest clade, some of which included both Cucurbitaceae and Solanaceae isolates. Nevertheless, the support for these minor clades depended on the Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Figure 1. Checkerboard diagram illustrating the phenotypic diversity between Agrobacterium biovar 1 isolates. The clustering was made in R and based on phenotypic information. The columns of the checkerboard represent the tested parameters (catalase activity, pH and temperature values allowing growth, biofilm forming ability and resistance to hydrogen peroxide), and the rows represent the different isolates tested. Black represents a positive phenotype, white a negative phenotype. Strains isolated from melon, cucumber and tomato crops are indicated by dark green, light green and red filled circles, respectively. Bosmans et al. 7 Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Figure 2. Ancestry attribution of Agrobacterium biovar 1 isolates from Cucurbitaceae and Solanaceae plants, as inferred by STRUCTURE analysis using the admixture model together with the LOCPRIOR option (see details in the main text). The colouring of each vertical line is proportional to the ancestry of each isolate from each of the two (for the recA + rpoB + trpE and rol + vir data sets) or three (for 16S rRNA + recA + rpoB + trpE) inferred clusters. Numeric codes for the strains characterized in this study are as in Table 1. 8 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Figure 3. ML consensus tree obtained from concatenated sequences of four chromosomal housekeeping genes (16S rRNA + recA + rpoB + trpE) for the Agrobacterium biovar 1 isolates characterized in this study. Node support values (bootstrap percentages, based on 1000 simulations) ≥90% are shown next to the branches. BI PP ≥90% and NJ bootstrap node support values ≥90% are denoted by asterisks and daggers, respectively. ML and BI analyses were performed under the GTR + I + G model of sequence evolution. The small phylogram is included to illustrate branch length heterogeneity (scale bar = 0.05 nucleotide substitutions per site). Strains isolated from melon, cucumber and tomato crops are indicated by dark green, light green and red filled circles, respectively. Rhizobium rhizogenes LMG 150T was used as an outgroup to root the tree, and A. radiobacter LMG 140T was included as a biovar 1 group reference strain. Bosmans et al. Phylogenetic comparison with well-defined genomospecies Phylogenetic analysis of the recA and rpoB sequences of the isolates characterized in this study and reference Agrobacterium biovar 1 strains belonging to different genomospecies allowed classification of our isolates in different lineages (Figs 5, S5 and S6, Supporting Information). Based on the recA data set, 35 isolates clustered with high confidence (≥90 ML and/or NJ bootstrap support, and/or ≥BI PP) with reference sequences. However, based on rpoB sequences, only 24 isolates grouped with reference sequences. This discrepancy is probably due to the limited number of reference rpoB sequences that were available in GenBank. Twenty isolates clustered with high confidence with reference sequences for both the recA and rpoB data sets, enabling classification at the genomospecies level: 1 isolate (of tomato origin) with genomospecies 1 (G1); 2 (both from tomato crops) with genomospecies 3 (G3); 1 (of unknown origin) with genomospecies 8 (G8); and 16 (1 from melon, 2 from cucumber, and 13 from tomato plants) with genomospecies 9 (G9) (Table 2). Interestingly, recA and rpoB clustering of the reference sequences suggested that genomospecies 7 (G7) represents a polyphyletic group, possibly containing a substantial part of our isolates, as suggested by the recA tree (Fig. 5). Notably, tomato isolates ST15.13/042 (represented by code ‘23’) and ST15.13/064 (code ‘31’) were again not included in the well-supported major clade comprising all other biovar 1 isolates in these recA and rpoB trees (Figs S5 and S6, Supporting Information). Furthermore, the position of isolate ST15.13/046 (code ‘24’), also from tomato origin, was uncertain, as it clustered with most other biovar 1 sequences in the rpoB tree but was excluded from the major clade in the recA tree. Network and recombination analyses Some degree of reticulation (i.e. conflicting phylogenetic signal) was observed in the Neighbor-Net networks obtained for all studied data sets and, in particular, for the concatenations of individual housekeeping and plasmid-borne genes (Figs S7 and S8, Supporting Information). Moreover, the δ scores obtained for most data sets suggested a moderate departure from a tree-like structure (Table S9, Supporting Information). However, the PHI test only detected significant evidence of recombination in trpE, rol and the two multilocus combinations of chromosomal genes (Table S9, Supporting Information). RDP analyses found significant intragenic recombination for the 16S rRNA gene and rol (one recombination event each; Table S10, Supporting Information), but not for recA, rpoB, trpE or vir. Notably, most recombinants (the single one detected for the 16S rRNA gene, and three out of five for rol) and the putative parental haplotypes that could be determined for them had a tomato origin (Table S10, Supporting Information). All housekeeping loci analysed except rpoB, and also the fourand three-loci combinations, displayed for the whole set of studied biovar 1 isolates per site recombination/mutation rate ratios (ρ/θ w ) greater than 1 (Table 3). On the contrary, that ratio was <1 for rpoB, and for vir, rol and their concatenation, suggesting that mutation occurred more often than recombination in these cases. Moreover, when isolates from Cucurbitaceae and Solanaceae host plants were considered separately, a ρ/θ w > 1 was only observed for the four-loci concatenation of chromosomal genes in the Cucurbitaceae-associated group, but not for any other combination of housekeeping or plasmid-borne sequences (Table 3). Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 phylogenetic method used. Tomato isolate ST15.13/046 was independent from, but clustered with, the major clade in all trees, although this grouping was only well supported by BI and NJ methods. In addition, tomato isolate ST15.13/042 was excluded from the main clade by all analysis methods, and also clustered apart from ST15.13/046. Notably, the consensus tree based on the concatenation of sequences of the three protein-encoding chromosomal genes (i.e. recA + rpoB + trpE, 1686 bp in total; Fig. S2, Supporting Information) and the trpE single-gene tree (Fig. S3, Supporting Information) showed a similar distribution and support of clades as the four-loci tree. In contrast, the other trees constructed from each independent housekeeping gene (16S rRNA, recA and rpoB) did not support the aforedescribed clustering pattern and, in particular, the position of isolates ST15.13/042 and ST15.13/046 was variable (Fig. S3, Supporting Information). The position of a third tomato isolate (ST15.13/064) which in the four- and three-loci trees was included within the major clade was also variable in the single-gene trees, as it clustered within the main clade in the trpE tree but apart from all or most other biovar 1 isolates, albeit with a low support, in the 16S rRNA, recA and rpoB phylogenies (Fig. S3, Supporting Information). The topologies of the ML trees based on rol and vir sequences (Fig. S4, Supporting Information) and their concatenation (486 bp in total, Fig. 4) were notably different from that described above for the trees obtained for the chromosomal genes. Two well-supported clades were observed in the rol-based tree, one of them including the sequences of tomato isolates ST15.13/057, ST15.13/077 and ST15.13/090, and a second which clustered all other studied biovar 1 isolates and contained some minor subdivisions (Fig. S4, Supporting Information). The topology of the tree built from vir sequences was somewhat similar, but some isolates included within the major clade in the rol tree clustered in this case, albeit with a low support, with strains ST15.13/057, ST15.13/077 and ST15.13/090 (Fig. S4, Supporting Information). Finally, the ML tree obtained from the concatenation of rol and vir nucleotide sequences (Fig. 4) clustered most biovar 1 isolates in a clade leaving out the aforementioned tomato strains (ST15.13/057, ST15.13/077 and ST15.13/090), but this clustering was only supported by the BI method. In addition, two subclades were observed in this concatenated tree: a minor one not supported by any phylogenetic method that included five isolates (four of melon origin: MAFF 106580, MAFF 210265, MAFF210268 and MAFF 301724; and one of unknown origin: NCIB 8196), and another with 98% ML bootstrap support and 100% BI PP which clustered the remaining 33 biovar 1 isolates of diverse origin (Fig. 4). The results of the SH test revealed that most single-locus ML trees were not supported by the data sets of other chromosomal loci or their concatenation (Table S7, Supporting Information). Furthermore, although the four- and three-loci ML, BI and NJ trees were congruent with the rpoB, trpE and multilocus data sets, all those trees were incongruent with the 16S rRNA alignment (Table S7, Supporting Information). Similarly, the recA data set did not support the ML and BI four-loci trees, but was congruent with the phylogenies obtained by those methods (but not by the NJ) for the concatenation of the three chromosomal proteinencoding genes. On the other hand, the rol and vir data sets supported the trees obtained for the concatenation of individual Ri-plasmid-borne genes by all phylogenetic methods used, but were incongruent with any single-locus tree other than their own and with the trees obtained for the four- and three-loci concatenations of housekeeping gene sequences (Table S8, Supporting Information). 9 10 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Figure 4. ML consensus tree based on concatenated sequences of two Ri-plasmid-borne genes (rol + vir) for the Agrobacterium biovar 1 isolates characterized in this study. Node support values (bootstrap percentages, based on 1000 simulations) ≥90% are shown next to the branches. BI PP ≥90% and NJ bootstrap node support values ≥90% are denoted by asterisks and daggers, respectively. A partitioned model of sequence evolution (GTR + I for rol and SYM + G for vir) was used for BI analysis, while the GTR + I + G model was assumed to infer the ML phylogeny. The small phylogram is included to illustrate branch length heterogeneity (scale bar = 0.05 nucleotide substitutions per site). Strains isolated from melon, cucumber and tomato crops are indicated by dark green, light green and red filled circles, respectively. Rhizobium rhizogenes LMG 150T was used as an outgroup to root the tree. Note that A. radiobacter LMG 140T (type strain of biovar 1 group) has no Ri-plasmid and, therefore, is not included in this tree. Bosmans et al. 11 The null hypothesis of linkage equilibrium among the chromosomal loci was rejected for the whole collection of isolates and the Cucurbitaceae- and Solanaceae-associated groups (Table 4). Similarly, linkage equilibrium between plasmid-borne loci was also rejected for the Solanaceae-associated group (P < 0.01), but not the whole collection of isolates or those originating from Cucurbitaceae crop plants (Table 4). signal) but not from the expectation of a random distribution pattern. Only termophily (which was defined as growth at 44o C) was found to follow the Brownian model when the four-loci ML tree was used as input in D metric calculations, but this conclusion was not supported by subsequent analyses using the other concatenated trees (Table 5). Testing for phylogenetic signal in phenotypic traits DISCUSSION Estimated D values for the studied binary traits varied depending on the trait and the concatenated ML tree used in the analysis (Table 5). Simulation tests indicated that the distribution pattern of most traits differed significantly from a Brownian motion model of gradual divergent evolution (i.e. strong phylogenetic A remarkable phenotypic diversity was observed for our collection of Agrobacterium biovar 1 strains. We did not only observe variation in growth characteristics under different Phenotypic and genetic variation within Agrobacterium biovar 1 isolates from Cucurbitaceae and Solanaceae Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Figure 5. ML consensus tree displaying the relationships of the studied isolates and reference strains of Agrobacterium biovar 1 belonging to different genomospecies (G1–G9), as determined by phylogenetic analysis of recA (A) and rpoB (B) sequences. In both cases, G7 sequences are distributed in different clades within the corresponding tree, suggesting a possible polyphyletic nature for this genomospecies. Isolates that significantly cluster with G7 reference sequences in the recA tree but not in the rpoB tree are displayed within green dotted circles in Fig. 5B. Numeric codes for the strains characterized in this study are as in Table 1. Melon, cucumber and tomato isolates are indicated by dark green, light green and red filled circles, respectively. The scale bar represents the number of nucleotide substitutions per site. Rhizobium rhizogenes LMG 150T was used as an outgroup to root the trees. The complete version of these trees displaying GenBank accession numbers for reference strains and actual strain numbers is provided as supporting information. 12 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 Table 2. Tentative classification into genomospecies of the Agrobacterium biovar 1 isolates characterized in this study. Data set used as basis for the classificationa Putative genomospecies 1 2 3 4 5 6 7 8 9 rpoB ST15.13/04540 – ST15.13/04339 , ST15.13/09535 – – – MAFF 1065801 , MAFF 1065872 , MAFF 1065913 , MAFF 2102686 , MAFF 3017244 , NCPPB 26557 , NCPPB 265910 , NCPPB 266011 , NCPPB 404213 , NCPPB 404312 , NCPPB 406238 , ST15.13/05929 , ST15.13/06714 , ST15.13/07732 , ST15.13/09033 NCIB 819641 MAFF 2102655 , NCPPB 26568 , NCPPB 26579 , ST15.13/00416 , ST15.13/00617 , ST15.13/00718 , ST15.13/01219 , ST15.13/01320 , ST15.13/03921 , ST15.13/04022 , ST15.13/04825 , ST15.13/05426 , ST15.13/06030 , ST15.13/09134 , ST15.13/09736 , ST15.13/09837 ST15.13/00115 , ST15.13/04223 , ST15.13/04624 , ST15.13/05627 , ST15.13/05728 , ST15.13/06431 ST15.13/04540 ST15.13/05627 ST15.13/04339 , ST15.13/09535 – – – – NCIB 819641 , ST15.13/05728 MAFF 2102655 , NCPPB 26568 , NCPPB 26579 , ST15.13/00115 , ST15.13/00416 , ST15.13/00617 , ST15.13/00718 , ST15.13/01219 , ST15.13/01320 , ST15.13/03921 , ST15.13/04022 , ST15.13/04624 , ST15.13/04825 , ST15.13/05426 , ST15.13/06030 , ST15.13/09134 , ST15.13/09736 , ST15.13/09837 MAFF 1065801 , MAFF 1065872 , MAFF 1065913 , MAFF 2102686 , MAFF 3017244 , NCPPB 26557 , NCPPB 265910 , NCPPB 266011 , NCPPB 404213 , NCPPB 404312 , NCPPB 406238 , ST15.13/004223 , ST15.13/05929 , ST15.13/06431 , ST15.13/06714 , ST15.13/07732 , ST15.13/09033 a Melon, cucumber and tomato origin are indicated by dark green, light green and red filled circles, respectively. Isolates classified into the same genomospecies with high confidence (≥90 ML and/or NJ bootstrap support, and/or ≥BI PP; see Figs S5 and S6, Supporting Information) for both data sets are underlined. Strain identifiers used in Fig. 5 are given in between brackets. b Isolates which did not significantly cluster with any reference sequence. environmental conditions (pH and temperature), but also in biofilm forming capability, resistance to hydrogen peroxide, swarming capability and catalase activity. In addition, in accordance with previously performed genetic work (Portier et al. 2006), we found high genetic heterogeneity within Agrobacterium biovar 1 strains from Cucurbitaceae and Solanaceae crops. More in particular, we found a high genetic diversity for some chromosomal loci, which resulted in unique nucleotide sequence types for all strains investigated. Additionally, linkage disequilibrium analysis of the multilocus chromosomal data set yielded I AS values departing significantly from zero for the whole culture collection and the Cucurbitaceae- and Solanaceae-associated groups, which is suggestive of a predominantly clonal population structure. This is in line with earlier findings delineating a clonal population of A. radiobacter of human origin using a multilocus sequence-based analysis of seven chromosomal housekeeping genes (Aujoulat et al. 2011). High genetic diversity in the context of extensive clonality has been detected for several plant-associated bacteria such as Pseudomonas spp. (Frapolli, Défago and Moënne-Loccoz 2007; Frapolli et al. 2012; ÁlvarezPérez, de Vega and Herrera 2013). Environmental heterogeneity and the consequent existence of differential selective pressures can favour the long-term persistence of different clonal lineages across different microsites, and eventually maintain the overall genotypic diversity (see e.g. Herrera, Pozo and Bazaga 2011). In contrast, only moderate genetic diversity was observed for the plasmid-borne loci and, in this case, significant linkage disequilibrium was observed only for the Solanaceae-associated group but not for the whole collection or the Cucurbitaceaeassociated group, thus suggesting different models of genetic structure for the plasmid-borne loci (clonal vs. recombinant, re- spectively). Additionally, we can conclude from these results that the chromosomal housekeeping genes do not exhibit the same evolutionary history as the studied Ri-plasmid-borne loci, which is in line with earlier observations that plasmids may be readily spread from one to another bacterium (Garcillán-Barcia, Alvarado and de la Cruz 2011). Although the studied strains seemed to cluster according to their host plant in some phylogenetic trees based on chromosomal or plasmid-borne genes and the Cucurbitaceae- and Solanaceae-associated groups displayed some relevant differences in their nucleotide diversity indices, STRUCTURE analyses revealed that isolates from different sources most likely belonged to the same genetic cluster. Unfortunately, further discussion of host-based genetic differences among rhizogenic strains is hampered by the limited number of Cucurbitaceae isolates available for analysis. Besides, geographic and/or temporal structuring of the observed genetic diversity cannot be discarded, as most tomato isolates were of Belgian origin and dated from 2014, while those from melon and cucumber were from Japan and the UK and dated from 1985 and 1974, respectively. Multiple genomospecies within Agrobacterium biovar 1 populations from Cucurbitaceae and Solanaceae Agrobacterium biovar 1 currently encompasses nine welldelineated genomospecies, of which genomospecies G4 harbours the type strain of A. radiobacter (and the ex-type strain of A. tumefaciens) (Costechareyre et al. 2010). A genomospecieslevel identification analysis based on recA and rpoB sequences revealed that several genomospecies may occur on cucurbit and tomato crops. More particularly, based on both markers we were Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 Unknownb recA Bosmans et al. 13 Table 3. Recombination and mutation indices for the studied isolates. Recombination and mutation indices Data set No. of segregating sites (%a ) θ w c (×10−2 ) ρ/θ w d (rangee ) 3.45 (2.85–4.13) 17.38 (14.57–20.59) 3.55 (2.74–4.50) 10.97 (8.81–13.61) 5.81 (4.12–8.05) 6.41 (4.97–7.89) 2.63 (1.96 × 10−3 –11.65) 0.03 (1.02 × 10−3 –0.12) 0.51 (1.05 × 10−3 –1.23) 2.89 5.83 4.26 6.71 4.34 5.16 11.57 8.41 7.77 1.19 (0.99–1.43) 2.98 (2.50–3.53) 0.83 (0.64–1.06) 1.63 (1.31–2.03) 1.34 (0.95–1.85) 1.24 (0.96–1.53) 0.23 (1.69 × 10−4 –1.01) 3.67 × 10−3 (1.21 × 10−4 –0.01) 0.07 (1.35 × 10−4 –0.16) 5.41 (4.33–7.22) 1.55 (1.33–1.79) 0.38 (2.01 × 10−2 –1.06) 2.89 3.51 1.50 1.87 (1.50–2.50) 0.44 (0.38–0.51) 0.25 (0.01–0.71) 3.79 (2.72–5.08) 5.91 (4.33–7.92) 4.49 (1.44–9.15) 4.82 5.92 9.07 0.79 (0.56–1.05) 1.00 (0.73–1.34) 0.49 (0.16–1.01) a Excluding gaps and missing data. Rho per site (lower-upper bound, 95th percentiles). c Theta per site (Watterson estimator); upper and lower bounds for this parameter are not available. d Rho/theta per site. e Calculated using the lower and upper bounds for estimates of ρ. b Table 4. Results of linkage equilibrium analysesa . Classical methodd methodd IA P-value I AS P-value Chromosomal loci Plasmid-borne loci 0.676 0.029 <0.001∗ 0.380 0.225 0.029 <0.001∗ 0.403 Cucurbitaceae-associated (14) Chromosomal loci Plasmid-borne loci 0.645 -0.183 0.001∗ 1 0.215 -0.183 0.001∗ 1 Solanaceae-associated (26) Chromosomal loci Plasmid-borne loci 0.885 0.319 <0.001∗ 0.003∗ 0.295 0.319 <0.001∗ 0.003∗ Group of isolates (nb ) Data setc All isolates (41) a The null hypothesis of complete linkage equilibrium (i.e. free recombination among sequences) was tested by the ‘classical’ and ‘standardized’ methods, with 1000 resamplings in both cases. b Number of isolates included in the group. c Chromosomal loci: 16S rRNA, recA, rpoB and trpE; plasmid-borne loci: rol and vir. d IA , classical index of association (Smith et al. 1993); I AS , standardized index of association (Haubold and Hudson 2000). Significant P-values are marked by an asterisk. able to identify at least four genomospecies within our collection of isolates, i.e. the genomospecies G1, G3, G8 and G9. Most of our isolates were found to belong to genomospecies G9, particularly those of tomato/Belgian origin. Additionally, rpoB and, to a lesser extent, recA clustering suggested that genomospecies G7 represents a polyphyletic group, possibly represented by several Cucurbitaceae- and Solanaceae-associated isolates investigated in this study. Furthermore, our collection contained several isolates that could not be classified within any of the previously identified genomospecies. Altogether, these analyses suggest that it is not unreasonable to assume that our collection potentially includes novel (genomo)species or even novel biovars (e.g. for the phylogenetically distantly related strains ST15.13/042, ST15.13/046 and ST.15.13/064). In this regard, further polyphasic study, e.g. by carrying out complementary phenotypic anal- yses, DNA–DNA hybridizations or full genome sequencing, is needed. Signatures of recombination The main sources of new genetic variants come from mutation and the novel combination of existing alleles through recombination (Lewis-Rogers, Crandall and Posada 2004). Although the impact of recombination on genetic diversity has become a central interest in the field of population genetics of bacteria (Tanabe, Kasai and Watanabe 2007), the role of homologous gene exchange in many species remains unclear (Vos and Didelot 2009). The incongruences observed in this study among the single-gene phylogenies inferred from chromosomal housekeeping genes suggest the possible occurrence of recombination Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 All isolates (n = 41) 16S rRNA 78 (11.7) recA 74 (21.3) rpoB 106 (15.6) trpE 179 (27.1) 16S rRNA + recA + rpoB + trpE 437 (18.6) recA + rpoB + trpE 359 (21.3) rol 73 (32.4) vir 44 (17.1) rol + vir 117 (24.2) Cucurbitaceae-associated isolates (n = 14) 16S rRNA + recA + rpoB + trpE 215 (9.1) recA + rpoB + trpE 182 (10.8) rol + vir 15 (3.1) Solanaceae-associated isolates (n = 26) 16S rRNA + recA + rpoB + trpE 433 (18.4) recA + rpoB + trpE 361 (21.4) rol + vir 114 (23.6) ρ b (×10−2 ) <0.001∗ 0.002∗ 0.013∗ 0.023∗ 0.003∗ 0.009∗ 0.002∗ <0.001∗ 0.997 0.899 0.274 0.312 0.561 0.523 0.913 0.935 1.592 1.340 0.866 0.896 1.039 1.043 1.358 1.369 <0.001∗ 0.012∗ <0.001∗ <0.001∗ 0.021∗ 0.049∗ 0.009∗ 0.040∗ 1.368 1.069 1.215 1.338 0.854 0.767 1.063 0.756 Impact on horticultural crop production Significant P values (<0.05) are marked by an asterisk. a 14 34 29 30 21 29 7 15 Biofilm production Catalase activity Acidophily (pH = 3) Psychrophily (4o C) Mild thermophily (41.5o C) Thermophily (44o C) Resistance to 300 ppm H2 O2 Resistance to 600 ppm H2 O2 27 7 12 11 20 12 34 26 1.234 1.193 1.127 0.793 0.924 0.399 1.142 0.837 0.799 0.704 0.676 0.209 0.356 0.020∗ 0.627 0.225 0.001∗ 0.007∗ <0.001∗ 0.034∗ 0.008∗ 0.207 0.011∗ 0.022∗ 0.954 0.533 0.782 0.933 0.248 0.168 0.530 0.157 p random modela p random modela Estimated D p Brownian modela p random modela Estimated D 1 0 Trait recA + rpoB + trpE 16S rRNA + recA + rpoB + trpE Counts of states events among isolates. This suggestion is further supported by our Neighbor-Net network analysis, and confirmed in some cases by the PHI test, δ scores > 0 and ρ/θ w values > 1. However, the number of recombination events detected in RDP analyses was low and the effect of recombination seemed to be limited to a few isolates, most of them of Belgian tomato origin, thus being insufficient to erase the clonal background evidenced by linkage disequilibrium analysis. A similar result has been obtained for other plant pathogens as well, such as Ralstonia solanacearum (Wicker et al. 2012). Similarly, some of the aforementioned signatures of recombination were also observed for the investigated Ri-plasmid-borne genes. However, one of the limitations of our study is the low number of isolates included in the recombination analyses, which can result in reduced statistical power. Nevertheless, multiple distinct populations of rhizogenic strains subjected to differential local adaptation and/or genetic drift might have been lumped together, which can result in the underestimation of the actual recombination rate (Vos and Didelot 2009). In this regard, the identification of ecologically differentiated populations differing in their patterns of homologous recombination (e.g. because of adaptive evolution or environmental constraints) could help to avoid confounding effects which affect the estimation of recombination rates (Vos and Didelot 2009). Another limitation is that recombination between distant evolutionary lineages is easier to detect than between closely related ones and, therefore, recombination events between and within phylotypes can remain undetectable (Wicker et al. 2012). In any case, despite these limitations, our results suggest that recombination could be contributing in some extent to the evolutionary dynamics of rhizogenic Agrobacterium populations. Our study does not only provide novel insights into the phylogeny and ecology of rhizogenic Agrobacterium biovar 1 isolates but also serves as a basis for future applied research, which should eventually lead to better disease management. Because Ri-plasmids are conjugative elements that can be transferred to various members of the family Rhizobiaceae (Teyssier-Cuvelle et al. 2004), determination of the causative agent of hairy root disease is complicated, and cannot be based on colony morphology alone, or on commonly used chromosomal genetic markers such as 16S rRNA genes. In contrast, based on the plasmidborne sequence data obtained in this study, we should be able to develop accurate tools for detection and identification of disease-causing strains harbouring the Ri plasmid, e.g. based on the presence of the rol locus. However, genetic variation assessment of our collection of Agrobacterium biovar 1 isolates revealed substantial genetic diversity between strains at the rol locus. This may imply that some populations may be underestimated or even may remain undetected by rol-based detection assays developed using the sequences of only one or a few strains (e.g. Weller and Stead 2002) due to suboptimal primertemplate binding (Lievens et al. 2005). Molecular detection assays such as real-time PCR assays to accurately detect and quantify pathogenic populations should therefore be designed and validated using a huge number of reference sequences and a large collection of reference strains, respectively. Additionally, our study has demonstrated that different lineages of rhizogenic Agrobacterium strains are able to form biofilms, which may pose a particular risk in hydroponic recirculating systems such as those commonly used worldwide in the cultivation of cucurbits and tomato. Biofilms are known to act as a reservoir of infection Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 p Brownian modela Estimated D rol + vir p Brownian modela FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 Table 5. Phylogenetic signal of key phenotypic traits. 14 Bosmans et al. which periodically release cells that are capable of developing new biofilm colonies and rapidly disseminating infection. Moreover, biofilms protect microbial inhabitants from predators and biocides (Madsen et al. 2012). One such commonly used biocide in hydroponic systems is hydrogen peroxide. However, various strains tested in this study were able to tolerate hydrogen peroxide, even at concentrations of 600 ppm (in practice, in general concentrations of maximum 100 ppm are used). Additionally, strains were found that possess catalase activity, enabling them to survive hydrogen peroxide exposure. In summary, our study demonstrated high phenotypic and (phylo)genetic diversity among rhizogenic strains of Agrobacterium biovar 1 occurring on Cucurbitaceae and Solanaceae. Future work should be aimed at correlating the different genotypes/phenotypes of Agrobacterium biovar 1 with differences in pathogenicity (host range, severity of symptoms, etc.) and management strategies for this economically important plant pathogenic bacterium. Supplementary data are available at FEMSEC online. ACKNOWLEDGEMENTS We are grateful to Hille Jan van Zwol (Netherlands) and Xavier Nesme (France) for providing bacterial strains and to Jannick Van Cauwenberghe for discussing data analysis. FUNDING We thank IWT (Agency for Innovation by Science and Technology) to support this research (project IWT-LA: 120761). Conflict of interest. None declared. REFERENCES Álvarez-Pérez S, de Vega C, Herrera CM. Multilocus sequence analysis of nectar pseudomonads reveals high genetic diversity and contrasting recombination patterns. PLoS One 2013;8:e75797. Aujoulat F, Jumas-Bilak E, Masnou A, et al. Multilocus sequencebased analysis delineates a clonal population of Agrobacterium (Rhizobium) radiobacter (Agrobacterium tumefaciens) of human origin. J Bacteriol 2011;193:2608–18. Berg G, Smalla K. Plant species and soil type cooperatively shape the structure and function of microbial communities in the rhizosphere. FEMS Microbiol Ecol 2009;68:1–13. Brisbane PG, Kerr A. Selective media for three biovars of Agrobacterium. J Appl Bacteriol 1983;54:425–31. Bruen T, Philippe H, Bryant D. A quick and robust statistical test to detect the presence of recombination. Genetics 2006;172:2665–81. Bryant D, Moulton V. Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol 2004;21:255–65. Cappuccino JG, Sherman N. Microbiology: A Laboratory Manual, 6th edn. Cummings, Menlo Park, CA: Pearson Education/Benjamin, 2002. Chandra S. Natural plant genetic engineer Agrobacterium rhizogenes: role of T-DNA in plant secondary metabolism. Biotechnol Lett 2012;34:407–15. Costechareyre D, Rhouma A, Lavire C, et al. Rapid and efficient identification of Agrobacterium species by recA allele analysis. Microbial Ecol 2010;60:862–72. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004;32: 1792–7. Escobar MA, Dandekar AM. Agrobacterium tumefaciens as an agent of disease. Eur Microbiol 2003;1:37–41. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 2005;14:2611–20. Frapolli M, Défago G, Moënne-Loccoz Y. Multilocus sequence analysis of biocontrol fluorescent Pseudomonas spp. producing the antifungal compound 2,4-diacetylphloroglucinol. Environ Microbiol 2007;9:1939–55. Frapolli M, Pothier JF, Défago G, et al. Evolutionary history of synthesis pathway genes for phloroglucinol and cyanide antimicrobials in plant-associated fluorescent pseudomonads. Mol Phylogenet Evol 2012;63:877–90. Fritz SA, Purvis A. Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits. Conserv Biol 2010;24:1042–51. Garcillán-Barcia MP, Alvarado A, de la Cruz F. Identification of bacterial plasmids based on mobility and plasmid population biology. FEMS Microbiol Rev 2011;35:936–56. Gelvin SB. Agrobacterium-mediated plant transformation: the biology behind the ‘gene-jockeying’ tool. Microbiol Mol Biol R 2003;67:16–37. Gelvin SB. Agrobacterium in the genomics age. Plant Physiol 2009;150:1665–76. Guindon S, Dufayard JF, Lefort V, et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 2010;59: 307–21. Haubold H, Hudson RR. IAN 3.0: detecting linkage disequilibrium in multilocus data. Bioinformatics 2000;16:847–8. Herrera CM, Pozo MI, Bazaga P. Clonality, genetic diversity and support for the diversifying selection hypothesis in natural populations of a flower-living yeast. Mol Ecol 2011;20:4395– 407. Holland BR, Huber KT, Dress A, et al. δ plots: a tool for analyzing phylogenetic distance data. Mol Biol Evol 2002;19:2051–9. Hooykaas PJJ, Beijersbergen AG. The virulence system of Agrobacterium tumefaciens. Annu Rev Phytopathol 1994;32:157– 81. Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 2006;23:254–67. Irelan NA, Meredith CP. Genetic analysis of Agrobacterium tumefaciens and A. vitis using randomly amplified polymorphic DNA. Am J Enol Viticult 1996;47:145–51. Jolley KA, Feil EJ, Chan MS, et al. Sequence type analysis and recombinational tests (START). Bioinformatics 2001;17:1230–1. Kerr A, Panagopoulos CG. Biotypes of Agrobacterium radiobacter var. tumefaciens and their biological control. J Phytopathol 1977;90:172–9. Lewis-Rogers N, Crandall KA, Posada D. Evolutionary analyses of genetic recombination. In: Parisi V, De Fonzo V, Aluffi-Pentini F (eds). Dynamical Genetics. Kerala, India: Research Signpost, 2004;49–78. Librado P, Rozas J. DNASP v5: a software for comprehensive analysis of DNA poly-morphism data. Bioinformatics 2009;25:1451–2. Lievens B, Brouwer M, Vanachter ACRC, et al. Design and development of a DNA array for rapid detection and identification Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 SUPPLEMENTARY DATA 15 16 FEMS Microbiology Ecology, 2015, Vol. 91, No. 8 Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 2003;19: 1572–4. Schmidt HA, Strimmer K, Vingron M, et al. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 2002;18:502–4. Shimodaira H, Hasegawa M. Multiple comparisons of loglikelihoods with applications to phylogenetic inference. Mol Biol Evol 1999;16:1114–6. Smith JM, Smith NH, O’Rourke M, et al. How clonal are bacteria? P Natl Acad Sci USA 1993;90:4384–8. Souza V, Rocha M, Valera A, et al. Genetic structure of natural populations of Escherichia coli in wild hosts on different continents. Appl Environ Microb 1999;65:3373–85. Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993;10:512–26. Tamura K, Peterson D, Peterson N, et al. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 2011;28:2731–9. Tanabe Y, Kasai F, Watanabe MM. Multilocus sequence typing (MLST) reveals high genetic diversity and clonal population structure of the toxic cyanobacterium Microcystis aeruginosa. Microbiology 2007;153:3695–703. Teyssier-Cuvelle S, Oger P, Mougel C, et al. A highly selectable and highly transferable Ti plasmid to study conjugal host range and Ti plasmid dissemination in complex ecosystems. Microbiol Ecol 2004;48:10–8. Vos M, Didelot X. A comparison of homologous recombination rates in bacteria and archaea. ISME J 2009;3:199–208. Weller SA, Stead DE. Detection of root mat associated Agrobacterium strains from plant material and other sample types by post-enrichment TaqMan PCR. J Appl Microbiol 2002;92: 118–26. Weller SA, Stead DE, O’neill TM, et al. Root mat of tomato caused by rhizogenic strains of Agrobacterium biovar 1 in the UK. Plant Pathol 2000;49:799. Weller SA, Stead DE, Young JPW. Acquisition of an Agrobacterium Ri plasmid and pathogenicity by other α-Proteobacteria in cucumber and tomato crops affected by root mat. Appl Environ Microb 2004;70:2779–85. Weller SA, Stead DE, Young JPW. Recurrent outbreaks of root mat in cucumber and tomato are associated with a monomorphic, cucumopine, Ri-plasmid harboured by various Alphaproteobacteria. FEMS Microbiol Lett 2006;258: 136–43. Wicker E, Lefeuvre P, de Cambiaire JC, et al. Contrasting recombination patterns and demographic histories of the plant pathogen Ralstonia solanacearum inferred from MLSA. ISME J 2012;6:961–74. Downloaded from http://femsec.oxfordjournals.org/ by guest on October 18, 2016 of multiple tomato vascular wilt pathogens. FEMS Microbiol Lett 2003;223:113–22. Lievens B, Grauwet TJMA, Cammue BPA, et al. Recent developments in diagnostics of plant pathogens: a review. Recent Res Dev Microbiol 2005;9:57–79. Lindström K, Young JPW. International committee on systematics of prokaryotes; subcommittee on the taxonomy of Agrobacterium and Rhizobium minutes of the meetings, 31 August Gent, Belgium. Int J Syst Evol Micr 2009;59:921–2. Lindström K, Young JPW. International committee on systematics of prokaryotes; subcommittee on the taxonomy of Agrobacterium and Rhizobium minutes of the meeting, 7 September 2010, Geneva, Switzerland. Int J Syst Evol Micr 2011;61:3089–93. Llop P, Lastra B, Marsal H, et al. Tracking Agrobacterium strains by a RAPD system to identify single colonies from plant tumours. Eur J Plant Pathol 2003;109:381–9. McVean G, Awadalla P, Fearnhead P. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 2002;160:1231–41. Madsen JS, Burmølle M, Hansen LH, et al. The interconnection between biofilm formation and horizontal gene transfer. FEMS Immunol Med Mic 2012;65:183–95. Martin DP, Lemey P, Lott M, et al. RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics 2010;26:2462–3. Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 1986;3:418–26. Orme D, Freckleton R, Thomas G, et al. Comparative Analyses of Phylogenetics and Evolution in R; R Package Version 0.5. 2012. http://CRAN.R-project.org/package=caper (21 July 2015, date last accessed). Peeters E, Nelis HJ, Coenye T. Comparison of multiple methods for quantification of microbial biofilms grown in microtiter plates. J Microbiol Meth 2008;72:157–65. Portier P, Fischer-Le Saux M, Mougel C, et al. Identification of genomic species in Agrobacterium biovar 1 by AFLP genomic markers. Appl Environ Microb 2006;72:7123–31. Posada D. jModelTest: phylogenetic model averaging. Mol Biol Evol 2008;25:1253–6. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics 2000;155:945–59. Pulawska J, Kalużna M. Phylogenetic relationship and genetic diversity of Agrobacterium spp. isolated in Poland based on gyrB gene sequence analysis and RAPD. Eur J Plant Pathol 2012;133:379–90. Riker AJ, Banfield WM, Wright WH, et al. Studies on infectious hairy root of nursery trees of apples. J Agr Res 1939;41: 507–40.