1 Heredity 2012 Vol: 108(6):626-632. DOI: 10.1038/hdy.2011.133

QTL detection power of multi-parental RIL populations in Arabidopsis thaliana

A major goal of today's biology is to understand the genetic basis of quantitative traits. This can be achieved by statistical methods that evaluate the association between molecular marker variation and phenotypic variation in different types of mapping populations. The objective of this work was to evaluate the statistical power of quantitative trait loci (QTL) detection of various multi-parental mating designs, as well as to assess the reasons for the observed differences. Our study was based on an empirical data of 20 Arabidopsis thaliana accessions, which have been selected to capture the maximum genetic diversity. The examined mating designs differed strongly with respect to the statistical power to detect QTL. We observed the highest power to detect QTL for the diallel cross with random mating design. The results of our study suggested that performing sibling mating within subpopulations of joint-linkage mapping populations has the potential to considerably increase the power for QTL detection. Our results, however, revealed that using designs in which more than two parental alleles segregate in each subpopulation increases the power even more.

Mentions
Figures
Figure 1: Histograms of the allele frequencies at an average QTL for the following mating designs compared with PI: REF, REFS, DC, DCS, FHC, THDC and FHDC with 10 or 100 individuals per F2 subpopulation (FHDC10 or FHDC100). Figure 2: Power to detect QTLs 1−β* when neglecting population structure for different α* levels in a scenario with 50 QTLs, heritability h2=0.5 and population size N=5000. The following mating designs were examined: REF, REFS, DC, DCS, FHC, THDC, and FHDC with 10 or 100 individuals per F2 subpopulation (FHDC10 or FHDC100). The whiskers represent the s.e.m. across all replications.
Altmetric
References
  1. Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177: 577-585 , (2007) .
    • . . . The random model was fitted using the statistical software ASReml (Gilmour et al., 2009) and the R package GenABEL (Aulchenko et al., 2007). . . .
    • . . . For QTL detection, the above described two-step procedure was used, where instead of phenotypic values, the residuals of the random model (3) were considered as dependent variables (Aulchenko et al., 2007). . . .
  2. Bernardo R. Molecular markers and selection for complex traits in plants: learning from the last 20 years. Crop Sci 48: 1649-1664 , (2008) .
    • . . . With a few exceptions, however, most of these QTL have not been successfully validated in other populations (Bernardo, 2008) . . .
  3. Blanc G, Charcosset A, Mangin B, Gallais A, Moreau L. Connected populations for detecting quantitative trait loci and testing for epistasis: an application in maize. Theor Appl Genet 113: 206-224 , (2006) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
  4. Brachi B, Faure N, Horton M, Flahauw E, Vazquez A, Nordborg M et al.. Linkage and association mapping of Arabidopsis thaliana flowering time in nature. PLoS Genet 6: e1000940 , (2010) .
    • . . . This problem cannot be completely prevented even by considering the population structure in the statistical analysis (Brachi et al., 2010) . . .
    • . . . This finding can be explained by the fact that association between haplomarkers, which differ only in state between subpopulations, and the phenotype cannot be as simply detected when population structure is corrected for during the QTL analysis (Yu et al., 2006; Sneller et al., 2009; Brachi et al., 2010) . . .
  5. Breseghello F, Sorrells ME. Association analysis as a strategy for improvement of quantitative traits in plants. Crop Sci 46: 1323-1330 , (2006) .
    • . . . A problem of association-mapping populations, however, is that some individuals might be more related to each other than individuals are related on average, and this leads to false-positive associations between the pheno- and genotypes (Breseghello and Sorrells, 2006; Sneller et al., 2009) . . .
  6. Buckler ES, Holland JB, Bradbury PJ, Acharya CB, Brown PJ, Browne C et al.. The genetic architecture of maize flowering time. Science 325: 714-718 , (2009) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
  7. Cavanagh C, Morell M, Mackay I, Powell W. From mutations to MAGIC: resources for gene discovery, validation and delivery in crop plants. Curr Opin Plant Biol 11: 215-221 , (2008) .
    • . . . The FHDC design is a combination of the Arabidopsis multi-parental RIL design and the multi-parent, advanced generation inter-cross design (Cavanagh et al., 2008) . . .
  8. Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD, Beatty J et al.. The collaborative cross, a community resource for the genetic analysis of complex traits. Nat Genet 36: 1133-1137 , (2004) .
    • . . . A method combining the strengths of linkage mapping and association mapping was proposed in the field of animal genetics (Mott et al., 2000; Churchill et al., 2004) . . .
  9. Clark RM, Schweikert G, Toomajian C, Ossowski S, Zeller G, Shinn P et al.. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317: 338-342 , (2007) .
    • . . . Our study was based on an empirical data of 20 A. thaliana accessions, namely Bay-0, Bor-4, Br-0, Bur-0, C24, Col-0, Cvi-0, Est-1, Fei-0, Got-7, Ler-1, Lov-5, Nfa-8, Rrs-7, Rrs-10, Sha, Tamm-2, Ts-1, Tsu-1 and Van-0 (Clark et al., 2007) . . .
  10. Doerge RW. Mapping and analysis of quantitative trait loci in experimental populations. Nat Rev Genet 3: 43-52 , (2002) .
    • . . . This can be achieved by means of statistical methods that evaluate the association between molecular marker variation and phenotypic variation in different types of mapping populations (for review, see Doerge, 2002; Sneller et al., 2009). . . .
  11. Gilmour AR, Gogel BJ, Cullis BR, Thompson R. ASReml User Guide Release 3.0. VSN International Ltd, Hemel Hempstead, www.vsni.co.uk , (2009) .
    • . . . The random model was fitted using the statistical software ASReml (Gilmour et al., 2009) and the R package GenABEL (Aulchenko et al., 2007). . . .
  12. Huang X, Paulo MJ, Boer M, Effgen S, Keizer P, Koornneef M et al.. Analysis of natural allelic variation in Arabidopsis using a multiparent recombinant inbred line population. Proc Natl Acad Sci USA 108: 1-6 , (2011) .
    • . . . The THDC design is similar to the Arabidopsis multi-parental RIL design (Paulo et al., 2008; Huang et al., 2011), where the PIs are crossed in pairs to create two-way hybrids, which were then crossed in a diallel . . .
  13. Jannink JL, Wu XL. Estimating allelic number and identity in state of QTLs in interconnected families. Genet Res 81: 133-144 , (2003) .
    • . . . In addition, statistical methods for the analysis of multi-parental populations were developed (Xu, 1998; Rebaï and Goffinet, 2000; Jannink and Wu, 2003) . . .
  14. Kover PX, Valdar W, Trakalo J, Scarcelli N, Ehrenreich IM, Purugganan MD et al.. A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsisthaliana. PLoS Genet 5: e1000551 , (2009) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
    • . . . The multi-parent, advanced-generation inter-cross design (Kover et al., 2009) is based on a diallel cross of all PIs, followed by four generations of random mating . . .
    • . . . The DCR design is similar to the design described by Kover et al., (2009), for which three generations of random crosses among all progenies followed a diallel cross . . .
  15. Lande R, Thompson R. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124: 743-756 , (1990) .
    • . . . The maximum genotypic effect per QTL ak with k=1,2, … l was drawn randomly without replacement from the geometric progression ak=a0qk, with a0=100(1−q)/(1−ql) and q=0.90 for 25 QTL, q=0.96 for 50 QTL, and q=0.99 for 100 QTL (Lande and Thompson, 1990) . . .
  16. Lee M, Sharopova N, Beavis WD, Grant D, Katt M, Blair D et al.. Expanding the genetic map of maize with the intermated B73 Mo17 (IBM) population. Plant Mol Bio 48: 453-461 , (2002) .
    • . . . Furthermore, sibling mating within the biparental populations has proven to increase the mapping resolution (Lee et al., 2002) . . .
  17. Lynch M, Walsh B. Genetics and Analysis of Quantitative Traits. Sinauer Associates, Inc.: Sunderland , (1998) .
    • . . . Quantitative traits, which include most fitness and agronomic traits, show a continuous distribution of phenotypic values as they are influenced by many genes, epistatic interactions and the environment (Lynch and Walsh, 1998) . . .
  18. Mackay TFC, Stone EA, Ayroles JF. The genetics of quantitative traits: challenges and prospects. Nat Rev Genet 10: 565-577 , (2009) .
    • . . . The genomes of the individuals of this population are mosaics of the genomes of the parental genotypes due to the occurred recombination events (Mackay et al., 2009) . . .
    • . . . The mapping resolution of the association-mapping populations compared with the biparental populations is high, as the former allow the utilization of historical recombination events (Mackay et al., 2009) . . .
  19. Maurer HP, Melchinger AE, Frisch M. Population genetic simulation and data analysis with Plabsoft. Euphytica 161: 133-139 , (2007) .
    • . . . All simulations were performed with the software PLABSOFT (Maurer et al., 2007), which is implemented as an extension of the statistical software R (R Development Core Team, 2009) . . .
  20. McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q et al.. Genetic properties of the maize nested association mapping population. Science 325: 737-740 , (2009) .
    • . . . We examined the power of the REF design, which is similar to the design used to establish the nested association-mapping population (Yu et al., 2008; McMullen et al., 2009) . . .
  21. Mott R, Talbot CJ, Turri MG, Collins AC, Flint J. A method for fine mapping quantitative trait loci in outbred animal stocks. Proc Natl Acad Sci USA 97: 12649-12654 , (2000) .
    • . . . A method combining the strengths of linkage mapping and association mapping was proposed in the field of animal genetics (Mott et al., 2000; Churchill et al., 2004) . . .
  22. Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, Zheng H et al.. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3: e196 , (2005) .
    • . . . These inbreds were selected on the basis of polymorphisms in 876 genome-wide distributed fragments from a sample of 96 A. thaliana genotypes to capture the maximum genetic diversity (Nordborg et al., 2005) . . .
  23. Paulo MJ, Boer M, Huang X, Koornneef M, van Eeuwijk FA. A mixed model QTL analysis for a complex cross population consisting of a half diallel of two-way hybrids in Arabidopsis thaliana: analysis of simulated data. Euphytica 161: 107-114 , (2008) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
    • . . . In the first step of the Arabidopsis multi-parental recombinant inbred line (RIL) mating design (Paulo et al., 2008), hybrid crosses between pairs of the PIs were performed . . .
    • . . . The THDC design is similar to the Arabidopsis multi-parental RIL design (Paulo et al., 2008; Huang et al., 2011), where the PIs are crossed in pairs to create two-way hybrids, which were then crossed in a diallel . . .
  24. Piepho HP. An algorithm for a letter-based representation of all-pairwise comparisons. J Comput Graph Statist 13: 456-466 , (2004) .
    • . . . The pairwise differences (significance level P<0.05) were presented via letter-based comparisons (Piepho, 2004). . . .
  25. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria , (2009) .
    • . . . All simulations were performed with the software PLABSOFT (Maurer et al., 2007), which is implemented as an extension of the statistical software R (R Development Core Team, 2009) . . .
    • . . . The QTL detection was performed using the statistical software R (R Development Core Team, 2009). . . .
  26. Rebaï A, Goffinet B. Power of tests for QTL detection using replicated progenies derived from a diallel cross. Theor Appl Genet 86: 1014-1022 , (1993) .
  27. Rebaï A, Goffinet B. More about quantitative trait locus mapping with diallel designs. Genet Res 75: 243-247 , (2000) .
    • . . . In addition, statistical methods for the analysis of multi-parental populations were developed (Xu, 1998; Rebaï and Goffinet, 2000; Jannink and Wu, 2003) . . .
  28. Rockman MV, Kruglyak L. Breeding designs for recombinant inbred advanced intercross lines. Genetics 179: 1069-1078 , (2008) .
    • . . . The increase in power by sibling mating accords with earlier results (Rockman and Kruglyak, 2008), and is due to a slower increase of homozygosity by sibling mating compared with selfing . . .
    • . . . This leads to a more genetic recombination in the segregating populations and thus, to a better resolution, but also to a higher power in the detection of QTLs (Vales et al., 2005; Rockman and Kruglyak, 2008) . . .
  29. Singer T, Fan Y, Chang HS, Zhu T, Hazen SP, Briggs SP. A high-resolution map of Arabidopsis recombinant inbred lines by whole-genome exon array hybridization. PLoS Genet 2: e144 , (2006) .
    • . . . Therefore, the physical map position of the middle SNP of each haplomarker was linearly projected on to the genetic map (Singer et al., 2006), resulting in an average genetic map distance of ~0.7 cM . . .
  30. Sneller CH, Mather DE, Crepieux S. Analytical approaches and population types for finding and utilizing QTL in complex plant populations. Crop Sci 49: 363-380 , (2009) .
    • . . . This can be achieved by means of statistical methods that evaluate the association between molecular marker variation and phenotypic variation in different types of mapping populations (for review, see Doerge, 2002; Sneller et al., 2009). . . .
    • . . . A problem of association-mapping populations, however, is that some individuals might be more related to each other than individuals are related on average, and this leads to false-positive associations between the pheno- and genotypes (Breseghello and Sorrells, 2006; Sneller et al., 2009) . . .
    • . . . This finding can be explained by the fact that association between haplomarkers, which differ only in state between subpopulations, and the phenotype cannot be as simply detected when population structure is corrected for during the QTL analysis (Yu et al., 2006; Sneller et al., 2009; Brachi et al., 2010) . . .
  31. Stich B. Comparison of mating designs for establishing nested association mapping populations in maize and Arabidopsis thaliana. Genetics 183: 1525-1534 , (2009) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
    • . . . The power for QTL detection 1−β* was calculated on the basis of these α levels as proportion of correctly identified QTLs from the total number of QTLs l (Stich, 2009). . . .
    • . . . In our study, the power to detect QTL of the REF and DC design was considerably lower than that observed by Stich (2009) . . .
    • . . . This finding can be explained by the different benchmarks of residual variance and hence, of heritability used in these two studies when simulating phenotypic values. Stich (2009) considered the genetic variance per subpopulation, whereas we used the genetic variance of the PIs as the basis for the simulation of phenotypic values . . .
    • . . . A second reason is the number of degrees of freedom required in the stepwise regression in our study due to the higher number of assumed alleles compared with the study of Stich (2009) . . .
    • . . . Our observation accords with the findings of Stich (2009) . . .
  32. Stich B, Melchinger AE, Piepho HP, Hamrit S, Schipprack W, Maurer HP et al.. Potential causes of linkage disequilibrium in a European maize breeding program investigated with computer simulations. Theor Appl Genet 115: 529-536 , (2007) .
    • . . . Therefore, the concept for mapping in multi-parental linkage mapping population was developed, which minimizes the effect of population structure by crossing diverse individuals, but still providing a high mapping resolution (Stich et al., 2007). . . .
  33. Valdar W, Flint J, Mott R. Simulating the collaborative cross: power of quantitative trait loci detection and mapping resolution in large sets of recombinant inbred strains of mice. Genetics 172: 1783-1797 , (2006) .
    • . . . From the genotypic values of the set of PIs, the genotypic variance σg2 was calculated (Valdar et al., 2006), which was the same for all mating designs . . .
  34. Vales MI, Schön CC, Capettini F, Chen XM, Corey AE, Mather DE et al.. Effect of population size on the estimation of QTL: a test using resistance to barley stripe rust. Theor Appl Genet 111: 1260-1270 , (2005) .
    • . . . This leads to a more genetic recombination in the segregating populations and thus, to a better resolution, but also to a higher power in the detection of QTLs (Vales et al., 2005; Rockman and Kruglyak, 2008) . . .
  35. Xu S. Mapping quantitative trait loci using multiple families of line crosses. Genetics 148: 517-524 , (1998) .
    • . . . In addition, statistical methods for the analysis of multi-parental populations were developed (Xu, 1998; Rebaï and Goffinet, 2000; Jannink and Wu, 2003) . . .
  36. Yu J, Holland JB, McMullen MD, Buckler ES. Genetic design and statistical power of nested association mapping in maize. Genetics 178: 539-551 , (2008) .
    • . . . Subsequently, different mating designs were recommended and used for the QTL detection in a plant genetics context (Blanc et al., 2006; Paulo et al., 2008; Yu et al., 2008; Buckler et al., 2009; Kover et al., 2009; Stich, 2009) . . .
    • . . . The mating design underlying the nested association-mapping strategy (Yu et al., 2008) is based on crosses between one parental inbred (PI) line with all other PIs . . .
    • . . . We examined the power of the REF design, which is similar to the design used to establish the nested association-mapping population (Yu et al., 2008; McMullen et al., 2009) . . .
  37. Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF et al.. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203-208 , (2006) .
    • . . . This finding can be explained by the fact that association between haplomarkers, which differ only in state between subpopulations, and the phenotype cannot be as simply detected when population structure is corrected for during the QTL analysis (Yu et al., 2006; Sneller et al., 2009; Brachi et al., 2010) . . .
  38. Zhao K, Aranzana MJ, Kim S, Lister C, Shindo C, Tang C et al.. An Arabidopsis example of association mapping in structured samples. PLoS Genet 3: e4 , (2007) .
    • . . . The relationship matrix was calculated from pedigree records or based on the proportion of shared haplomarker for each pair of individuals (Zhao et al., 2007) . . .
  39. Zhu C, Gore M, Buckler ES, Yu J. Status and prospects of association mapping in plants. Plant Genome 1: 5-20 , (2008) .
    • . . . To overcome this problem, the detection of QTLs using a set of genotypes with unknown ancestry, which is called association mapping, has become popular (for review, see Zhu et al., 2008). . . .
Expand