Genetic Diversity Analyses in Maize Inbred Lines Using Microsatellite Markers
Directorate of Maize Research (Indian Council of Agricultural Research) Pusa Campus, New Delhi 110 012 (India)
Genetic Diversity Analyses in Maize Inbred Lines Using Microsatellite Markers
Directorate of Maize Research Pusa Campus, New Delhi 110 012, India Website: www.maizeindia.org, Email: firstname.lastname@example.org Phone: 011-25841805, 25842372 Fax:011-25848195
Genetic Diversity Analyses in Maize Inbred Lines Using Microsatellite Markers
Sain Dass, Avinash Singode, Sujay Rakshit, Manivannan A, Jyoti Kaul, JC Sekhar, Ravinder, Meenakshi and Chikappa GK. Genetic Diversity Analyses in Maize Inbred Lines using Microsatellite Markers. Directorate of Maize Research, Pusa Campus, New Delhi-110012, Technical Bulletin No. 2010/2, pp. 36.
Directorate of Maize Research Pusa Campus, New Delhi-110 012 (India) Ph: 91-11-25841805, 25842372, 25849725 FAX: 91-11-25848195 Email: email@example.com
Published in 2010
Alpha Printographics (India) Mobile : 9811199620, 9999039940
PREFACE Maize is an important crop with respect to food, feed and industrial uses. The current utilization pattern of maize reveals that a large portion of maize produced (52%) is consumed as poultry feed and about a total of 61% is directed as feed in various animal husbandry industries viz. cattle and pigry. Only about 25% of the total maize produced is consumed as human food and rest of the produce is used as raw material in various industries, in USA there are about 3000 products developed using maize. Maize is potential exportable commodity as poultry feed to neighboring countries. Specialty corn viz. baby corn and sweet corn is processed and exported to various countries, thereby, providing nutritional and livelihood security to Indian farmers. Hence, it clearly implies that maize has a unique place in Indian economy. In India the projected demand for maize production is set to 4.7 by Planning Commission to meet the demands in future. The current growth rate of maize production and productivity has exceeded the required growth rate. This was mainly because of the Single Cross Hybrids technology. Single cross hybrids have the highest yield potential than any other type of hybrids. It becomes important to maintain the sustainability in the production of maize which requires continuous efforts in developing elite maize inbred lines resistant to biotic stresses and tolerant to abiotic stresses. Molecular approaches in crop improvement are an important area of research which needs to be explored. The molecular techniques have the potential to complemen the conventional breeding. Among the various molecular approaches in plant breeding one of the important areas of research is the use of molecular markers in marker assisted selection (MAS), QTL mapping, diversity analysis, DNA fingerprinting for protection of elite lines/germplasm etc. With this brief introduction the present manual is prepared to give an outline of research in the field of molecular breeding. Directorate of Maize Research is actively engaged in DNA fingerprinting of large number of maize inbred lines. A part of this research programme is taken to illustrate the utility of molecular markers in diversity analysis. Genetic diversity analysis of 46 important QPM lines using micosatellite markers is discussed in this manual. The manual serves as a primary resource for establishing a molecular breeding lab. This manual includes protocols and lists of reagents and instruments required for microsatellite analysis. Authors
CONTENTS S.No. Content
Introduction to Molecular Markers
Applications of Molecular Markers in Plant Genome
Analysis and Breeding 3.
Practical Manual on Diversity Analysis
Isolation and Purification of Genomic DNA
DNA Quantification and Quality Analysis
Polymerase Chain Reaction (PCR)
SSR Data Compilation and Data Analysis
Genetic Diversity Analysis in QPM Inbred Lines
Introduction to Molecular Markers Morphological and cytological markers dominated the classical era of genetic variability and mapping studies. Starting from the cytological analysis to in-vitro and in-situ DNA reassociation studies to molecular markers, DNA cloning and sequence analysis, we can see the evolution in the understanding of genome structure and function. Among such important landmark discoveries molecular markers have a distinct place in plant breeding. Molecular markers include biochemical constituents (e.g. secondary metabolites in plants) and macromolecules, viz. proteins, isozymes and deoxyribonucleic acids (DNA). Analysis of secondary metabolites is, however, restricted to those plants that produce a suitable range of metabolites which can be easily analyzed and can distinguish between varieties. These metabolites which are being used as markers should be ideally neutral to environmental effects or management practices. Amongst the molecular markers used, DNA markers are more suitable, as they are neutral to environmental effect and ubiquitous to most of the living organisms. Molecular markers are modern diagnostic tools, which may help breeders to solve practical problems. They facilitate cultivar identification, the determination of genetic similarities among breeding stocks and enable the calculation of polymorphism level and heterozygosity. But, the main expectation with respect to molecular markers is their potential use in marker-assisted selection (MAS). The applications of molecular markers have been used to scrutinize DNA sequence variation(s) in and among the crop species and create new sources of genetic variation by introducing new and favorable traits from landraces and related crop species. Markers can aid selection for target alleles that are not easily assayed in individual plants, minimize linkage drag around the target gene, and reduce the number of generations required to recover a very high percentage of the recurrent parent genetic background. Improvements in marker detection systems and in the techniques used to identify markers linked to useful traits, has enabled great advances in the field of diversity analysis, phylogenetic studies and molecular breeding in recent years. Molecular markers are now been used for determining linkage disequilibrium between loci and hence is an useful technique for association mapping in plants and animals for important target traits. 1
DNA-based markers Genetic polymorphism is classically defined as the simultaneous occurrence of a trait in the same population of two or more discontinuous variants or genotypes. Although DNA sequencing is a straight forward approach for identifying variations at a locus, it is expensive and laborious. A wide variety of techniques have, therefore, been developed in the past few years for visualizing DNA sequence polymorphism. The term DNA-fingerprinting was introduced for the first time by Alec Jeffrey in 1985 to describe bar-code-like DNA fragment patterns generated by multilocus probes after electrophoretic separation of genomic DNA fragments. The emerging patterns make up an unique feature of the analysed individual and are currently considered to be the ultimate tool for biological individualization. Recently, the term DNA fingerprinting is used to describe the combined use of several single locus detection systems and are being used as versatile tools for investigating various aspects of plant genomes. These include characterization of genetic variability, genome fingerprinting, genome mapping, gene localization, analysis of genome evolution, population genetics, taxonomy, plant breeding, and diagnostics.
Properties desirable for ideal DNA markers v v v v v v v v
Highly polymorphic nature Codominant inheritance (determination of homozygous and heterozygous states of diploid organisms) Frequent occurrence in genome Selectively neutral behavior (the DNA sequences of any organism are neutral to environmental conditions or management practices) Easy access (availability) Easy and fast assay High reproducibility Easy exchange of data between laboratories.
DNA based markers can be classified as hybridization-based markers and polymerase chain reaction (PCR)-based markers. In hybridization-based markers a labeled probe is used to visualize the DNA profile of restricted DNA viz. RFLP. Whereas, PCR-based markers involve in-vitro amplification of particular DNA sequences or loci, with the help of specifically or arbitrarily chosen oligonucleotide sequences (primers) and a thermostable DNA polymerase enzyme. The amplified fragments (amplicons) are separated electrophoretically and banding patterns are detected by different methods such as staining and autoradiography, as in the case of RAPD, microsatellites, STMS and EST. 2 International and National Regulatory Instruments Related to Biological Diversity
Types and description of DNA markers Single or low copy probes Restriction fragment length polymorphism (RFLP) RFLP markers were used for the first time in the construction of genetic maps by Botstein et al. in 1980. RFLP are codominant and show Mendelian inheritance pattern. The polymorphism in restricted fragments is due to DNA rearrangements that occur due to evolutionary processes, point mutations within the restriction enzyme recognition site sequences, insertions or deletions within the fragments, and unequal crossing over. This technique involves DNA-DNA hybridization of clones and probe. Generally, there are two sources of probes, c-DNA clones and Pst1 derived genomic clones. PstI clones are based on the suggestion that expressed genes are not methylated. GC and GXC methylation is the most prominent form of methylation in plants. The enzyme PstI enzyme is C-methylation sensitive. Therefore, the enzyme will only cut non-methylated sites. If a gene is expressed, then its sequence will not be methylated and will be susceptible to PstI digestion. And because they probably contain expressed sequences, these fragments would have a greater probability of being low copy number. These probes are mostly species specific single locus probes of about 0.5â€“3.0 kb in size, obtained from a cDNA library or a genomic library. Following are the steps involved in a typical RFLP assay (i) restriction digestion of genomic DNA using restriction endonuclease(s); (ii) resolving restricted genomic fragments through gel electrophoresis; (iii) transfer of resolved fragments from gel to nitrocellulose membrane using southern blotting; (iv) membrane containing DNA fragments hybridization with labeled probe using southern hybridization; (v) detection of polymorphism through X-ray autoradiography or chemiluminescent technique. Strength Co-dominant, robust, highly reproducible v Useful in comparative genome mapping. v
Constraints The assay is laborious, not amenable for complete automation v Require large quantity of DNA v Limited utility in MAS due to low assay efficiency v
Multi locus probes Genome analysis has led to the discovery that about 30â€“90% of the genome of almost all the species is constituted by regions of repetitive DNA, which are highly polymorphic in nature. These regions may be used as markers as they contain several hundred alleles, differing from each other with respect to length, sequence or both and they are interspersed in tandem arrays ubiquitously. These repetitive regions acts as buffers and absorb mutation in genome. Variation in repeat sequence unit and the mutation are the major reasons that makes repetitive sequences efficient marker system that are useful for various application in plant genome analysis. The class of markers that are based on repetitive sequence are both hybridization and PCR based. Microsatellites and minisatellites The term microsatellites was coined by Litt and Lutty (1989), while the term minisatellites was introduced by Alec. Jeffrey (1985). Both are multilocus probes creating complex banding patterns. They essentially belong to the repetitive DNA family. Fingerprints generated by these probes are also known as oligonucleotide fingerprints. Minisatelites are tandem repeats of DNA sequence with 10-100 bp repeat motifs whereas, microsatellites are tandem repeat of DNA sequence with 2-6 bp repeat motifs. Microsatellites and minisatellites forms the hypervariable regions of genome this makes them ideal marker system. These repetitive DNA are uniformly dispersed throughout the genome of most of the eukaryotes. These are also reffered to as Variable Number of Tandem Repeats (VNTRs) and this is one of the basis of polymorphism at a locus. Many alleles exist in a population, the level of heterozygosity is high and they follow Mendelian inheritance.
SSRs (Simple sequence Repeats): These also known as Microstellite markers, short tandem repeats (STRs) or simple sequence length length polymorphism (SSLP). These are tandem repeats of di-, tri-, tetra-, penta- nucleotides units. They have been characterized in many crop species including maize, rice, sorghum, Brassica, barley and tomato. These are the PCR based markers in which the forward and reverse primers are complimentary to the conserved flanking regions of these repeats.
The di- and tetra-nucleotide repeats are present mostly in the non-coding regions of the genome, while 57% of trinucleotide repeats are shown to reside in or around the genes. A very good relationship between the number of alleles detected and the total number of simple repeats within the targeted microsatellite DNA has been observed. Thus larger the repeat number in the microsatellite DNA, greater is the number of alleles detected in a large population.
International and National Regulatory Instruments Related to Biological Diversity
Strengths v v v v v
Abundant and uniformly distributed in the genome. Hyper variable (large number of alleles per locus) Codominant markers with known genomic locations. Highly reliable and reproducible assay Powerful tools in genotype differentiation, seed purity evaluation, mapping, marker-assisted selection, population genetic studies, and genetic diversity analysis.
Expensive and time-consuming to detetct SSR loci and design primers (in many crop plants, such as maize, rice and wheat, a large number of SSR primer are available in public domain).
Not available for all plant species; primers usually species specific.
Other repetitive DNA-type markers Transposable elements: A large number of transposable repeat elements have been studied in plants; however, only a few have been exploited as molecular markers. In evolutionary terms, they have contributed to genetic differences between species and individuals by playing a role in retrotransposition events promoting unequal crossing over. Retrotransposon-mediated fingerprinting (REMAP- Retrotransposon Microstellite Amplified Ploymorphism and IRAP- Inter-Retrotransposon Amplified Polymorphism) has been shown to be an efficient fingerprinting method for detection of genetic differences between different species.
Arbitrary sequence markers Randomly-amplified polymorphic DNA markers (RAPD) In 1991,Welsh and McClelland developed a new PCR-based genetic assay namely randomly amplified polymorphic DNA (RAPD). This procedure detects nucleotide sequence polymorphisms in DNA by using a single primer of arbitrary nucleotide (8-12 bp) sequence. The primer anneals to complimentary sequence in template DNA in forward or reverse direction at multitude location in genome. The amplification occurs between forward and reverse annealing generally 1504000 bp apart. By resolving the resulting amplicons, profile with multiple bands can be seen. No knowledge of the DNA sequence for the targeted gene is required, as the primers will bind somewhere in the sequence, but it is not certain exactly where. This marker show lack of reproducibility and the assay is sensitive to variation in DNA concentration. They are dominant markers and hence have 5
limitations in their use as markers for mapping, which can be overcome to some extent by selecting those markers that are linked in coupling. RAPD assay has been used by several groups as efficient tools for identification of markers linked to agronomically important traits, which are introgressed during the development of near isogenic lines (NILs). The application of RAPDs and their related modified markers in variability analysis and individual-specific genotyping has largely been carried out, but is less popular due to problems such as poor reproducibility faint or fuzzy products, and difficulty in scoring bands which lead to inappropriate inferences. Some variations in the RAPD technique include DNA amplification fingerprinting (DAF), Arbitrary Primed Polymerase Chain Reaction (AP-PCR), Sequence Characterized Amplified Regions for amplification of specific band (SCAR), and Cleaved Amplified Polymorphic Sequences (CAPs). Strength Requires small quantities of DNA. v Needs limited investment in time and training. v Sets of several hundred primes are commercially available. v
Constraints v Lack of reproducibility in marker patterns across labs and across experiments, as the assay is sensitive to variation in DNA concentration, optimal primer concentration, and thermal cycling conditions. v Inability to discern differences in sequence homology among similarity-sized fragments.
Amplified fragment length polymorphism (AFLP) A novel DNA fingerprinting technique called AFLP is described by Vos et al (1995). The AFLP technique is based on the selective PCR amplification of restriction fragments from a total digest of genomic DNA. The technique involves three steps: (i) restriction of the DNA and ligation of oligonucleotide adapters, (ii) selective amplification of sets of restriction fragments, and (iii) gel analysis of the amplified fragments. PCR amplification of restriction fragments is achieved by using the adapter and restriction site sequence as target sites for primer annealing. The selective amplification is achieved by the use of primers that extend into the restriction fragments, amplifying only those fragments in which the primer extensions match the nucleotides flanking the restriction sites. Using this method, sets of restriction fragments may be visualized by PCR without knowledge of nucleotide sequence. The method allows the specific co-amplification of high number of restriction fragments. The number of fragments that can be analyzed 6 International and National Regulatory Instruments Related to Biological Diversity
simultaneously, however, is dependent on the resolution of the detection system. Typically 50-100 restriction fragments are amplified and detected on denaturing polyacrylamide gels. The AFLP technique provides a novel and very powerful DNA fingerprinting technique for DNAs of any origin or complexity. AFLP provides an effective tool detecting a large number of polymorphic genetic markers that are highly reliable and reproducible. The AFLP technique has been used extensively to detect genetic polymorphisms, evaluate and characterize breeding resources, construct genetic maps and identify genes. Similar to RAPDs, the bands of interest obtained by AFLP can be converted into SCARs. Thus AFLP provides a newly developed, important tool for a variety of applications. Strengths v v
v v v
Stable amplification and high repeatability. Hypervariablity, coupled a large number of mappable loci with a single amplification (high assay efficiency)- facilitates saturation of a region of the genome rather quickly. Provides raw material for STS derivation. Can generate fingerprints of any DNA regardless of their origin or complexity. Can act as bridge between genetic map and physical maps.
Constraints Time consuming procedure v Requires significant technical skills and financial resources. v
Expressed sequence tags (EST) Expressed sequence tags (EST) are subsets of STS derived from cDNA clones. ESTs can serve the same purpose as the random STS, with the advantage that ESTs are derived from expressed genes, that is, from spliced mRNA which is usually free of introns as well as repetitive DNA. A large number of ESTs have already been detected in plants, such as rice and maize, with the availability of a large amount of cDNA sequence data in a relatively short time. Polymorphic ESTs will be increasingly available and used more widely than at present. Strengths: Represent real functional genes, therefore, more useful as genetic markers than anonymous nonfunctional sequences. v Advantageous for comparative genome analysis and gain of information on genome structure. v
Constraints: High development/start-up costs. v Available at present in a very limited number of crop plants. v
Single nucleotide polymorphisms (SNPs) Single nucleotide polymorphisms (SNPs) can be considered as third generation markers. These are point mutations in which one nucleotide substituted for another at a particular locus. SNPs are the most common type of sequence differences between alleles, are codominant in nature, and represent an inexhaustible source of polymorphic markers for use in high-resolution genetic mapping of traits. Detection of the codominant SNPs is based on DNA amplification using primers based on known sequence information for specific genes. SNP assays can be carried out in plants, such as rice and maize, where genome is either advanced or is progressing at a rapid pace. Use of EST sequences has proved to be useful for discovery of SNPs in plants such as maize. Resequencing studies with a set of 502 EST-derived loci from eight elite maize inbreds, covering 400-500 bp per locus, disclosed a difference of one SNP in every 48 base pairs (bp) in the 3’ untranslated regions (UTRs) and one SNP per 130 bp in coding regions. Two hundred and fifteen insertion/deletion (indel) polymorphisms of atleast one bp in size were also detected. In soybean, SNP frequency was found to be 1.64 SNPs per kb in coding regions and 4.85 SNPs per kb in non-coding regions. Thirty three percent of 3’ UTRs were found to contain an SNP. However, in some species such as rice, rates of polymorphism are relatively lower in comparison to the high rate of SNP polymorphism found in maize. Thus, for some species, pre-screening of amplicons may be necessary to determine whether sufficient polymorphism exists to justify further screening of SNPs. Denaturing high-pressure liquid chromatography (dHPLC), single-strand conformational polymorphism (SSCP), or various chemical or enzymatic cleavage methods may be used for pre-screening. There are many commercially available assays of SNP genotyping; but, none has yet emerged dominant leader for this application. A high-throughput allele-specific hybridization assay for SNP scoring has been developed in a commercial setting for use in marker-assisted breeding of soybean. Strengths: Easier to work with SNPs than SSRs or AFLPs. v May lend themselves more readily to high –throughput approaches. v Most useful when several SNP loci are closely positioned and allow haplotype definition and development of ‘haplotype tags’. v Serve to integrate physical and genetic maps. v
Constraints: Requirement for sequencing information for the genes. v High development/start-up cost. v
8 International and National Regulatory Instruments Related to Biological Diversity
Applications of Molecular Markers in Plant Genome Analysis and Breeding
Molecular markers have evolved as potential tool for a large number of applications ranging from localization of a gene to improvement of plant varieties by markerassisted selection. Their hypervariability and the ease of detection of polymorphism make them extremely popular for phylogenetic analysis, adding new dimensions to the evolutionary theories. With the advancement in the technology in the field of molecular marker our understanding in genetic analysis, genome analysis, and genomic has got significant impetus.
Fingerprinting of crop plants DNA fingerprinting refers to identify an individual unambiguously using multilocus DNA profiling. It can be done using hybridization markers, PCR based marker either locus specific amplification or by using random primers and sequencing. Huge number of scientific literature is available in various crops in the context of DNA fingerprinting. Alec Jeffery and his associates were the first to develop method of â€˜DNA fingerprintingâ€™ through simultaneous detection of highly variable DNA fragments by hybridizing multilocus probes with electophoretically separated restriction fragments. RFLP (Restriction Fragment Length Polymorphism) is the classic technique of DNA fingerprinting. Microsatellites and minisatellites are dispersed throughout the genome, including the transcription units makes them suitable for DNA fingerprinting. RAPD (Radom Amplified Polymorphic DNA) is the most commonly used technique for DNA fingerprinting. It is a multilocus, simple and rapid technique which does not require any sequence information for developing primers; hence, it can be used in plants where sequence information is not available. But, its dominant nature and low reproducibility makes it less powerful than other markers that are co-dominant and highly reproducible. Amplified Fragment Length Polymorphism (AFLP) is the marker intermediate to RFLP and RAPD. AFLPs provide an effective means of covering larger areas of the genome in a single assay and are as reproducible as RFLPs. DNA fingerprinting has remarkable importance in Plant Variety Protection (PVP). And the utilities include identification of cultivars and genotypes; true to type plants at juvenile stage (DUS testing) for seed purity ; mutants and chimeras; nucellar and zygotic embryos; somatic hybrids in fusion experiments and somaclonal variants etc. 9
Mapping and tagging of genes: Tools for MAS Plant breeding is the science that aims at crop improvement, using the available variability. The outcome of crop improvement is selection of right kind of plant with right combination of genes/alleles. Conventional breeding takes lot of time for evaluation, identification and introgression of novel genes. Molecular markers has accelerated conventional plant breeding. It is a powerful tool for identification of diverse lines, mapping and tagging of genes. With the use of molecular markers it is now a routine to trace valuable alleles in a segregating population. These markers once mapped enable dissection of the complex traits into component genetic units more precisely, thus providing breeders with new tools to manage these complex units more efficiently in breeding programme. There are several examples of gene mapping and their utility in Marker Assisted Selection in various crops using different molecular markers. The very first genome map in plants was reported in maize, followed by rice, Arabidopsis etc. using RFLP markers. Maps have since then been constructed for several other crops like potato, barley, banana, members of Brassicaceae, etc. Once the framework maps are generated, a large number of markers derived from various techniques are used to saturate the maps as much as is possible. Microsatellite markers, especially STMS markers, have been found to be extremely useful in this regard. Owing to their quality of following clear Mendelian inheritance, they can be easily used in the construction of index maps, which can provide an anchor or reference point for specific regions of the genome. Once mapped, these markers are efficiently employed in tagging several individual traits that are extremely important in breeding programme like yield, disease resistance, stress tolerance, seed quality, etc. Molecular marker have been exploited in mapping and tagging of various genes in plant species which are currently been used by breeders in marker assisted selection. RFLP and SSRs have extensively been used for locating QTL for yield, biotic and abiotic resistance. RFLP markers have proved their importance in gene tagging for traits like water use efficiency in tomato, LR 9 and LR 24 leaf rust resistance genes in wheat and root knot nematodes (Meliodogyne sp.) in tomato. Allele-specific associated primers have also exhibited their utility in genotyping allelic variants of loci that result from both size differences and point mutations. Some of the genuine examples of this are the waxy gene locus in maize, opaque (o2) in maize, the Glu D1 complex locus associated with bread making quality in wheat, the Lr1 leaf rust resistance locus in wheat, the Gro1 and H1 alleles conferring resistance to the root cyst nematode Globodera rostochiensis in potato, and allele-specific 10 International and National Regulatory Instruments Related to Biological Diversity
amplification of polymorphic sites for detection of powdery mildew resistance loci in cereals. Arbitrary markers like RAPDs have also been employed in saturating genetic linkage maps and gene tagging. They especially important marker systems, where RFLPs have failed to reveal much polymorphism which is very important for mapping. RAPD markers in near isogenic lines can be converted into SCARs and can be used as diagnostic markers. SCAR/STS marker linked to the translocated segment on 4 AL of bread wheat carrying the Lr 28 gene has been tagged Apart from mapping and tagging of genes, an important utility of RFLP and SSR markers has been observed in detecting gene introgression in a backcross breeding programme, and synteny mapping among closely related species. Similar utility of STMS markers has been observed for reliable preselection in a marker assisted selection backcross scheme.
Phylogeny and evolution Molecular markers are powerful tools in phylogenetic and evolutionary studies. These studies strengthened the earlier studies made based on morphological and cytological evidences for establishing relationship between the wild relatives of species and their cultivated species. The comprehensive studies on genetic structure using molecular markers have revealed evolutionary forces that led the wild relatives to the present cultivable form of species. RFLP, DNA sequencing, and a number of PCR-based markers are being used extensively for reconstructing phylogenies of various species. The techniques are speculated to provide pathbreaking information regarding the fine time scale on which closely related species have diverged and what sort of genetic variations are associated with species formation. Furthermore, these studies hold a great promise for revealing more about the pattern of genetic variation within species. In connection to plant breeding they are very much helpful in understanding the crop evolution from wild progenitors and to classify them into appropriate groups. This would help in introgression of useful genes from wild progenitors into cultivated high yielding varieties of a crop species.
Diversity Analysis One of the important utility of molecular markers is diversity analysis. Lines with similar morphological characters may substantially diverse form each other at DNA level and vice-versa. Diversity analysis can be done based on pedigree data, morphological data, agronomic performance data, biochemical data, and more recently molecular (DNA-based) data. DNA based markers can unambiguously distinguish two different lines. The DNA polymorphism revealed through 11
molecular markers can be used to deduce genetic distance among the gemplasm, breeding lines and population. There are several utilities of diversity analysis like (i) selection of parents for developing hybrids (ii) selection of parents for developing mapping population (iii) to study genetic inheritance of a trait (iv) combining ability studies (iv) developing heterotic pools (v) understanding the environmental affect on geographically diverse lines (vi) population genetics studies (vii) identification of regions specific fixed alleles in landraces. Some of the important key points for an accurate and unbiased estimation of genetic diversity are (i) sampling strategies; (ii) utilization of various data sets on the basis of the understanding of their strengths and constraints; (iii) choice of genetic distance measure(s), clustering procedures, and other multivariate methods in analyses of data; and (iv) objective determination of genetic relationships. There are various strategies for diversity analysis, a judicious combination and utilization of statistical tools and techniques is vital for addressing complex issues related to data analysis and interpretation. The genetic relationship or genetic diversity estimates can be utilized for clustering, which is graphical representation of genetic relationship among the lines. The commonly used measures of genetic distance or genetic similarity (GS) using binary data are (i) Nei and Li’s (1979) coefficient (GDNL), (ii) Jaccard’s (1908) (iii) Sokal and Michener’s (1958) simple matching coefficient (GDSM), and (iv) Modified Rogers’ distance, (1972) (GDMR). Genetic distances determined by these measures can be estimated as follows: GDNL = 1 - [2N11/(2N11- N10- N01)] GDJ = 1 - [N11/(N11 - N10 - N01)] GDSM = 1 - [(N11- N00)/(N11- N10- N01 - N00)] GDMR = [(N10 - N01)/2N]0.5
12 International and National Regulatory Instruments Related to Biological Diversity
Practical Manual on Diversity Analysis
14 International and National Regulatory Instruments Related to Biological Diversity
Isolation and Purification of Genomic DNA Genomic DNA isolated following Cetyl trimethyl ammonium bromide (CTAB) method (Saghai-Mahroof, 1984) with minor modifications. 1. Crush 1gm of young leaf sample in liquid nitrogen to fine powder in pestle and mortar, transfer in Oakridge tube (centrifuge tube 50 ml capacity). 2. Add 10 ml of prewarmed lysis buffer (CTAB 2% w/v, NaCl 1.4%, Tris HCl 100 mM pH-8.0, EDTA 20 mM and 2-Mercaptoethanol 100mM) in the oakridge tube containing fine powder of leaf sample. Mix the content gently. 3. Incubate at 65ºC in water bath for 1 hr with occasional swirling. 4. Emulsify the mixture with an equal volume of chloroform- isoamyl alcohol (24:1). 5. Centrifuge at 10,000 rpm for 15 minutes. 6. Transfer the upper aqueous layer in fresh centrifuge tube- 50ml capacity. 7. Add 10 µl RNase A and incubate at 37ºC for 1hour. 8. Add equal volume of phenol: chlorophorm: isoamylalchohol (25:24:1), mix gently for 10 minutes. 9. Centrifuge at 10000 rpm for 15 minutes. 10. Collect the upper aqueous layer in a fresh tube and mix equal volume of chlorophorm: isoamylalchohol (24:1). 11. Centrifuge at 10000 rpm for 10 minutes and again collect the upper aqueous phase in fresh tube and mix 0.6 volume of isopropanol, keep at -200C for one hour or overnight for precipitation. 12. Again centrifuge at 10000 rpm for 10 minutes and discard supernatant. 13. Dissolve pellet in 1ml sterile distilled water and incubate at 37º C for 1 hour. 14. Add double volume of chilled absolute ethanol and 0.1 volume sodium acetate (3M, pH- 5.7). 15. Centrifuge at 10000 rpm for 10 minutes, discard the supernatant and wash pellet with 70% ethanol by centrifuging at 8000 rpm for 7 minutes. 16. Air dry pellet at room temperature or incubate at 37ºC for 10 minutes and dissolve the pellet in 1ml T.E. buffer and store at 40C.
DNA Quantification and Quality Analysis After isolation of DNA, it is necessary to quantify DNA and asses its quality before using the sample for further analysis. This is important for many applications including digestion of DNA by restriction enzymes or PCR amplification of target DNA. The most commonly used methodologies for quantifying the amount of nucleic acid are: (i) gel electrophoresis; and (ii) spectrophotometric analysis. If the sample amount is less, the former method is usually preferred.
Agarose Gel Electrophoresis for DNA Quantification and Quality Analysis Ethidium bromide is a fluorescent dye used in electrophoresis, which intercalates between DNA. The dye: DNA complex florescence is much higher than unbound dye, this is because DNA absorbs UV rays at 254nm which is transmitted to bound dye and the dye itself absorbs UV rays at 302 and 366nm. This absorbed radiation by dye is retransmitted at 590nm which falls under reddish-orange spectrum of light. The quantity of DNA sample can estimate by comparing the band intensity (or florescent yield) with uncut Lambda (位) DNA of known concentration. A sharp and clear band indicates good quality. RNA and sheared DNA can be visually identified in gel (Fig.1).
Fig. 1. DNA quantification image using in 8% gel
Procedure 1. Prepare a 0.8% agarose gel. 2. Add 1 碌l of 6X gel loading dye to 2-3 碌l of each DNA sample before loading the wells of the gel. Addition of dye allows us to note the extent to which the samples might have migrated during electrophoresis, so that it can be halted at an appropriate stage. 3. Load at least 1 or 2 wells with uncut, good quality 位 DNA or any previously quantified DNA samples (50ng and 100ng) as molecular weight standards. 16 International and National Regulatory Instruments Related to Biological Diversity
4. Run the submarine electrophoretic gel at 70V till the dye has migrated onethird of the distance in the gel. 5. DNA can be visualized using a UV transilluminator and quantified in comparison with the fluorescent yield of the standards. Note: For SSR and RAPD analysis, it is more important to have good quality DNA samples (unsheared/undegraded DNA), than high quantities of DNA. In contrast, RFLP analysis requires larger quantities of DNA, since the technique is not PCRbased.
B. Spectrophotometric Determination Spectrophotometer determination is simple and accurate estimation of the concentration of nucleic acids in a sample. Purines and pyrmidines in nucleic acid show maximum absorption of UV light around 260nm (eg., dATP: 259nm; dCTP: 272nm; dTTP: 247nm) if the DNA sample is pure without significant contamination from proteins or organic solvents. The ratio of optical density at 260/280nm (OD260/ OD280) is used to assess the purity of the sample. This method is however limited by the quantity of DNA and the purity of the preparation. Accurate analysis of the DNA preparation may be impeded by the presence of impurities in the sample or if the amount of DNA is too little. In the estimation of total genomic DNA, for example, the presence of RNA, sheared DNA etc. could interfere with the accurate estimation of total high molecular weight genomic DNA.
Procedure 1. Take 1 ml TE buffer in a cuvette and calibrate the spectrophotometer at 260nm as well as 280nm. 2. Add 10 ml of each DNA sample to 900ml TE (Tris-EDTA buffer) and mix well. 3. Use TE buffer as a blank in the other cuvette of the spectrophotometer. 4. Note the OD260 and OD280 values on spectrophotometer. 5. Calculate the OD260/OD280 ratio.
Comments: • • •
A ratio between 1.8-2.0 denotes that the absorption in the UV range is due to nucleic acids. A ratio lower than 1.8 indicates the presence of proteins and/or other UV absorbers. A ratio higher than 2.0 indicates that the samples may be contaminated with chloroform or phenol. In either case (<1.8 or >2.0) it is advisable to re-precipitate the DNA. 17
The amount of DNA can be quantified using the formula:
DNA concentration (mg/ml) =
OD260 x 100 (dilution factor) x 50 mg/ml 1000
Reference Hoisington, D. Khairallah, M. and Gonzalez-de-Leon, D. (1994). Laboratory Protocols: CIMMYT Applied Biotechnology Center. Second Edition, Mexico, D.F.: CIMMYT.
18 International and National Regulatory Instruments Related to Biological Diversity
Polymerase Chain Reaction (PCR) SSR-PCR reactions were performed in 15 µl of reaction mixture containing DNA (50ng) taq buffer (IX), MgCl2 (1.5mM), dNTP (0.5mM) primer (1pM), Taq polymerase (0.5 unit). Composition of reaction mixture for single SSR reaction (15 µl): Taq Buffer (10X)
dNTPs mix. (10 mM each) -
Taq DNA Pol. (5U/ µl)
Template DNA (5ng/ µl) -
Mix all these components (except template DNA) and add 10µl of the mixture reaction in PCR tubes or titre plates containing the template DNA (5µl of genomic DNA of concentration 10ng/µl). Place the samples in a thermo cycler with the PCR conditions/programme. PCR condition for SSR: Step I 94oC
72 C 35 cycles of step-II Step III 72oC
Store amplified products at 4oC. 19
Note – Annealing temperature depend on primer properties (Tm value). Annealing temperature and primer sequence for markers used in diversity analysis study at DMR are annexed. Composition of reaction mixture for single RAPD reaction (25 µl): Taq Buffer (10X) MgCl2 (25mM) dNTPs mix. (10 mM each) Primer (10mM) Taq DNA Pol. (5U/ µl) Water Template DNA
2.5µl 2.5 µl 0.5µl 1.5 µl 0.2 µl (1U) 4.8 µl 2.5 µl (25 ng/ µl)
PCR condition for RAPD: Step I 94oC - 4.0 min. Step II 94oC - 1 min. o 35 C - 1 min. 72oC - 2 min. 40 repeats of Step II Step III 72oC - 7.0 min. Store amplified products at 4oC. Gel electrophoresis for SSR-PCR products:
Fig. 2 Microsatellite profile of maize inbred lines. Amplifications performed with primer bnlg1523. M denotes 50bp GeneRuler ladder and numbers are given as an identity of each inbred line analysed in the study. The list of inbred lines is annexed. 20 International and National Regulatory Instruments Related to Biological Diversity
Resolve the amplified products in 3.5% SFR (Super Fine Resolution) agarose gel containing ethidium bromide using 1X TAE buffer. • Mix the total reaction mixture with 3µl of 6X DNA loading dye and load the total samples on the gel. • Load 50bp DNA ladder, along with the unknown samples to know the size of amplified products. • Run the gel at 90V for four hours. • Take the gel image using a gel documentation system. Note: The above mentioned protocol is standardized at DMR Biotechnology lab, New Delhi-12 •
SSR Data Compilation and Data Analysis Score the alleles manually in terms of positions of the bands relative to the ladder sequentially from the smallest to the largest-sized bands. v The presence of a band is denoted with ‘1’ and ‘0’ for the absence of a band in the data matrix. Diffused bands or bands revealing ambiguity in scoring as missing data and designate as ‘9’. v Consider genotypes showing two allelic bands with equal intensity as heterozygous for the locus. v Determine the polymorphism information content (PIC) described as v
where Pij is the frequency of jth allele at ith locus. Allele number and allele frequency can be calculated for SSR because the resulting bands are allelic. But the same cannot be done for RAPD and AFLP so we just score bands as ‘0’ and ‘1’ and calculate the distance.PIC values range from 0 (monomorphic) to 1 (very highly discrimative with many alleles in equal and not in low frequency) v Jaccard’s coefficient (J) to calculate the genetic similarities among pair wise comparison of genotypes based on SSR data, as follows: J = N11 / (N11 +N10 + N01) Where, N11 is the number of bands present in both genotypes; N10 is the number of bands present in one genotype (lane) and N01 the number of bands present in the other genotype. v Similarity matrix using NTSYS pc 2.02 to produce an Sequential, Agglomerative, Hierarchical, Nested (SAHN) classification by employing UPGMA (Unweighted Paired Group Method using Arithmetic averages). v Test goodness of fit of clustering by estimating cophenetic values using COPH and MXCOMP options of the NTSYSpc programme.
22 International and National Regulatory Instruments Related to Biological Diversity
Genetic Diversity Analysis in QPM Inbred Lines SSRs have been widely used in maize diversity analysis owing to their codominant nature, abundance in the genome, high polymorphism, repeatability and reliability. Genetic diversity analysis in 45 QPM inbreds, using 46 SSRs marker was done at DMR. The study revealed considerable diversity in these lines. A total of 210 alleles were observed with 4.2 average alleles per locus. There were 85 rare with a frequency >0.05. Twelve unique alleles found in 46 inbreds which can be used for fingerprinting. The genetic diversity parameters are summarized in table 1. The resulting dendrogram (Fig. 3) using UPGMA has grouped 46 genotypes into four clusters (Table.2). The relationship between clusters in dendrogram shows that Cluster IV is more diverse than other three clusters. Cluster IV is the largest and consists of HKI-164 series. HKI-163 is derived from CML-163 and the relationship is clearly depicted in the dendrogram (Cluster I). HKI-164 series of inbreds are very much tends to group in cluster IV except for three HKI-HKI164-7-2, HKI-164-D-3-3-2 and HKI164-1-4. This finding suggests that these lines though derived from same source but they have undergone considerable changes during the course of development. HKI-193-1, HKI-161, and HKI-163 are productive lines which have been used in combination to develop QPM Single Cross Hybrids. HQPM-1 (HKI 193-1 x HKI163), HQPM-7 (HKI 193-1 x HKI-161) and HQPM-5 (HKI-163 x HKI-161).
Table 1. Summary table for SSR analysis in 45 QPM lines. QPM inbred lines 45 Total Markers (loci) 46 Total Alleles 210 Average Alleles 4.2 Rare Alleles 85 Unique Alleles 12 Jaccardâ€™s coefficient range 0.0725-0.4898 PIC range 0.263 â€“ 0.85 average PIC 0.655 No. of clusters 5 23
Table 2. Clustering of Quality protein maize inbred lines Cluster
HKI-14-2, HKI 161-TR-5-2, HKI-163, CML163, CML-142, CML 150, CML -161, CL-QRCYQ51
HKI-1647-7-2, HKI-193-2-1, HKI-5072-2-BT, CML -176, HKI-27-3, HKI164-7-2, HKI-188, HKI-17-2, CML175, HKI-35-5-2, HKI-26-2-4 (1-4)), CML-165, HKI-34-(1+2)-1
HKI-15-2-2(1-3), HKI-164-D-3-3-2, HKI-191-1-2-5, HKI-164-1-4, HKI170 (H-2), HKI-162, CML-140, CL -02457
HKI-13-2, DMRQPM-03-124, HKI-193, HKI-193-2-2, HKI193-1, HKI164-3 (2-1), HKI-164-7-6, HKI-164-D-4, HKI-164-7-4, HKI-164-7-3, HKI-164-4(1-3), HKI-164-TB-3-4, HKI 164-7-7ER, HKI-164-7-6X161, CML451
Fig.3. Dendogram of Quality protein maize inbred lines 24 International and National Regulatory Instruments Related to Biological Diversity
Suggested Reading Botstein D, White RL, Stocknick M and Davis RW. 1980. Construction of genetic linkage map using restriction fragment length polymorphism. Amer. J. Human Genet. 32: 314-331. Gupta PK, Balyan HS, Sharma PC and Ramesh B. 1996. Micosatcllites in plants: A new class of molecular markers. Curr.Sci. 70: 45-54. Karp A, Kresovich S, Bhat KV, Aynd WO and Hodgkin T. 1997. Molecular tools in plant genetic resources conservation: a guide to the technologies. IPGRI Technical Bulletin No. 2, International Plant Genetic Resources Institute, Rome, Italy. Phillips RL, and Vasil IK, (eds.). 1994. DNA-based Marker: in Plants. Kluwer Academic Publishers, Netherlands. Powell W. Morgante M, Andre C, Hanafey M, Vogel J, Tingey S and Rafalski A. 1996a. The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) marker for Germplasm analysis. Tanksley SD. 1983. Molecular markers in plant breeding. Plant Biol.. Rep. 1:3-8. Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, and Zabeau M. 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 23:4407-14. Welsh J and McClelland M. 1990. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Research, 18: 7213-7218 Jeffreys AJ, Wilson V, Thein SL. 1985. Hypervariable minisatellite regions in human DNA. Nature. 314:67-73. Litt M and Luty JA. 1989. A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am. J. Hum. Genet. 44:397-401 Jaccard P. 1908. Nouvelles researches sur la distribution florale. Bull. Soc. Vaudoise Sci. Natl. 44:223â€“270. Rogers JS. 1972. Measures of genetic similarity and genetic distance. p. 145-153. In: Studies in Genetics VII (ed. J.S. Rogers). Publ. 7213. Univ. of Texas, Austin, TX. Burt B and Burr FA. 1991. Recombinant inbred lines for molecular mapping in maize. Theor. Appl. Genet. 85: 55-60. Saghai-Maroof MA, Sollman KM, Jorgensen RA. & AllardRW. 1984. Ribosomal DNA spacer length polymorphisms in barley: Mendelian inheritance, chromosomal location and population dynamics. PNAS, USA. 81: 8014. 25
Solutions and buffers used: 1) 0.5M EDTA (pH-8.0): 37.22g EDTA di-sodium salt and 4.0g sodium hydroxide dissolved in 150ml of water, pH adjusted by NaOH solution and the final volume made up to 200ml. 2) 1M Tris-HCl (pH-8.0): 24.23g Tris base dissolved 150ml of water, pH adjusted by NaOH solution and the final volume made up to 200ml. 3) 5M NaCl: 58.44g NaCl was dissolved in water and final volume was made up to 200ml. 4) 10mM TE Buffer (pH-8.0): 1 ml of Tris-HCl (1M) and 0.2ml of EDTA (0.5M) added in water and final volume made up to 100ml. 5) 50X Tris Acetic acid EDTA or TAE Buffer (pH: 8.0): 242g of tris base dissolved in 500ml water, 57.1 ml acetic acid and 100ml of .5M EDTA added and final volume made up to 1000ml. 6) DNA Loading dye (Bromophenol blue): a. 0.5 M EDTA - 10ml b. 40% Sucrose - 20.0g c. 25% Bromophenol blue - 13.5g d. Final volume made up to 50ml with distilled water. 7) DNA Lysis Buffer (CTAB): 0.5M EDTA 1M Tris-HCl 5M NaCl 2% w/v CTAB
2.0 ml (20 mM) 2.0 ml (100 mM) 4.16 ml (1.4M) 400mg
100mM 2-Mercaptoethanol added freshly. Mixed and final volume made up to 20ml with distilled water.
26 International and National Regulatory Instruments Related to Biological Diversity
S.No. Chemicals/ Reagents
DNA low range ruler
Lambda Hind III Digest
Taq DNA Polymerase (1U)
dNTPS (4 x 500Âľl)
10 X taq buffer B
MgCl2 (1ml x 2)
RNase A (100mg)
Water protease & Nuclease free
Tris Saturated Phenols
Acetic acid glacial AR
Propan-2-ol (iso-propyl alcohol) HPLC
Instruments used S.No. Instrument
Deep Freezer (-20oC)
MJ Research PTC- 200, Biored
Gel Documentation system
Hp alpha imager
eppendorf & Nichipet
Plastic-ware (Oakridge tube Stand, eppendrof tube etc)
PCR Plates and PCR tubes
28 International and National Regulatory Instruments Related to Biological Diversity
ANNEXURE-IV List of markers and their genomic location
Source: Maize GDB (www.maizegdb.com)
Directorate of Maize Research (Indian Council of Agricultural Research) Pusa Campus, New Delhi 110012 (India) Website : www.maizeindia.org Email: firstname.lastname@example.org Phone: 011-25841805, 25842372, 25849725 Fax: 011-25848195