Genome skimming with degraded DNA from herbarium specimens
Ignition Grant Round 3 (June 2014)
- Adrienne Nicotra, ANU
- Alexander Schmidt-Lebuhn, CSIRO
Genomic DNA in old collection material is generally degraded, most often into fragments of 50-100 bp in length, making it unsuitable for analysis with Sanger sequencing or traditional genotyping methods such as AFLP or microsatellites.
Modern sequencing techniques, however, lend themselves to work on fragmented DNA as they have low to intermediate read lengths and thus require shearing or enzymatic cutting as a first step anyway, even when reading high quality DNA. The two major challenges are biochemical and bioinformatic:
- Making genomic fragments from old specimens suitable for sequencing, and
- Dealing with errors resulting from biochemical changes in old DNA, especially in the case of low coverage.
Although there have been previous attempts to establish sequencing of degraded plant material, no consensus for an effective approach has emerged.
Recently a new protocol was published for the biochemical treatment of degraded DNA from insect specimens (Tin et al. 2014). In discussions with Alexander Mikheyev, at a recent Degraded DNA Workshop at the Australian National Insect Collection, it was suggested that the protocol should work equally well in other groups of organisms. This project consequently aims to adapt the approach to herbarium specimens.
Many applications of modern sequencing require the availability of a reference genome, and Mikheyev and his collaborators have also mapped their results against known model genomes, e.g. that of Drosophila melanogaster. However, in many phylogenetically, ecologically or conservation relevant groups no reasonably close reference genome is available. An alternative that will be used in the proposed project is genome skimming, that is lower coverage sequencing aiming to retrieve mostly the high copy regions of the genome (chloroplasts, mitochondria, ribosomal DNA and its spacers).
This approach, which has been used successfully for a variety of purposes (e.g. Bock et al. 2013; Male et al. 2014), trades off the number of independent markers recovered against several other advantages: a full reference genome becomes unnecessary; the aforementioned problems caused by biochemical changes in old DNA are easier to deal with in high copy regions; and because lower coverage is needed, more barcoded samples can be sequenced together, crucially reducing the cost per sample for applications using high sample numbers such as population studies or phylogenetics.
Because the project is meant to establish the method rather than already produce a large phylogeny, it will use only a limited number of specimens to infer what coverage of the relevant areas can be achieved in this way. Due to the shared interest of the investigators, they will mostly be alpine Asteraceae but also include selected species that have proved particularly problematic with Sanger sequencing from herbarium material in the past, presumably because of their behaviour in drying, such as ephemeral Epitriche. Finally, fresh samples of selected species will be processed in the same way after DNA shearing alongside conspecific herbarium material to verify comparability.
The protocol to be used is genomic shotgun sequencing. After DNA extraction and quantitation it involves denaturation, a phosphatase treatment, addition of GTP to the 3' termini, second strand re-synthesis, ligation of barcoded adaptors, amplification and bead-based library purification. Samples with different barcodes are quantitated and pooled equimolarly before Illumina sequencing. Sequenced Asteraceae genomes (esp. sunflower) can be used for mapping of organellar and ribosomal sequences.
If the Tin et al. approach can be successfully adapted to herbarium material, it will facilitate the use of old specimens for numerous types of research and will be particularly valuable for the study of remote, rare or extinct species. Making better use of old material from biodiversity collections will reduce the need to collect additional samples from rare species in the wild, thus avoiding additional damage to those populations and providing an alternative to sometimes difficult field trips. Of course, the study of extinct or presumed extinct species will be facilitated most as old specimens with potentially degraded DNA constitute all available material.