Collections-based landscape genomics: Red-browed finches as a test case

Ignition Grant Round 3 (June 2014)

Landscape genomics is an emerging field that attempts to find relationships between environmental variables and genetic variants [1]. Such studies look for the presence or absence of a variant in individuals across some environmental gradient. Landscape genomics differs from classical population genetics studies in that the individual, rather than the population, is the sampling unit [2]. Furthermore, the aim is to sample from as many locations along the environmental gradient as possible. Landscape genomics studies are therefore an ideal fit to most scientific collections: collections aim to capture biodiversity, and therefore rarely contain more than one individual from a particular time and place.
 
As a case in point, the Australian National Wildlife Collection (ANWC) has tissue samples from Neochmia temporalis (Red-browed finches) collected across a latitudinal range from Cape York right down to southern Victoria. Their distribution covers a number of environmental gradients relevant to climate change, most notably a north-to-south temperature gradient.
 
We aim to describe genetic variation in N. temporalis across its range, through whole genome sequencing of tissue samples held at ANWC.
 
This will allow us to assess the utility of scientific collections for landscape genomics, and inform future sampling designs for a large-scale landscape genomics study. Ultimately, this will provide insights into the pathways, genes, and alleles relevant to heat tolerance in birds.
 
Whole genome sequence (WGS) data for 48 Red-browed finches with tissue samples housed at ANWC will be generated. Phenotypic data, for example body size, will also be collated. These 48 individuals represent a subsample of the 76 available samples; they are selected to represent a latitudinal gradient along the east coast of Australia (see Figure 1). Each individual will be sequenced to ~2x coverage. A previous WGS analysis of the related species Lonchura castaneothorax indicates this is sufficient to detect polymorphisms across the majority of the genome (personal communication, Katie Faust-Stryjewski). Individually barcoded samples will be pooled and sequenced using four lanes of an Illumina HiSeq2000 sequencing run.
 
Data will be aligned to the Taeniopygia guttata (Zebra Finch) genome [3] and polymorphic sites identified using software such as CRISP [4]. Excellent alignment is anticipated, as in the L. castaneothorax study mentioned above, more than 92% of reads aligned to the T. guttata genome (personal communication, Katie Faust-Stryjewski). In addition to allowing common SNPs to be called across low-coverage samples, the pooling allows rare SNPs to be accurately distinguished from errors, by harnessing the fact that errors are highly correlated between samples sequenced simultaneously [5].
 
Basic population structure will be investigated using software such as STRUCTURE [6]. An initial landscape genomic analysis will also be performed to determine the data's potential and identify future sampling requirements. To this end, climate data will be sourced from the Bureau of Meteorology; to account for the fact that samples are collected in different years, historical data matching the year of collection will be used where appropriate. Relationships between individual alleles and environmental variables will then be investigated through statistical analysis, e.g. by multiple logistic regression as implemented in SAM [7]).
 
The project will identify putative functional candidate genes involved in climate adaptation in birds, increasing our understanding of the biology underlying this process. Based on similarity between the zebra finch and the chicken genomes, it is reasonable to expect that such insights may be applicable to other avian species, with implications for management and conservation.
 
This project will also provide a road map for future collections-based analysis of adaptation across landscapes. Although steps are underway to include genomic information in policy development and application (e.g., as evidenced by the recent OCE Genomics Symposium on "Species' ability to adapt to climate change", held at ANU in December 2013), our ability to incorporate genomic resources into decision frameworks remains limited.
 
Our understanding of the usefulness of genomic information to policy makers is also still in its infancy. Challenges include difficulty in obtaining samples; identifying adaptive, rather than neutral genomic signals; and interpreting the consequences of findings. This project aims to address these problems, by demonstrating that meaningful adaptive signatures can be identified from available samples with only moderate investment.
 
[1] TRENDS ECOL EVOL (2013) 28(10), 614-621; [2] ANNU REV ECOL EVOL S (2012) 43, 23-43; [3] NATURE (2010) 464(7289), 757-762; [4] BIOINFORMATICS (2010) 26(12), i318-i324; [5] NAT COMMUN (2012) 3, 811; [6] GENETICS (2000) 155(2), 945-959. [7] ECOGRAPHY (2010) 33(1), 46-50.