Teasing apart neutral and adaptive genomic processes and identifying loci that are targets of selection can be difficult, particularly for nonmodel species that lack a reference genome. However, identifying such loci and the factors driving selection have the potential to greatly assist conservation and restoration practices, especially for the management of species in the face of contemporary and future climate change. Here, we focus on assessing adaptive genomic variation within a nonmodel plant species, the narrow-leaf hopbush (Dodonaea viscosa ssp. angustissima), commonly used for restoration in Australia. We used a hybrid-capture target enrichment approach to selectively sequence 970 genes across 17 populations along a latitudinal gradient from 30°S to 36°S. We analysed 8462 single-nucleotide polymorphisms (SNPs) for FST outliers as well as associations with environmental variables. Using three different methods, we found 55 SNPs with significant correlations to temperature and water availability, and 38 SNPs to elevation. Genes containing SNPs identified as under environmental selection were diverse, including aquaporin and abscisic acid genes, as well as genes with ontologies relating to responses to environmental stressors such as water deprivation and salt stress. Redundancy analysis demonstrated that only a small proportion of the total genetic variance was explained by environmental variables. We demonstrate that selection has led to clines in allele frequencies in a number of functional genes, including those linked to leaf shape and stomatal variation, which have been previously observed to vary along the sampled environmental cline. Using our approach, gene regions subject to environmental selection can be readily identified for nonmodel organisms.
Sample availability limits population genetics research on many species, especially taxa from regions with high diversity. However, many such species are well represented in museum collections assembled before the molecular era. Development of techniques to recover genetic data from these invaluable specimens will benefit biodiversity science. Using a mixture of freshly preserved and historical tissue samples, and a sequence capture probe set targeting >5000 loci, we produced high-confidence genotype calls on thousands of single nucleotide polymorphisms (SNPs) in each of five South-East Asian bird species and their close relatives (N = 27–43). On average, 66.2% of the reads mapped to the pseudo-reference genome of each species. Of these mapped reads, an average of 52.7% was identified as PCR or optical duplicates. We achieved deeper effective sequencing for historical samples (122.7×) compared to modern samples (23.5×). The number of nucleotide sites with at least 8× sequencing depth was high, with averages ranging from 0.89 × 106 bp (Arachnothera, modern samples) to 1.98 × 106 bp (Stachyris, modern samples). Linear regression revealed that the amount of sequence data obtained from each historical sample (represented by per cent of the pseudo-reference genome recovered with ≥8× sequencing depth) was positively and significantly (P ≤ 0.013) related to how recently the sample was collected. We observed characteristic post-mortem damage in the DNA of historical samples. However, we were able to reduce the error rate significantly by truncating ends of reads during read mapping (local alignment) and conducting stringent SNP and genotype filtering.
Here, we present a set of RNA-based probes for whole mitochondrial genome in-solution enrichment, targeting a diversity of mammalian mitogenomes. This probes set was designed from seven mammalian orders and tested to determine the utility for enriching degraded DNA. We generated 63 mitogenomes representing five orders and 22 genera of mammals that yielded varying coverage ranging from 0 to >5400X. Based on a threshold of 70% mitogenome recovery and at least 10× average coverage, 32 individuals or 51% of samples were considered successful. The estimated sequence divergence of samples from the probe sequences used to construct the array ranged up to nearly 20%. Sample type was more predictive of mitogenome recovery than sample age. The proportion of reads from each individual in multiplexed enrichments was highly skewed, with each pool having one sample that yielded a majority of the reads. Recovery across each mitochondrial gene varied with most samples exhibiting regions with gaps or ambiguous sites. We estimated the ability of the probes to capture mitogenomes from a diversity of mammalian taxa not included here by performing a clustering analysis of published sequences for 100 taxa representing most mammalian orders. Our study demonstrates that a general array can be cost and time effective when there is a need to screen a modest number of individuals from a variety of taxa. We also address the practical concerns for using such a tool, with regard to pooling samples, generating high quality mitogenomes and detail a pipeline to remove chimeric molecules.
Phylogenetics benefits from using a large number of putatively independent nuclear loci and their combination with other sources of information, such as the plastid and mitochondrial genomes. To facilitate the selection of orthologous low-copy nuclear (LCN) loci for phylogenetics in nonmodel organisms, we created an automated and interactive script to select hundreds of LCN loci by a comparison between transcriptome and genome skim data. We used our script to obtain LCN genes for southern African Oxalis (Oxalidaceae), a speciose plant lineage in the Greater Cape Floristic Region. This resulted in 1164 LCN genes greater than 600 bp. Using target enrichment combined with genome skimming (Hyb-Seq), we obtained on average 1141 LCN loci, nearly the whole plastid genome and the nrDNA cistron from 23 southern African Oxalis species. Despite a wide range of gene trees, the phylogeny based on the LCN genes was very robust, as retrieved through various gene and species tree reconstruction methods as well as concatenation. Cytonuclear discordance was strong. This indicates that organellar phylogenies alone are unlikely to represent the species tree and stresses the utility of Hyb-Seq in phylogenetics.
Acropyga ants are a widespread clade of small subterranean formicines that live in obligate symbiotic associations with root mealybugs. We generated a data set of 944 loci of ultraconserved elements (UCEs) to reconstruct the phylogeny of 41 representatives of 23 Acropyga species using both concatenation and species-tree approaches. We investigated the biogeographic history of the genus through divergence dating analyses and ancestral range reconstructions. We also explored the evolution of the Acropyga-mealybug mutualism using ancestral state reconstruction methods. We recovered a highly supported species phylogeny for Acropyga with both concatenation and species-tree analyses. The age for crown-group Acropyga is estimated to be around 30 Ma. The geographic origin of the genus remains uncertain, although phylogenetic affinities within the subfamily Formicinae point to a Paleotropical ancestor. Two main Acropyga lineages are recovered with mutually exclusive distributions in the Old World and New World. Within the Old World clade, a Palearctic and African lineage is suggested as sister to the remaining species. Ancestral state reconstructions indicate that Old World species have diversified mainly in close association with xenococcines from the genus Eumyrmococcus, although present-day associations also involve other mealybug genera. In contrast, New World Acropyga predominantly evolved with Neochavesia until a recent (10–15 Ma) switch to rhizoecid mealybug partners (genus Rhizoecus). The striking mandibular variation in Acropyga evolved most likely from a 5-toothed ancestor. Our results provide an initial evolutionary framework for extended investigations of potential co-evolutionary interactions between these ants and their mealybug partners.
Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction-site-associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96–384 Wisteria plants. Our results demonstrate that our RADcap method: (i) methodologically reduces (to <5%) and allows computational removal of PCR duplicate reads from data, (ii) achieves 80–90% reads on target in 11 of 12 enrichments, (iii) returns consistent coverage (≥4×) across >90% of individuals at up to 99.8% of the targeted loci, (iv) produces consistently high occupancy matrices of genotypes across hundreds of individuals and (v) costs significantly less than current approaches.
The Cracidae (curassows, guans, and chachalacas) include some of the most spectacular and endangered Neotropical bird species. They lack a comprehensive phylogenetic hypothesis, hence their geographic origin and the history of their diversification remain unclear. We present a species-level phylogeny of Cracidae inferred from a matrix of 430 ultraconserved elements (UCEs; at least one species sampled per genus) and eight more variable loci (introns and mtDNA; all available species). We use this phylogeny along with probabilistic biogeographic modeling to test whether Gondwanan vicariance, ancient dispersal to South America, ancient dispersal from South America, or massive global cooling isolated cracids in the Neotropics. Contrary to previous estimates that extant cracids diversified in the Cretaceous, our fossil-calibrated divergence time estimates instead support that crown Cracidae originated in the late Miocene. Species-rich genera Crax, Penelope, and Ortalis began diversifying as recently as 3 Mya. Biogeographic reconstructions indicate that modern cracids originated in Mesoamerica and were isolated from a widespread Laurasian ancestor, consistent with the massive global cooling hypothesis. Current South American diversity is the result of multiple colonization events following uplift of the Panamanian Isthmus, coupled with rapid diversification and evolution of secondary sympatry. Of the four major cracid lineages (curassows, chachalacas, typical guans, horned guan), the only lineage that has failed to colonize and diversify South America is the unique horned guan (Oreophasis derbianus), which is sister to curassows and chachalacas rather than typical guans.
New DNA sequencing technologies are allowing researchers to explore the genomes of the millions of natural history specimens collected prior to the molecular era. Yet, we know little about how well specific next-generation sequencing (NGS) techniques work with the degraded DNA typically extracted from museum specimens. Here, we use one type of NGS approach, sequence capture of ultraconserved elements (UCEs), to collect data from bird museum specimens as old as 120 years. We targeted 5060 UCE loci in 27 western scrub-jays (Aphelocoma californica) representing three evolutionary lineages that could be species, and we collected an average of 3749 UCE loci containing 4460 single nucleotide polymorphisms (SNPs). Despite older specimens producing fewer and shorter loci in general, we collected thousands of markers from even the oldest specimens. More sequencing reads per individual helped to boost the number of UCE loci we recovered from older specimens, but more sequencing was not as successful at increasing the length of loci. We detected contamination in some samples and determined that contamination was more prevalent in older samples that were subject to less sequencing. For the phylogeny generated from concatenated UCE loci, contamination led to incorrect placement of some individuals. In contrast, a species tree constructed from SNPs called within UCE loci correctly placed individuals into three monophyletic groups, perhaps because of the stricter analytical procedures used for SNP calling. This study and other recent studies on the genomics of museum specimens have profound implications for natural history collections, where millions of older specimens should now be considered genomic resources.
Ectoparasites frequently vector pathogens from often unknown pathogen reservoirs to both human and animal populations. Simultaneous identification of the ectoparasite species, the wildlife host that provided their most recent blood meal(s), and their pathogen load would greatly facilitate the understanding of the complex transmission dynamics of vector-borne diseases. Currently, these identifications are principally performed using multiple polymerase chain reaction (PCR) assays. We developed an assay (EctoBaits) based on in-solution capture paired with high-throughput sequencing to simultaneously identify ectoparasites, host blood meals and pathogens. We validated our in-solution capture results using double-blind PCR assays, morphology and collection data. The EctoBaits assay effectively and efficiently identifies ectoparasites, blood meals, and pathogens in a single capture experiment, allowing for high-resolution taxonomic identification while preserving the DNA sample for future analyses.
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall’s sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall’s sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.
Ann Arbor, MI 48103
(d/b/a Daicel Arbor Biosciences)
All Rights Reserved.