Here, we present a set of RNA-based probes for whole mitochondrial genome in-solution enrichment, targeting a diversity of mammalian mitogenomes. This probes set was designed from seven mammalian orders and tested to determine the utility for enriching degraded DNA. We generated 63 mitogenomes representing five orders and 22 genera of mammals that yielded varying coverage ranging from 0 to >5400X. Based on a threshold of 70% mitogenome recovery and at least 10× average coverage, 32 individuals or 51% of samples were considered successful. The estimated sequence divergence of samples from the probe sequences used to construct the array ranged up to nearly 20%. Sample type was more predictive of mitogenome recovery than sample age. The proportion of reads from each individual in multiplexed enrichments was highly skewed, with each pool having one sample that yielded a majority of the reads. Recovery across each mitochondrial gene varied with most samples exhibiting regions with gaps or ambiguous sites. We estimated the ability of the probes to capture mitogenomes from a diversity of mammalian taxa not included here by performing a clustering analysis of published sequences for 100 taxa representing most mammalian orders. Our study demonstrates that a general array can be cost and time effective when there is a need to screen a modest number of individuals from a variety of taxa. We also address the practical concerns for using such a tool, with regard to pooling samples, generating high quality mitogenomes and detail a pipeline to remove chimeric molecules.

Phylogenetics benefits from using a large number of putatively independent nuclear loci and their combination with other sources of information, such as the plastid and mitochondrial genomes. To facilitate the selection of orthologous low-copy nuclear (LCN) loci for phylogenetics in nonmodel organisms, we created an automated and interactive script to select hundreds of LCN loci by a comparison between transcriptome and genome skim data. We used our script to obtain LCN genes for southern African Oxalis (Oxalidaceae), a speciose plant lineage in the Greater Cape Floristic Region. This resulted in 1164 LCN genes greater than 600 bp. Using target enrichment combined with genome skimming (Hyb-Seq), we obtained on average 1141 LCN loci, nearly the whole plastid genome and the nrDNA cistron from 23 southern African Oxalis species. Despite a wide range of gene trees, the phylogeny based on the LCN genes was very robust, as retrieved through various gene and species tree reconstruction methods as well as concatenation. Cytonuclear discordance was strong. This indicates that organellar phylogenies alone are unlikely to represent the species tree and stresses the utility of Hyb-Seq in phylogenetics.

Acropyga ants are a widespread clade of small subterranean formicines that live in obligate symbiotic associations with root mealybugs. We generated a data set of 944 loci of ultraconserved elements (UCEs) to reconstruct the phylogeny of 41 representatives of 23 Acropyga species using both concatenation and species-tree approaches. We investigated the biogeographic history of the genus through divergence dating analyses and ancestral range reconstructions. We also explored the evolution of the Acropyga-mealybug mutualism using ancestral state reconstruction methods. We recovered a highly supported species phylogeny for Acropyga with both concatenation and species-tree analyses. The age for crown-group Acropyga is estimated to be around 30 Ma. The geographic origin of the genus remains uncertain, although phylogenetic affinities within the subfamily Formicinae point to a Paleotropical ancestor. Two main Acropyga lineages are recovered with mutually exclusive distributions in the Old World and New World. Within the Old World clade, a Palearctic and African lineage is suggested as sister to the remaining species. Ancestral state reconstructions indicate that Old World species have diversified mainly in close association with xenococcines from the genus Eumyrmococcus, although present-day associations also involve other mealybug genera. In contrast, New World Acropyga predominantly evolved with Neochavesia until a recent (10–15 Ma) switch to rhizoecid mealybug partners (genus Rhizoecus). The striking mandibular variation in Acropyga evolved most likely from a 5-toothed ancestor. Our results provide an initial evolutionary framework for extended investigations of potential co-evolutionary interactions between these ants and their mealybug partners.

Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction-site-associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96–384 Wisteria plants. Our results demonstrate that our RADcap method: (i) methodologically reduces (to <5%) and allows computational removal of PCR duplicate reads from data, (ii) achieves 80–90% reads on target in 11 of 12 enrichments, (iii) returns consistent coverage (≥4×) across >90% of individuals at up to 99.8% of the targeted loci, (iv) produces consistently high occupancy matrices of genotypes across hundreds of individuals and (v) costs significantly less than current approaches.

The Cracidae (curassows, guans, and chachalacas) include some of the most spectacular and endangered Neotropical bird species. They lack a comprehensive phylogenetic hypothesis, hence their geographic origin and the history of their diversification remain unclear. We present a species-level phylogeny of Cracidae inferred from a matrix of 430 ultraconserved elements (UCEs; at least one species sampled per genus) and eight more variable loci (introns and mtDNA; all available species). We use this phylogeny along with probabilistic biogeographic modeling to test whether Gondwanan vicariance, ancient dispersal to South America, ancient dispersal from South America, or massive global cooling isolated cracids in the Neotropics. Contrary to previous estimates that extant cracids diversified in the Cretaceous, our fossil-calibrated divergence time estimates instead support that crown Cracidae originated in the late Miocene. Species-rich genera Crax, Penelope, and Ortalis began diversifying as recently as 3 Mya. Biogeographic reconstructions indicate that modern cracids originated in Mesoamerica and were isolated from a widespread Laurasian ancestor, consistent with the massive global cooling hypothesis. Current South American diversity is the result of multiple colonization events following uplift of the Panamanian Isthmus, coupled with rapid diversification and evolution of secondary sympatry. Of the four major cracid lineages (curassows, chachalacas, typical guans, horned guan), the only lineage that has failed to colonize and diversify South America is the unique horned guan (Oreophasis derbianus), which is sister to curassows and chachalacas rather than typical guans.

New DNA sequencing technologies are allowing researchers to explore the genomes of the millions of natural history specimens collected prior to the molecular era. Yet, we know little about how well specific next-generation sequencing (NGS) techniques work with the degraded DNA typically extracted from museum specimens. Here, we use one type of NGS approach, sequence capture of ultraconserved elements (UCEs), to collect data from bird museum specimens as old as 120 years. We targeted 5060 UCE loci in 27 western scrub-jays (Aphelocoma californica) representing three evolutionary lineages that could be species, and we collected an average of 3749 UCE loci containing 4460 single nucleotide polymorphisms (SNPs). Despite older specimens producing fewer and shorter loci in general, we collected thousands of markers from even the oldest specimens. More sequencing reads per individual helped to boost the number of UCE loci we recovered from older specimens, but more sequencing was not as successful at increasing the length of loci. We detected contamination in some samples and determined that contamination was more prevalent in older samples that were subject to less sequencing. For the phylogeny generated from concatenated UCE loci, contamination led to incorrect placement of some individuals. In contrast, a species tree constructed from SNPs called within UCE loci correctly placed individuals into three monophyletic groups, perhaps because of the stricter analytical procedures used for SNP calling. This study and other recent studies on the genomics of museum specimens have profound implications for natural history collections, where millions of older specimens should now be considered genomic resources.

Ectoparasites frequently vector pathogens from often unknown pathogen reservoirs to both human and animal populations. Simultaneous identification of the ectoparasite species, the wildlife host that provided their most recent blood meal(s), and their pathogen load would greatly facilitate the understanding of the complex transmission dynamics of vector-borne diseases. Currently, these identifications are principally performed using multiple polymerase chain reaction (PCR) assays. We developed an assay (EctoBaits) based on in-solution capture paired with high-throughput sequencing to simultaneously identify ectoparasites, host blood meals and pathogens. We validated our in-solution capture results using double-blind PCR assays, morphology and collection data. The EctoBaits assay effectively and efficiently identifies ectoparasites, blood meals, and pathogens in a single capture experiment, allowing for high-resolution taxonomic identification while preserving the DNA sample for future analyses.

Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall’s sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall’s sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

The Cyclophyllidea is the most diverse order of tapeworms, encompassing species that infect all classes of terrestrial tetrapods including humans and domesticated animals. Available phylogenetic reconstructions based either on morphology or molecular data lack the resolution to allow scientists to either propose a solid taxonomy or infer evolutionary associations. Molecular markers available for the Cyclophyllidea mostly include ribosomal DNA and mitochondrial loci. In this study, we identified 3641 single-copy nuclear coding loci by comparing the genomes of Hymenolepis microstoma, Echinococcus granulosus and Taenia solium. We designed RNA baits based on the sequence of H. microstoma, and applied target enrichment and Illumina sequencing to test the utility of those baits to recover loci useful for phylogenetic analyses. We captured DNA from five species of tapeworms representing two families of cyclophyllideans. We obtained an average of 3284 (90%) of the targets from the test samples and then used captured sequences (2 181 361 bp in total; fragment size ranging from 301 to 6969 bp) to reconstruct a phylogeny for the five test species plus the three species for which genomic data are available. The results were consistent with the current consensus regarding cyclophyllidean relationships. To assess the potential for our method to yield informative genetic variation at intraspecific scales, we extracted 14 074 single nucleotide polymorphisms (SNPs) from alignments of four Arostrilepis macrocirrosa and two A. cooki and successfully inferred their relationships. The results showed that our target gene tools yield data sets that provide robust inferences at a range of taxonomic scales in the Cyclophyllidea.

Recent studies have advocated biomonitoring using DNA techniques. In this study, two high-throughput sequencing (HTS)-based methods were evaluated: amplicon metabarcoding of the cytochrome C oxidase subunit I (COI) mitochondrial gene and gene enrichment using MYbaits (targeting nine different genes including COI). The gene-enrichment method does not require PCR amplification and thus avoids biases associated with universal primers. Macroinvertebrate samples were collected from 12 New Zealand rivers. Macroinvertebrates were morphologically identified and enumerated, and their biomass determined. DNA was extracted from all macroinvertebrate samples and HTS undertaken using the illumina miseq platform. Macroinvertebrate communities were characterized from sequence data using either six genes (three of the original nine were not used) or just the COI gene in isolation. The gene-enrichment method (all genes) detected the highest number of taxa and obtained the strongest Spearman rank correlations between the number of sequence reads, abundance and biomass in 67% of the samples. Median detection rates across rare (<1% of the total abundance or biomass), moderately abundant (1–5%) and highly abundant (>5%) taxa were highest using the gene-enrichment method (all genes). Our data indicated primer biases occurred during amplicon metabarcoding with greater than 80% of sequence reads originating from one taxon in several samples. The accuracy and sensitivity of both HTS methods would be improved with more comprehensive reference sequence databases. The data from this study illustrate the challenges of using PCR amplification-based methods for biomonitoring and highlight the potential benefits of using approaches, such as gene enrichment, which circumvent the need for an initial PCR step.